Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, Short story : Many hours after a hardware crash correctly handled, DRBD over LVM does not uses anymore the correct device size (629145600 bytes instead of 6291456000, or 614400KB instead of 6144000KB). Device size reported by LVM seems correct. Long story : A device size problem occured yesterday on a test setup : /home/postgres -> /dev/drbd1 -> /dev/bases/bases -> /dev/sda2 (reiserfs) (DRBD 0.7.5) (LVM from 2.4.27) actual disk The actual partition is 17.20GB, on wich 5.86GB are allocated by LVM and available to DRBD. Everything was working cleanly until yesterday, with this setup. I first added a second disk (/dev/sdb) to the hot-plug SCSI bus, then tested it (an old disk, that had errors). The errors on SDB led to a complete halt of SDA, the replicated disk (I can't make it respond to SCSI commands any more : it seems totally dead). /dev/sda[12] were replicated by DRBD 0.7.5 on kernel 2.4.27 (Debian). DRBD placed the device in the ClientWithoutDisk state when SDA crashed (this mode was new for me, so I let it go this way a few hours, and searched for information about this state). I then switched all services to the peer server, and disconnected DRBD to install a new disk. OK for this part. Many hours later, Postgres (which reside over DRBD) tried to access a sector beyond the device end : "ERROR: cannot read block 142 of tbl_carac_cle_doc_code_classif_: Input/output error" I immediately stopped Postgres, and investigated. This happened "many hours later", but I don't know if it's because Postgres was not really used until then, or if the actual cause began many hours after the crash of the peer's disk. It appears that DRBD only sees 629145600 out of the 6291456000 bytes on the LVM device (thus exactly ten times less), and refuses to extend to the whole space. I upgraded DRBD 0.7.5 to 0.7.10 before reporting this problem, just to be sure it was not a bug in 0.7.5. LVM is still the same : the one in stock kernel 2.4.27. This is the result of a "dd if=the_LVM_device_then_the_DRBD_device", showing what a standard command reads as actual device size : 6291456000 Feb 3 20:10 dev_bases_bases.dd 629145600 Feb 3 16:01 dev_drbd1.dd LVM says the device is still 6GB : [root at malauzat:root]# lvdisplay /dev/bases/bases --- Logical volume --- LV Name /dev/bases/bases VG Name bases LV Write Access read/write LV Status available LV # 1 # open 2 LV Size 5.86 GB Current LE 1500 Allocated LE 1500 Allocation next free Read ahead sectors 1024 Block device 58:1 When I setup DRBD, I get nothing ou stdout, but lines in syslog tell me that the size of the device is one tenth of its actual size : drbd1: resync bitmap: bits=153600 words=4800 drbd1: size = 600 MB (614400 KB) drbd1: 600 MB marked out-of-sync by on disk bit-map. drbd1: Found 6 transactions (324 active extents) in activity log. drbd1: drbdsetup [8051]: cstate Unconfigured --> StandAlone drbd1: drbdsetup [8054]: cstate StandAlone --> Unconnected drbd1: drbd1_receiver [8055]: cstate Unconnected --> WFConnection When I mount the FS over DRBD, ReiserFS correctly complains about device size vs. FS size, because the FS was created and worked with a 6GB device : Feb 4 14:43:09 malauzat kernel: reiserfs: found format "3.6" with standard journal Feb 4 14:43:09 malauzat kernel: Filesystem on 93:01 cannot be mounted because it is bigger than the device Feb 4 14:43:09 malauzat kernel: You may need to run fsck or increase size of your LVM partition Feb 4 14:43:09 malauzat kernel: Or may be you forgot to reboot after fdisk when it told you to When I try to force the size of the DRBD device, it refuses : [root at malauzat:root]# /sbin/drbdsetup /dev/drbd1 disk /dev/bases/bases /dev/hda10 1 --on-io-error=detach -d 6144000 drbd1: Requested disk size is too big (6144000 > 614400) drbd1: size = 600 MB (614400 KB) drbd1: 600 MB marked out-of-sync by on disk bit-map. drbd1: Found 6 transactions (324 active extents) in activity log. drbd1: drbdsetup [8788]: cstate Unconfigured --> StandAlone Thanks for any advice, clues, etc. -- Nicolas Huillard Directeur Technique GHS Solutions Interactives 38, rue du Texel - 75014 PARIS - FRANCE Tél. 01 43 21 16 66 - Fax 01 56 54 02 18 E-mail : nhuillard at ghs.fr - URL : http://www.ghs.fr