Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello Jeff, I had the same problem with your FC7 cluster , there is a bug in the metadata conversion, the drbd 8.2.3/8.2.4 corrected the problem. Best regards. Francis Jeff Goris wrote: > Hi, > > I have two nodes running DRBD v0.7.25 on Fedora 7 with kernel 2.6.22.9-91.fc7. > DRBD has two resources: "data1" (/dev/drbd0 on /dev/md2) and "data2" (/dev/drbd1 > on /dev/md3) both running on software RAID. In order to use a 2.6.23 kernel it > appears that I need to upgrade to DRBD 8. I read the step by step guide from > Florian's blog for info on how to perform a DRBD 0.7 to DRBD 8 upgrade. > > The procedure I used to upgrade DRBD from 0.7 to 8 was as follows: > > 1. Stop heartbeat on both nodes > 2. Ensure that DRBD is secondary on both nodes with consistent data. > 3. Stop DRBD on both nodes > 4. Install DRBD 8.0.8 on first node. I compiled DRBD myself. > 5. Upgrade the metadata on the first node. This is where I hit a snag as > drbdmeta did not detect that the existing metadata was v07. Here's the output > plus a few other commands I used that may be helpful to anyone reading this. > > # drbdadm create-md data1 > v08 Magic number not found > v07 Magic number not found > About to create a new drbd meta data block > on /dev/md2. > > ==> This might destroy existing data! <== > > Do you want to proceed? > [need to type 'yes' to confirm] no > > Operation cancelled. > # drbdadm dump-md data1 > v08 Magic number not found > Command 'drbdmeta /dev/drbd0 v08 /dev/md2 internal dump-md' terminated with exit > code 255 > drbdadm aborting > # drbdmeta /dev/drbd0 v07 /dev/md2 internal dump-md > version "v07"; > > gc { > 5; 8; 11; 223; 647; > } > la-size-sect 39755520; > # bm-bytes 621184; > # bits-set 0; > bm { > # at 0kB > 0x0000000000000000; 0x0000000000000000; 0x0000000000000000; 0x0000000000000000; > 77644 times 0x0000000000000000; > 0x0000000000000000; > } > # drbdmeta /dev/drbd0 v07 /dev/md2 internal show-gi > > WantFullSync | > ConnectedInd | | > lastState | | | > ArbitraryCnt | | | | > ConnectedCnt | | | | | > TimeoutCnt | | | | | | > HumanCnt | | | | | | | > Consistent | | | | | | | | > --------+-----+-----+-----+-----+-----+-----+-----+ > 1/c | 8 | 11 | 223 | 647 | 0/s | 1/c | 0/n > > last agreed size: 18 GB > 0 bits set in the bitmap [ 0 KB out of sync ] > > According to the v07 metadata dump it is v07 metadata. However, I've not used > the dump-md command before so I don't know what the output should look like. I > tried to upgrade the metadata on the resource "data2" and this failed. On the > other node I then tried upgrading to DRBD 8.0.8 and the metadata upgrade also > failed to detect that the resources' metadata were v07. Finally, I installed and > booted into the latest Fedora 7 kernel (2.6.23.12-52.fc7) and attempted to > upgrade with the same issue. I rolled back to kernel 2.6.22.9-91.fc7 and DRBD > 0.7.25 and my cluster is running fine again. > > Does anyone have any ideas as to why the "drbdadm create-md <resource>" commands > are failing to detect that the existing metadata is v07? Here are my DRBD > configs from both nodes for both DRBD 0.7.25 and DRBD 8.0.8. I think that they > are pretty basic. > > Thanks, > Jeff. > > > DRBD 0.7.25 Config > ================== > > resource data1 { > protocol C; > incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f"; > startup { > wfc-timeout 0; # Infinite! > degr-wfc-timeout 120; # 2 minutes. > } > disk { > on-io-error detach; > } > net { > # timeout 60; # 6 seconds (unit = 0.1 seconds) > # connect-int 10; # 10 seconds (unit = 1 second) > # ping-int 10; # 10 seconds (unit = 1 second) > # max-buffers 2048; > # max-epoch-size 2048; > # ko-count 4; > # on-disconnect reconnect; > } > syncer { > rate 20M; # 20 Mbit/s > group 1; > # al-extents 257; > } > on sauron.whiterabbit.com.au { > device /dev/drbd0; > disk /dev/md2; > address 172.16.0.10:7788; > meta-disk internal; > } > on shelob.whiterabbit.com.au { > device /dev/drbd0; > disk /dev/md2; > address 172.16.0.11:7788; > meta-disk internal; > } > } > > resource data2 { > protocol C; > incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f"; > startup { > wfc-timeout 0; # Infinite! > degr-wfc-timeout 120; # 2 minutes. > } > disk { > on-io-error detach; > } > net { > # timeout 60; # 6 seconds (unit = 0.1 seconds) > # connect-int 10; # 10 seconds (unit = 1 second) > # ping-int 10; # 10 seconds (unit = 1 second) > # max-buffers 2048; > # max-epoch-size 2048; > # ko-count 4; > # on-disconnect reconnect; > } > syncer { > rate 20M; # 20 Mbit/s > group 2; > # al-extents 257; > } > on sauron.whiterabbit.com.au { > device /dev/drbd1; > disk /dev/md3; > address 172.16.0.10:7789; > meta-disk internal; > } > on shelob.whiterabbit.com.au { > device /dev/drbd1; > disk /dev/md3; > address 172.16.0.11:7789; > meta-disk internal; > } > } > > > DRBD 8.0.8 Config > ================= > > resource data1 { > protocol C; > #handlers { > # pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; > # pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f"; > # local-io-error "echo o > /proc/sysrq-trigger ; halt -f"; > # outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5"; > #} > startup { > wfc-timeout 0; # Infinite! > degr-wfc-timeout 120; # 2 minutes. > } > disk { > on-io-error detach; > } > net { > # timeout 60; # 6 seconds (unit = 0.1 seconds) > # connect-int 10; # 10 seconds (unit = 1 second) > # ping-int 10; # 10 seconds (unit = 1 second) > # max-buffers 2048; > # max-epoch-size 2048; > # ko-count 4; > # on-disconnect reconnect; > } > syncer { > rate 20M; # 20 Mbit/s > # al-extents 257; > } > on sauron.whiterabbit.com.au { > device /dev/drbd0; > disk /dev/md2; > address 172.16.0.10:7788; > meta-disk internal; > } > on shelob.whiterabbit.com.au { > device /dev/drbd0; > disk /dev/md2; > address 172.16.0.11:7788; > meta-disk internal; > } > } > > resource data2 { > protocol C; > #handlers { > # pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; > # pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f"; > # local-io-error "echo o > /proc/sysrq-trigger ; halt -f"; > # outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5"; > #} > startup { > wfc-timeout 0; # Infinite! > degr-wfc-timeout 120; # 2 minutes. > } > disk { > on-io-error detach; > } > net { > # timeout 60; # 6 seconds (unit = 0.1 seconds) > # connect-int 10; # 10 seconds (unit = 1 second) > # ping-int 10; # 10 seconds (unit = 1 second) > # max-buffers 2048; > # max-epoch-size 2048; > # ko-count 4; > # on-disconnect reconnect; > } > syncer { > rate 20M; # 20 Mbit/s > after "data1"; > # al-extents 257; > } > on sauron.whiterabbit.com.au { > device /dev/drbd1; > disk /dev/md3; > address 172.16.0.10:7789; > meta-disk internal; > } > on shelob.whiterabbit.com.au { > device /dev/drbd1; > disk /dev/md3; > address 172.16.0.11:7789; > meta-disk internal; > } > } > > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user > >