[DRBD-user] Upgrade from 0.7.25 to 8.0.8 fails during create-md

Francis SOUYRI francis.souyri at apec.fr
Mon Jan 21 08:29:08 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello Jeff,

    I had the same problem with your FC7 cluster , there is a bug in the 
metadata conversion, the drbd 8.2.3/8.2.4 corrected the problem.

Best regards.

Francis

Jeff Goris wrote:
> Hi,
>
> I have two nodes running DRBD v0.7.25 on Fedora 7 with kernel 2.6.22.9-91.fc7.
> DRBD has two resources: "data1" (/dev/drbd0 on /dev/md2) and "data2" (/dev/drbd1
> on /dev/md3) both running on software RAID. In order to use a 2.6.23 kernel it
> appears that I need to upgrade to DRBD 8. I read the step by step guide from
> Florian's blog for info on how to perform a DRBD 0.7 to DRBD 8 upgrade.
>
> The procedure I used to upgrade DRBD from 0.7 to 8 was as follows:
>
> 1. Stop heartbeat on both nodes
> 2. Ensure that DRBD is secondary on both nodes with consistent data.
> 3. Stop DRBD on both nodes
> 4. Install DRBD 8.0.8 on first node. I compiled DRBD myself.
> 5. Upgrade the metadata on the first node. This is where I hit a snag as
> drbdmeta did not detect that the existing metadata was v07. Here's the output
> plus a few other commands I used that may be helpful to anyone reading this.
>
> # drbdadm create-md data1
> v08 Magic number not found
> v07 Magic number not found
> About to create a new drbd meta data block
> on /dev/md2.
>
>  ==> This might destroy existing data! <==
>
> Do you want to proceed?
> [need to type 'yes' to confirm] no
>
> Operation cancelled.
> # drbdadm dump-md data1
> v08 Magic number not found
> Command 'drbdmeta /dev/drbd0 v08 /dev/md2 internal dump-md' terminated with exit
> code 255
> drbdadm aborting
> # drbdmeta /dev/drbd0 v07 /dev/md2 internal dump-md
> version "v07";
>
> gc {
>     5; 8; 11; 223; 647;
> }
> la-size-sect 39755520;
> # bm-bytes 621184;
> # bits-set 0;
> bm {
>    # at 0kB
>     0x0000000000000000; 0x0000000000000000; 0x0000000000000000; 0x0000000000000000;
>     77644 times 0x0000000000000000;
>     0x0000000000000000;
> }
> # drbdmeta /dev/drbd0 v07 /dev/md2 internal show-gi
>
>                                         WantFullSync |
>                                   ConnectedInd |     |
>                                lastState |     |     |
>                       ArbitraryCnt |     |     |     |
>                 ConnectedCnt |     |     |     |     |
>             TimeoutCnt |     |     |     |     |     |
>         HumanCnt |     |     |     |     |     |     |
> Consistent |     |     |     |     |     |     |     |
>    --------+-----+-----+-----+-----+-----+-----+-----+
>        1/c |   8 |  11 | 223 | 647 | 0/s | 1/c | 0/n
>
> last agreed size: 18 GB
> 0 bits set in the bitmap [ 0 KB out of sync ]
>
> According to the v07 metadata dump it is v07 metadata. However, I've not used
> the dump-md command before so I don't know what the output should look like. I
> tried to upgrade the metadata on the resource "data2" and this failed. On the
> other node I then tried upgrading to DRBD 8.0.8 and the metadata upgrade also
> failed to detect that the resources' metadata were v07. Finally, I installed and
> booted into the latest Fedora 7 kernel (2.6.23.12-52.fc7) and attempted to
> upgrade with the same issue. I rolled back to kernel 2.6.22.9-91.fc7 and DRBD
> 0.7.25 and my cluster is running fine again.
>
> Does anyone have any ideas as to why the "drbdadm create-md <resource>" commands
> are failing to detect that the existing metadata is v07? Here are my DRBD
> configs from both nodes for both DRBD 0.7.25 and DRBD 8.0.8. I think that they
> are pretty basic.
>
> Thanks,
> Jeff.
>
>
> DRBD 0.7.25 Config
> ==================
>
> resource data1 {
>   protocol C;
>   incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
>   startup {
>     wfc-timeout         0;  # Infinite!
>     degr-wfc-timeout  120;  # 2 minutes.
>   }
>   disk {
>     on-io-error   detach;
>   }
>   net {
>     # timeout         60;   #  6 seconds  (unit = 0.1 seconds)
>     # connect-int     10;   # 10 seconds  (unit = 1 second)
>     # ping-int        10;   # 10 seconds  (unit = 1 second)
>     # max-buffers     2048;
>     # max-epoch-size  2048;
>     # ko-count        4;
>     # on-disconnect   reconnect;
>   }
>   syncer {
>     rate 20M;  # 20 Mbit/s
>     group 1;
>     # al-extents 257;
>   }
>   on sauron.whiterabbit.com.au {
>     device     /dev/drbd0;
>     disk       /dev/md2;
>     address    172.16.0.10:7788;
>     meta-disk  internal;
>   }
>   on shelob.whiterabbit.com.au {
>     device    /dev/drbd0;
>     disk      /dev/md2;
>     address   172.16.0.11:7788;
>     meta-disk internal;
>   }
> }
>
> resource data2 {
>   protocol C;
>   incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
>   startup {
>     wfc-timeout         0;  # Infinite!
>     degr-wfc-timeout  120;  # 2 minutes.
>   }
>   disk {
>     on-io-error   detach;
>   }
>   net {
>     # timeout         60;   #  6 seconds  (unit = 0.1 seconds)
>     # connect-int     10;   # 10 seconds  (unit = 1 second)
>     # ping-int        10;   # 10 seconds  (unit = 1 second)
>     # max-buffers     2048;
>     # max-epoch-size  2048;
>     # ko-count        4;
>     # on-disconnect   reconnect;
>   }
>   syncer {
>     rate 20M;  # 20 Mbit/s
>     group 2;
>     # al-extents 257;
>   }
>   on sauron.whiterabbit.com.au {
>     device     /dev/drbd1;
>     disk       /dev/md3;
>     address    172.16.0.10:7789;
>     meta-disk  internal;
>   }
>   on shelob.whiterabbit.com.au {
>     device    /dev/drbd1;
>     disk      /dev/md3;
>     address   172.16.0.11:7789;
>     meta-disk internal;
>   }
> }
>
>
> DRBD 8.0.8 Config
> =================
>
> resource data1 {
>   protocol C;
>   #handlers {
>   #  pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
>   #  pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
>   #  local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
>   #  outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
>   #}
>   startup {
>     wfc-timeout         0;  # Infinite!
>     degr-wfc-timeout  120;  # 2 minutes.
>   }
>   disk {
>     on-io-error   detach;
>   }
>   net {
>     # timeout         60;   #  6 seconds  (unit = 0.1 seconds)
>     # connect-int     10;   # 10 seconds  (unit = 1 second)
>     # ping-int        10;   # 10 seconds  (unit = 1 second)
>     # max-buffers     2048;
>     # max-epoch-size  2048;
>     # ko-count        4;
>     # on-disconnect   reconnect;
>   }
>   syncer {
>     rate 20M;  # 20 Mbit/s
>     # al-extents 257;
>   }
>   on sauron.whiterabbit.com.au {
>     device     /dev/drbd0;
>     disk       /dev/md2;
>     address    172.16.0.10:7788;
>     meta-disk  internal;
>   }
>   on shelob.whiterabbit.com.au {
>     device    /dev/drbd0;
>     disk      /dev/md2;
>     address   172.16.0.11:7788;
>     meta-disk internal;
>   }
> }
>
> resource data2 {
>   protocol C;
>   #handlers {
>   #  pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
>   #  pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
>   #  local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
>   #  outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
>   #}
>   startup {
>     wfc-timeout         0;  # Infinite!
>     degr-wfc-timeout  120;  # 2 minutes.
>   }
>   disk {
>     on-io-error   detach;
>   }
>   net {
>     # timeout         60;   #  6 seconds  (unit = 0.1 seconds)
>     # connect-int     10;   # 10 seconds  (unit = 1 second)
>     # ping-int        10;   # 10 seconds  (unit = 1 second)
>     # max-buffers     2048;
>     # max-epoch-size  2048;
>     # ko-count        4;
>     # on-disconnect   reconnect;
>   }
>   syncer {
>     rate 20M;  # 20 Mbit/s
>     after "data1";
>     # al-extents 257;
>   }
>   on sauron.whiterabbit.com.au {
>     device     /dev/drbd1;
>     disk       /dev/md3;
>     address    172.16.0.10:7789;
>     meta-disk  internal;
>   }
>   on shelob.whiterabbit.com.au {
>     device    /dev/drbd1;
>     disk      /dev/md3;
>     address   172.16.0.11:7789;
>     meta-disk internal;
>   }
> }
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>   



More information about the drbd-user mailing list