Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Lars, thanks for this tip (and sorry for the long answer in advance). With the command drbdadm -- --overwrite-data-of-peer primary db I'm able to bring the secondary into primary role. Now if I power on the former primary, I get a split brain (.. well as expected I guess). Now I run "drbdadm -- --discard-my-data connect all" on the secondary and the changes (200 MB) are synced from the primary to the secondary - everything works fine then. Thanks ! The test for a simulated data center burn with node recovery from the old data center here was like this: - Node1 = Primary - Node2 = Standby - Diconnect replication link on Node1 - Heartbeat uses dopd to outdate peer disk on Node2 - Power off Node1 (hard power off and put back network cable) - Node2 tries to take over ressources (fails, outdated disk) - drbdadm -- --overwrite-data-of-peer primary db on Node2 - /usr/lib/heartbeat/ResourceManager takegroup drbddisk on Node2 - Write 200 MB of data on drbd device (file on mountpoint) - Power on Node 1 - Split brain - Try drbdadm -- --discard-my-data connect all on Node1 .. drbd syncs the 200 MB changes ... END When running the same test with a fresh DRBD device in Node1 after the reboot (=server burned down and now I bring in a replacement machine and restored it from tape) everything syncs as expected (full sync). The test for a simulated data center burn without node recovery from the old data center here was like this: - Node1 = Primary - Node2 = Standby - Diconnect replication link on Node2 (and keep the network pulled) - Heartbeat uses dopd to outdate peer disk on Node2 - Power off Node1 (hard power off) - Node2 tries to take over ressources (fails, outdated disk) - drbdadm -- --overwrite-data-of-peer primary db on Node2 - /usr/lib/heartbeat/ResourceManager takegroup drbddisk on Node2 - Write 200 MB of data on drbd device (file on mountpoint) - Power on Node 1 (and keep the network pulled) - drbdadm down db on Node1 - drbdadm wipe-md db on Node1 - drbdadm create-md db on Node1 - drbdadm up db Node1 .. drbd syncs the ALL DATA (full sync) ... END Here's a short summary of this thread for the list archive: ==== Thread Summary ===== Dopd and failover ------------------------- A administrator can use dopd to invalidate remote disks via all available heartbeat communication channels. Drbd currently only support a single IP/ interface as replication interface. To enable redundancy for the replication link bonding can be used, however clusters usually utilize a second fully indipendent link (redundant heartbeat) for communication. This link can be used to invalidate disks on the standby side in case the replication link dies. This avoids working with old data. If you use dopd for outdating remote disks and you have a node failure while your standby disk is outdated you have to run drbdadm -- --overwrite-data-of-peer primary RESOURCE on the standby side to enable the resource. If your former primary comes back with the old meta data, you will get a split brain situation. To solve this the standard split brain solving can be used (--discard-my-data connect). ===================== Thanks again, Robert Lars Ellenberg schrieb: > On Fri, Aug 15, 2008 at 05:27:02PM +0200, Robert wrote: > >> Ok, found a way: >> >> debnode2:~# drbdadm down db >> debnode2:~# drbdadm -- :::::1:::: set-gi db >> previously >> 1E65F6C2CB7D5B5C:0000000000000000:E9ACAE2A5D5F54A8:0173BFF26274A71F:1:0:0:0:0:0 >> set GI to >> 1E65F6C2CB7D5B5C:0000000000000000:E9ACAE2A5D5F54A8:0173BFF26274A71F:1:1:0:0:0:0 >> >> Write new GI to disk? >> [need to type 'yes' to confirm] yes >> >> debnode2:~# drbdadm up db >> debnode2:~# /usr/lib/heartbeat/ResourceManager takegroup drbddisk >> >> Why not add a simple "drbdadm uptodate" to drbdadm to issue the >> corresponding commands ? >> > > drbdadm -- --overwrite-data-of-peer primary db > > is expected to to that for you. > >