Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I'm having a problem getting my secondary node to see that data has changed on the primary and that it needs to sync up. The problem seems to be happening to me on both 0.7 and 0.8. I'm sure that it's just something that I'm doing, but I'd like a little pointer. Here's what I've done to replicate the problem... Bring up both nodes. Force one into being a primary. The secondary node notices this and they start syncing up. This happened when I first set DRBD up. If I leave things be, the two seem to remain in sync just fine. Catting /proc/drbd shows the numbers changing. Good. But, if I do a /etc/init.d/drbd stop and then a start on the secondary, things never sync again *unless* I force the secondary as inconsistent. My backend devices are LVM. So, in effect, I'm running drbd on top of LVM. My primary server names are xen04 and my secondary is xen05. Here is a snippet from my drbd.conf: resource logserver { protocol C; startup { degr-wfc-timeout 120; } disk { on-io-error detach; } net {} syncer { rate 10M; } on xen05 { device /dev/drbd0; disk /dev/mapper/domu-logserver; address 192.168.0.60:7788; meta-disk /dev/domu/drbdmeta[1]; } on xen04 { device /dev/drbd0; disk /dev/mapper/domu-logserver; address 192.168.0.72:7788; meta-disk /dev/domu/drbdmeta[1]; } My /proc/drbd earlier in the day: xen05:/usr/src/drbd-8.0.3# cat /proc/drbd version: 8.0.3 (api:86/proto:86) SVN Revision: 2881 build by root at xen05, 2007-06-11 11:57:14 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r--- ns:0 nr:20480000 dw:20480000 dr:0 al:0 bm:1250 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:1278750 misses:1250 starving:0 dirty:0 changed:1250 act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0 And then after I've done a /etc/init.d/drbd stop /etc/init.d/drbd start on the secondary machine. xen05:/usr/src/drbd-8.0.3# cat /proc/drbd version: 8.0.3 (api:86/proto:86) SVN Revision: 2881 build by root at xen05, 2007-06-11 11:57:14 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0 This seems like it should be a fairly painless thing to do...take away the secondary, wait a little bit, reattach it, and watch it sync back up. As a side note, the secondary truly is a secondary. While unattached, I'm not making any changes to the backend disks - I just wanted to test the ability to remove the secondary machine and re-add it later. I ran into a problem over the weekend that made me want to fail over to the secondary machine, but I found that it was a month out of sync. If I need to provide anything else, please let me know. Thanks!