[DRBD-user] Problem getting a secondary node to notice that i t needs to sync.

Craig Hoffman craig.hoffman at acs-inc.com
Wed Jun 13 20:30:38 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.



Lars Ellenberg wrote:
> On Tue, Jun 12, 2007 at 01:51:04PM -0400, Craig Hoffman wrote:
>   
>> I'm having a problem getting my secondary node to see that data has
>> changed on the primary and that it needs to sync up.
>> The problem seems to be happening to me on both 0.7 and 0.8.  I'm sure
>> that it's just something that I'm doing, but I'd like a little
>>     
> pointer.
>   
>> Here's what I've done to replicate the problem...
>>
>> Bring up both nodes.  Force one into being a primary.  The secondary
>> node notices this and they start syncing up.  This happened when I
>>     
> first
>   
>> set DRBD up.  If I leave things be, the two seem to remain in sync
>>     
> just
>   
>> fine.  Catting /proc/drbd shows the numbers changing.  Good.  But, if
>>     
> I
>   
>> do a /etc/init.d/drbd stop and then a start on the secondary, things
>> never sync again *unless* I force the secondary as inconsistent.
>>
>> My backend devices are LVM.  So, in effect, I'm running drbd on top of
>> LVM.  My primary server names are xen04 and my secondary is xen05.
>>
>> Here is a snippet from my drbd.conf:
>>
>> resource logserver {
>> protocol C;
>> startup {
>>  degr-wfc-timeout 120;
>> }
>> disk {
>>  on-io-error detach;
>> }
>> net {}
>> syncer {
>>  rate 10M;
>> }
>> on xen05 {
>>  device /dev/drbd0;
>>  disk /dev/mapper/domu-logserver;
>>  address 192.168.0.60:7788;
>>  meta-disk /dev/domu/drbdmeta[1];
>> }
>> on xen04 {
>>  device /dev/drbd0;
>>  disk /dev/mapper/domu-logserver;
>>  address 192.168.0.72:7788;
>>  meta-disk /dev/domu/drbdmeta[1];
>> }
>>     
>
> you are sure you want to have a drbd between two domU ?
>
> you are sure that the "disk = []" in you xen configs is ok,
> no copy'n'paste error? (sorry, I have to ask, that _does_ happen)
>   
I believe that I'm sure.  ;)
The domU on xen05 is not operational.  It is meant to be a cold failover.

The disk line in my xen config file looks like this:
disk = [ 'phy:domu/logserver,hda1,w','phy:domu/logserver_swap,hda2,w' ]

To further elaborate, my xen config works just fine.  And when I first 
setup drbd, the LV's were mirrored as I expected them to - so I assume 
that my configs are fine.
>
>   
>> My /proc/drbd earlier in the day:
>>
>> xen05:/usr/src/drbd-8.0.3# cat /proc/drbd
>> version: 8.0.3 (api:86/proto:86)
>> SVN Revision: 2881 build by root at xen05, 2007-06-11 11:57:14
>> 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
>>    ns:0 nr:20480000 dw:20480000 dr:0 al:0 bm:1250 lo:0 pe:0 ua:0 ap:0
>>        resync: used:0/31 hits:1278750 misses:1250 starving:0 dirty:0
>> changed:1250
>>        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0
>>     
> changed:0
>   
>>
>> And then after I've done a
>> /etc/init.d/drbd stop
>> /etc/init.d/drbd start
>> on the secondary machine.
>>
>> xen05:/usr/src/drbd-8.0.3# cat /proc/drbd
>> version: 8.0.3 (api:86/proto:86)
>> SVN Revision: 2881 build by root at xen05, 2007-06-11 11:57:14
>> 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
>>    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
>>        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
>>        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0
>>     
> changed:0
>
> what makes you think there is something wrong with this output?
>   
Well, prior to stopping the DRBD service on xen05, I saw those numbers 
change.  Now all of them show "0".  "Hit's" for example - prior to 
stopping the service, there were 1278750 hits.  After I stopped the 
service and restarted it, my hits stayed at 0.  Also, if I stop the 
service, mount that LV and look at the data, nothing new is in there.  
If I reconnect my DRBD devices, and let them sit for a month, even 
though it says that they're UpToDate, they're not.  If Xen04 (my 
primary) were to die, I would not be able to bring xen05 up with current 
data.  The data would only be current as of the last time I manually 
said that the secondary(xen05) was inconsistent.  I could write a script 
that would do this once a day, but that's not much better than a typical 
backup.

While writing this, I just took another snapshot of my /proc/drbd.  It 
looks identical to the one I sent yesterday - and this is a pretty busy 
server.

 From xen04(primary):

 0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
    ns:0 nr:0 dw:0 dr:20480000 al:0 bm:1250 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:1278750 misses:1250 starving:0 dirty:0 
changed:1250
        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

 From xen05(secondary):

 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0



Thanks!



More information about the drbd-user mailing list