[DRBD-user] Problem getting a secondary node to notice that i t needs to sync.

Lars Ellenberg lars.ellenberg at linbit.com
Thu Jun 14 12:24:13 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wed, Jun 13, 2007 at 02:30:38PM -0400, Craig Hoffman wrote:
> 
> 
> Lars Ellenberg wrote:
> >On Tue, Jun 12, 2007 at 01:51:04PM -0400, Craig Hoffman wrote:
> >  
> >>I'm having a problem getting my secondary node to see that data has
> >>changed on the primary and that it needs to sync up.
> >>The problem seems to be happening to me on both 0.7 and 0.8.  I'm sure
> >>that it's just something that I'm doing, but I'd like a little
> >>    
> >pointer.
> >  
> >>Here's what I've done to replicate the problem...
> >>
> >>Bring up both nodes.  Force one into being a primary.  The secondary
> >>node notices this and they start syncing up.  This happened when I
> >>    
> >first
> >  
> >>set DRBD up.  If I leave things be, the two seem to remain in sync
> >>    
> >just
> >  
> >>fine.  Catting /proc/drbd shows the numbers changing.  Good.  But, if
> >>    
> >I
> >  
> >>do a /etc/init.d/drbd stop and then a start on the secondary, things
> >>never sync again *unless* I force the secondary as inconsistent.
> >>
> >>My backend devices are LVM.  So, in effect, I'm running drbd on top of
> >>LVM.  My primary server names are xen04 and my secondary is xen05.
> >>
> >>Here is a snippet from my drbd.conf:
> >>
> >>resource logserver {
> >>protocol C;
> >>startup {
> >> degr-wfc-timeout 120;
> >>}
> >>disk {
> >> on-io-error detach;
> >>}
> >>net {}
> >>syncer {
> >> rate 10M;
> >>}
> >>on xen05 {
> >> device /dev/drbd0;
> >> disk /dev/mapper/domu-logserver;
> >> address 192.168.0.60:7788;
> >> meta-disk /dev/domu/drbdmeta[1];
> >>}
> >>on xen04 {
> >> device /dev/drbd0;
> >> disk /dev/mapper/domu-logserver;
> >> address 192.168.0.72:7788;
> >> meta-disk /dev/domu/drbdmeta[1];
> >>}
> >>    
> >
> >you are sure you want to have a drbd between two domU ?
> >
> >you are sure that the "disk = []" in you xen configs is ok,
> >no copy'n'paste error? (sorry, I have to ask, that _does_ happen)
> >  
> I believe that I'm sure.  ;)
> The domU on xen05 is not operational.  It is meant to be a cold failover.
> 
> The disk line in my xen config file looks like this:
> disk = [ 'phy:domu/logserver,hda1,w','phy:domu/logserver_swap,hda2,w' ]

soo...
why does that not read phy:drbd0, then?

> To further elaborate, my xen config works just fine.  And when I first 
> setup drbd, the LV's were mirrored as I expected them to - so I assume 
> that my configs are fine.
> >
> >  
> >>My /proc/drbd earlier in the day:
> >>
> >>xen05:/usr/src/drbd-8.0.3# cat /proc/drbd
> >>version: 8.0.3 (api:86/proto:86)
> >>SVN Revision: 2881 build by root at xen05, 2007-06-11 11:57:14
> >>0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
> >>   ns:0 nr:20480000 dw:20480000 dr:0 al:0 bm:1250 lo:0 pe:0 ua:0 ap:0
> >>       resync: used:0/31 hits:1278750 misses:1250 starving:0 dirty:0
> >>changed:1250
> >>       act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0
> >>    
> >changed:0
> >  
> >>
> >>And then after I've done a
> >>/etc/init.d/drbd stop
> >>/etc/init.d/drbd start
> >>on the secondary machine.
> >>
> >>xen05:/usr/src/drbd-8.0.3# cat /proc/drbd
> >>version: 8.0.3 (api:86/proto:86)
> >>SVN Revision: 2881 build by root at xen05, 2007-06-11 11:57:14
> >>0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
> >>   ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
> >>       resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
> >>       act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0
> >>    
> >changed:0
> >
> >what makes you think there is something wrong with this output?
> >  
> Well, prior to stopping the DRBD service on xen05, I saw those numbers 
> change.  Now all of them show "0".  "Hit's" for example - prior to 
> stopping the service, there were 1278750 hits.  After I stopped the 
> service and restarted it, my hits stayed at 0.

forget about the "hits" and that stuff...
the interesting counters are 
ns:0 nr:0 dw:0 dr:0 al:0 bm:0

>  Also, if I stop the 
> service, mount that LV and look at the data, nothing new is in there.  
> If I reconnect my DRBD devices, and let them sit for a month, even 
> though it says that they're UpToDate, they're not.  If Xen04 (my 
> primary) were to die, I would not be able to bring xen05 up with current 
> data.  The data would only be current as of the last time I manually 
> said that the secondary(xen05) was inconsistent.  I could write a script 
> that would do this once a day, but that's not much better than a typical 
> backup.

I really don't know what is wrong with your config, but that sounds
really broken, and I doubt it is drbd.

maybe you bypass drbd with your writes?
your disk line in your xen config point in that direction, too.
I have no idea why xen would allow that, but aparently it does....


-- 
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list