[DRBD-user] Truck replication

Mon Nov 9 23:26:07 CET 2009

Hi All:

I am hitting a strange behavior during truck replication (so far it has occurred to one machine out of 10).  I have two machines node1 and node2, and I am restoring metadata on node2.  Whenever I do the following (as mentioned on truck replication page) resynchronization takes place. Any idea what could be causing this, any clues to debug such issues?  

drbdsetup 1 new-current-uuid --clear-bitmap
drbdadm detach res
drbdmeta_cmd=$(drbdadm -d dump-md res)
drbdadm detach res 
${drbdmeta_cmd/dump-md/restore-md} /var/testmeta

dmesg output:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
block drbd1: conn( Unconnected -> WFConnection ) 
block drbd1: Handshake successful: Agreed network protocol version 91
block drbd1: conn( WFConnection -> WFReportParams ) 
block drbd1: Starting asender thread (from drbd1_receiver [24837])
block drbd1: data-integrity-alg: <not-used>
block drbd1: drbd_sync_handshake:
block drbd1: self C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:669554939 flags:0
block drbd1: peer C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:0 flags:0
block drbd1: uuid_compare()=0 by rule 40
block drbd1: No resync, but 669554939 bits in bitmap!
block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> Inconsistent ) 
block drbd1: peer( Secondary -> Primary ) pdsk( Inconsistent -> UpToDate ) 
block drbd1: drbd_sync_handshake:
block drbd1: self C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:669554939 flags:0
block drbd1: peer 40BD0C3BDCF22CB9:C48D0F6F05011D66:0000000000000004:0000000000000000 bits:0 flags:0
block drbd1: uuid_compare()=-1 by rule 50
block drbd1: Becoming sync target due to disk states.
block drbd1: conn( Connected -> WFBitMapT ) 
block drbd1: conn( WFBitMapT -> WFSyncUUID ) 
block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1
block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)
block drbd1: conn( WFSyncUUID -> SyncTarget ) 
block drbd1: Began resync as SyncTarget (will sync 2678219756 KB [669554939 bits set]).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
version: 8.3.4 (api:88/proto:86-91)
GIT-hash: 70a645ae080411c87b4482a135847d69dc90a6a2 build by rmake-chroot at localhost.localdomain, 2009-10-27 15:57:44

 1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r----
    ns:0 nr:9735808 dw:9731552 dr:0 al:0 bm:592 lo:135 pe:29068 ua:133 ap:0 ep:1 wo:b oos:2668488204
        [>....................] sync'ed:  0.4% (2605944/2615448)M
        finish: 62:56:09 speed: 11,768 (11,460) K/sec

From: tech_j at hotmail.com
To: lars.ellenberg at linbit.com; drbd-user at lists.linbit.com
Date: Mon, 2 Nov 2009 19:29:01 +0000
Subject: Re: [DRBD-user] Truck replication

Thats correct.  The problem was with my perception of how it suppose to work versus the actual behavior.  Truck based replication worked perfectly for me.  After initial UpToDate/UpToDate message I untared the my directory structure on drbd mounted device and monitored dw (disk write) status from which I was able to see disk writes happening.  Before this I was hoping to see UpToDate/Inconsistent(or Outdated) ds message, but that was just my perception.

Thanks a lot for help.

Regards,
Jay

> Date: Sat, 31 Oct 2009 10:17:22 +0100
> From: lars.ellenberg at linbit.com
> To: drbd-user at lists.linbit.com
> Subject: Re: [DRBD-user] Truck replication
> 
> On Thu, Oct 29, 2009 at 07:41:16PM +0000, jay b wrote:
> > 
> > Hi All:
> > 
> > I am stuck with truck based replication.  I have the following setup:
> > 
> > node1 (primary):
> > /dev/sda7 [4TB]
> > /dev/drbd1
> > I have also created filesystem ext3 on drbd1
> > 
> > node2:
> > <same configuration as above, but there is NO filesystem>
> > 
> > I am basically trying to avoid the initial sync time (if possible).
> > So, now node1 is primary and if I follow steps written on this page
> > http://www.drbd.org/users-guide/s-using-truck-based-replication.html
> > I can get both servers to display uptodate/uptodate state.
> > 
> > But, when I mount /dev/drbd1 to /mnt/drbd (on node1 which is primary),
> > and copy files to /mnt/drbd, I do not see Uptodate/Inconsistent
> > message (via cat /proc/drbd on node1)
> 
> Why would you expect node2 to become "Inconsistent"
> during normal operation?
> 
> > At this point I would assume that node2 should get some data and try
> > to sync with primary.  Am I missing any steps?
> 
> When you _connect_ the second node,
> it will do some (bitmap based) resync.
> 
> Once connected, and "Connected Uptodate",
> it will do _online replication_.
> 
> If there is problem, I don't see it?
> Either you or me or DRBD misunderstood something, I guess.
> 
> -- 
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> 
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list   --   I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

Save up to 84% on Windows 7 until Jan 3-eligible CDN College or University students only. Hurry-buy it now for $39.99! 		 	   		  
_________________________________________________________________
Windows Live: Friends get your Flickr, Yelp, and Digg updates when they e-mail you.
http://go.microsoft.com/?linkid=9691817
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20091109/956380ef/attachment.htm>