[DRBD-user] Skipping initial sync, and full sync after node failure

Thu Oct 22 20:44:45 CEST 2009

Hi Ian,

when creating a new resource which doesn't have any data that you want 
to keep on either node, you can use the following:

# drbdadm -- --clear-bitmap new-current-uuid drbd0

You can see the documentation here and make sure this applies to your 
situation:
http://www.drbd.org/users-guide/re-drbdadm.html

Then, your second issue should not occur as both nodes will now have a 
synchronized bitmap. It least that's how I understand it.

Regards,
-- 
Jean-François Chevrette [iWeb]

On 09-10-22 8:57 AM, Ian Marlier wrote:
> Hi, All --
>
> I'm working on getting DRBD up and running for a large storage array --
> around 10TB.  I'm having two issues that I suspect are related, and am
> hoping that someone might be able to help me out with them.
> Specifically, I'm wondering whether these issues are related; and if so,
> whether there is a method that can be used to allow the desired behavior.
>
> First of all, 10TB requires a fair amount of time to sync.  Even if a
> 10Gbps network is used, drbd's limit of 650MB/s means that a full sync
> would take between 4 and 5 hours.  With a 1Gbps network, that time rises
> to closer to 18 hours since the effective speed is 125MB/s.
>
> Because of this, I'm hoping to avoid the initial sync phase.  I'm
> starting with empty disks, and so I don't need the bitmap to be
> synchronized for data preservation or anything like that.  In googling
> around, I found the  following command, which does in fact have the
> effect of causing both nodes to report that they are UpToDate:
>      drbdadm -- 6::::1 set-gi resource
>
> So, the first question is whether this is, in fact, the appropriate
> command to use if one wants to avoid the initial sync.  Is there
> another  method that's preferred?  Is it simply not possible to skip the
> initial sync any longer?  What I really want is a way to tell drbd to
> sync the bitmap without actually syncing data, since there isn't data
> that I care about.
>
> The second issue that I'm having is that having established both nodes
> as UpToDate using the command above, and having swapped the Primary role
> back and forth between the hosts successfully, if one node fails (or is
> rebooted), it requires a full sync after coming back online.  This
> happens even if the other node was primary at the time that the local
> machine went down, and if no changes have been made to the local node.
> It appears that there is something going on with the size of the bitmap
> changing on the rebooted host, based on the logs, though that doesn't
> really make all that much sense to me:
>      Oct 21 17:16:11 scurry4 kernel: drbd0: No usable activity log found.
>      Oct 21 17:16:11 scurry4 kernel: drbd0: max_segment_size ( = BIO
> size ) = 32768
>      Oct 21 17:16:11 scurry4 kernel: drbd0: drbd_bm_resize called with
> capacity == 21462221048
>      Oct 21 17:16:11 scurry4 kernel: drbd0: resync bitmap:
> bits=2682777631 words=41918401
>      Oct 21 17:16:11 scurry4 kernel: drbd0: size = 10 TB (10731110524 KB)
>      Oct 21 17:16:11 scurry4 kernel: drbd0: Writing the whole bitmap,
> size changed
>      Oct 21 17:16:11 scurry4 kernel: drbd0: writing of bitmap took 468
> jiffies
>      Oct 21 17:16:11 scurry4 kernel: drbd0: 10 TB (2682777631 bits)
> marked out-of-sync by on disk bit-map.
>      Oct 21 17:16:12 scurry4 kernel: drbd0: reading of bitmap took 289
> jiffies
>      Oct 21 17:16:12 scurry4 kernel: drbd0: recounting of set bits took
> additional 271 jiffies
>      Oct 21 17:16:12 scurry4 kernel: drbd0: 10 TB (2682777631 bits)
> marked out-of-sync by on disk bit-map.
>      Oct 21 17:16:12 scurry4 kernel: drbd0: disk( Attaching ->
> Inconsistent )
>      Oct 21 17:16:12 scurry4 kernel: drbd0: Writing meta data super
> block now.
>
> The remote host, which remained up, shows this in its logs:
>      Oct 21 17:16:48 scurry24 kernel: drbd0: Becoming sync source due to
> disk states.
>      Oct 21 17:16:48 scurry24 kernel: drbd0: Writing the whole bitmap,
> full sync required after drbd_sync_handshake.
>      Oct 21 17:16:48 scurry24 kernel: drbd0: Writing meta data super
> block now.
>
> I'm wondering whether its possible that this behavior is related to the
> skipped sync documented above, or if it may be related in some way to
> the size of the device being synced.  Has anyone seen this before, or
> can anyone shed some light on that?
>
> Basic info: OS is CentOS x86_64.  Kernel version is
> 2.6.18-128.1.10.el5.  DRBD is version 8.2.6-2.
>
> Thanks for any help,
>
> Ian
>
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user