[DRBD-user] Question: how to avoid full resync

Eli Dorfman (Voltaire) dorfman.eli at gmail.com
Mon May 18 16:18:41 CEST 2009


Lars Ellenberg wrote:
> On Sun, May 17, 2009 at 05:12:50PM +0300, Eli Dorfman (Voltaire) wrote:
>> Lars Ellenberg wrote:
>>> On Thu, May 14, 2009 at 04:45:44PM +0300, Eli Dorfman (Voltaire) wrote:
>>>> Lars Ellenberg wrote:
>>>>> On Wed, May 13, 2009 at 06:35:43PM +0300, Eli Dorfman (Voltaire) wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Assuming a setup of 2 nodes primary A and secondary B.
>>>>>> After primary node A reboot and B became primary, is there a way to avoid full resync
>>>>>> from B to A?
>>>>> Uhm, well, yes: just do nothing and let DRBD do its job ;)
>>>>> Bitmap based resync is the "normal" resync operation.
>>>>>
>>>> After A was rebooted and B had almost no changes - 
>>>> yet when A goes up again it seems that B (drbd) performs full resync.  
>>> what makes you think so?
>> That's what I see in /proc/drbd.
>> we have a partition of 20GB and from the link speed and the time till process is completed,
>> it seems that drbd performs full resync.
>>
>>>> Is this correct?
>>> it would not be expected.
>> So what could be the reason for this full resync?
> 
> I still doubt the full sync, though.
> care to show logs?
> 

After rebooting node A these are the messages on B:

May 18 19:12:23 optimus2 kernel: drbd0: PingAck did not arrive in time.
May 18 19:12:23 optimus2 kernel: drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
May 18 19:12:23 optimus2 kernel: drbd0: asender terminated
May 18 19:12:23 optimus2 kernel: drbd0: Terminating asender thread
May 18 19:12:23 optimus2 kernel: drbd0: short read expecting header on sock: r=-512
May 18 19:12:23 optimus2 kernel: drbd0: Writing meta data super block now.
May 18 19:12:23 optimus2 kernel: drbd0: tl_clear()
May 18 19:12:23 optimus2 kernel: drbd0: Connection closed
May 18 19:12:23 optimus2 kernel: drbd0: conn( NetworkFailure -> Unconnected )
May 18 19:12:23 optimus2 kernel: drbd0: receiver terminated
May 18 19:12:23 optimus2 kernel: drbd0: receiver (re)started
May 18 19:12:23 optimus2 kernel: drbd0: conn( Unconnected -> WFConnection )
May 18 19:13:16 optimus2 ResourceManager[30674]: [30685]: info: Acquiring resource group: optimus2 172.30.3.99/24/eth0/172.30.3.255 drbddisk::ufmdb Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd
May 18 19:13:16 optimus2 kernel: drbd0: role( Secondary -> Primary )
May 18 19:13:16 optimus2 kernel: drbd0: Writing meta data super block now.
May 18 19:13:16 optimus2 kernel: drbd0: Creating new current UUID
May 18 19:13:16 optimus2 kernel: drbd0: Writing meta data super block now.
May 18 19:13:16 optimus2 ResourceManager[30674]: [30965]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start
May 18 19:13:16 optimus2 Filesystem[30978]: [31008]: INFO: Running start for /dev/drbd0 on /opt/ufm/files
May 18 19:13:16 optimus2 kernel: EXT3 FS on drbd0, internal journal
May 18 19:13:56 optimus2 kernel: drbd0: Handshake successful: Agreed network protocol version 88
May 18 19:13:56 optimus2 kernel: drbd0: conn( WFConnection -> WFReportParams )
May 18 19:13:56 optimus2 kernel: drbd0: Starting asender thread (from drbd0_receiver [22112])
May 18 19:13:56 optimus2 kernel: drbd0: data-integrity-alg: <not-used>
May 18 19:13:56 optimus2 kernel: drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
May 18 19:13:56 optimus2 kernel: drbd0: Writing meta data super block now.
May 18 19:13:56 optimus2 kernel: drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent )
May 18 19:13:56 optimus2 kernel: drbd0: Began resync as SyncSource (will sync 20482176 KB [5120544 bits set]).
May 18 19:13:56 optimus2 kernel: drbd0: Writing meta data super block now.


[root at optimus2 ~]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn at c5-x8664-build, 2008-06-21 08:48:13
 0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---
    ns:2244236 nr:482168 dw:483008 dr:2256073 al:18 bm:136 lo:3 pe:5 ua:253 ap:0 oos:18238236
        [=>..................] sync'ed: 11.0% (17810/20002)M
        finish: 0:24:30 speed: 12,320 (11,504) K/sec


It seems as if drbd performs full resync of node A - why?
The link is 100 Mb so drbd is actually using the maximal bandwidth.

Thanks,
Eli


More information about the drbd-user mailing list