Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Lars Ellenberg wrote: > On Sun, May 17, 2009 at 05:12:50PM +0300, Eli Dorfman (Voltaire) wrote: >> Lars Ellenberg wrote: >>> On Thu, May 14, 2009 at 04:45:44PM +0300, Eli Dorfman (Voltaire) wrote: >>>> Lars Ellenberg wrote: >>>>> On Wed, May 13, 2009 at 06:35:43PM +0300, Eli Dorfman (Voltaire) wrote: >>>>>> Hi, >>>>>> >>>>>> Assuming a setup of 2 nodes primary A and secondary B. >>>>>> After primary node A reboot and B became primary, is there a way to avoid full resync >>>>>> from B to A? >>>>> Uhm, well, yes: just do nothing and let DRBD do its job ;) >>>>> Bitmap based resync is the "normal" resync operation. >>>>> >>>> After A was rebooted and B had almost no changes - >>>> yet when A goes up again it seems that B (drbd) performs full resync. >>> what makes you think so? >> That's what I see in /proc/drbd. >> we have a partition of 20GB and from the link speed and the time till process is completed, >> it seems that drbd performs full resync. >> >>>> Is this correct? >>> it would not be expected. >> So what could be the reason for this full resync? > > I still doubt the full sync, though. > care to show logs? > After rebooting node A these are the messages on B: May 18 19:12:23 optimus2 kernel: drbd0: PingAck did not arrive in time. May 18 19:12:23 optimus2 kernel: drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) May 18 19:12:23 optimus2 kernel: drbd0: asender terminated May 18 19:12:23 optimus2 kernel: drbd0: Terminating asender thread May 18 19:12:23 optimus2 kernel: drbd0: short read expecting header on sock: r=-512 May 18 19:12:23 optimus2 kernel: drbd0: Writing meta data super block now. May 18 19:12:23 optimus2 kernel: drbd0: tl_clear() May 18 19:12:23 optimus2 kernel: drbd0: Connection closed May 18 19:12:23 optimus2 kernel: drbd0: conn( NetworkFailure -> Unconnected ) May 18 19:12:23 optimus2 kernel: drbd0: receiver terminated May 18 19:12:23 optimus2 kernel: drbd0: receiver (re)started May 18 19:12:23 optimus2 kernel: drbd0: conn( Unconnected -> WFConnection ) May 18 19:13:16 optimus2 ResourceManager[30674]: [30685]: info: Acquiring resource group: optimus2 172.30.3.99/24/eth0/172.30.3.255 drbddisk::ufmdb Filesystem::/dev/drbd0::/opt/ufm/files/::ext3 mysqld ufmd May 18 19:13:16 optimus2 kernel: drbd0: role( Secondary -> Primary ) May 18 19:13:16 optimus2 kernel: drbd0: Writing meta data super block now. May 18 19:13:16 optimus2 kernel: drbd0: Creating new current UUID May 18 19:13:16 optimus2 kernel: drbd0: Writing meta data super block now. May 18 19:13:16 optimus2 ResourceManager[30674]: [30965]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt/ufm/files/ ext3 start May 18 19:13:16 optimus2 Filesystem[30978]: [31008]: INFO: Running start for /dev/drbd0 on /opt/ufm/files May 18 19:13:16 optimus2 kernel: EXT3 FS on drbd0, internal journal May 18 19:13:56 optimus2 kernel: drbd0: Handshake successful: Agreed network protocol version 88 May 18 19:13:56 optimus2 kernel: drbd0: conn( WFConnection -> WFReportParams ) May 18 19:13:56 optimus2 kernel: drbd0: Starting asender thread (from drbd0_receiver [22112]) May 18 19:13:56 optimus2 kernel: drbd0: data-integrity-alg: <not-used> May 18 19:13:56 optimus2 kernel: drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) May 18 19:13:56 optimus2 kernel: drbd0: Writing meta data super block now. May 18 19:13:56 optimus2 kernel: drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) May 18 19:13:56 optimus2 kernel: drbd0: Began resync as SyncSource (will sync 20482176 KB [5120544 bits set]). May 18 19:13:56 optimus2 kernel: drbd0: Writing meta data super block now. [root at optimus2 ~]# cat /proc/drbd version: 8.2.6 (api:88/proto:86-88) GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn at c5-x8664-build, 2008-06-21 08:48:13 0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r--- ns:2244236 nr:482168 dw:483008 dr:2256073 al:18 bm:136 lo:3 pe:5 ua:253 ap:0 oos:18238236 [=>..................] sync'ed: 11.0% (17810/20002)M finish: 0:24:30 speed: 12,320 (11,504) K/sec It seems as if drbd performs full resync of node A - why? The link is 100 Mb so drbd is actually using the maximal bandwidth. Thanks, Eli