[DRBD-user] RE: How to do partial re-sync -> Sorry for not being detailed enough

Vic Berdin vic at digi.com.ph
Thu Aug 18 15:23:38 CEST 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi Lars,

> ------------------------------
> I am sure drbd is fine with any library and bash compatible shell.
> I rather suspect you doing something "stupid" (sorry for directness).

1. First of, my environment does not have "/bin/bash" (and I do not plan to
use "bash"). Instead, all it has is /bin/sh -> /bin/busybox. Thus, I changed
all calls to /bin/bash.

2. My ip -> /bin/busybox doesn't support route_module, thus, it spurrs "No
route from me (192.168.2.2) to peer (192.168.2.1) messages as I run my
/etc/rc.d/init.d/drbd start (from scripts/drbd; also tweaked to work on my
environment). So I made some tweaks to prioritize ifconfig -> /bin/busybox
and augment the absence of `ip route`.

Note: Although the drbd.o module loads and /proc/drbd is available even if I
do not do (2). I just thought I had to do it just to be sure/and be free of
these "No route..." errors.

> if you talk "bash" to me, that is,
> starting from no drbd module loaded,
> step by step the commands that you issue,
> indicating on which node, and your expectations,
> including /proc/drbd and/or dmesg | tail | grep -i drbd
> (if it changed from the step before),
> then probably the list or I can point out where the actual thinko is.

After doing (1), and (2) above, and drbd starts cleanly (w/o errors) on my
environment, I now have the following `cat /proc/drbd` results on the
Secondary node:

Results after a previous (sucessful) full-sync:
------------------------------------------------
version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:2048256 dw:2048256 dr:0 al:0 bm:252 lo:0 pe:0 ua:0 ap:0

On-going results during a full-sync (invoked by `drbdadm invalidate all`)

------------------------------------------------
version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent
    ns:0 nr:2350500 dw:2350500 dr:0 al:0 bm:396 lo:64 pe:768 ua:64 ap:0
	[==>.................] sync'ed: 15.0% (1746268/2048256)K
	finish: 0:13:13 speed: 2,148 (1,908) K/sec

...and still on-going...
------------------------------------------------
version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent
    ns:0 nr:3036368 dw:3036368 dr:0 al:0 bm:438 lo:93 pe:768 ua:93 ap:0
	[=========>..........] sync'ed: 48.6% (1060516/2048256)K
	finish: 0:07:57 speed: 2,176 (1,916) K/sec

...almost...
------------------------------------------------
version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent
    ns:0 nr:3928032 dw:3928032 dr:0 al:0 bm:492 lo:84 pe:768 ua:84 ap:0
	[==================>.] sync'ed: 91.9% (168816/2048256)K
	finish: 0:01:06 speed: 2,412 (1,924) K/sec
	
...and finally:
------------------------------------------------
version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:4096512 dw:4096512 dr:0 al:0 bm:504 lo:0 pe:0 ua:0 ap:0
------------------------------------------------

>From above, and after mount+inspection of target device on the Secondary, I
am sure that full-sync did its job.

Now, after I unmount the target device on Secondary, and re-enable DRBD on
it, once more, as Secondary, while the other node remains Primary, I did
some dumping of data on Primary's source device (simple `cp`+`sync` commands
just to add data on it and made sure they're not just in buffer).

Now, as I do the above `cp`+`sync` commands, I constantly check the hardware
LEDs of both nodes. Only one node, the Primary, shows activity. These are
lots of data continously being created on the Primary, yet still, no
activity on the Secondary's HDD LED.

And while I'm at it, I'm also doing a series of "cat /proc/drbd
>>/tmp/copying.res;" and "dmesg|tail|grep -i drbd>>/tmp/copying.res" on
Secondary. After HDD LED on Primary stops blinking, I openned the file, and
here're the results:

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01
drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01
drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01
drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01
drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01
drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01
drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: I am(S): 1:00000002:00000004:00000019:00000002:01
drbd0: Peer(P): 1:00000002:00000004:00000019:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:WFConnection st:Secondary/Unknown ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: PingAck did not arrive in time.
drbd0: drbd0_asender [5715]: cstate Connected --> NetworkFailure
drbd0: asender terminated
drbd0: drbd0_receiver [5708]: cstate NetworkFailure --> BrokenPipe
drbd0: short read expecting header on sock: r=-512
drbd0: worker terminated
drbd0: drbd0_receiver [5708]: cstate BrokenPipe --> Unconnected
drbd0: Connection lost.
drbd0: drbd0_receiver [5708]: cstate Unconnected --> WFConnection

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:00000019:00000003:01
drbd0: Peer(P): 1:00000002:00000004:0000001a:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:00000019:00000003:01
drbd0: Peer(P): 1:00000002:00000004:0000001a:00000003:10
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Primary
drbd0: drbd0_receiver [5708]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [5708]: cstate SyncTarget --> Connected

version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at linuxmachine, 2005-08-12 10:20:57
 0: cs:Connected st:Secondary/Secondary ld:Consistent
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
drbd0: drbd0_receiver [5708]: cstate Unconnected --> WFConnection
drbd0: drbd0_receiver [5708]: cstate WFConnection --> WFReportParams
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000004:0000001a:00000003:01
drbd0: Peer(S): 1:00000002:00000004:0000001a:00000003:01
drbd0: drbd0_receiver [5708]: cstate WFReportParams --> Connected
drbd0: Secondary/Unknown --> Secondary/Secondary


Note that the results are the same all through out the "copying_on_primary"
event/process until it finishes. If I am already able to perform full-sync
each time I want to, what stops DRBD from performing partial-sync during a
"Connected" state?

Once more, I already saw how DRBD-0.6.13 work (w/ auto-partial-sync) way-way
back. That is why I'm expecting that this (0.7.11) should work also the
same.

> or, to start with,
> just point out were the "drbd quickstart guide" does not work for you.

Your guides/how to are fine. DRBD is fine, I am sure. That is why I'm
guessing/trying to figure out if my thin environment is the culprit for this
failure.

And by the way, if it matters, I'm implementing DRBD on the Nodes' internal
device. That is, I do not have the actual hostnames (`hostname`) as part of
my /etc/hosts.

Best regards,
Vic

-- 
Internal Virus Database is out-of-date.
Checked by AVG Anti-Virus.
Version: 7.0.323 / Virus Database: 267.8.13/47 - Release Date: 7/12/2005
 




More information about the drbd-user mailing list