[DRBD-user] sync doesn't start

Per Liden per at fukt.bth.se
Mon Nov 29 13:03:18 CET 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

Another issue I've come across is that invalidation of the secondary node 
doesn't automatically start a resync. Here follows the sequense of 
commands to trigger this bug:

Proc1:~ # cat /proc/drbd 
version: 0.7.4 (api:76/proto:74)
SVN Revision: 1539 build by lmb at chip, 2004-09-14 10:21:07
 0: cs:Connected st:Primary/Secondary ld:Consistent
    ns:3004 nr:0 dw:3004 dr:4600 al:37 bm:4024 lo:0 pe:0 ua:0 ap:0

Proc1:~ # drbdadm invalidate all
ioctl(,INVALIDATE,) failed: Operation now in progress
Only in 'Connected' cstate possible.
Command '/sbin/drbdsetup /dev/drbd0 invalidate' terminated with exit code 20
drbdsetup exited with code 20

Proc1:~ # cat /proc/drbd 
version: 0.7.4 (api:76/proto:74)
SVN Revision: 1539 build by lmb at chip, 2004-09-14 10:21:07
 0: cs:Connected st:Primary/Secondary ld:Consistent
    ns:3004 nr:0 dw:3004 dr:4600 al:37 bm:4024 lo:0 pe:0 ua:0 ap:0


Ok, fair enough, we don't want to invalidate the primary side. 
Try do the same thing on the other side.


Proc2:~ # cat /proc/drbd 
version: 0.7.4 (api:76/proto:74)
SVN Revision: 1539 build by lmb at chip, 2004-09-14 10:21:07
 0: cs:Connected st:Secondary/Primary ld:Consistent
    ns:0 nr:3004 dw:3004 dr:0 al:0 bm:4024 lo:0 pe:0 ua:0 ap:0

Proc2:~ # drbdadm invalidate all

Proc2:~ # cat /proc/drbd 
version: 0.7.4 (api:76/proto:74)
SVN Revision: 1539 build by lmb at chip, 2004-09-14 10:21:07
 0: cs:WFBitMapT st:Secondary/Primary ld:Inconsistent
    ns:0 nr:3004 dw:3004 dr:0 al:0 bm:8048 lo:0 pe:0 ua:0 ap:0

Proc2: /var/log/messages
Nov 26 18:02:15 Proc2 kernel: drbd0: drbdsetup [3415]: cstate Connected --> WFBitMapT
Nov 26 18:02:16 Proc2 kernel: drbd0: 65928176 KB now marked out-of-sync by on disk bit-map.


Looks ok, the secondary has initiated the sync. Now let's look at the 
primary side.


Proc1:~ # cat /proc/drbd 
version: 0.7.4 (api:76/proto:74)
SVN Revision: 1539 build by lmb at chip, 2004-09-14 10:21:07
 0: cs:Connected st:Primary/Secondary ld:Consistent
    ns:3004 nr:0 dw:3004 dr:4600 al:37 bm:4024 lo:0 pe:0 ua:0 ap:0

Proc1: /var/log/messages
(contains nothing new)


This is where the problem starts. Even if the nodes are connected and all, 
the primary node has no idea the secondary node wants to synchronize.


Proc1:~ # drbdadm connect all

Proc1:~ # cat /proc/drbd 
version: 0.7.4 (api:76/proto:74)
SVN Revision: 1539 build by lmb at chip, 2004-09-14 10:21:07
 0: cs:SyncSource st:Primary/Secondary ld:Consistent
    ns:11100 nr:0 dw:3004 dr:19960 al:37 bm:4024 lo:0 pe:42 ua:1065 ap:0
        [>...................] sync'ed:  0.1% (64372/64382)M
        finish: 1:38:05 speed: 10,936 (10,936) K/sec


Running "drbdadm connect" to connect the already connected primary side 
seems to help drbd understand that the other side is waiting for the 
sync. Once the sync is started it completes just fine.


Nov 26 18:05:02 Proc1 kernel: drbd0: drbd0_receiver [1099]: cstate Connected --> BrokenPipe
Nov 26 18:05:02 Proc1 kernel: drbd0: short read expecting header on sock: r=-512
Nov 26 18:05:02 Proc1 kernel: drbd0: worker terminated
Nov 26 18:05:02 Proc1 kernel: drbd0: asender terminated
Nov 26 18:05:02 Proc1 kernel: drbd0: drbd0_receiver [1099]: cstate BrokenPipe --> StandAlone
Nov 26 18:05:02 Proc1 kernel: drbd0: Connection lost.
Nov 26 18:05:02 Proc1 kernel: drbd0: receiver terminated
Nov 26 18:05:02 Proc1 kernel: drbd0: drbdsetup [3820]: cstate StandAlone --> Unconnected
Nov 26 18:05:02 Proc1 kernel: drbd0: drbd0_receiver [3822]: cstate Unconnected --> WFConnection
Nov 26 18:05:02 Proc1 kernel: drbd0: drbd0_receiver [3822]: cstate WFConnection --> WFReportParams
Nov 26 18:05:02 Proc1 kernel: drbd0: Handshake successful: DRBD Network Protocol version 74
Nov 26 18:05:02 Proc1 kernel: drbd0: Connection established.
Nov 26 18:05:02 Proc1 kernel: drbd0: I am(P): 1:00000007:00000001:00000021:0000000c:10
Nov 26 18:05:02 Proc1 kernel: drbd0: Peer(S): 0:00000007:00000001:00000020:0000000c:01
Nov 26 18:05:02 Proc1 kernel: drbd0: drbd0_receiver [3822]: cstate WFReportParams --> WFBitMapS
Nov 26 18:05:02 Proc1 kernel: drbd0: Primary/Unknown --> Primary/Secondary
Nov 26 18:05:03 Proc1 kernel: drbd0: drbd0_receiver [3822]: cstate WFBitMapS --> SyncSource
Nov 26 18:05:03 Proc1 kernel: drbd0: Resync started as SyncSource (need to sync 65928176 KB [16482044 bits set]).

Proc2:~ # cat /proc/drbd 
version: 0.7.4 (api:76/proto:74)
SVN Revision: 1539 build by lmb at chip, 2004-09-14 10:21:07
 0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent
    ns:0 nr:753244 dw:753244 dr:0 al:0 bm:8093 lo:0 pe:1368 ua:5 ap:0
        [>...................] sync'ed:  1.2% (63650/64382)M
        finish: 1:28:40 speed: 12,224 (10,872) K/sec

Nov 26 18:02:15 Proc2 kernel: drbd0: drbdsetup [3415]: cstate Connected --> WFBitMapT
Nov 26 18:02:16 Proc2 kernel: drbd0: 65928176 KB now marked out-of-sync by on disk bit-map.
Nov 26 18:05:02 Proc2 kernel: drbd0: sock was shut down by peer
Nov 26 18:05:02 Proc2 kernel: drbd0: drbd0_receiver [1111]: cstate WFBitMapT --> BrokenPipe
Nov 26 18:05:02 Proc2 kernel: drbd0: short read expecting header on sock: r=0
Nov 26 18:05:02 Proc2 kernel: drbd0: meta connection shut down by peer.
Nov 26 18:05:02 Proc2 kernel: drbd0: asender terminated
Nov 26 18:05:02 Proc2 kernel: drbd0: worker terminated
Nov 26 18:05:02 Proc2 kernel: drbd0: drbd0_receiver [1111]: cstate BrokenPipe --> Unconnected
Nov 26 18:05:02 Proc2 kernel: drbd0: Connection lost.
Nov 26 18:05:02 Proc2 kernel: drbd0: drbd0_receiver [1111]: cstate Unconnected --> WFConnection
Nov 26 18:05:02 Proc2 kernel: drbd0: drbd0_receiver [1111]: cstate WFConnection --> WFReportParams
Nov 26 18:05:02 Proc2 kernel: drbd0: Handshake successful: DRBD Network Protocol version 74
Nov 26 18:05:02 Proc2 kernel: drbd0: Connection established.
Nov 26 18:05:02 Proc2 kernel: drbd0: I am(S): 0:00000007:00000001:00000020:0000000c:01
Nov 26 18:05:02 Proc2 kernel: drbd0: Peer(P): 1:00000007:00000001:00000021:0000000c:10
Nov 26 18:05:02 Proc2 kernel: drbd0: drbd0_receiver [1111]: cstate WFReportParams --> WFBitMapT
Nov 26 18:05:02 Proc2 kernel: drbd0: Secondary/Unknown --> Secondary/Primary
Nov 26 18:05:03 Proc2 kernel: drbd0: drbd0_receiver [1111]: cstate WFBitMapT --> SyncTarget
Nov 26 18:05:03 Proc2 kernel: drbd0: Resync started as SyncTarget (need to sync 65928176 KB [16482044 bits set]).

Nov 26 19:46:19 Proc1 kernel: drbd0: Resync done (total 6076 sec; paused 0 sec; 10848 K/sec)
Nov 26 19:46:19 Proc1 kernel: drbd0: drbd0_worker [3821]: cstate SyncSource --> Connected

Nov 26 19:46:19 Proc2 kernel: drbd0: Resync done (total 6076 sec; paused 0 sec; 10848 K/sec)
Nov 26 19:46:19 Proc2 kernel: drbd0: drbd0_worker [3432]: cstate SyncTarget --> Connected


/Per



More information about the drbd-user mailing list