[DRBD-user] DRBD sync stalled

R Johnson robert.robertjohnson at gmail.com
Tue Nov 5 19:51:25 CET 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I am using drbd 0.7.22, 2 nodes with heartbeat on 2 SLES10 SP4 XEN virtual
servers; here are the configs:

3255:/etc # drbdadm dump all
resource r0 {
    protocol               C;
    incon-degr-cmd       "echo '!DRBD! pri on incon-degr' | wall ; sleep 60
; halt -f";
    3255 {
        device           /dev/drbd0;
        disk             /dev/DATA/RHAPSODY;
        address          172.xx.xx.22:7788;
        meta-disk        /dev/DATA/DRBD_METADATA [0];
    }
    on 3256 {
        device           /dev/drbd0;
        disk             /dev/DATA/RHAPSODY;
        address          172.xx.xx.33:7788;
        meta-disk        /dev/DATA/DRBD_METADATA [0];
    }
    disk {
        on-io-error      detach;
    }
    syncer {
        rate             50M;
        group              1;
        al-extents       257;
    }
}

and

3256:/ # drbdadm dump all
resource r0 {
    protocol               C;
    incon-degr-cmd       "echo '!DRBD! pri on incon-degr' | wall ; sleep 60
; halt -f";
    on 3256 {
        device           /dev/drbd0;
        disk             /dev/DATA/RHAPSODY;
        address          172.xx.xx.33:7788;
        meta-disk        /dev/DATA/DRBD_METADATA [0];
    }
    on 3255 {
        device           /dev/drbd0;
        disk             /dev/_DATA/RHAPSODY;
        address          172.xx.xx.22:7788;
        meta-disk        /dev/DATA/DRBD_METADATA [0];
    }
    disk {
        on-io-error      detach;
    }
    syncer {
        rate             50M;
        group              1;
        al-extents       257;
    }
}

Everything works fine for a very short while and then the sync's between
the nodes fail with the following:

3255:/etc # cat  /proc/drbd
version: 0.7.22 (api:79/proto:74)
SVN Revision: 2572 build by lmb at dale, 2006-10-25 18:17:21
 0: cs:SyncSource st:Primary/Secondary ld:Consistent
    ns:63181996 nr:0 dw:62680576 dr:2644442 al:446 bm:585 lo:0 pe:0 ua:0
ap:0
        [>...................] sync'ed:  0.1% (18100/18100)M
        stalled

3256:/ # cat  /proc/drbd
version: 0.7.22 (api:79/proto:74)
SVN Revision: 2572 build by lmb at dale, 2006-10-25 18:17:21
 0: cs:SyncTarget st:Secondary/Primary ld:Inconsistent
    ns:0 nr:78284 dw:78284 dr:0 al:0 bm:0 lo:0 pe:1280 ua:0 ap:0
        [>...................] sync'ed:  0.1% (18100/18100)M
        stalled

I see the following errors in /var/log/messages:

Nov  5 11:50:05 w583s3255 kernel: drbd0: drbd0_asender [5880]: cstate
SyncSource --> NetworkFailure
Nov  5 11:50:05 w583s3255 kernel: drbd0: drbd0_receiver [2909]: cstate
NetworkFailure --> Unconnected
Nov  5 11:50:06 w583s3255 kernel: drbd0: Handshake successful: DRBD Network
Protocol version 74
Nov  5 11:55:08 w583s3255 kernel: drbd0: drbd0_asender [4856]: cstate
SyncSource --> NetworkFailure
Nov  5 11:55:08 w583s3255 kernel: drbd0: drbd0_receiver [2909]: cstate
NetworkFailure --> Unconnected


However tcpdump between the servers is active and indicates a healthy
connection state.

Please let me know if any other information is required


RJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20131105/4d88f81c/attachment.htm>


More information about the drbd-user mailing list