[DRBD-user] drbd sync stalled

Miguel Olivares miguel.olivares at evistel.com
Tue Jul 12 20:42:19 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

i'm using drbd8.3 under redhat 5.5 and when I try to *sync* the two nodes,
the *sync* seems to *stall* out indefinitely. I tried to find some help but
i couldn't


rpm -qa |grep drbd
drbd83-8.3.8-1.el5.centos
kmod-drbd83-8.3.8-1.el5.centos

rpm -qa |grep kernel
kernel-headers-2.6.18-194.el5
kernel-devel-2.6.18-194.el5
kernel-2.6.18-194.el5



cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
mockbuild at builder10.centos.org, 2010-06-04 08:04:09
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
    ns:83712000 nr:0 dw:0 dr:83712000 al:0 bm:5109 lo:0 pe:0 ua:0 ap:0 ep:1
wo:b oos:209292364
        [>....................] sync'ed:  0.1% (204384/204384)M delay_probe:
31064
        stalled


drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
mockbuild at builder10.centos.org, 2010-06-04 08:04:09
m:res           cs            ro                 ds
p             mounted  fstype
stalled

...             sync'ed:      0.1%               (204384/204384)M
delay_probe:
0:cdrs      SyncSource    Primary/Secondary  UpToDate/Inconsistent  C


[/var/log/messages on server1]
Jul 12 18:27:47 LMS1 kernel: block drbd0: peer( Secondary -> Unknown ) conn(
SyncSource -> TearDown )
Jul 12 18:27:47 LMS1 kernel: block drbd0: meta connection shut down by peer.
Jul 12 18:27:47 LMS1 kernel: block drbd0: asender terminated
Jul 12 18:27:47 LMS1 kernel: block drbd0: Terminating asender thread
Jul 12 18:27:47 LMS1 kernel: block drbd0: Connection closed
Jul 12 18:27:47 LMS1 kernel: block drbd0: conn( TearDown -> Unconnected )
Jul 12 18:27:47 LMS1 kernel: block drbd0: receiver terminated
Jul 12 18:27:47 LMS1 kernel: block drbd0: Restarting receiver thread
Jul 12 18:27:47 LMS1 kernel: block drbd0: receiver (re)started
Jul 12 18:27:47 LMS1 kernel: block drbd0: conn( Unconnected -> WFConnection
)
Jul 12 18:28:48 LMS1 kernel: block drbd0: Handshake successful: Agreed
network protocol version 94
Jul 12 18:28:48 LMS1 kernel: block drbd0: Peer authenticated using 20 bytes
of 'sha1' HMAC
Jul 12 18:28:48 LMS1 kernel: block drbd0: conn( WFConnection ->
WFReportParams )
Jul 12 18:28:48 LMS1 kernel: block drbd0: Starting asender thread (from
drbd0_receiver [12058])
Jul 12 18:28:48 LMS1 kernel: block drbd0: data-integrity-alg: <not-used>
Jul 12 18:28:48 LMS1 kernel: block drbd0: drbd_sync_handshake:
Jul 12 18:28:48 LMS1 kernel: block drbd0: self
D015F68F0A997E7F:3FE1237214F4A834:0000000000000004:0000000000000000
bits:52323091 flags:0
Jul 12 18:28:48 LMS1 kernel: block drbd0: peer
3FE1237214F4A834:0000000000000000:0000000000000000:0000000000000000
bits:52323091 flags:0
Jul 12 18:28:48 LMS1 kernel: block drbd0: uuid_compare()=1 by rule 70
Jul 12 18:28:48 LMS1 kernel: block drbd0: Becoming sync source due to disk
states.
Jul 12 18:28:48 LMS1 kernel: block drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS )
Jul 12 18:28:49 LMS1 kernel: block drbd0: conn( WFBitMapS -> SyncSource )
Jul 12 18:28:49 LMS1 kernel: block drbd0: Began resync as SyncSource (will
sync 209292364 KB [52323091 bits set]).


[/var/log/messages on server2]
Jul 12 18:16:02 LMS2 kernel: device eth0 entered promiscuous mode
Jul 12 18:16:21 LMS2 kernel: device eth0 left promiscuous mode
Jul 12 18:16:43 LMS2 kernel: device eth0 entered promiscuous mode
Jul 12 18:16:52 LMS2 kernel: device eth0 left promiscuous mode
Jul 12 18:27:58 LMS2 kernel: block drbd0: peer( Primary -> Unknown ) conn(
SyncTarget -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Jul 12 18:27:58 LMS2 kernel: block drbd0: short read expecting header on
sock: r=-512
Jul 12 18:27:58 LMS2 kernel: block drbd0: asender terminated
Jul 12 18:27:58 LMS2 kernel: block drbd0: Terminating asender thread
Jul 12 18:27:58 LMS2 kernel: block drbd0: Connection closed
Jul 12 18:27:58 LMS2 kernel: block drbd0: conn( Disconnecting -> StandAlone
)
Jul 12 18:27:58 LMS2 kernel: block drbd0: receiver terminated
Jul 12 18:27:58 LMS2 kernel: block drbd0: Terminating receiver thread
Jul 12 18:28:38 LMS2 kernel: block drbd0: conn( StandAlone -> Unconnected )
Jul 12 18:28:38 LMS2 kernel: block drbd0: Starting receiver thread (from
drbd0_worker [14427])
Jul 12 18:28:38 LMS2 kernel: block drbd0: receiver (re)started
Jul 12 18:28:38 LMS2 kernel: block drbd0: conn( Unconnected -> WFConnection
)
Jul 12 18:28:59 LMS2 kernel: block drbd0: Handshake successful: Agreed
network protocol version 94
Jul 12 18:28:59 LMS2 kernel: block drbd0: Peer authenticated using 20 bytes
of 'sha1' HMAC
Jul 12 18:28:59 LMS2 kernel: block drbd0: conn( WFConnection ->
WFReportParams )
Jul 12 18:28:59 LMS2 kernel: block drbd0: Starting asender thread (from
drbd0_receiver [24260])
Jul 12 18:28:59 LMS2 kernel: block drbd0: data-integrity-alg: <not-used>
Jul 12 18:28:59 LMS2 kernel: block drbd0: drbd_sync_handshake:
Jul 12 18:28:59 LMS2 kernel: block drbd0: self
3FE1237214F4A834:0000000000000000:0000000000000000:0000000000000000
bits:52323091 flags:0
Jul 12 18:28:59 LMS2 kernel: block drbd0: peer
D015F68F0A997E7F:3FE1237214F4A834:0000000000000004:0000000000000000
bits:52323091 flags:0
Jul 12 18:28:59 LMS2 kernel: block drbd0: uuid_compare()=-1 by rule 50
Jul 12 18:28:59 LMS2 kernel: block drbd0: Becoming sync target due to disk
states.
Jul 12 18:28:59 LMS2 kernel: block drbd0: peer( Unknown -> Primary ) conn(
WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
Jul 12 18:28:59 LMS2 kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID )
Jul 12 18:28:59 LMS2 kernel: block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0
Jul 12 18:28:59 LMS2 kernel: block drbd0: helper command: /sbin/drbdadm
before-resync-target minor-0 exit code 0 (0x0)
Jul 12 18:28:59 LMS2 kernel: block drbd0: conn( WFSyncUUID -> SyncTarget )
Jul 12 18:28:59 LMS2 kernel: block drbd0: Began resync as SyncTarget (will
sync 209292364 KB [52323091 bits set]).




[drbd.conf]

global {
    usage-count no;
}

common {
    syncer {
        rate 500M;           # in MBytes
    }

    net {

        cram-hmac-alg sha1;
        shared-secret "password";

        after-sb-0pri discard-zero-changes;
        after-sb-1pri discard-secondary;
        after-sb-2pri disconnect;
    }
}


resource cdrs {
    protocol C;

    on LMS1{
        device    /dev/drbd0;
        disk      /dev/cciss/c0d1p1;
        address   192.168.1.1:7790;
        meta-disk internal;
    }

    on LMS2{
        device    /dev/drbd0;
        disk      /dev/cciss/c0d1p1;
        address   192.168.1.2:7790;
        meta-disk internal;
    }
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110712/47755b8d/attachment.htm>


More information about the drbd-user mailing list