<br><br>Hi,<br><br>i'm using drbd8.3 under redhat 5.5 and <span>when I try to <i>sync</i> the two nodes, the <i>sync</i> seems to <i>stall</i> out indefinitely. I tried to find some help but i couldn't </span><br>
<br><br>rpm -qa |grep drbd<br>drbd83-8.3.8-1.el5.centos<br>kmod-drbd83-8.3.8-1.el5.centos<br><br>rpm -qa |grep kernel<br>kernel-headers-2.6.18-194.el5<br>kernel-devel-2.6.18-194.el5<br>kernel-2.6.18-194.el5<br><br><br><br>
cat /proc/drbd<br>version: 8.3.8 (api:88/proto:86-94)<br>GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by <a href="mailto:mockbuild@builder10.centos.org" target="_blank">mockbuild@builder10.centos.org</a>, 2010-06-04 08:04:09<br>
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----<br> ns:83712000 nr:0 dw:0 dr:83712000 al:0 bm:5109 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:209292364<br> [>....................] sync'ed: 0.1% (204384/204384)M delay_probe: 31064<br>
stalled<br><br><br>drbd driver loaded OK; device status:<br>version: 8.3.8 (api:88/proto:86-94)<br>GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by <a href="mailto:mockbuild@builder10.centos.org" target="_blank">mockbuild@builder10.centos.org</a>, 2010-06-04 08:04:09<br>
m:res cs ro ds p mounted fstype<br>stalled<br><br>... sync'ed: 0.1% (204384/204384)M delay_probe:<br>0:cdrs SyncSource Primary/Secondary UpToDate/Inconsistent C<br>
<br><br>[/var/log/messages on server1]<br>Jul 12 18:27:47 LMS1 kernel: block drbd0: peer( Secondary -> Unknown ) conn( SyncSource -> TearDown )<br>Jul 12 18:27:47 LMS1 kernel: block drbd0: meta connection shut down by peer.<br>
Jul 12 18:27:47 LMS1 kernel: block drbd0: asender terminated<br>Jul 12 18:27:47 LMS1 kernel: block drbd0: Terminating asender thread<br>Jul 12 18:27:47 LMS1 kernel: block drbd0: Connection closed<br>Jul 12 18:27:47 LMS1 kernel: block drbd0: conn( TearDown -> Unconnected )<br>
Jul 12 18:27:47 LMS1 kernel: block drbd0: receiver terminated<br>Jul 12 18:27:47 LMS1 kernel: block drbd0: Restarting receiver thread<br>Jul 12 18:27:47 LMS1 kernel: block drbd0: receiver (re)started<br>Jul 12 18:27:47 LMS1 kernel: block drbd0: conn( Unconnected -> WFConnection )<br>
Jul 12 18:28:48 LMS1 kernel: block drbd0: Handshake successful: Agreed network protocol version 94<br>Jul 12 18:28:48 LMS1 kernel: block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC<br>Jul 12 18:28:48 LMS1 kernel: block drbd0: conn( WFConnection -> WFReportParams )<br>
Jul 12 18:28:48 LMS1 kernel: block drbd0: Starting asender thread (from drbd0_receiver [12058])<br>Jul 12 18:28:48 LMS1 kernel: block drbd0: data-integrity-alg: <not-used><br>Jul 12 18:28:48 LMS1 kernel: block drbd0: drbd_sync_handshake:<br>
Jul 12 18:28:48 LMS1 kernel: block drbd0: self D015F68F0A997E7F:3FE1237214F4A834:0000000000000004:0000000000000000 bits:52323091 flags:0<br>Jul 12 18:28:48 LMS1 kernel: block drbd0: peer 3FE1237214F4A834:0000000000000000:0000000000000000:0000000000000000 bits:52323091 flags:0<br>
Jul 12 18:28:48 LMS1 kernel: block drbd0: uuid_compare()=1 by rule 70<br>Jul 12 18:28:48 LMS1 kernel: block drbd0: Becoming sync source due to disk states.<br>Jul 12 18:28:48 LMS1 kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS )<br>
Jul 12 18:28:49 LMS1 kernel: block drbd0: conn( WFBitMapS -> SyncSource )<br>Jul 12 18:28:49 LMS1 kernel: block drbd0: Began resync as SyncSource (will sync 209292364 KB [52323091 bits set]).<br><br><br>[/var/log/messages on server2]<br>
Jul 12 18:16:02 LMS2 kernel: device eth0 entered promiscuous mode<br>Jul 12 18:16:21 LMS2 kernel: device eth0 left promiscuous mode<br>Jul 12 18:16:43 LMS2 kernel: device eth0 entered promiscuous mode<br>Jul 12 18:16:52 LMS2 kernel: device eth0 left promiscuous mode<br>
Jul 12 18:27:58 LMS2 kernel: block drbd0: peer( Primary -> Unknown ) conn( SyncTarget -> Disconnecting ) pdsk( UpToDate -> DUnknown )<br>Jul 12 18:27:58 LMS2 kernel: block drbd0: short read expecting header on sock: r=-512<br>
Jul 12 18:27:58 LMS2 kernel: block drbd0: asender terminated<br>Jul 12 18:27:58 LMS2 kernel: block drbd0: Terminating asender thread<br>Jul 12 18:27:58 LMS2 kernel: block drbd0: Connection closed<br>Jul 12 18:27:58 LMS2 kernel: block drbd0: conn( Disconnecting -> StandAlone )<br>
Jul 12 18:27:58 LMS2 kernel: block drbd0: receiver terminated<br>Jul 12 18:27:58 LMS2 kernel: block drbd0: Terminating receiver thread<br>Jul 12 18:28:38 LMS2 kernel: block drbd0: conn( StandAlone -> Unconnected )<br>Jul 12 18:28:38 LMS2 kernel: block drbd0: Starting receiver thread (from drbd0_worker [14427])<br>
Jul 12 18:28:38 LMS2 kernel: block drbd0: receiver (re)started<br>Jul 12 18:28:38 LMS2 kernel: block drbd0: conn( Unconnected -> WFConnection )<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: Handshake successful: Agreed network protocol version 94<br>
Jul 12 18:28:59 LMS2 kernel: block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: conn( WFConnection -> WFReportParams )<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: Starting asender thread (from drbd0_receiver [24260])<br>
Jul 12 18:28:59 LMS2 kernel: block drbd0: data-integrity-alg: <not-used><br>Jul 12 18:28:59 LMS2 kernel: block drbd0: drbd_sync_handshake:<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: self 3FE1237214F4A834:0000000000000000:0000000000000000:0000000000000000 bits:52323091 flags:0<br>
Jul 12 18:28:59 LMS2 kernel: block drbd0: peer D015F68F0A997E7F:3FE1237214F4A834:0000000000000004:0000000000000000 bits:52323091 flags:0<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: uuid_compare()=-1 by rule 50<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: Becoming sync target due to disk states.<br>
Jul 12 18:28:59 LMS2 kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID )<br>
Jul 12 18:28:59 LMS2 kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)<br>
Jul 12 18:28:59 LMS2 kernel: block drbd0: conn( WFSyncUUID -> SyncTarget )<br>Jul 12 18:28:59 LMS2 kernel: block drbd0: Began resync as SyncTarget (will sync 209292364 KB [52323091 bits set]).<br><br>
<br><br><br>[drbd.conf]<br><br>global {<br> usage-count no;<br>}<br><br>common {<br> syncer {<br> rate 500M; # in MBytes<br> }<br><br> net {<br><br> cram-hmac-alg sha1;<br> shared-secret "password";<br>
<br> after-sb-0pri discard-zero-changes;<br> after-sb-1pri discard-secondary;<br> after-sb-2pri disconnect;<br> }<br>}<br><br><br>resource cdrs {<br> protocol C;<br><br> on LMS1{<br> device /dev/drbd0;<br>
disk /dev/cciss/c0d1p1;<br> address <a href="http://192.168.1.1:7790" target="_blank">192.168.1.1:7790</a>;<br> meta-disk internal;<br> }<br><br> on LMS2{<br> device /dev/drbd0;<br>
disk /dev/cciss/c0d1p1;<br>
address <a href="http://192.168.1.2:7790" target="_blank">192.168.1.2:7790</a>;<br> meta-disk internal;<br> }<br>}<br>