<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style></head>
<body class='hmmessage'><div dir='ltr'>Hi Friends,<br><br>We are having DRBD 8.3.13 running on RHEL 6.4 for a two node cluster. Yesterday we applied OS patches on these servers and restarted them into new kernel. After restart DRBD sync is getting stalled at 100%. I tried to reboot into old kernel also but same issue. I also tried drbdadm disconnect --force r0 and then connect but still it is stalling at 100%. Below are my config file.<br><br>Primary :<br><br>cat /proc/drbd<br>version: 8.3.13 (api:88/proto:86-96)<br>GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by dag@Build64R6, 2012-09-04 12:06:10<br> 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----<br> ns:1303160 nr:0 dw:1303160 dr:5501409 al:614 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:360<br> [===================>] sync'ed:100.0% (360/360)K<br> finish: 0:53:10 speed: 0 (0) K/sec<br><br>Secondary :<br><br>cat /proc/drbd<br>version: 8.3.13 (api:88/proto:86-96)<br>GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by dag@Build64R6, 2012-09-04 12:06:10<br> 0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----<br> ns:0 nr:58460 dw:3583548 dr:0 al:0 bm:26 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:360<br> [===================>] sync'ed:100.0% (360/360)K<br> finish: 1:05:06 speed: 0 (0) want: 30 K/sec<br><br>drbd.conf :<br><br>skip {<br> As you can see, you can also comment chunks of text<br> with a 'skip[optional nonsense]{ skipped text }' section.<br> This comes in handy, if you just want to comment out<br> some 'resource <some name> {...}' section:<br> just precede it with 'skip'.<br><br> The basic format of option assignment is<br> <option name><linear whitespace><value>;<br><br> It should be obvious from the examples below,<br> but if you really care to know the details:<br><br> <option name> :=<br> valid options in the respective scope<br> <value> := <num>|<string>|<choice>|...<br> depending on the set of allowed values<br> for the respective option.<br> <num> := [0-9]+, sometimes with an optional suffix of K,M,G<br> <string> := (<name>|\"([^\"\\\n]*|\\.)*\")+<br> <name> := [/_.A-Za-z0-9-]+<br>}<br><br>#<br># At most ONE global section is allowed.<br># It must precede any resource section.<br>#<br>global {<br> # By default we load the module with a minor-count of 32. In case you<br> # have more devices in your config, the module gets loaded with<br> # a minor-count that ensures that you have 10 minors spare.<br> # In case 10 spare minors are too little for you, you can set the<br> # minor-count exeplicit here. ( Note, in contrast to DRBD-0.7 an<br> # unused, spare minor has only a very little overhead of allocated<br> # memory (a single pointer to be exact). )<br> #<br> # minor-count 64;<br><br> # The user dialog counts and displays the seconds it waited so<br> # far. You might want to disable this if you have the console<br> # of your server connected to a serial terminal server with<br> # limited logging capacity.<br> # The Dialog will print the count each 'dialog-refresh' seconds,<br> # set it to 0 to disable redrawing completely. [ default = 1 ]<br> #<br> # dialog-refresh 5; # 5 seconds<br><br> # You might disable one of drbdadm's sanity check.<br> # disable-ip-verification;<br><br> # Participate in DRBD's online usage counter at http://usage.drbd.org<br> # possilbe options: ask, yes, no. Default is ask. In case you do not<br> # know, set it to ask, and follow the on screen instructions later.<br> usage-count no;<br>}<br><br><br>#<br># The common section can have all the sections a resource can have but<br># not the host section (started with the "on" keyword).<br># The common section must precede all resources.<br># All resources inherit the settings from the common section.<br># Whereas settings in the resources have precedence over the common<br># setting.<br>#<br><br>common {<br> syncer { rate 3M; }<br>}<br><br>resource r0 {<br> protocol C;<br> #incon-degr-cmd "halt -f";<br> startup {<br> degr-wfc-timeout 120; # 2 minutes.<br> }<br> disk {<br> on-io-error detach;<br> }<br> handlers<br> {<br> split-brain "/root/splitbrain.sh root";<br> }<br> net {<br> }<br> syncer {<br> rate 30;<br> #group 1;<br> al-extents 257;<br> }<br> on Primary {<br> device /dev/drbd0;<br> meta-disk /dev/sdb1[0];<br> disk /dev/sdb2;<br> address xxx.xxx.xxx.xxx:7788;<br> }<br> on Secondary {<br> device /dev/drbd0;<br> meta-disk /dev/sdb1[0];<br> disk /dev/sdb2;<br> address xxx.xxx.xxx.xxx:7788;<br> }<br>}<br><br><br>logs :<br><br>Sep 28 08:16:30 secondary kernel: block drbd0: peer( Primary -> Unknown ) conn( SyncTarget -> Disconnecting ) pdsk( UpToDate -> DUnknown )<br>Sep 28 08:16:30 secondary kernel: block drbd0: asender terminated<br>Sep 28 08:16:30 secondary kernel: block drbd0: Terminating asender thread<br>Sep 28 08:16:30 secondary kernel: block drbd0: bitmap WRITE of 1599 pages took 34 jiffies<br>Sep 28 08:16:30 secondary kernel: block drbd0: 360 KB (90 bits) marked out-of-sync by on disk bit-map.<br>Sep 28 08:16:30 secondary kernel: block drbd0: Connection closed<br>Sep 28 08:16:30 secondary kernel: block drbd0: conn( Disconnecting -> StandAlone )<br>Sep 28 08:16:30 secondary kernel: block drbd0: receiver terminated<br>Sep 28 08:16:30 secondary kernel: block drbd0: Terminating receiver thread<br>Sep 28 08:16:33 secondary kernel: block drbd0: conn( StandAlone -> Unconnected )<br>Sep 28 08:16:33 secondary kernel: block drbd0: Starting receiver thread (from drbd0_worker [1765])<br>Sep 28 08:16:33 secondary kernel: block drbd0: receiver (re)started<br>Sep 28 08:16:33 secondary kernel: block drbd0: conn( Unconnected -> WFConnection )<br>Sep 28 08:16:33 secondary kernel: block drbd0: Handshake successful: Agreed network protocol version 96<br>Sep 28 08:16:33 secondary kernel: block drbd0: conn( WFConnection -> WFReportParams )<br>Sep 28 08:16:33 secondary kernel: block drbd0: Starting asender thread (from drbd0_receiver [29181])<br>Sep 28 08:16:33 secondary kernel: block drbd0: data-integrity-alg: <not-used><br>Sep 28 08:16:33 secondary kernel: block drbd0: drbd_sync_handshake:<br>Sep 28 08:16:33 secondary kernel: block drbd0: self 5F0D0794C3189654:0000000000000000:31D1206D1558C3A2:31D0206D1558C3A3 bits:90 flags:0<br>Sep 28 08:16:33 secondary kernel: block drbd0: peer EF964F9B847F7A89:5F0D0794C3189655:5F0C0794C3189655:5F0B0794C3189655 bits:90 flags:0<br>Sep 28 08:16:33 secondary kernel: block drbd0: uuid_compare()=-1 by rule 50<br>Sep 28 08:16:33 secondary kernel: block drbd0: Becoming sync target due to disk states.<br>Sep 28 08:16:33 secondary kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )<br>Sep 28 08:16:33 secondary kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID )<br>Sep 28 08:16:33 secondary kernel: block drbd0: updated sync uuid 5F0E0794C3189654:0000000000000000:31D1206D1558C3A2:31D0206D1558C3A3<br>Sep 28 08:16:33 secondary kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0<br>Sep 28 08:16:33 secondary kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)<br>Sep 28 08:16:33 secondary kernel: block drbd0: conn( WFSyncUUID -> SyncTarget )<br>Sep 28 08:16:33 secondary kernel: block drbd0: Began resync as SyncTarget (will sync 360 KB [90 bits set]).<br><br>Appreciate any help.<br><br>Thanks,<br>Vjay<br>                                            </div></body>
</html>