Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Friends,
We are having DRBD 8.3.13 running on RHEL 6.4 for a two node cluster. Yesterday we applied OS patches on these servers and restarted them into new kernel. After restart DRBD sync is getting stalled at 100%. I tried to reboot into old kernel also but same issue. I also tried drbdadm disconnect --force r0 and then connect but still it is stalling at 100%. Below are my config file.
Primary :
cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by dag at Build64R6, 2012-09-04 12:06:10
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:1303160 nr:0 dw:1303160 dr:5501409 al:614 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:360
[===================>] sync'ed:100.0% (360/360)K
finish: 0:53:10 speed: 0 (0) K/sec
Secondary :
cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by dag at Build64R6, 2012-09-04 12:06:10
0: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
ns:0 nr:58460 dw:3583548 dr:0 al:0 bm:26 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:360
[===================>] sync'ed:100.0% (360/360)K
finish: 1:05:06 speed: 0 (0) want: 30 K/sec
drbd.conf :
skip {
As you can see, you can also comment chunks of text
with a 'skip[optional nonsense]{ skipped text }' section.
This comes in handy, if you just want to comment out
some 'resource <some name> {...}' section:
just precede it with 'skip'.
The basic format of option assignment is
<option name><linear whitespace><value>;
It should be obvious from the examples below,
but if you really care to know the details:
<option name> :=
valid options in the respective scope
<value> := <num>|<string>|<choice>|...
depending on the set of allowed values
for the respective option.
<num> := [0-9]+, sometimes with an optional suffix of K,M,G
<string> := (<name>|\"([^\"\\\n]*|\\.)*\")+
<name> := [/_.A-Za-z0-9-]+
}
#
# At most ONE global section is allowed.
# It must precede any resource section.
#
global {
# By default we load the module with a minor-count of 32. In case you
# have more devices in your config, the module gets loaded with
# a minor-count that ensures that you have 10 minors spare.
# In case 10 spare minors are too little for you, you can set the
# minor-count exeplicit here. ( Note, in contrast to DRBD-0.7 an
# unused, spare minor has only a very little overhead of allocated
# memory (a single pointer to be exact). )
#
# minor-count 64;
# The user dialog counts and displays the seconds it waited so
# far. You might want to disable this if you have the console
# of your server connected to a serial terminal server with
# limited logging capacity.
# The Dialog will print the count each 'dialog-refresh' seconds,
# set it to 0 to disable redrawing completely. [ default = 1 ]
#
# dialog-refresh 5; # 5 seconds
# You might disable one of drbdadm's sanity check.
# disable-ip-verification;
# Participate in DRBD's online usage counter at http://usage.drbd.org
# possilbe options: ask, yes, no. Default is ask. In case you do not
# know, set it to ask, and follow the on screen instructions later.
usage-count no;
}
#
# The common section can have all the sections a resource can have but
# not the host section (started with the "on" keyword).
# The common section must precede all resources.
# All resources inherit the settings from the common section.
# Whereas settings in the resources have precedence over the common
# setting.
#
common {
syncer { rate 3M; }
}
resource r0 {
protocol C;
#incon-degr-cmd "halt -f";
startup {
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error detach;
}
handlers
{
split-brain "/root/splitbrain.sh root";
}
net {
}
syncer {
rate 30;
#group 1;
al-extents 257;
}
on Primary {
device /dev/drbd0;
meta-disk /dev/sdb1[0];
disk /dev/sdb2;
address xxx.xxx.xxx.xxx:7788;
}
on Secondary {
device /dev/drbd0;
meta-disk /dev/sdb1[0];
disk /dev/sdb2;
address xxx.xxx.xxx.xxx:7788;
}
}
logs :
Sep 28 08:16:30 secondary kernel: block drbd0: peer( Primary -> Unknown ) conn( SyncTarget -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Sep 28 08:16:30 secondary kernel: block drbd0: asender terminated
Sep 28 08:16:30 secondary kernel: block drbd0: Terminating asender thread
Sep 28 08:16:30 secondary kernel: block drbd0: bitmap WRITE of 1599 pages took 34 jiffies
Sep 28 08:16:30 secondary kernel: block drbd0: 360 KB (90 bits) marked out-of-sync by on disk bit-map.
Sep 28 08:16:30 secondary kernel: block drbd0: Connection closed
Sep 28 08:16:30 secondary kernel: block drbd0: conn( Disconnecting -> StandAlone )
Sep 28 08:16:30 secondary kernel: block drbd0: receiver terminated
Sep 28 08:16:30 secondary kernel: block drbd0: Terminating receiver thread
Sep 28 08:16:33 secondary kernel: block drbd0: conn( StandAlone -> Unconnected )
Sep 28 08:16:33 secondary kernel: block drbd0: Starting receiver thread (from drbd0_worker [1765])
Sep 28 08:16:33 secondary kernel: block drbd0: receiver (re)started
Sep 28 08:16:33 secondary kernel: block drbd0: conn( Unconnected -> WFConnection )
Sep 28 08:16:33 secondary kernel: block drbd0: Handshake successful: Agreed network protocol version 96
Sep 28 08:16:33 secondary kernel: block drbd0: conn( WFConnection -> WFReportParams )
Sep 28 08:16:33 secondary kernel: block drbd0: Starting asender thread (from drbd0_receiver [29181])
Sep 28 08:16:33 secondary kernel: block drbd0: data-integrity-alg: <not-used>
Sep 28 08:16:33 secondary kernel: block drbd0: drbd_sync_handshake:
Sep 28 08:16:33 secondary kernel: block drbd0: self 5F0D0794C3189654:0000000000000000:31D1206D1558C3A2:31D0206D1558C3A3 bits:90 flags:0
Sep 28 08:16:33 secondary kernel: block drbd0: peer EF964F9B847F7A89:5F0D0794C3189655:5F0C0794C3189655:5F0B0794C3189655 bits:90 flags:0
Sep 28 08:16:33 secondary kernel: block drbd0: uuid_compare()=-1 by rule 50
Sep 28 08:16:33 secondary kernel: block drbd0: Becoming sync target due to disk states.
Sep 28 08:16:33 secondary kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
Sep 28 08:16:33 secondary kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID )
Sep 28 08:16:33 secondary kernel: block drbd0: updated sync uuid 5F0E0794C3189654:0000000000000000:31D1206D1558C3A2:31D0206D1558C3A3
Sep 28 08:16:33 secondary kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
Sep 28 08:16:33 secondary kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
Sep 28 08:16:33 secondary kernel: block drbd0: conn( WFSyncUUID -> SyncTarget )
Sep 28 08:16:33 secondary kernel: block drbd0: Began resync as SyncTarget (will sync 360 KB [90 bits set]).
Appreciate any help.
Thanks,
Vjay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130928/3d2cdacb/attachment.htm>