Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello,
We got a crash of one node of our two-nodes-DRBD setup.
I am trying to resync the crashed one, and can't seem to get it working.
The setup is pretty simple. Two nodes, one disk on each node, in
primary/secondary mode.
Here is the config from the 'good node' (the one that did not fail J),
referred as node01. The config file is the same on node02 :
global {
usage-count no;
}
resource cluster {
protocol C;
startup {
wfc-timeout 120;
degr-wfc-timeout 120;
}
disk {
on-io-error detach;
}
net{
after-sb-0pri discard-younger-primary;
after-sb-1pri consensus;
after-sb-2pri disconnect;
}
syncer{
rate 10M;
al-extents 257;
}
on node01 {
device /dev/drbd0;
disk /dev/sda6;
address 10.160.1.39:7788;
meta-disk internal;
}
on node02 {
device /dev/drbd0;
disk /dev/sda5;
address 10.160.1.41:7788;
meta-disk internal;
}
}
Here is the logs file :
Node01:
Jul 5 16:40:42 node01 drbd0: receiver (re)started
Jul 5 16:40:42 node01 drbd0: conn( Unconnected -> WFConnection )
Jul 5 16:40:42 node01 drbd0: Handshake successful: DRBD Network
Protocol version 86
Jul 5 16:40:42 node01 drbd0: conn( WFConnection -> WFReportParams )
Jul 5 16:40:42 node01 drbd0: Starting asender thread (from
drbd0_receiver [2732])
Jul 5 16:40:42 node01 drbd0: Becoming sync source due to disk states.
Jul 5 16:40:42 node01 drbd0: Writing the whole bitmap, full sync
required after drbd_sync_handshake.
Jul 5 16:40:42 node01 drbd0: Writing meta data super block now.
Jul 5 16:40:42 node01 drbd0: writing of bitmap took 2 jiffies
Jul 5 16:40:42 node01 drbd0: 9750 MB (2496005 bits) marked out-of-sync
by on disk bit-map.
Jul 5 16:40:42 node01 drbd0: Writing meta data super block now.
Jul 5 16:40:42 node01 drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS )
Jul 5 16:40:42 node01 drbd0: Writing meta data super block now.
Jul 5 16:40:42 node01 drbd0: sock was shut down by peer
Jul 5 16:40:42 node01 drbd0: peer( Secondary -> Unknown ) conn(
WFBitMapS -> BrokenPipe )
Jul 5 16:40:42 node01 drbd0: short read expecting header on sock: r=0
Jul 5 16:40:42 node01 drbd0: Writing meta data super block now.
Jul 5 16:40:42 node01 drbd0: meta connection shut down by peer.
Jul 5 16:40:42 node01 drbd0: asender terminated
Jul 5 16:40:42 node01 drbd0: Terminating asender thread
Jul 5 16:40:42 node01 drbd0: tl_clear()
Jul 5 16:40:42 node01 drbd0: Connection closed
Jul 5 16:40:42 node01 drbd0: conn( BrokenPipe -> Unconnected )
Jul 5 16:40:42 node01 drbd0: receiver terminated
Node02:
Jul 5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: Restarting
receiver thread
Jul 5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: receiver
(re)started
Jul 5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: conn(
Unconnected -> WFConnection )
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Handshake
successful: Agreed network protocol version 86
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: conn(
WFConnection -> WFReportParams )
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Starting
asender thread (from drbd0_receiver [5394])
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Resize while
not connected was forced by the user!
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0:
drbd_sync_handshake:
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: self
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:2514078 flags:0
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: peer
6EDAE75AEEC95209:BF4166F1310890B5:75A43B3B0306F310:0000000000000004
bits:2496005 flags:2
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0:
uuid_compare()=-2 by rule 20
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Becoming sync
target due to disk states.
Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Writing the
whole bitmap, full sync required after drbd_sync_handshake.
Jul 5 16:36:46 node02 kernel: [ 6704.944230] block drbd0: 9821 MB
(2514078 bits) marked out-of-sync by on disk bit-map.
Jul 5 16:36:46 node02 kernel: [ 6704.955555] block drbd0: peer( Unknown
-> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown ->
UpToDate )
Jul 5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: peer( Primary
-> Unknown ) conn( WFBitMapT -> ProtocolError ) pdsk( UpToDate ->
DUnknown )
Jul 5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: asender
terminated
Jul 5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: Terminating
asender thread
Jul 5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: Connection
closed
Jul 5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: conn(
ProtocolError -> Unconnected )
Jul 5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: receiver
terminated
I have tried a lot of thing, and never seems to be able to get another
message.
Anyone has an idea of what I am doing wrong ?
Eulaerts Gregory
IT Security Admin
________________________________________________
STIB-MIVB - FAL - DSI
Rue Royale, 76
1000 Bruxelles
eulaertsg at stib.irisnet.be <mailto:eulaertsg at stib.irisnet.be>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20100705/559e0569/attachment.htm>