[DRBD-user] syncing problem after node reinstall

Eulaerts Gregory EULAERTSG at STIB.IRISNET.BE
Mon Jul 5 16:48:51 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello,

 

We got a crash of one node of our two-nodes-DRBD setup.

I am trying to resync the crashed one, and can't seem to get it working.

 

The setup is pretty simple. Two nodes, one disk on each node, in
primary/secondary mode.

 

Here is the config from the 'good node' (the one that did not fail J),
referred as node01. The config file is the same on node02 :

 

global {

        usage-count no;

}

resource cluster {

  protocol C;

  startup {

    wfc-timeout 120;

    degr-wfc-timeout 120;

  }

  disk {

    on-io-error detach;

  }

  net{

      after-sb-0pri discard-younger-primary;

      after-sb-1pri consensus;

      after-sb-2pri disconnect;

  }

  syncer{

    rate 10M;

    al-extents 257;

  }

  on node01 {

    device /dev/drbd0;

    disk  /dev/sda6;

    address 10.160.1.39:7788;

    meta-disk internal;

  }

 

  on node02 {

    device /dev/drbd0;

    disk  /dev/sda5;

    address 10.160.1.41:7788;

    meta-disk internal;

  }

}

 

 

Here is the logs file :

 

Node01:

 

Jul  5 16:40:42 node01 drbd0: receiver (re)started

Jul  5 16:40:42 node01 drbd0: conn( Unconnected -> WFConnection ) 

Jul  5 16:40:42 node01 drbd0: Handshake successful: DRBD Network
Protocol version 86

Jul  5 16:40:42 node01 drbd0: conn( WFConnection -> WFReportParams ) 

Jul  5 16:40:42 node01 drbd0: Starting asender thread (from
drbd0_receiver [2732])

Jul  5 16:40:42 node01 drbd0: Becoming sync source due to disk states.

Jul  5 16:40:42 node01 drbd0: Writing the whole bitmap, full sync
required after drbd_sync_handshake.

Jul  5 16:40:42 node01 drbd0: Writing meta data super block now.

Jul  5 16:40:42 node01 drbd0: writing of bitmap took 2 jiffies

Jul  5 16:40:42 node01 drbd0: 9750 MB (2496005 bits) marked out-of-sync
by on disk bit-map.

Jul  5 16:40:42 node01 drbd0: Writing meta data super block now.

Jul  5 16:40:42 node01 drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS ) 

Jul  5 16:40:42 node01 drbd0: Writing meta data super block now.

Jul  5 16:40:42 node01 drbd0: sock was shut down by peer

Jul  5 16:40:42 node01 drbd0: peer( Secondary -> Unknown ) conn(
WFBitMapS -> BrokenPipe ) 

Jul  5 16:40:42 node01 drbd0: short read expecting header on sock: r=0

Jul  5 16:40:42 node01 drbd0: Writing meta data super block now.

Jul  5 16:40:42 node01 drbd0: meta connection shut down by peer.

Jul  5 16:40:42 node01 drbd0: asender terminated

Jul  5 16:40:42 node01 drbd0: Terminating asender thread

Jul  5 16:40:42 node01 drbd0: tl_clear()

Jul  5 16:40:42 node01 drbd0: Connection closed

Jul  5 16:40:42 node01 drbd0: conn( BrokenPipe -> Unconnected ) 

Jul  5 16:40:42 node01 drbd0: receiver terminated

 

 

Node02:

 

Jul  5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: Restarting
receiver thread

Jul  5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: receiver
(re)started

Jul  5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: conn(
Unconnected -> WFConnection ) 

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Handshake
successful: Agreed network protocol version 86

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: conn(
WFConnection -> WFReportParams ) 

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Starting
asender thread (from drbd0_receiver [5394])

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Resize while
not connected was forced by the user!

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0:
drbd_sync_handshake:

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: self
0000000000000004:0000000000000000:0000000000000000:0000000000000000
bits:2514078 flags:0

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: peer
6EDAE75AEEC95209:BF4166F1310890B5:75A43B3B0306F310:0000000000000004
bits:2496005 flags:2

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0:
uuid_compare()=-2 by rule 20

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Becoming sync
target due to disk states.

Jul  5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Writing the
whole bitmap, full sync required after drbd_sync_handshake.

Jul  5 16:36:46 node02 kernel: [ 6704.944230] block drbd0: 9821 MB
(2514078 bits) marked out-of-sync by on disk bit-map.

Jul  5 16:36:46 node02 kernel: [ 6704.955555] block drbd0: peer( Unknown
-> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown ->
UpToDate ) 

Jul  5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: peer( Primary
-> Unknown ) conn( WFBitMapT -> ProtocolError ) pdsk( UpToDate ->
DUnknown ) 

Jul  5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: asender
terminated

Jul  5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: Terminating
asender thread

Jul  5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: Connection
closed

Jul  5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: conn(
ProtocolError -> Unconnected ) 

Jul  5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: receiver
terminated

 

 

 

I have tried a lot of thing, and never seems to be able to get another
message.

Anyone has an idea of what I am doing wrong ?

 

Eulaerts Gregory
IT Security Admin
________________________________________________
STIB-MIVB - FAL - DSI
Rue Royale, 76
1000 Bruxelles
eulaertsg at stib.irisnet.be <mailto:eulaertsg at stib.irisnet.be> 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20100705/559e0569/attachment.htm>


More information about the drbd-user mailing list