Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, We got a crash of one node of our two-nodes-DRBD setup. I am trying to resync the crashed one, and can't seem to get it working. The setup is pretty simple. Two nodes, one disk on each node, in primary/secondary mode. Here is the config from the 'good node' (the one that did not fail J), referred as node01. The config file is the same on node02 : global { usage-count no; } resource cluster { protocol C; startup { wfc-timeout 120; degr-wfc-timeout 120; } disk { on-io-error detach; } net{ after-sb-0pri discard-younger-primary; after-sb-1pri consensus; after-sb-2pri disconnect; } syncer{ rate 10M; al-extents 257; } on node01 { device /dev/drbd0; disk /dev/sda6; address 10.160.1.39:7788; meta-disk internal; } on node02 { device /dev/drbd0; disk /dev/sda5; address 10.160.1.41:7788; meta-disk internal; } } Here is the logs file : Node01: Jul 5 16:40:42 node01 drbd0: receiver (re)started Jul 5 16:40:42 node01 drbd0: conn( Unconnected -> WFConnection ) Jul 5 16:40:42 node01 drbd0: Handshake successful: DRBD Network Protocol version 86 Jul 5 16:40:42 node01 drbd0: conn( WFConnection -> WFReportParams ) Jul 5 16:40:42 node01 drbd0: Starting asender thread (from drbd0_receiver [2732]) Jul 5 16:40:42 node01 drbd0: Becoming sync source due to disk states. Jul 5 16:40:42 node01 drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake. Jul 5 16:40:42 node01 drbd0: Writing meta data super block now. Jul 5 16:40:42 node01 drbd0: writing of bitmap took 2 jiffies Jul 5 16:40:42 node01 drbd0: 9750 MB (2496005 bits) marked out-of-sync by on disk bit-map. Jul 5 16:40:42 node01 drbd0: Writing meta data super block now. Jul 5 16:40:42 node01 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) Jul 5 16:40:42 node01 drbd0: Writing meta data super block now. Jul 5 16:40:42 node01 drbd0: sock was shut down by peer Jul 5 16:40:42 node01 drbd0: peer( Secondary -> Unknown ) conn( WFBitMapS -> BrokenPipe ) Jul 5 16:40:42 node01 drbd0: short read expecting header on sock: r=0 Jul 5 16:40:42 node01 drbd0: Writing meta data super block now. Jul 5 16:40:42 node01 drbd0: meta connection shut down by peer. Jul 5 16:40:42 node01 drbd0: asender terminated Jul 5 16:40:42 node01 drbd0: Terminating asender thread Jul 5 16:40:42 node01 drbd0: tl_clear() Jul 5 16:40:42 node01 drbd0: Connection closed Jul 5 16:40:42 node01 drbd0: conn( BrokenPipe -> Unconnected ) Jul 5 16:40:42 node01 drbd0: receiver terminated Node02: Jul 5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: Restarting receiver thread Jul 5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: receiver (re)started Jul 5 16:36:45 node02 kernel: [ 6704.452406] block drbd0: conn( Unconnected -> WFConnection ) Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Handshake successful: Agreed network protocol version 86 Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: conn( WFConnection -> WFReportParams ) Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Starting asender thread (from drbd0_receiver [5394]) Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Resize while not connected was forced by the user! Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: drbd_sync_handshake: Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: self 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:2514078 flags:0 Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: peer 6EDAE75AEEC95209:BF4166F1310890B5:75A43B3B0306F310:0000000000000004 bits:2496005 flags:2 Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: uuid_compare()=-2 by rule 20 Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Becoming sync target due to disk states. Jul 5 16:36:46 node02 kernel: [ 6704.775221] block drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake. Jul 5 16:36:46 node02 kernel: [ 6704.944230] block drbd0: 9821 MB (2514078 bits) marked out-of-sync by on disk bit-map. Jul 5 16:36:46 node02 kernel: [ 6704.955555] block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) Jul 5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: peer( Primary -> Unknown ) conn( WFBitMapT -> ProtocolError ) pdsk( UpToDate -> DUnknown ) Jul 5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: asender terminated Jul 5 16:36:46 node02 kernel: [ 6704.981517] block drbd0: Terminating asender thread Jul 5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: Connection closed Jul 5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: conn( ProtocolError -> Unconnected ) Jul 5 16:36:46 node02 kernel: [ 6705.040937] block drbd0: receiver terminated I have tried a lot of thing, and never seems to be able to get another message. Anyone has an idea of what I am doing wrong ? Eulaerts Gregory IT Security Admin ________________________________________________ STIB-MIVB - FAL - DSI Rue Royale, 76 1000 Bruxelles eulaertsg at stib.irisnet.be <mailto:eulaertsg at stib.irisnet.be> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20100705/559e0569/attachment.htm>