[Drbd-dev] [CASE-26] some node be changed to unexpected "Standalone" after re-connect

김재헌 jhkim at mantech.co.kr
Sat Feb 27 09:02:11 CET 2016


Hi,

Please check.

Ver:
  - drbd-9.0.1-1
  - CentOS 7

Env:
  - 4 nodes configured but used 3 nodes only
 - sync(C) replication mode.
  - VM network bandwith: 100Mbps(slow)

Test:
 01) setup node1, 2, 3, UpToDate all
 02) node1: primary
 03) node1: mount /dev/drbd1  /mnt
 04) node1: copy 1Gfile /mnt ( it takes 2 and half minutes)
 05) during copy
 06) node2: disconnect
 07) node3: down
 08) node2: connect
 09) node3: up
 10) node2: changed from Connecting to Standalone
 11) node2: disconnect again
 12) node2: connect again
 13) node2: changed to normal Connected status(2nd connect try is
successful.)

Notes:
 -  Check please node2 log at the time of Test-10) step

34378 Feb 27 15:32:52 drbd9-02 kernel: drbd r0 drbd9-01: conn( Unconnected
-> Connecting )
34379 Feb 27 15:32:52 drbd9-02 kernel: drbd r0 tcp:drbd9-01: initial packet
S crossed
34380 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Handshake
successful: Agreed network protocol version 111
34381 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Feature flags
enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
34382 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Starting ack_recv
thread (from drbd_r_r0 [6617])
34383 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: meta connection
shut down by peer.
34384 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: conn( Connecting
-> NetworkFailure )
34385 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: ack_receiver
terminated
34386 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Terminating
ack_recv thread
34387 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: sock was shut down
by peer
34388 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Connection closed
34389 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: conn(
NetworkFailure -> Unconnected )
34390 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Restarting
receiver thread
34391 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: conn( Unconnected
-> Connecting )
34392 Feb 27 15:32:54 drbd9-02 kernel: drbd r0 tcp:drbd9-01: initial packet
S crossed
34393 Feb 27 15:32:55 drbd9-02 kernel: drbd r0 drbd9-01: Handshake
successful: Agreed network protocol version 111
34394 Feb 27 15:32:55 drbd9-02 kernel: drbd r0 drbd9-01: Feature flags
enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
34395 Feb 27 15:32:55 drbd9-02 kernel: drbd r0 drbd9-01: Starting ack_recv
thread (from drbd_r_r0 [6617])
34396 Feb 27 15:32:55 drbd9-02 kernel: drbd r0 drbd9-01: Preparing remote
state change 56227800 (primary_nodes=0, weak_nodes=0)
34397 Feb 27 15:33:25 drbd9-02 kernel: drbd r0: Two-phase commit 56227800
timeout
34398 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1: peer does not
support WRITE_SAME
34399 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01:
drbd_sync_handshake:
34400 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01: self
E22074C65B6BE5A2:0000000000000000:15F5D43C9A0B7F26:351C9E61C6B06FB6 bits:0
flags:0
34401 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01: peer
4784B438C8271CE9:E22074C65B6BE5A2:15F5D43C9A0B7F26:0000000000000000
bits:427126 flags:120
34402 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01:
uuid_compare()=-2 by rule 50
34403 Feb 27 15:33:46 drbd9-02 kernel: drbd r0: State change failed: Need a
connection to start verify or resync
34404 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: Failed: conn(
Connecting -> Connected ) peer( Unknown -> Primary )
34405 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01: Failed:
pdsk( DUnknown -> UpToDate ) repl( Off -> WFBitMapT )
34406 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: conn( Connecting
-> Disconnecting )
34407 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: error receiving
P_STATE, e: -5 l: 0!
34408 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: ack_receiver
terminated
34409 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: Terminating
ack_recv thread
34410 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: Connection closed
34411 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: conn(
Disconnecting -> StandAlone )
34412 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: Terminating
receiver thread
34413 Feb 27 15:33:49 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34414 Feb 27 15:33:55 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34415 Feb 27 15:34:04 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34416 Feb 27 15:34:15 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34417 Feb 27 15:34:26 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34418 Feb 27 15:34:36 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34419 Feb 27 15:34:47 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-03: unexpected


Let me attach full log files here:
 - node1: http://pastebin.com/BbUGy2Yn
 - node2: http://pastebin.com/2z7auSA2
 - node3: http://pastebin.com/jn5JLg1P

Please see my comments interspersed in above log files.

I think the first disconnect-connect step(Test-06,Test-08) on node2 is
normal.
Nevertheless, node2 was changed to not "Connected" but "Standalone" after
uuid_compare.
Why?


Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-dev/attachments/20160227/cf8a4018/attachment.htm>


More information about the drbd-dev mailing list