[Drbd-dev] [CASE-26] some node be changed to unexpected "Standalone" after re-connect
김재헌
jhkim at mantech.co.kr
Sat Feb 27 09:02:11 CET 2016
Hi,
Please check.
Ver:
- drbd-9.0.1-1
- CentOS 7
Env:
- 4 nodes configured but used 3 nodes only
- sync(C) replication mode.
- VM network bandwith: 100Mbps(slow)
Test:
01) setup node1, 2, 3, UpToDate all
02) node1: primary
03) node1: mount /dev/drbd1 /mnt
04) node1: copy 1Gfile /mnt ( it takes 2 and half minutes)
05) during copy
06) node2: disconnect
07) node3: down
08) node2: connect
09) node3: up
10) node2: changed from Connecting to Standalone
11) node2: disconnect again
12) node2: connect again
13) node2: changed to normal Connected status(2nd connect try is
successful.)
Notes:
- Check please node2 log at the time of Test-10) step
34378 Feb 27 15:32:52 drbd9-02 kernel: drbd r0 drbd9-01: conn( Unconnected
-> Connecting )
34379 Feb 27 15:32:52 drbd9-02 kernel: drbd r0 tcp:drbd9-01: initial packet
S crossed
34380 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Handshake
successful: Agreed network protocol version 111
34381 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Feature flags
enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
34382 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Starting ack_recv
thread (from drbd_r_r0 [6617])
34383 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: meta connection
shut down by peer.
34384 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: conn( Connecting
-> NetworkFailure )
34385 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: ack_receiver
terminated
34386 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Terminating
ack_recv thread
34387 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: sock was shut down
by peer
34388 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Connection closed
34389 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: conn(
NetworkFailure -> Unconnected )
34390 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: Restarting
receiver thread
34391 Feb 27 15:32:53 drbd9-02 kernel: drbd r0 drbd9-01: conn( Unconnected
-> Connecting )
34392 Feb 27 15:32:54 drbd9-02 kernel: drbd r0 tcp:drbd9-01: initial packet
S crossed
34393 Feb 27 15:32:55 drbd9-02 kernel: drbd r0 drbd9-01: Handshake
successful: Agreed network protocol version 111
34394 Feb 27 15:32:55 drbd9-02 kernel: drbd r0 drbd9-01: Feature flags
enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
34395 Feb 27 15:32:55 drbd9-02 kernel: drbd r0 drbd9-01: Starting ack_recv
thread (from drbd_r_r0 [6617])
34396 Feb 27 15:32:55 drbd9-02 kernel: drbd r0 drbd9-01: Preparing remote
state change 56227800 (primary_nodes=0, weak_nodes=0)
34397 Feb 27 15:33:25 drbd9-02 kernel: drbd r0: Two-phase commit 56227800
timeout
34398 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1: peer does not
support WRITE_SAME
34399 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01:
drbd_sync_handshake:
34400 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01: self
E22074C65B6BE5A2:0000000000000000:15F5D43C9A0B7F26:351C9E61C6B06FB6 bits:0
flags:0
34401 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01: peer
4784B438C8271CE9:E22074C65B6BE5A2:15F5D43C9A0B7F26:0000000000000000
bits:427126 flags:120
34402 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01:
uuid_compare()=-2 by rule 50
34403 Feb 27 15:33:46 drbd9-02 kernel: drbd r0: State change failed: Need a
connection to start verify or resync
34404 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: Failed: conn(
Connecting -> Connected ) peer( Unknown -> Primary )
34405 Feb 27 15:33:46 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-01: Failed:
pdsk( DUnknown -> UpToDate ) repl( Off -> WFBitMapT )
34406 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: conn( Connecting
-> Disconnecting )
34407 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: error receiving
P_STATE, e: -5 l: 0!
34408 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: ack_receiver
terminated
34409 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: Terminating
ack_recv thread
34410 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: Connection closed
34411 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: conn(
Disconnecting -> StandAlone )
34412 Feb 27 15:33:46 drbd9-02 kernel: drbd r0 drbd9-01: Terminating
receiver thread
34413 Feb 27 15:33:49 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34414 Feb 27 15:33:55 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34415 Feb 27 15:34:04 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34416 Feb 27 15:34:15 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34417 Feb 27 15:34:26 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34418 Feb 27 15:34:36 drbd9-02 kernel: drbd r0 tcp:drbd9-05: Closing
unexpected connection from 100.100.10.11
34419 Feb 27 15:34:47 drbd9-02 kernel: drbd r0/0 drbd1 drbd9-03: unexpected
Let me attach full log files here:
- node1: http://pastebin.com/BbUGy2Yn
- node2: http://pastebin.com/2z7auSA2
- node3: http://pastebin.com/jn5JLg1P
Please see my comments interspersed in above log files.
I think the first disconnect-connect step(Test-06,Test-08) on node2 is
normal.
Nevertheless, node2 was changed to not "Connected" but "Standalone" after
uuid_compare.
Why?
Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-dev/attachments/20160227/cf8a4018/attachment.htm>
More information about the drbd-dev
mailing list