[DRBD-user] 'Unable to bind sock' and strange error.

Jon Nelson jnelson-drbd at jamponi.net
Thu Mar 22 21:57:57 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I had something strange happen in my drbd testing environment.
The machine 'machineA' is the primary and 'machineB' is the secondary.
'machineA' is up 24/7 and 'machineB' is on and off throughout the day.

However, the problem I encountered I can't really explain.  I am not
aware of any process that goes off at 3:29 that would cause this. The
machine did not produce this error any day prior to today, and has run
with an un-changed configuration for a while.

First, the logs from the primary (with comments intermingled):

Mar 22 03:29:01 machineA kernel: drbd2: Unable to bind sock2 (-98)
Mar 22 03:29:01 machineA kernel: drbd2: drbd2_receiver [17538]: cstate WFConnection --> Unconnected
Mar 22 03:29:01 machineA kernel: drbd2: worker terminated
Mar 22 03:29:01 machineA kernel: drbd2: drbd2_receiver [17538]: cstate Unconnected --> Unconnected
Mar 22 03:29:01 machineA kernel: drbd2: Connection lost.
Mar 22 03:29:01 machineA kernel: drbd2: Discarding network configuration.
Mar 22 03:29:01 machineA kernel: drbd2: drbd2_receiver [17538]: cstate Unconnected --> StandAlone
Mar 22 03:29:01 machineA kernel: drbd2: receiver terminated

Around 9:00 the secondary ('machineB') came up but nothing happened on the
primary (this is unexpected, the primary should have had data to
synchronize).

At 9:09:35 I restart drbd on the primary ('machineA'):

Mar 22 09:09:35 machineA kernel: drbd2: Primary/Unknown --> Secondary/Unknown
Mar 22 09:09:35 machineA kernel: drbd2: drbdsetup [14092]: cstate StandAlone --> Unconnected
Mar 22 09:09:35 machineA kernel: drbd2: drbdsetup [14092]: cstate Unconnected --> StandAlone
Mar 22 09:09:35 machineA kernel: drbd2: drbdsetup [14092]: cstate StandAlone --> Unconfigured
Mar 22 09:09:35 machineA kernel: drbd2: worker terminated
Mar 22 09:09:37 machineA kernel: drbd2: resync bitmap: bits=7340032 words=229376
Mar 22 09:09:37 machineA kernel: drbd2: size = 28 GB (29360128 KB)
Mar 22 09:09:37 machineA kernel: drbd2: 243 MB marked out-of-sync by on disk bit-map.
Mar 22 09:09:37 machineA kernel: drbd2: Found 6 transactions (324 active
extents) in activity log.
Mar 22 09:09:37 machineA kernel: drbd2: drbdsetup [14136]: cstate Unconfigured --> StandAlone
Mar 22 09:09:37 machineA kernel: drbd2: drbdsetup [14142]: cstate StandAlone --> Unconnected
Mar 22 09:09:37 machineA kernel: drbd2: drbd2_receiver [14143]: cstate Unconnected --> WFConnection
Mar 22 09:09:37 machineA kernel: drbd2: drbd2_receiver [14143]: cstate WFConnection --> WFReportParams
Mar 22 09:09:37 machineA kernel: drbd2: Handshake successful: DRBD Network Protocol version 74
Mar 22 09:09:37 machineA kernel: drbd2: Connection established.
Mar 22 09:09:37 machineA kernel: drbd2: I am(S): 1:00000007:00000001:0000001d:00000004:00
Mar 22 09:09:37 machineA kernel: drbd2: Peer(S): 1:00000007:00000001:0000001b:00000004:00
Mar 22 09:09:37 machineA kernel: drbd2: drbd2_receiver [14143]: cstate WFReportParams --> WFBitMapS
Mar 22 09:09:37 machineA kernel: drbd2: Secondary/Unknown --> Secondary/Secondary
Mar 22 09:09:37 machineA kernel: drbd2: drbd2_receiver [14143]: cstate WFBitMapS --> SyncSource
Mar 22 09:09:37 machineA kernel: drbd2: Resync started as SyncSource (need to sync 249476 KB [62369 bits set]).
Mar 22 09:09:49 machineA kernel: drbd2: Resync done (total 11 sec; paused 0 sec; 22676 K/sec)
Mar 22 09:09:49 machineA kernel: drbd2: drbd2_worker [14137]: cstate SyncSource --> Connected

  and then I had to tell it that it was, in fact, the primary again:

Mar 22 09:10:30 machineA kernel: drbd2: Secondary/Secondary --> Primary/Secondary

Here are the logs from the secondary:

Mar 22 09:01:23 machineB kernel: drbd: initialised. Version: 0.7.22 (api:79/proto:74)
Mar 22 09:01:23 machineB kernel: drbd: SVN Revision: 2554 build by lmb at dale, 2006-10-30 22:52:11
Mar 22 09:01:23 machineB kernel: drbd: registered as block device major 147
Mar 22 09:01:24 machineB kernel: drbd0: resync bitmap: bits=7340032 words=229376
Mar 22 09:01:24 machineB kernel: drbd0: size = 28 GB (29360128 KB)
Mar 22 09:01:24 machineB kernel: drbd0: 0 KB marked out-of-sync by on disk bit-map.
Mar 22 09:01:24 machineB kernel: drbd0: No usable activity log found.
Mar 22 09:01:24 machineB kernel: drbd0: drbdsetup [3564]: cstate Unconfigured --> StandAlone
Mar 22 09:01:24 machineB kernel: drbd0: drbdsetup [3592]: cstate StandAlone --> Unconnected
Mar 22 09:01:24 machineB kernel: drbd0: drbd0_receiver [3593]: cstate Unconnected --> WFConnection

   Here I manually restart the drbd on the primary ('machineA').

Mar 22 09:09:37 machineB kernel: drbd0: drbd0_receiver [3593]: cstate WFConnection --> WFReportParams
Mar 22 09:09:37 machineB kernel: drbd0: Handshake successful: DRBD Network Protocol version 74
Mar 22 09:09:37 machineB kernel: drbd0: Connection established.
Mar 22 09:09:37 machineB kernel: drbd0: I am(S): 1:00000007:00000001:0000001b:00000004:00
Mar 22 09:09:37 machineB kernel: drbd0: Peer(S): 1:00000007:00000001:0000001d:00000004:00
Mar 22 09:09:37 machineB kernel: drbd0: drbd0_receiver [3593]: cstate WFReportParams --> WFBitMapT
Mar 22 09:09:37 machineB kernel: drbd0: Secondary/Unknown --> Secondary/Secondary
Mar 22 09:09:37 machineB kernel: drbd0: drbd0_receiver [3593]: cstate WFBitMapT --> SyncTarget
Mar 22 09:09:37 machineB kernel: drbd0: Resync started as SyncTarget (need to sync 249476 KB [62369 bits set]).
Mar 22 09:09:49 machineB kernel: drbd0: Resync done (total 11 sec; paused 0
sec; 22676 K/sec)
Mar 22 09:09:49 machineB kernel: drbd0: drbd0_worker [3580]: cstate SyncTarget --> Connected
Mar 22 09:10:30 machineB kernel: drbd0: Secondary/Secondary --> Secondary/Primary

Questions:

0. What caused the transition from 'Primary' to 'Standalone' on
   'machineA'?
1. On 'machineA', if it was not the primary, why did it synchronize?
2. Why didn't it move back form 'StandAlone' to 'Primary' when the
   determination had been made that it's peer was definately already in
   Secondary?

--
Jon Nelson <jnelson-drbd at jamponi.net>



More information about the drbd-user mailing list