[DRBD-user] "become-primary-on both; " fails when both start up at the same time

Tue Jul 22 00:58:55 CEST 2008

Hello,

I'm now setting up a fairly standard Primary/Primary configuration:
OCFS2 on top of DRBD, on two machines running Debian testing
(Linux 2.6.25, DRBD 8.0.12 as packaged).  It works fine most of
the time so far, except one thing: when both boxes are rebooted.
I noticed this when testing shutdowns with Network UPS Tools.

 - node1 is NUT master (with RS232 connection to the UPS),
   node2 is NUT slave (gets UPS status over the network),
   both are powered from the same UPS
 - run "upsmon -c fsd" on node1 to simulate low battery condition
 - node2 (NUT slave) shuts down first, does failover to node1
 - node1 (NUT master) shuts down last (after 1 minute FINALDELAY)
 - UPS powers down
 - UPS powers up again, both nodes start
 - node1 becomes Primary, node2 becomes Secondary (fails to mount OCFS2!)
 - manually reboot node2 - now Primary, mounts OCFS2 as it should
 - reboot any single node - no problem, still Primary/Primary

I verified everything shuts down cleanly before the power is cut.
What could be wrong, why is the extra reboot of node2 necessary to
make it Primary again?  What are these TOO_SMALL errors below?

Kernel messages from node2 when the problem occurs:

 disk( Diskless -> Attaching )
 Starting worker thread (from cqueue [1750])
 Found 4 transactions (16 active extents) in activity log.
 max_segment_size ( = BIO size ) = 32768
 drbd_bm_resize called with capacity == 209708728
 resync bitmap: bits=26213591 words=409588
 size = 100 GB (104854364 KB)
 reading of bitmap took 13 jiffies
 recounting of set bits took additional 1 jiffies
 0 KB (0 bits) marked out-of-sync by on disk bit-map.
 disk( Attaching -> UpToDate )
 Writing meta data super block now.
 Barriers not supported on meta data device - disabling
 conn( StandAlone -> Unconnected )
 Starting receiver thread (from drbd0_worker [2166])
 receiver (re)started
 conn( Unconnected -> WFConnection )
 Handshake successful: DRBD Network Protocol version 86
 Peer authenticated using 20 bytes of 'sha1' HMAC
 conn( WFConnection -> WFReportParams )
 Starting asender thread (from drbd0_receiver [2180])
 peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
 Writing meta data super block now.
 Requested state change failed by peer: TOO_SMALL
 Requested state change failed by peer: TOO_SMALL
 State change failed: State changed was refused by peer node
   state = { cs:WFBitMapT st:Secondary/Secondary ds:UpToDate/UpToDate r--- }
  wanted = { cs:WFBitMapT st:Primary/Secondary ds:UpToDate/UpToDate r--- }
 conn( WFBitMapT -> WFSyncUUID )
 peer( Secondary -> Primary )
 conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent )
 Began resync as SyncTarget (will sync 48 KB [12 bits set]).
 Writing meta data super block now.
 Resync done (total 1 sec; paused 0 sec; 48 K/sec)
 conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
 Writing meta data super block now.
 local disk flush failed with status -95

Kernel messages from node2 on subsequent reboot which fixes it:

 disk( Diskless -> Attaching )
 Starting worker thread (from cqueue [1736])
 Found 4 transactions (16 active extents) in activity log.
 max_segment_size ( = BIO size ) = 32768
 drbd_bm_resize called with capacity == 209708728
 resync bitmap: bits=26213591 words=409588
 size = 100 GB (104854364 KB)
 reading of bitmap took 13 jiffies
 recounting of set bits took additional 1 jiffies
 0 KB (0 bits) marked out-of-sync by on disk bit-map.
 disk( Attaching -> UpToDate )
 Writing meta data super block now.
 Barriers not supported on meta data device - disabling
 conn( StandAlone -> Unconnected )
 Starting receiver thread (from drbd0_worker [2152])
 receiver (re)started
 conn( Unconnected -> WFConnection )
 Handshake successful: DRBD Network Protocol version 86
 Peer authenticated using 20 bytes of 'sha1' HMAC
 conn( WFConnection -> WFReportParams )
 Starting asender thread (from drbd0_receiver [2166])
 peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
 Writing meta data super block now.
 role( Secondary -> Primary )
 Writing meta data super block now.
 conn( WFBitMapT -> WFSyncUUID )
 conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent )
 Began resync as SyncTarget (will sync 4 KB [1 bits set]).
 Writing meta data super block now.
 Resync done (total 1 sec; paused 0 sec; 4 K/sec)
 conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
 Writing meta data super block now.

Suspicious lines (only when node2 fails to become Primary):

 Requested state change failed by peer: TOO_SMALL
 Requested state change failed by peer: TOO_SMALL
 State change failed: State changed was refused by peer node
   state = { cs:WFBitMapT st:Secondary/Secondary ds:UpToDate/UpToDate r--- }
  wanted = { cs:WFBitMapT st:Primary/Secondary ds:UpToDate/UpToDate r--- }

 local disk flush failed with status -95

Thanks,
Marek