[DRBD-user] one of drbd disk need full resync after reboot

chatchai jantaraprim chatchai.j at gmail.com
Sat May 30 06:39:53 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,
     Our servers using debian lenny, recently we just upgrade the
kernel on our servers, and also in the process the drbd module.
The current kernel vesion is linux 2.6.26-2-686, when we restart one
of the server for maintainance one of the drbd disk
did full resync. We seen this twice. For the last one we have this in the syslog

May 28 13:21:06 nnksvr01 kernel: [  145.948274] drbd: initialised.
Version: 8.0.14 (api:86/proto:86)
May 28 13:21:06 nnksvr01 kernel: [  145.948274] drbd: GIT-hash:
bb447522fc9a87d0069b7e14f0234911ebdab0f7 build by phil at fat-tyre,
2008-11-12 16:40:33
May 28 13:21:06 nnksvr01 kernel: [  145.948274] drbd: registered as
block device major 147
May 28 13:21:06 nnksvr01 kernel: [  145.948274] drbd: minor_table @ 0xf796e3c0
May 28 13:21:07 nnksvr01 kernel: [  147.176534] drbd6: disk( Diskless
-> Attaching )
May 28 13:21:07 nnksvr01 kernel: [  147.176534] drbd6: Starting worker
thread (from cqueue [2609])
May 28 13:21:07 nnksvr01 kernel: [  147.265232] drbd6: Found 6
transactions (234 active extents) in activity log.
May 28 13:21:07 nnksvr01 kernel: [  147.265238] drbd6: Backing
device's merge_bvec_fn() = f8c9c069
May 28 13:21:07 nnksvr01 kernel: [  147.265241] drbd6:
max_segment_size ( = BIO size ) = 4096
May 28 13:21:07 nnksvr01 kernel: [  147.265244] drbd6: Adjusting my
ra_pages to backing device's (32 -> 64)
May 28 13:21:07 nnksvr01 kernel: [  147.265248] drbd6: drbd_bm_resize
called with capacity == 356181824
May 28 13:21:07 nnksvr01 kernel: [  147.266925] drbd6: resync bitmap:
bits=44522728 words=1391336
May 28 13:21:07 nnksvr01 kernel: [  147.266930] drbd6: size = 170 GB
(178090912 KB)
May 28 13:21:07 nnksvr01 kernel: [  147.868100] drbd6: recounting of
set bits took additional 2 jiffies
May 28 13:21:07 nnksvr01 kernel: [  147.868107] drbd6: 0 KB (0 bits)
marked out-of-sync by on disk bit-map.
May 28 13:21:07 nnksvr01 kernel: [  147.868112] drbd6: disk( Attaching
-> UpToDate )
May 28 13:21:07 nnksvr01 kernel: [  147.998236] drbd6: conn(
StandAlone -> Unconnected )
May 28 13:21:07 nnksvr01 kernel: [  148.002321] drbd6: Starting
receiver thread (from drbd6_worker [2667])
May 28 13:21:07 nnksvr01 kernel: [  148.002362] drbd6: receiver (re)started
May 28 13:21:07 nnksvr01 kernel: [  148.002362] drbd6: conn(
Unconnected -> WFConnection )
May 28 13:21:08 nnksvr01 kernel: [  148.169608] drbd6: Handshake
successful: DRBD Network Protocol version 86
May 28 13:21:08 nnksvr01 kernel: [  148.169608] drbd6: conn(
WFConnection -> WFReportParams )
May 28 13:21:08 nnksvr01 kernel: [  148.169608] drbd6: Starting
asender thread (from drbd6_receiver [2693])
May 28 13:21:08 nnksvr01 kernel: [  148.185007] drbd6: drbd_sync_handshake:
May 28 13:21:08 nnksvr01 kernel: [  148.185007] drbd6: self
F7668094BF050518:0000000000000000:E56A0B0B69CA4098:B4B5CF85BFA1686C
May 28 13:21:08 nnksvr01 kernel: [  148.185007] drbd6: peer
DA26B785F0FD01C9:F7668094BF050519:E56A0B0B69CA4099:B4B5CF85BFA1686C
May 28 13:21:08 nnksvr01 kernel: [  148.185007] drbd6:
uuid_compare()=-1 by rule 5
May 28 13:21:08 nnksvr01 kernel: [  148.194847] drbd6: peer( Unknown
-> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown ->
UpToDate )
May 28 13:21:08 nnksvr01 kernel: [  148.860198] drbd6: conn( WFBitMapT
-> WFSyncUUID )
May 28 13:21:08 nnksvr01 kernel: [  148.888532] drbd6: conn(
WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent )
May 28 13:21:08 nnksvr01 kernel: [  148.888542] drbd6: Began resync as
SyncTarget (will sync 4 KB [1 bits set]).
May 28 13:21:08 nnksvr01 kernel: [  149.340302] drbd6: conn(
SyncTarget -> PausedSyncT ) peer_isp( 0 -> 1 )
May 28 13:21:08 nnksvr01 kernel: [  149.340302] drbd6: Resync suspended
May 28 13:21:08 nnksvr01 kernel: [  149.346065] drbd6: aftr_isp( 0 -> 1 )
May 28 13:21:08 nnksvr01 kernel: [  149.408303] drbd6: Resync done
(total 1 sec; paused 0 sec; 4 K/sec)
May 28 13:21:08 nnksvr01 kernel: [  149.408303] drbd6: conn(
PausedSyncT -> Connected ) disk( Inconsistent -> UpToDate )
May 28 13:21:09 nnksvr01 kernel: [  150.488580] drbd6: unexpected
cstate (Connected) in receive_bitmap
May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6: drbd_sync_handshake:
May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6: self
DA26B785F0FD01C8:0000000000000000:2951B01A84735B7A:F7668094BF050519
May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6: peer
DA26B785F0FD01C9:0000000000000000:2951B01A84735B7B:F7668094BF050519
May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6:
uuid_compare()=0 by rule 4
May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6: No resync, but
44522728 bits in bitmap!

For some reason we got these

      May 28 13:21:09 nnksvr01 kernel: [  150.488580] drbd6:
unexpected cstate (Connected) in receive_bitmap
      May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6:
drbd_sync_handshake:
      May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6: self
DA26B785F0FD01C8:0000000000000000:2951B01A84735B7A:F7668094BF050519
      May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6: peer
DA26B785F0FD01C9:0000000000000000:2951B01A84735B7B:F7668094BF050519
      May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6:
uuid_compare()=0 by rule 4
      May 28 13:21:09 nnksvr01 kernel: [  150.504885] drbd6: No
resync, but 44522728 bits in bitmap!

and then when this server reboot, this drbd need full resync
The configuration for this drbd6 part is
---------------------------------------------------------------------------
...
resource r6 {
  protocol C;

  handlers {
    pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
    pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
    local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
    outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
  }

  startup {
    degr-wfc-timeout 120;
  }

  disk {
    on-io-error detach;
    no-disk-flushes;
    no-md-flushes;
  }

  net {
  }

  syncer {
    rate 100M;
    after "r5";
    al-extents 257;
  }

  on nnksvr01 {
    device     /dev/drbd6;
    disk       /dev/md7;
    address    10.0.0.1:7794;
    meta-disk  internal;
  }

  on nnksvr02 {
    device    /dev/drbd6;
    disk      /dev/md7;
    address   10.0.0.2:7794;
    meta-disk internal;
  }
}

---------------------------------------------------------------------------

If full log and all config are need please tell me, I can provide them.

Regards,
cj



More information about the drbd-user mailing list