[DRBD-user] drbd resyncing entire device after each reboot

Hanspeter Kunz hkunz at ifi.uzh.ch
Mon Oct 8 09:52:13 CEST 2018


On Sat, 2018-10-06 at 00:02 -0400, Digimer wrote:
> On 2018-10-05 04:02 PM, Hanspeter Kunz wrote:
> > Hi there,
> > 
> > I see a strange behavior on a freshly set up pair of machines
> > (debian
> > stretch, drbd 8.4.7): 
> > 
> > after each reboot, the whole drbd device is resynced from scratch,
> > even
> > if both drbd devices report to be uptodate before the reboot. I
> > never
> > experienced this on other drbd installations I have. 
> > 
> > I just rebooted the secondary machine, after starting drbd syslog
> > gives
> > me the following information on that machine:
> > 
> > Oct  5 21:36:43 claire drbd[3578]: Starting DRBD resources:[
> > Oct  5 21:36:43 claire drbd[3578]:      create res: nfs
> > Oct  5 21:36:43 claire drbd[3578]:    prepare disk: nfs
> > Oct  5 21:36:43 claire kernel: [  379.663592] drbd nfs: Starting
> > worker thread (from drbdsetup-84 [3596])
> > Oct  5 21:36:43 claire kernel: [  379.664004] block drbd0: disk(
> > Diskless -> Attaching ) 
> > Oct  5 21:36:43 claire kernel: [  379.664629] drbd nfs: Method to
> > ensure write ordering: flush
> > Oct  5 21:36:43 claire kernel: [  379.664634] block drbd0: max BIO
> > size = 1048576
> > Oct  5 21:36:43 claire kernel: [  379.664642] block drbd0:
> > drbd_bm_resize called with capacity == 53685452728
> > Oct  5 21:36:43 claire kernel: [  379.875816] block drbd0: resync
> > bitmap: bits=6710681591 words=104854400 pages=204794
> > Oct  5 21:36:43 claire kernel: [  379.875819] block drbd0: size =
> > 25 TB (26842726364 KB)
> > Oct  5 21:36:44 claire drbd[3578]:     adjust disk: nfs
> > Oct  5 21:36:44 claire kernel: [  381.510770] block drbd0:
> > recounting of set bits took additional 32 jiffies
> > Oct  5 21:36:44 claire kernel: [  381.510772] block drbd0: 0 KB (0
> > bits) marked out-of-sync by on disk bit-map.
> > Oct  5 21:36:44 claire kernel: [  381.510778] block drbd0: disk(
> > Attaching -> UpToDate ) 
> > Oct  5 21:36:44 claire kernel: [  381.510789] block drbd0: attached
> > to UUIDs
> > 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
> > Oct  5 21:36:44 claire drbd[3578]:      adjust net: nfs
> > Oct  5 21:36:44 claire drbd[3578]: ]
> > Oct  5 21:36:44 claire kernel: [  381.516705] drbd nfs: conn(
> > StandAlone -> Unconnected ) 
> > Oct  5 21:36:44 claire kernel: [  381.516756] drbd nfs: Starting
> > receiver thread (from drbd_w_nfs [3598])
> > Oct  5 21:36:44 claire kernel: [  381.516823] drbd nfs: receiver
> > (re)started
> > Oct  5 21:36:44 claire kernel: [  381.516883] drbd nfs: conn(
> > Unconnected -> WFConnection ) 
> > Oct  5 21:36:45 claire kernel: [  382.250879] drbd nfs: Handshake
> > successful: Agreed network protocol version 101
> > Oct  5 21:36:45 claire kernel: [  382.250884] drbd nfs: Feature
> > flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
> > Oct  5 21:36:45 claire kernel: [  382.251202] drbd nfs: Peer
> > authenticated using 20 bytes HMAC
> > Oct  5 21:36:45 claire kernel: [  382.251307] drbd nfs: conn(
> > WFConnection -> WFReportParams ) 
> > Oct  5 21:36:45 claire kernel: [  382.251366] drbd nfs: Starting
> > ack_recv thread (from drbd_r_nfs [3607])
> > Oct  5 21:36:45 claire kernel: [  382.310672] block drbd0:
> > drbd_sync_handshake:
> > Oct  5 21:36:45 claire kernel: [  382.310680] block drbd0: self
> > 0000000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
> > bits:0 flags:0
> > Oct  5 21:36:45 claire kernel: [  382.310687] block drbd0: peer
> > 06D17ADE18B89143:0000000000000005:B6D88D552E97D8B7:B6D78D552E97D8B7
> > bits:0 flags:0
> > Oct  5 21:36:45 claire kernel: [  382.310691] block drbd0:
> > uuid_compare()=-2 by rule 20
> > Oct  5 21:36:45 claire kernel: [  382.310696] block drbd0: Writing
> > the whole bitmap, full sync required after drbd_sync_handshake.
> > Oct  5 21:36:47 claire kernel: [  383.728620] block drbd0: bitmap
> > WRITE of 204794 pages took 1228 ms
> > Oct  5 21:36:47 claire kernel: [  383.728626] block drbd0: 25 TB
> > (6710681591 bits) marked out-of-sync by on disk bit-map.
> > Oct  5 21:36:47 claire kernel: [  383.728693] block drbd0: peer(
> > Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk(
> > UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate ) 
> > Oct  5 21:36:47 claire drbd[3578]: WARN: stdin/stdout is not a TTY;
> > using /dev/console.
> > Oct  5 21:36:47 claire systemd[1]: Started LSB: Control DRBD
> > resources..
> > Oct  5 21:36:47 claire kernel: [  384.049775] block drbd0: receive
> > bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23;
> > compression: 100.0%
> > Oct  5 21:36:47 claire kernel: [  384.145044] block drbd0: send
> > bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23;
> > compression: 100.0%
> > Oct  5 21:36:47 claire kernel: [  384.145049] block drbd0: conn(
> > WFBitMapT -> WFSyncUUID ) 
> > Oct  5 21:36:47 claire kernel: [  384.275789] block drbd0: updated
> > sync uuid
> > 0001000000000004:0000000000000000:B6D88D552E97D8B6:B6D78D552E97D8B7
> > Oct  5 21:36:47 claire kernel: [  384.275945] block drbd0: helper
> > command: /sbin/drbdadm before-resync-target minor-0
> > Oct  5 21:36:47 claire kernel: [  384.279872] block drbd0: helper
> > command: /sbin/drbdadm before-resync-target minor-0 exit code 0
> > (0x0)
> > Oct  5 21:36:47 claire kernel: [  384.279905] block drbd0: conn(
> > WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
> > Oct  5 21:36:47 claire kernel: [  384.279949] block drbd0: Began
> > resync as SyncTarget (will sync 26842726364 KB [6710681591 bits
> > set]).
> > 
> > Probably the explanation is simple, I just do not see it. 
> > 
> > If you need the configuration (although it should be identical to
> > similar drbd configs which are working without problems) I am happy
> > to
> > provide it.
> > 
> > Best and many thanks if any body could shed some light on this,
> > Hp
> 
> Can you share your config? Are you using thin LVM?

this is my config as reported by "drbdsetup show":

resource nfs {
    options {
    }
    net {
        max-buffers     	131072;
        cram-hmac-alg   	"sha1";
        shared-secret   	"REMOVED";
        verify-alg      	"sha1";
    }
    _remote_host {
        address			ipv4 192.168.3.182:7788;
    }
    _this_host {
        address			ipv4 192.168.3.181:7788;
        volume 0 {
            device			minor 0;
            disk			"/dev/storage/nfs";
            meta-disk			internal;
            disk {
                resync-rate     	122880k; # bytes/second
                al-extents      	3389;
                c-fill-target   	40960s; # bytes
                c-max-rate      	4096000k; # bytes/second
                c-min-rate      	81920k; # bytes/second
            }
        }
    }
}

this is the volume information for /dev/storage/nfs

lvdisplay /dev/storage/nfs
  --- Logical volume ---
  LV Path                /dev/storage/nfs
  LV Name                nfs
  VG Name                storage
  LV UUID                TcncF5-uhtd-d9ea-C1fO-cu4U-eo06-2Y0UCq
  LV Write Access        read/write
  LV Creation host, time claris, 2018-09-27 14:28:14 +0200
  LV Status              available
  # open                 2
  LV Size                25.00 TiB
  Current LE             6553600
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:0

> Also, 8.4.7 is _ancient_. Nearly countless bug fixes since then,
> which
> may or may not relate. In any case, updating is _strongly_
> recommended.

ok, I might give this a try (right now I use what is shipped with
debian stable). Remember, I have more or less exactly the same setup
running on quote a few other machines (since many years) without
problems, so I do not think that updating will solve the above problem.

Many thanks,
Hp



More information about the drbd-user mailing list