[DRBD-user] Limit Syncer Speed

Jasmin J. jasmin at anw.at
Sat Dec 17 22:22:38 CET 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi!

I am not sure what I did test yesterday, but today it is definitely not
working!
Now I am getting "drbd_rs_complete_io(,5576704 [=170]) called, but refcnt is
0!?" and "drbd_rs_complete_io() called, but extent not found" outputs in
syslog.

I changed the configuration to 8.4 syntax as advices by Igor in reply to
another question.

global_common.conf:
common {
    disk {
                 # do not rely on default of 1237
                 al-extents 2000;
                 # Full speed for 1GBit link
                 c-plan-ahead 0;
                 resync-rate 110M;

                 # best performance for RAID with cache and BBU
                 no-disk-barrier;
                 no-disk-flushes;
         }

     net {
                 max-buffers 8000;
                 max-epoch-size 8000;
                 sndbuf-size 512k;
                 cram-hmac-alg sha1;
                 shared-secret "xxxx";
                 verify-alg md5;
                 protocol A;
                 on-congestion pull-ahead;
                 # should be 90% of the TCP send buffer if no DRBD Proxy is used
                 congestion-fill 460k;
                 # should be 90% of the al-extents
                 congestion-extents 1800;

                 after-sb-0pri discard-zero-changes;
                 after-sb-1pri discard-secondary;
                 after-sb-2pri disconnect;
         }
}

vm_100_disk_root.res:
resource vm-100-disk-root {
	device /dev/drbd0;
	meta-disk internal;
	disk /dev/vg_vm_disks_A/vm_100_disk_root;
	on A {
		address 10.0.0.227:7788;
	}
	on B {
		address 10.0.0.226:7788;
	}
}

The VM is started on server A. When I now write to the disk I get a lot of
warnings on the replicated node B and also on A it looks different now.

Node B (secondary):
  Dec 17 18:44:01 B kernel: [15086.423229] block drbd1: conn( Connected -> 
Behind ) disk( UpToDate -> Outdated )
  Dec 17 18:44:01 B pveproxy[31495]: Clearing outdated entries from certificate 
cache
  Dec 17 18:44:02 B kernel: [15087.756755] block drbd1: updated sync uuid 
124E20C44E300D26:0000000000000000:066FFC07CEB258B8:              0265DAB48237BD3C
  Dec 17 18:44:02 B kernel: [15087.762865] block drbd1: helper command: 
/sbin/drbdadm before-resync-target minor-1
  Dec 17 18:44:02 B kernel: [15087.764706] block drbd1: helper command: 
/sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)
  Dec 17 18:44:02 B kernel: [15087.764811] block drbd1: conn( Behind -> 
SyncTarget ) disk( Outdated -> Inconsistent )
  Dec 17 18:44:02 B kernel: [15087.764916] block drbd1: Began resync as 
SyncTarget (will sync 36792 KB [9198 bits set]).
  Dec 17 18:44:02 B kernel: [15088.239060] block drbd1: conn( SyncTarget -> 
Behind )
  Dec 17 18:44:04 B kernel: [15089.759160] block drbd1: updated sync uuid 
124F20C44E300D26:0000000000000000:066FFC07CEB258B8:              0265DAB48237BD3C
  Dec 17 18:44:04 B kernel: [15089.779443] block drbd1: helper command: 
/sbin/drbdadm before-resync-target minor-1
  Dec 17 18:44:04 B kernel: [15089.781228] block drbd1: helper command: 
/sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)
  Dec 17 18:44:04 B kernel: [15089.781408] block drbd1: conn( Behind -> 
SyncTarget )
  Dec 17 18:44:04 B kernel: [15089.781477] block drbd1: Began resync as 
SyncTarget (will sync 104188 KB [26047 bits set]).
  Dec 17 18:44:04 B kernel: [15090.103001] block drbd1: 
drbd_rs_complete_io(,5541888 [=169]) called, but refcnt is 0!?
  Dec 17 18:44:04 B kernel: [15090.112360] block drbd1: 
drbd_rs_complete_io(,5543936 [=169]) called, but refcnt is 0!?
  Dec 17 18:44:04 B kernel: [15090.121661] block drbd1: 
drbd_rs_complete_io(,5545984 [=169]) called, but refcnt is 0!?
  Dec 17 18:44:06 B kernel: [15092.142175] block drbd1: conn( SyncTarget -> 
Behind )
  Dec 17 18:44:07 B kernel: [15093.263781] block drbd1: updated sync uuid 
125020C44E300D26:0000000000000000:066FFC07CEB258B8:              0265DAB48237BD3C
  Dec 17 18:44:07 B kernel: [15093.271052] block drbd1: helper command: 
/sbin/drbdadm before-resync-target minor-1
  Dec 17 18:44:07 B kernel: [15093.272906] block drbd1: helper command: 
/sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)
  Dec 17 18:44:07 B kernel: [15093.273019] block drbd1: conn( Behind -> 
SyncTarget )
  Dec 17 18:44:07 B kernel: [15093.273092] block drbd1: Began resync as 
SyncTarget (will sync 146652 KB [36663 bits set]).
  Dec 17 18:44:07 B kernel: [15093.287939] block drbd1: 
drbd_rs_complete_io(,5576704 [=170]) called, but refcnt is 0!?
  Dec 17 18:44:07 B kernel: [15093.288249] block drbd1: drbd_rs_complete_io() 
called, but extent not found
  Dec 17 18:44:07 B kernel: [15093.297693] block drbd1: drbd_rs_complete_io() 
called, but extent not found
....
  Dec 17 18:44:09 B kernel: [15094.506888] block drbd1: Began resync as 
SyncTarget (will sync 186016 KB [46504 bits set]).
  Dec 17 18:44:11 B kernel: [15096.339852] block drbd1: Resync done (total 1 
sec; paused 0 sec; 186016 K/sec)
  Dec 17 18:44:11 B kernel: [15096.339940] block drbd1: updated UUIDs 
406A434F37639588:0000000000000000:125120C44E300D26:066FFC07CEB258B8
  Dec 17 18:44:11 B kernel: [15096.340034] block drbd1: conn( SyncTarget -> 
Connected ) disk( Inconsistent -> UpToDate )
  Dec 17 18:44:11 B kernel: [15096.387692] block drbd1: helper command: 
/sbin/drbdadm after-resync-target minor-1
  Dec 17 18:44:11 B kernel: [15096.390005] block drbd1: helper command: 
/sbin/drbdadm after-resync-target minor-1 exit code 0 (0x0)

Node A (primary):
  Dec 17 18:44:01 A kernel: [155462.657020] block drbd1: Congestion-fill 
threshold reached
  Dec 17 18:44:01 A kernel: [155462.657094] block drbd1: conn( Connected -> 
Ahead ) pdsk( UpToDate -> Consistent )
  Dec 17 18:44:02 A kernel: [155463.728133] block drbd1: helper command: 
/sbin/drbdadm before-resync-source minor-1
  Dec 17 18:44:02 A kernel: [155463.731161] block drbd1: helper command: 
/sbin/drbdadm before-resync-source minor-1 exit code 0 (0x0)
  Dec 17 18:44:02 A kernel: [155463.731275] block drbd1: conn( Ahead -> 
SyncSource ) pdsk( Consistent -> Inconsistent )
  Dec 17 18:44:02 A kernel: [155463.731393] block drbd1: Began resync as 
SyncSource (will sync 36792 KB [9198 bits set]).
  Dec 17 18:44:02 A kernel: [155463.807042] block drbd1: updated sync UUID 
406A434F37639589:124E20C44E300D27:066FFC07CEB258B8:             066EFC07CEB258B8
  Dec 17 18:44:02 A kernel: [155464.373423] block drbd1: Congestion-fill 
threshold reached
  Dec 17 18:44:02 A kernel: [155464.373492] block drbd1: conn( SyncSource -> 
Ahead )
  Dec 17 18:44:03 A kernel: [155465.520030] block drbd1: helper command: 
/sbin/drbdadm before-resync-source minor-1
  Dec 17 18:44:03 A kernel: [155465.522488] block drbd1: helper command: 
/sbin/drbdadm before-resync-source minor-1 exit code 0 (0x0)
  Dec 17 18:44:03 A kernel: [155465.522601] block drbd1: conn( Ahead -> 
SyncSource )
  Dec 17 18:44:03 A kernel: [155465.522681] block drbd1: Began resync as 
SyncSource (will sync 104188 KB [26047 bits set]).
  Dec 17 18:44:03 A kernel: [155465.522789] block drbd1: updated sync UUID 
406A434F37639589:124F20C44E300D27:124E20C44E300D27:             066FFC07CEB258B8
  Dec 17 18:44:06 A kernel: [155468.133306] block drbd1: Congestion-fill 
threshold reached
  Dec 17 18:44:06 A kernel: [155468.133377] block drbd1: conn( SyncSource -> 
Ahead )
  Dec 17 18:44:07 A kernel: [155469.424035] block drbd1: helper command: 
/sbin/drbdadm before-resync-source minor-1
  Dec 17 18:44:07 A kernel: [155469.427250] block drbd1: helper command: 
/sbin/drbdadm before-resync-source minor-1 exit code 0 (0x0)
  Dec 17 18:44:07 A kernel: [155469.427384] block drbd1: conn( Ahead -> 
SyncSource )
  Dec 17 18:44:07 A kernel: [155469.427472] block drbd1: Began resync as 
SyncSource (will sync 146652 KB [36663 bits set]).
  Dec 17 18:44:07 A kernel: [155469.427594] block drbd1: updated sync UUID 
406A434F37639589:125020C44E300D27:124F20C44E300D27:             124E20C44E300D27
  Dec 17 18:44:09 A pvestatd[22514]: status update time (6.528 seconds)
  Dec 17 18:44:11 A kernel: [155472.674189] block drbd1: Resync done (total 1 
sec; paused 0 sec; 186016 K/sec)
  Dec 17 18:44:11 A kernel: [155472.674282] block drbd1: updated UUIDs 
406A434F37639589:0000000000000000:125120C44E300D27:125020C44E300D27
  Dec 17 18:44:11 A kernel: [155472.674384] block drbd1: conn( SyncSource -> 
Connected ) pdsk( Inconsistent -> UpToDate )

I found no possibility to get back to the state from yesterday. I tried the
configuration from yesterday and did reload the Kernel driver and also reboot
the server.
Then I re-initalized the meta data from scratch but still the same behaviour.
I am doubting, if it ever worked yesterday ... .

I did a test with a sndbuf-size 10M with the same result.

I also googled the "called, but refcnt is 0!?" and "complete_io() called, but
extent not found" but I found only hints for an older DRDB version.
I am using the latest 8.4.9-1 Version from Git with the kernel_sendmsg fix.

Is this now an intended behaviour or did I trigger a hidded "feature" which I
can help to debug?

BR,
    Jasmin



More information about the drbd-user mailing list