Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, I've been using drbd for about 5 years now, and it has been working great. Recently we modified the setup, moved a couple of servers around, so drbd should replicate over a 20 Mbit/sec WAN line. I've changed from Protocol C to A, and enabled Ahead/Behind mode. It seems to work, but after some time, some of the resources stuck in Ahead/Behind mode and never resync again unless I disconnect and reconnect the resource. It looks like this on the Primary: cat /proc/drbd version: 8.4.1 (api:1/proto:86-100) GIT-hash: bb796da897912034a90003910f69ae0a2c10cf44 build by root at node1, 2012-06-04 13:02:39 [..] 13: cs:Ahead ro:Primary/Secondary ds:UpToDate/Inconsistent A r----- ns:9428820 nr:0 dw:446339296 dr:931364896 al:280851 bm:66801 lo:0 pe:0 ua:0 ap:0 ep:1 wo:n oos:1389708 A minute later: 13: cs:Ahead ro:Primary/Secondary ds:UpToDate/Inconsistent A r----- ns:9428820 nr:0 dw:446340428 dr:931364948 al:280851 bm:66801 lo:0 pe:0 ua:0 ap:0 ep:1 wo:n oos:1389728 This seems like a bug to me, and it has already been reported by someone else in August: http://lists.linbit.com/pipermail/drbd-user/2012-August/018934.html I've also created a virtualised testsetup with two nodes with 8.4.2, and I could reach this state, so it is fairly reproducible. The problem seems to be happening when the node switches from SyncSource to Ahead mode without finishing synchronization, i.e. I finish some writing to the drbd device, then wait a few seconds so the node starts to sync, then I start writing again. On the productive system it happens on resources which has the most writes. Any help is appreciated. Bye. The configuration: cat /usr/local/etc/drbd.d/global_common.conf global { usage-count no; } common { net { protocol A; max-buffers 2048; max-epoch-size 2048; verify-alg sha1; csums-alg sha1; } disk { disk-barrier no; disk-flushes no; md-flushes no; disk-drain no; al-extents 1801; } startup { wfc-timeout 180; degr-wfc-timeout 120; } } cat /usr/local/etc/drbd.d/r13.res resource r13 { net { protocol A; on-congestion pull-ahead; congestion-fill 200k; congestion-extents 1620; } disk { c-max-rate 1500k; } on node1 { device /dev/drbd13 minor 13; disk /dev/sda5; meta-disk internal; address ipv4 10.129.164.130:7801; } on node2 { device /dev/drbd13 minor 13; disk /dev/sdb7; meta-disk internal; address ipv4 10.129.166.125:7801; } } -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20121209/01d88bea/attachment.htm>