Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Stanislav, my system sends me an email when verify finds an out-of-sync condition. You can use the same handler if you like. In my global, handlers section: out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh myemailaddress"; Are you resyncing after the error is detected (disconnect/connect the resource)? Dan, in Atlanta From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Stanislav German-Evtushenko Sent: Sunday, March 24, 2013 7:00 AM To: drbd-user at lists.linbit.com Subject: [DRBD-user] Uncatchable DRBD out-of-sync issue Dear all, I'm trying to catch the issue with out-of-sync and I've stuck so far. Can anybody give me a hint what can I check next? Configuration: - two nodes Dell PowerEdge R710 (both nodes of the same hadrware, same configuration) - drbd0 master-master (size is 900GiB) - direct connection (two 1Gbit/s ethernet adapters in bonding balance-rr) - data-integrity-alg is crc32c (it has been enabled for testing purposes) - LVM on top of DRBD (LVM volumes are used by virtual machines) Software: - DRBD module version: 8.3.13 - kernel: Linux 2.6.32-19-pve #1 SMP x86_64 GNU/Linux Problem: - Each time when I do online verification it founds some sectors are out of sync (not many usually, about 5-15 messages after verification is done) - In fact these sectors are not synced (checked with dd and md5sum) - data-integrity-alg doesn't cause any messages in logs since drbdadm is connected all and until verification process finds some sectors out of sync Questions: - How is that possible? - Why data-integrity-alg doesn't catch the problem? - How to fix? *** extracts from kernel log *** Mar 24 13:23:38 host1 kernel: block drbd0: conn( Connected -> VerifyS ) Mar 24 13:23:38 host1 kernel: block drbd0: Starting Online Verify from sector 0 Mar 24 14:13:17 host1 kernel: block drbd0: Out of sync: start=718996928, size=8 (sectors) Mar 24 14:13:17 host1 kernel: block drbd0: Out of sync: start=718996984, size=8 (sectors) Mar 24 14:13:17 host1 kernel: block drbd0: Out of sync: start=718997224, size=8 (sectors) ********************************* *** check with dd and md5sum *** # dd iflag=direct if=/dev/drbd0 bs=512 skip=718997224 count=8 | md5sum host1: 669a5c2ba22fa931aac16cdd2f03e22a host2: ceeac3bd59178ee13f94ce283e3a4de3 ******************************** *** drbdadm /dev/drbd0 show *** disk { size 0s _is_default; # bytes on-io-error pass_on _is_default; fencing dont-care _is_default; max-bio-bvecs 0 _is_default; } net { timeout 60 _is_default; # 1/10 seconds max-epoch-size 2048 _is_default; max-buffers 2048 _is_default; unplug-watermark 128 _is_default; connect-int 10 _is_default; # seconds ping-int 10 _is_default; # seconds sndbuf-size 0 _is_default; # bytes rcvbuf-size 0 _is_default; # bytes ko-count 0 _is_default; allow-two-primaries; cram-hmac-alg "sha1"; shared-secret "XXXXXXXXXXXXXXXXXXX"; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect _is_default; rr-conflict disconnect _is_default; ping-timeout 5 _is_default; # 1/10 seconds data-integrity-alg "crc32c"; on-congestion block _is_default; congestion-fill 0s _is_default; # byte congestion-extents 127 _is_default; } syncer { rate 153600k; # bytes/second after -1 _is_default; al-extents 127 _is_default; verify-alg "md5"; on-no-data-accessible io-error _is_default; c-plan-ahead 0 _is_default; # 1/10 seconds c-delay-target 10 _is_default; # 1/10 seconds c-fill-target 0s _is_default; # bytes c-max-rate 102400k _is_default; # bytes/second c-min-rate 4096k _is_default; # bytes/second } protocol C; _this_host { device minor 0; disk "/dev/sda3"; meta-disk internal; address ipv4 172.23.10.1:7788<http://172.23.10.1:7788>; } _remote_host { address ipv4 172.23.10.2:7788<http://172.23.10.2:7788>; } # (89) unknown tag = (integer) 0 [len: 4] # Found unknown tags, you should update your # userland tools ******************************* Best regards, Stanislav -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130324/38deaeee/attachment.htm>