<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Stanislav, my system sends me an email when verify finds an out-of-sync condition. You can use the same handler if you like.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">In my global, handlers section:<o:p></o:p></span></p>
<p class="MsoNormal" style="text-indent:5.25pt"><span style="font-size:11.0pt;font-family:"Courier New";color:black">out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh myemailaddress";<o:p></o:p></span></p>
<p class="MsoNormal" style="text-indent:5.25pt"><span style="font-size:11.0pt;font-family:"Courier New";color:black"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Are you resyncing after the error is detected (disconnect/connect the resource)?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Dan, in Atlanta<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> drbd-user-bounces@lists.linbit.com [mailto:drbd-user-bounces@lists.linbit.com]
<b>On Behalf Of </b>Stanislav German-Evtushenko<br>
<b>Sent:</b> Sunday, March 24, 2013 7:00 AM<br>
<b>To:</b> drbd-user@lists.linbit.com<br>
<b>Subject:</b> [DRBD-user] Uncatchable DRBD out-of-sync issue<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Dear all,<br>
<br>
I'm trying to catch the issue with out-of-sync and I've stuck so far. Can anybody give me a hint what can I check next?<br>
<br>
Configuration:<br>
- two nodes Dell PowerEdge R710 (both nodes of the same hadrware, same configuration)<br>
- drbd0 master-master (size is 900GiB)<br>
- direct connection (two 1Gbit/s ethernet adapters in bonding balance-rr)<br>
- data-integrity-alg is crc32c (it has been enabled for testing purposes)<br>
- LVM on top of DRBD (LVM volumes are used by virtual machines)<br>
<br>
Software:<br>
- DRBD module version: 8.3.13<br>
- kernel: Linux 2.6.32-19-pve #1 SMP x86_64 GNU/Linux<br>
<br>
Problem:<br>
- Each time when I do online verification it founds some sectors are out of sync (not many usually, about 5-15 messages after verification is done)<br>
- In fact these sectors are not synced (checked with dd and md5sum)<br>
- data-integrity-alg doesn't cause any messages in logs since drbdadm is connected all and until verification process finds some sectors out of sync<br>
<br>
Questions:<br>
- How is that possible?<br>
- Why data-integrity-alg doesn't catch the problem?<br>
- How to fix?<br>
<br>
*** extracts from kernel log ***<br>
Mar 24 13:23:38 host1 kernel: block drbd0: conn( Connected -> VerifyS )<br>
Mar 24 13:23:38 host1 kernel: block drbd0: Starting Online Verify from sector 0<br>
Mar 24 14:13:17 host1 kernel: block drbd0: Out of sync: start=718996928, size=8 (sectors)<br>
Mar 24 14:13:17 host1 kernel: block drbd0: Out of sync: start=718996984, size=8 (sectors)<br>
Mar 24 14:13:17 host1 kernel: block drbd0: Out of sync: start=718997224, size=8 (sectors)<br>
*********************************<br>
<br>
*** check with dd and md5sum ***<br>
# dd iflag=direct if=/dev/drbd0 bs=512 skip=718997224 count=8 | md5sum<br>
host1: 669a5c2ba22fa931aac16cdd2f03e22a<br>
host2: ceeac3bd59178ee13f94ce283e3a4de3<br>
********************************<br>
<br>
*** drbdadm /dev/drbd0 show ***<br>
disk {<br>
size 0s _is_default; # bytes<br>
on-io-error pass_on _is_default;<br>
fencing dont-care _is_default;<br>
max-bio-bvecs 0 _is_default;<br>
}<br>
net {<br>
timeout 60 _is_default; # 1/10 seconds<br>
max-epoch-size 2048 _is_default;<br>
max-buffers 2048 _is_default;<br>
unplug-watermark 128 _is_default;<br>
connect-int 10 _is_default; # seconds<br>
ping-int 10 _is_default; # seconds<br>
sndbuf-size 0 _is_default; # bytes<br>
rcvbuf-size 0 _is_default; # bytes<br>
ko-count 0 _is_default;<br>
allow-two-primaries;<br>
cram-hmac-alg "sha1";<br>
shared-secret "XXXXXXXXXXXXXXXXXXX";<br>
after-sb-0pri discard-zero-changes;<br>
after-sb-1pri discard-secondary;<br>
after-sb-2pri disconnect _is_default;<br>
rr-conflict disconnect _is_default;<br>
ping-timeout 5 _is_default; # 1/10 seconds<br>
data-integrity-alg "crc32c";<br>
on-congestion block _is_default;<br>
congestion-fill 0s _is_default; # byte<br>
congestion-extents 127 _is_default;<br>
}<br>
syncer {<br>
rate 153600k; # bytes/second<br>
after -1 _is_default;<br>
al-extents 127 _is_default;<br>
verify-alg "md5";<br>
on-no-data-accessible io-error _is_default;<br>
c-plan-ahead 0 _is_default; # 1/10 seconds<br>
c-delay-target 10 _is_default; # 1/10 seconds<br>
c-fill-target 0s _is_default; # bytes<br>
c-max-rate 102400k _is_default; # bytes/second<br>
c-min-rate 4096k _is_default; # bytes/second<br>
}<br>
protocol C;<br>
_this_host {<br>
device minor 0;<br>
disk "/dev/sda3";<br>
meta-disk internal;<br>
address ipv4 <a href="http://172.23.10.1:7788">172.23.10.1:7788</a>;<br>
}<br>
_remote_host {<br>
address ipv4 <a href="http://172.23.10.2:7788">172.23.10.2:7788</a>;<br>
}<br>
# (89) unknown tag = (integer) 0 [len: 4]<br>
# Found unknown tags, you should update your<br>
# userland tools<br>
*******************************<br>
<br clear="all">
Best regards,<br>
Stanislav<o:p></o:p></p>
</div>
</div>
</body>
</html>