<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">"buffer modified by upper layers during
write" means whatever sits on top of drbd changes data "in
flight".<br>
Please search the list archives, this is a FAQ.<br>
Swap and some file systems do that - usually it's some kind of
optimization. I suspect VMWare VMs hosted in ext4 do that too.<br>
There's probably nothing wrong, but DRBD can't know. You should do
your data integrity checking on some higher level (fsck for
example)<br>
Lionel.<br>
<br>
Le 10/10/2014 01:45, aurelien panizza a écrit :<br>
</div>
<blockquote
cite="mid:CAOhtac1CrUEBuQt2AOpHCjyjZh-fxRJU2F+00tWk0TBxJACb-g@mail.gmail.com"
type="cite">
<div dir="ltr">Hi all,
<div><br>
</div>
<div>I've got a problem on my environnement.</div>
<div>I set up my primary server (pacemaker + drbd) which ran
alone for a while, and then I added the second server
(currently only DRBD).</div>
<div>Both server can see each other and /proc/drbd reports
"uptodate/uptodate".</div>
<div>If I run a verify on that resource (right after the full
resync), it reports some blocks out of sync ( generally from
100 to 1500 on my 80GO LVM partition).</div>
<div>So I disconnect/connect the slave and oos report 0 block.</div>
<div>I run again a verify and some block are still out of sync.
What I've notived is that it seems to be almost always the
same blocks which are out of sync.</div>
<div>I tried to do a full resync multiple times but had the same
issue.</div>
<div>I also tried to replace the physical secondary server by a
virtual machine (in order to check if the issue came from the
secondary server) but had the same issue.</div>
<div><br>
</div>
<div>I then activated "data-integrity-alg crc32c" and got a
couple of "Digest mismatch, buffer modified by upper layers
during write: 167134312s +4096" in the primary log.</div>
<div><br>
</div>
<div>I tried on a different network card but got the same
errors.</div>
<div><br>
</div>
<div>My full configuration file:</div>
<div><br>
</div>
<div> protocol C;</div>
<div> meta-disk internal;</div>
<div> device /dev/drbd0;</div>
<div> disk /dev/sysvg/drbd;</div>
<div><br>
</div>
<div> handlers {</div>
<div> split-brain "/usr/lib/drbd/notify-split-brain.sh
xxx@xxx";</div>
<div> out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh
xxx@xxx";</div>
<div> fence-peer "/usr/lib/drbd/crm-fence-peer.sh";</div>
<div> after-resync-target
"/usr/lib/drbd/crm-unfence-peer.sh";</div>
<div> }</div>
<div><br>
</div>
<div> net {</div>
<div> cram-hmac-alg "sha1";</div>
<div> shared-secret "drbd";</div>
<div> sndbuf-size 512k;</div>
<div> max-buffers 8000;</div>
<div> max-epoch-size 8000;</div>
<div> verify-alg md5;</div>
<div> after-sb-0pri disconnect;</div>
<div> after-sb-1pri disconnect;</div>
<div> after-sb-2pri disconnect;</div>
<div> data-integrity-alg crc32c;</div>
<div> }</div>
<div><br>
</div>
<div> disk {</div>
<div> al-extents 3389;</div>
<div> fencing resource-only;</div>
<div> }</div>
<div><br>
</div>
<div> syncer {</div>
<div> rate 90M;</div>
<div> }</div>
<div> on host1 {</div>
<div> address <a moz-do-not-send="true"
href="http://10.110.1.71:7799">10.110.1.71:7799</a>;</div>
<div> }</div>
<div> on host2 {</div>
<div> address <a moz-do-not-send="true"
href="http://10.110.1.72:7799">10.110.1.72:7799</a>;</div>
<div> }</div>
<div>}</div>
<div><br>
</div>
<div>My OS : Redhat6 2.6.32-431.20.3.el6.x86_64</div>
<div>DRBD version : drbd84-8.4.4-1</div>
<div><br>
</div>
<div>
<div>ethtool -k eth0</div>
<div>Features for eth0:</div>
<div>rx-checksumming: on</div>
<div>tx-checksumming: on</div>
<div>scatter-gather: on</div>
<div>tcp-segmentation-offload: on</div>
<div>udp-fragmentation-offload: off</div>
<div>generic-segmentation-offload: on</div>
<div>generic-receive-offload: off</div>
<div>large-receive-offload: off</div>
<div>ntuple-filters: off</div>
<div>receive-hashing: off</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div>Secondary server is currently not in the HA (pacemaker) but
I don't think this the problem.</div>
<div>I have got another HA on 2 physical host with the exact
same configuration and drbd/os version (but not same server
model) and everything's OK.</div>
<div><br>
</div>
<div>As the primary server is in production, I can't stop the
application (Database) to check if the alerts are false
positive.</div>
<div><br>
</div>
<div>Would you have any advice ?</div>
<div>Could it be the primary server which have corrupted block
or wrong metadata ?</div>
<div><br>
</div>
<div>Regards,</div>
<div><br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
drbd-user mailing list
<a class="moz-txt-link-abbreviated" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a>
<a class="moz-txt-link-freetext" href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>
</pre>
</blockquote>
<br>
</body>
</html>