<div dir="ltr">Greetings,<div><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 10, 2014 at 10:35 AM, Adam Błaszczykowski <span dir="ltr"><<a href="mailto:adam.blaszczykowski@gmail.com" target="_blank">adam.blaszczykowski@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div><div>Hello,<br><br></div>I am testing DRBD 8.4.3 with kernel 3.4.69 and I have error messages in dmesg on both nodes. I tested DRBD on different servers with new disks to make sure that this problem isn't related with hardware but still the same issue. During all of my tests I was copying the data to DRBD volume.<br>
<br></div>Please, I will be glad if you can help me with this problem.<br></div></blockquote><div><br></div><div><div>Did you find a solution/explanation for this? I am running this on 3.13 kernel (ubuntu 14.04), DRBD 8.4.3</div><div><br></div><div>I am observing this on *one* pair of servers, the other servers seems to be fine. However, I did found hardware-related issues on these servers, related to CPU stuck messages during online verification. This was cured by BIOS update (that bumped CPU microcode from 0x14 to 0x15), this is a Xeon E5620 CPU.</div><div><br></div><div>However, now, as I resync (online verification marked lots of OOS), I get the very same messages:</div><div><br></div><div>On Secondary:</div><div>block drbd1: BAD! sector=1771175320s enr=54051 rs_left=-51 rs_failed=0 count=128 cstate=SyncTarget<br></div><div><br></div><div>On Primary:</div><div>block drbd1: BAD! sector=1771175320s enr=54051 rs_left=-51 rs_failed=0 count=128 cstate=SyncSource<br></div><div><br></div></div><div>I have lots of these as the sync continues, from 3 to 10 per minute.</div><div><br></div><div>Before doing the online verification, I had *silent* data corruption on the secondary (I tried to move services to secondary, in order to work on primary with OS updates), this is when I noticed secondary was in fact out of sync: missing files, wrong disk space, ... clear signs of filesystem corruption, even though /proc/drbd showed UpToDate/UpToDate. I blame the initial corruption on the hardware problems, due to firmware, as I have not observed this on any other pair of servers.</div><div> </div><div>Thanks!</div><div><br></div><div>Ildefonso</div><div><br></div><div><br></div></div>
</div></div>