Actually this looks a lot like our situation.<br>The similarities are just too close.<br><br>So I'm wondering if Ryan can test something.<br>We found a "sweet spot" for data corruption on MD devices - at about 183 megs.
<br>When transferring a file 183 megs in size (or larger) the drbd would corrupt.<br><br>It turned out to be exactly what Lars mentioned, don't mkfs/fsck on the lower device (/dev/xdwhatever)<br>instead do it on the /dev/drbdX device
<br><br>This method may take a bit of getting used to (and some confusion if you're trying to use HA) but it's much better.<br>Since we starting using the proper method everything has been working fine on our raid devices.
<br><br>Dan.<br><br><div><span class="gmail_quote">On 5/23/07, <b class="gmail_sendername">Lars Ellenberg</b> <<a href="mailto:lars.ellenberg@linbit.com">lars.ellenberg@linbit.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Mon, May 21, 2007 at 05:20:40PM +0200, Lars Ellenberg wrote:<br>> On Fri, May 18, 2007 at 11:56:33AM -0400, Ryan Steele wrote:<br>> > I saw someone else post something similar to this a few weeks ago, but<br>> > didn't see any response to it. I've just set up DRBD
0.7.23 with<br>> > Heartbeat2 on two future database server. However, DRBD seems to have<br>> > corrupted my multi-disk RAID1. I booted a Knoppix CD on the affected<br>> > machines, removed the DRBD rc.d
scripts, and rebooted and things were<br>> > fine. To verify, I ran update-rc.d to recreate the symbolic links, and<br>> > rebooted again to find that it again would not boot. Moreover, even<br>> > removing the
rc.d links did not help - the array is, I fear, irreparably<br>> > damaged.<br>> ><br>> > Is there any acknowledgement of this bug, or are there any suggestions<br>> > as to how one might go about fixing it? I can't even boot into the
<br>> > machine to run mdadm and repair the array, though maybe I can do that<br>> > from the Knoppix CD...<br>><br>> I think...<br>> the issue is that md raid5<br><br>wait. after reading your post closely again,
<br>you should not be affected by this.<br>you said you use raid _1_ not raid5?<br><br>hm... so there may be something else?<br><br>anyways. drbd does not corrupt md raid (or anything, for that matter;<br>I would have noticed for sure!) anywhere on boxes I was involved,
<br>or have access to. and that are quite a few.<br><br>so if this is a real issue for you, and you can reproduce this at will,<br>there is something "special" in your setup...<br>try to _not_ mkfs /dev/something , then resize,
<br>but to mkfs /dev/drbdX ...<br>try a different kernel, try with small arrays first, so you can more<br>easily verify (by comparing against checksums/images) how/where<br>the corruption takes place.<br><br>--<br>: Lars Ellenberg Tel +43-1-8178292-0 :
<br>: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :<br>: Vivenotgasse 48, A-1120 Vienna/Europe <a href="http://www.linbit.com">http://www.linbit.com</a> :<br>__<br>please use the "List-Reply" function of your email client.
<br>_______________________________________________<br>drbd-user mailing list<br><a href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a><br><a href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user
</a><br></blockquote></div><br>