[DRBD-user] DRBD 0.7.23 and MD corruption

Dan Gahlinger dgahling at gmail.com
Wed May 23 18:05:28 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Actually this looks a lot like our situation.
The similarities are just too close.

So I'm wondering if Ryan can test something.
We found a "sweet spot" for data corruption on MD devices - at about 183
megs.
When transferring a file 183 megs in size (or larger) the drbd would
corrupt.

It turned out to be exactly what Lars mentioned, don't mkfs/fsck on the
lower device (/dev/xdwhatever)
instead do it on the /dev/drbdX device

This method may take a bit of getting used to (and some confusion if you're
trying to use HA) but it's much better.
Since we starting using the proper method everything has been working fine
on our raid devices.

Dan.

On 5/23/07, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
>
> On Mon, May 21, 2007 at 05:20:40PM +0200, Lars Ellenberg wrote:
> > On Fri, May 18, 2007 at 11:56:33AM -0400, Ryan Steele wrote:
> > > I saw someone else post something similar to this a few weeks ago, but
> > > didn't see any response to it.  I've just set up DRBD 0.7.23 with
> > > Heartbeat2 on two future database server.  However, DRBD seems to have
> > > corrupted my multi-disk RAID1.  I booted a Knoppix CD on the affected
> > > machines, removed the DRBD rc.d scripts, and rebooted and things were
> > > fine.  To verify, I ran update-rc.d to recreate the symbolic links,
> and
> > > rebooted again to find that it again would not boot.  Moreover, even
> > > removing the rc.d links did not help - the array is, I fear,
> irreparably
> > > damaged.
> > >
> > > Is there any acknowledgement of this bug, or are there any suggestions
> > > as to how one might go about fixing it?  I can't even boot into the
> > > machine to run mdadm and repair the array, though maybe I can do that
> > > from the Knoppix CD...
> >
> > I think...
> > the issue is that md raid5
>
> wait. after reading your post closely again,
> you should not be affected by this.
> you said you use raid _1_ not raid5?
>
> hm...  so there may be something else?
>
> anyways.  drbd does not corrupt md raid (or anything, for that matter;
> I would have noticed for sure!) anywhere on boxes I was involved,
> or have access to.  and that are quite a few.
>
> so if this is a real issue for you, and you can reproduce this at will,
> there is something "special" in your setup...
> try to _not_ mkfs /dev/something , then resize,
> but to mkfs /dev/drbdX ...
> try a different kernel, try with small arrays first, so you can more
> easily verify (by comparing against checksums/images) how/where
> the corruption takes place.
>
> --
> : Lars Ellenberg                            Tel +43-1-8178292-0  :
> : LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
> : Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
> __
> please use the "List-Reply" function of your email client.
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20070523/d587eb45/attachment.htm>


More information about the drbd-user mailing list