<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Dan,<br>
<br>
I think the person whose post I saw on the net was named "Gerry Reno".
Nonetheless, I used the "resize_reiserfs" tool to shrink the filesystem
to allow for metadata internal to the disk itself. I've had success
with this in the past, with older versions of DRBD (I have a production
system in place for which I've done this) and older kernels.<br>
<br>
The disks are actually mostly empty at the moment - they are going to
be production boxes at some point in the future, but right now they're
still in a testing environment. I haven't actually tried to sync
anything up except a test file created with "touch foo" - definitely
not 183 MB.<br>
<br>
Here are the steps, as best as I can restate them:<br>
<br>
I first resized the filesystem to allow for DRBD's internal metadata.
It says it only requires 128MB, but I have plenty of disk space to
spare so I gave it double what the docs recommend:<br>
<br>
<tt>resize_reiserfs -s -256M /dev/md7</tt><br>
<br>
In my sources.list:<br>
<pre>deb-src <a class="moz-txt-link-freetext" href="http://http.us.debian.org/debian">http://http.us.debian.org/debian</a> sid main contrib non-free</pre>
I then built the module against my kernel:<br>
<br>
<tt>apt-get build-dep drbd0.7-module-source<br>
apt-get -b source drbd0.7-module-source<br>
<br>
</tt>Once I had the .debs that created (it creates the packages for the
utils and the module source), I built the module:<br>
<pre>m-a a-b drbd0.7-module --kernel-dir=/usr/local/src/linux-2.6.18.8</pre>
Then, I installed the resulting .deb package, and loaded the module,
confirming it was loaded:<br>
<br>
<tt>modprobe drbd; lsmod</tt> <br>
<br>
I then installed the standard Etch heartbeat2 package and configured
it. Here is my drbd.conf and relevant heartbeat configs:<br>
<br>
### /etc/drbd.conf ###<br>
<tt>resource r0 {<br>
protocol C;<br>
incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ;
halt -f";<br>
startup { wfc-timeout 0; degr-wfc-timeout 120; }<br>
disk { on-io-error detach; }<br>
syncer {<br>
rate 60M;<br>
group 1; # sync when r2 is finished syncing.<br>
}<br>
on db1 {<br>
device /dev/drbd0;<br>
disk /dev/md7;<br>
address 192.168.101.26:7791;<br>
meta-disk internal;<br>
}<br>
on db2 {<br>
device /dev/drbd0;<br>
disk /dev/md7;<br>
address 192.168.101.27:7791;<br>
meta-disk internal;<br>
}<br>
}</tt><br>
<br>
### /etc/ha.d/haresources ###<br>
<tt>db1 drbddisk::r0 Filesystem::/dev/drbd0::/home::reiserfs \<br>
192.168.101.25 mysql postgresql-8.1 \<br>
<a class="moz-txt-link-freetext" href="MailTo::sysadmin@our-domain.com::LVS-State_Change">MailTo::sysadmin@our-domain.com::LVS-State_Change</a></tt><br>
<br>
### /etc/ha.d/ha.cf ###<br>
<tt>logfacility daemon # Log to syslog as facility "daemon"<br>
node db1 db2 # List our cluster members<br>
keepalive 1 # Send one heartbeat each second<br>
deadtime 10 # Declare nodes dead after 10 seconds<br>
bcast eth0 eth1 # Broadcast heartbeats on eth0 and eth1
interfaces<br>
ping 192.168.101.1 # Ping our router to monitor ethernet
connectivity<br>
auto_failback no # Don't fail back to paul automatically<br>
respawn hacluster /usr/lib/heartbeat/ipfail # Failover on network
failures<br>
</tt><br>
If you see any error in anything I've done, let me know, but for
reference, here is the post from Gerry Reno:<br>
<br>
<div><b>Author: </b>Gerry Reno<br>
<b>Date: </b>
<script type="text/javascript"><!--
textdate(1177732492);//--></script>2007-04-27 23:54<noscript>2007-04-28
03:54</noscript>
<script type="text/javascript"><!--
timezone(1177732492);//--></script>-400<noscript>UTC</noscript><br>
<b>To: </b>drbd-user<br>
<b>Subject: </b>[DRBD-user] drbd 0.7.23: md device corruption<br>
</div>
I am experiencing md device (raid1) corruption when using drbd 0.7.23.
<br>
At times the whole machine hangs and a hard reboot is the only way to
<br>
recover. Once rebooted the raid array that holds the root drive goes
<br>
into full resync. This has happened on several machines. If I stop
<br>
drbd I do not have the problem. In a previous thread there was
<br>
discussion of a fix in 2.6.20 kernel. My problem is that I cannot
<br>
upgrade to this kernel. Has this fix been backported at all? I am
<br>
using FC6 with kernel 2.6.19-1.2985.fc6xen. Or is there any type of
<br>
workaround?
<br>
<br>
<br>
Dan Gahlinger wrote:
<blockquote
cite="mid:439b44d10705180953h5998ef36pb82ec1a06e068898@mail.gmail.com"
type="cite">Are you referring to the issues I had?<br>
This is a common problem if you try to setup DRBD on an existing
partition, and don't take into consideration the meta-disk<br>
<br>
When we ran this, it would "look" ok, but when trying to copy a file of
a certain size it would corrupt.
<br>
It turned out the magic number was around 183 megs for us.<br>
<br>
The key to that was to build the partition on the DRBD device which is
the way it's supposed to be installed.<br>
<br>
If you can let us know the exact steps you use to setup the drbd devce,
the partition, and our configuration file,
<br>
someone can probably help you.<br>
<br>
Dan.<br>
<br>
<div><span class="gmail_quote">On 5/18/07, <b
class="gmail_sendername">Ryan Steele</b> <<a moz-do-not-send="true"
href="mailto:steele@agora-net.com">steele@agora-net.com</a>> wrote:</span>
<blockquote class="gmail_quote"
style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I
saw someone else post something similar to this a few weeks ago, but<br>
didn't see any response to it. I've just set up DRBD 0.7.23 with<br>
Heartbeat2 on two future database server. However, DRBD seems to have<br>
corrupted my multi-disk RAID1. I booted a Knoppix CD on the affected<br>
machines, removed the DRBD rc.d scripts, and rebooted and things were
<br>
fine. To verify, I ran update-rc.d to recreate the symbolic links, and<br>
rebooted again to find that it again would not boot. Moreover, even<br>
removing the rc.d links did not help - the array is, I fear,
irreparably
<br>
damaged.<br>
<br>
Is there any acknowledgement of this bug, or are there any suggestions<br>
as to how one might go about fixing it? I can't even boot into the<br>
machine to run mdadm and repair the array, though maybe I can do that
<br>
from the Knoppix CD...<br>
<br>
In any case, I just wanted to make people aware, and hopefully get a<br>
little feedback. Thanks.<br>
<br>
<br>
--<br>
Ryan Steele<br>
Systems Administrator<br>
<br>
<br>
_______________________________________________
<br>
drbd-user mailing list<br>
<a moz-do-not-send="true" href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a><br>
<a moz-do-not-send="true"
href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user</a>
<br>
</blockquote>
</div>
<br>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Ryan Steele
Systems Administrator <a class="moz-txt-link-abbreviated" href="mailto:steele@agora-net.com">steele@agora-net.com</a>
AgoraNet, Inc.                                  (302) 224-2475
314 E. Main Street, Suite 1 (302) 224-2552 (fax)
Newark, DE 19711 <a class="moz-txt-link-freetext" href="http://www.agora-net.com">http://www.agora-net.com</a></pre>
</body>
</html>