<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
</head>
<body bgcolor="#ffffff" text="#000000">
Hi all,<br>
<br>
following the example in this Florian's post:
(<a class="moz-txt-link-freetext"
href="http://fghaas.wordpress.com/2007/10/01/an-underrated-cluster-admins-companion-dopd/">http://fghaas.wordpress.com/2007/10/01/an-underrated-cluster-admins-companion-dopd/</a>)
I'm testing the outdate-peer plugin.<br>
<br>
My scenario: two debian machines (OV-HA1 primary, OV-HA2 secondary) ,
heartbeat+drbd, 1 ethernet + 1 serial cable (the ethernet is used both
for drbd replication and to expose services).<br>
I also know that a dedicated ethernet connections between the two nodes
is recommended for drdb data synchronization, but for testing use this
is the scenario :).<br>
Heartbeat is configured with ipfail, so when the ethernet connection
goes down, heartbeat migrate the services to the other node.<br>
<br>
Obviusly in this configuration the troubles appears when I unplug the
OV-HA1 (primary) link: I'm testing the outdate-peer daemon as I read on
your post because without this plugin the secondary becames primary
(and this is OK) , but when I reconnect the ethernet the 2 nodes are
"standalone" and not re-syncronize their drbd partitions (this is the
case of "drbd split brain"). <br>
Now with your post's configuration: <br>
<ul>
<li>in OV-HA2's ha-log I see this warning <i>WARN:
check_drbd_peer: drbd peer OV-HA1 was not found;</i><br>
</li>
<li>however the plugin seems to work, because my OV-HA2 is now
outdated;</li>
<li>after the log message above, I see in OV-HA2's ha-log: <br>
<i>ResourceManager[6217]: 2007/10/15_14:54:47 ERROR: Return code
20 from /etc/ha.d/resource.d/drbddisk<br>
ResourceManager[6217]: 2007/10/15_14:54:47 CRIT: Giving up resources
due to failure of drbddisk::ovHA</i></li>
<li>investigating
the syslog I see that OV-HA2 fails to become
primary
<i>Oct 15 14:54:47 localhost kernel: drbd0: State change failed:
Refusing to be Primary without at least one UpToDate disk<br>
Oct 15 14:54:47 localhost kernel: drbd0: state = { cs:WFConnection
st:Secondary/Unknown ds:Outdated/DUnknown r--- }<br>
Oct 15 14:54:47 localhost kernel: drbd0: wanted = { cs:WFConnection
st:Primary/Unknown ds:Outdated/DUnknown r--- }<br>
Oct 15 14:54:47 localhost kernel: ttyS0: 1 input overrun(s)<br>
Oct 15 14:54:47 localhost ResourceManager[6217]: debug:
/etc/ha.d/resource.d/drbddisk ovHA start done. RC=20<br>
Oct 15 14:54:47 localhost ResourceManager[6217]: ERROR: Return code 20
from /etc/ha.d/resource.d/drbddisk<br>
Oct 15 14:54:47 localhost ResourceManager[6217]: CRIT: Giving up
resources due to failure of drbddisk::ovHA</i><br>
</li>
</ul>
It is correct that now in my scenario: <br>
<ul>
<li>the plugin outdate the secondary when etherner fails;</li>
<li>the secondary fails to become primary because now it is marked
as "outdated" :)</li>
</ul>
<br>
<div id="result_box" dir="ltr">Is there a solution?</div>
<br>
Best regards,<br>
Matteo.<br>
<div class="moz-signature"><br>
</div>
</body>
</html>