<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

</head>

<body bgcolor="#ffffff" text="#000000">

Hi all,<br>

<br>

following the example in this Florian's post:

(<a class="moz-txt-link-freetext"

 href="http://fghaas.wordpress.com/2007/10/01/an-underrated-cluster-admins-companion-dopd/">http://fghaas.wordpress.com/2007/10/01/an-underrated-cluster-admins-companion-dopd/</a>)

I'm testing the outdate-peer plugin.<br>

<br>

My scenario: two debian machines (OV-HA1 primary, OV-HA2 secondary) ,

heartbeat+drbd, 1 ethernet + 1 serial cable (the ethernet is used both

for drbd replication and to expose services).<br>

I also know that a dedicated ethernet connections between the two nodes

is recommended for drdb data synchronization, but for testing use this

is the scenario :).<br>

Heartbeat is configured with ipfail, so when the ethernet connection

goes  down,  heartbeat  migrate the services  to the  other node.<br>

<br>

Obviusly in this configuration the troubles appears when I unplug the

OV-HA1 (primary) link: I'm testing the outdate-peer daemon as I read on

your post because without this plugin the secondary becames primary

(and this is OK) , but when I reconnect the ethernet the 2 nodes are

"standalone" and not re-syncronize their drbd partitions (this is the

case of "drbd split brain"). <br>

Now with your post's configuration: <br>

<ul>

  <li>in OV-HA2's ha-log  I see this warning  <i>WARN:

check_drbd_peer: drbd peer OV-HA1 was not found;</i><br>

  </li>

  <li>however the plugin seems to work, because my OV-HA2 is now

outdated;</li>

  <li>after the log message above, I see in OV-HA2's ha-log: <br>

    <i>ResourceManager[6217]:  2007/10/15_14:54:47 ERROR: Return code

20 from /etc/ha.d/resource.d/drbddisk<br>

ResourceManager[6217]:  2007/10/15_14:54:47 CRIT: Giving up resources

due to failure of drbddisk::ovHA</i></li>

  <li>investigating

the syslog I see that OV-HA2 fails to become

primary                                                                                                                                  

    <i>Oct 15 14:54:47 localhost kernel: drbd0: State change failed:

Refusing to be Primary without at least one UpToDate disk<br>

Oct 15 14:54:47 localhost kernel: drbd0:   state = { cs:WFConnection

st:Secondary/Unknown ds:Outdated/DUnknown r--- }<br>

Oct 15 14:54:47 localhost kernel: drbd0:  wanted = { cs:WFConnection

st:Primary/Unknown ds:Outdated/DUnknown r--- }<br>

Oct 15 14:54:47 localhost kernel: ttyS0: 1 input overrun(s)<br>

Oct 15 14:54:47 localhost ResourceManager[6217]: debug:

/etc/ha.d/resource.d/drbddisk ovHA start done. RC=20<br>

Oct 15 14:54:47 localhost ResourceManager[6217]: ERROR: Return code 20

from /etc/ha.d/resource.d/drbddisk<br>

Oct 15 14:54:47 localhost ResourceManager[6217]: CRIT: Giving up

resources due to failure of drbddisk::ovHA</i><br>

  </li>

</ul>

It is correct that now in my scenario: <br>

<ul>

  <li>the plugin outdate the secondary when etherner fails;</li>

  <li>the secondary fails to become  primary because  now it is marked

as "outdated" :)</li>

</ul>

<br>

<div id="result_box" dir="ltr">Is there a solution?</div>

<br>

Best regards,<br>

Matteo.<br>

<div class="moz-signature"><br>

</div>

</body>

</html>