[DRBD-user] drbd peer outdated plugin

Mon Oct 15 16:00:17 CEST 2007

Hi all,

following the example in this Florian's post: 
(http://fghaas.wordpress.com/2007/10/01/an-underrated-cluster-admins-companion-dopd/) 
I'm testing the outdate-peer plugin.

My scenario: two debian machines (OV-HA1 primary, OV-HA2 secondary) , 
heartbeat+drbd, 1 ethernet + 1 serial cable (the ethernet is used both 
for drbd replication and to expose services).
I also know that a dedicated ethernet connections between the two nodes 
is recommended for drdb data synchronization, but for testing use this 
is the scenario :).
Heartbeat is configured with ipfail, so when the ethernet connection 
goes  down,  heartbeat  migrate the services  to the  other node.

Obviusly in this configuration the troubles appears when I unplug the 
OV-HA1 (primary) link: I'm testing the outdate-peer daemon as I read on 
your post because without this plugin the secondary becames primary (and 
this is OK) , but when I reconnect the ethernet the 2 nodes are 
"standalone" and not re-syncronize their drbd partitions (this is the 
case of "drbd split brain").
Now with your post's configuration:

    * in OV-HA2's ha-log  I see this warning  /WARN: check_drbd_peer:
      drbd peer OV-HA1 was not found;/
    * however the plugin seems to work, because my OV-HA2 is now outdated;
    * after the log message above, I see in OV-HA2's ha-log:
      /ResourceManager[6217]:  2007/10/15_14:54:47 ERROR: Return code 20
      from /etc/ha.d/resource.d/drbddisk
      ResourceManager[6217]:  2007/10/15_14:54:47 CRIT: Giving up
      resources due to failure of drbddisk::ovHA/
    * investigating the syslog I see that OV-HA2 fails to become
      primary                                                                                                                                  
      /Oct 15 14:54:47 localhost kernel: drbd0: State change failed:
      Refusing to be Primary without at least one UpToDate disk
      Oct 15 14:54:47 localhost kernel: drbd0:   state = {
      cs:WFConnection st:Secondary/Unknown ds:Outdated/DUnknown r--- }
      Oct 15 14:54:47 localhost kernel: drbd0:  wanted = {
      cs:WFConnection st:Primary/Unknown ds:Outdated/DUnknown r--- }
      Oct 15 14:54:47 localhost kernel: ttyS0: 1 input overrun(s)
      Oct 15 14:54:47 localhost ResourceManager[6217]: debug:
      /etc/ha.d/resource.d/drbddisk ovHA start done. RC=20
      Oct 15 14:54:47 localhost ResourceManager[6217]: ERROR: Return
      code 20 from /etc/ha.d/resource.d/drbddisk
      Oct 15 14:54:47 localhost ResourceManager[6217]: CRIT: Giving up
      resources due to failure of drbddisk::ovHA/

It is correct that now in my scenario:

    * the plugin outdate the secondary when etherner fails;
    * the secondary fails to become  primary because  now it is marked
      as "outdated" :)

Is there a solution?

Best regards,
Matteo.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20071015/5314f58c/attachment.htm>