Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Scenario:
* You have a mailserver (postfix, exim4, whatever...) in an
HA cluster and mailserver spool directories are replicated through
a drbd resource.
* You have heartbeat installed and configured. Heartbeat is issued
through two interfaces, eth0 and eth1; eth1 is also the interface
used by drbd for replication
* DRBD resource containng replicated directories is synchronized
using protocol C.
* You want to get a mail notification every time fence-peer (or
outdate-peer) handler is called.
I created an /usr/lib/drbd/notify-fence-peer.sh based on the original
/usr/lib/drbd/notify.sh script. Then I modified
/etc/drbd.d/global_common.conf in order to have this script called when
fence-peer is invoked:
fence-peer "/usr/lib/drbd/notify-fence-peer.sh;
/usr/lib/heartbeat/drbd-peer-outdater -t 5";
If I put down eth1 (the drbd interface) on the slave peer and i look at
/proc/drbd on primary, instead of getting:
0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated C r----
I get:
0: cs:NetworkFailure ro:Primary/Unknown ds:UpToDate/DUnknown C r----
If I try to disconnect the resource via "drbd disconnect r0" I get
stucked in Disconnecting state. If I also try to perform a sync I get
stucked again, with sync never returning.
I discover that this problem is dued to the mail command inside
notify-fence-peer.sh. Since mailserver spool directories are in the drbd
resource that is going to be IO freezed for the outdate procedure, the
mail server cannot send the email and the fence-peer handler is stucked.
To solve this issue I modified the last line of notify-fence-peer.sh from:
echo "$BODY" | mail -s "$SUBJECT" $RECIPIENT
to:
sleep 10s && echo "$BODY" | mail -s "$SUBJECT" $RECIPIENT &
This way drbd can complete the fence-peer handler and IO on replicated
resource will be defreezed, so the mail command can complete succesfully
after 10 seconds.
Am I doing it the right way or there is some issue I should consider?
Thank you
--
Dario Fiumicello
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20101117/98cd1f9a/attachment.htm>