[DRBD-user] can't mount on secondary node

CAMPBELL Robert robert.campbell at morpho.com
Wed Feb 1 17:30:19 CET 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On 1-2-2012 16:56, Martin Gerhard Loschwitz wrote:
>> Martin,
>> I'm actually running a dual-active Samba server with a shared GFS2
>> file-system (block-replicated by DR:BD). I'm also (ab)using it for an
>> Apache/Tomcat installation with session replication in Tomcat through a
>> shared file-system.
>> Of course, all this is balanced using RR-DNS, and when one node fails,
>> the cluster resource (IP address) is taken over by the surviving node to
>> re-establish service (at somewhat lower performance).
>> Did a kitten just die?
>> Robert Campbell
> Robert,
> what sort of STONITH do you use for this setup? How did the system react
> the last time where the interconnect between the two nodes was broken
> but they were still up and running? And when did you test the fail-over
> capabilities of your cluster for the last time?
> What happens if one of your nodes fails and GFS thinks that it will have
> to fence them until that fencing process is actually done?
> I'll get a shovel while waiting for the answer. ;-)
> Best regards
> Martin

STONITH is power-off through IPMI on HP iLO, so that should be sorted. 
The system is still in the testing-fase, but I'll test the loss of an 
iLO connection, and then a failure of a node (will IMPI fencing still 
claim a success of fence?).

The last time (still during initial buildup test) the suviving node 
STONITHed the node that lost network connection (it's difficult to test 
a different failure scenario than a network failure. How do you make a 
kernel panic on purpose?). RHEL-clustering announced a succesful fence 
and everything continued working happily on the surviving node. The 
connection used for the STONITH is redundantly made to a pair of 
switches. Unfortunately the connection to the iLO is not redundant, but 
this would be a failure on a failure (so not really a concern for us).

During the build-up we also noticed that if there is no successful fence 
(fencing was not configured), the surviving node will get a file-system 
time-out. Rather unfortunate. We're thinking of implementing a manual 
fence mechanism in addition to the automatic, so that the I/O can 
continue as soon as an administrator has dialed in to the cluster after 
having received the e-mail of the failure. But again, this would require 
a failure on top of a failure.

I feel confident the shovel can be put back into the cupboard.


Help save paper! Do you really need to print this email?

Aan de inhoud van dit bericht kunnen alleen rechten ten opzichte van Morpho B.V.
worden ontleend, indien zij door rechtsgeldig ondertekende stukken worden ondersteund. 
De informatie in dit e-mailbericht is van vertrouwelijke aard en alleen bedoeld voor gebruik 
door geadresseerde. Als u een bericht onbedoeld heeft ontvangen, wordt u verzocht de
verzender hiervan in kennis te stellen en het bericht te vernietigen zonder te vermenigvuldigen
of andersoortig te gebruiken.

The contents of this electronic mail message are only binding upon Morpho B.V.
if the contents of the message are accompanied by a lawfully recognized type of
signature.  The contents of this electronic mail message are privileged and confidential and are
intended only for use by the addressee.  If you have received this electronic mail message by error,
please notify the sender and delete the message without reproducing it and using it in any way.

" Ce courriel et les documents qui lui sont joints peuvent contenir des informations confidentielles ou ayant un caractère privé. S'ils ne vous sont pas destinés, nous vous signalons qu'il est strictement interdit de les divulguer, de les reproduire ou d'en utiliser de quelque manière que ce soit le contenu. Si ce message vous a été transmis par erreur, merci d'en informer l'expéditeur et de supprimer immédiatement de votre système informatique ce courriel ainsi que tous les documents qui y sont attachés."
" This e-mail and any attached documents may contain confidential or proprietary information. If you are not the intended recipient, you are notified that any dissemination, copying of this e-mail and any attachments thereto or use of their contents by any means whatsoever is strictly prohibited. If you have received this e-mail in error, please advise the sender immediately and delete this e-mail and all attached documents from your computer system."

More information about the drbd-user mailing list