[DRBD-user] Antwort: Re: question about start drbd on single node after a power outage
Robert.Koeppl at knapp.com
Robert.Koeppl at knapp.com
Tue Jan 31 16:53:55 CET 2012
Hi!
You can force the node into primary by issuing "drbdadm --
--overwrite-data-of-peer primary all". This is quite selfexplanatory
regardingits consequences when the other node comes up again.
if you issue a "drbdadm disconnect all" after that you make sure it does
not overwrite the other node when it comes up, so you can do some recovery
on the other node in case you are missing some data and are lucky.
"drbdadm connect all" later on should reestablish the connection later on.
Mit freundlichen Grüßen / Best Regards
Robert Köppl
Customer Support & Projects
Teamleader IT Support
KNAPP Systemintegration GmbH
Waltenbachstraße 9
8700 Leoben, Austria
Phone: +43 3842 805-322
Fax: +43 3842 82930-500
robert.koeppl at knapp.com
www.KNAPP.com
Commercial register number: FN 138870x
Commercial register court: Leoben
The information in this e-mail (including any attachment) is confidential
and intended to be for the use of the addressee(s) only. If you have
received the e-mail by mistake, any disclosure, copy, distribution or use
of the contents of the e-mail is prohibited, and you must delete the
e-mail from your system. As e-mail can be changed electronically KNAPP
assumes no responsibility for any alteration to this e-mail or its
attachments. KNAPP has taken every reasonable precaution to ensure that
any attachment to this e-mail has been swept for virus. However, KNAPP
does not accept any liability for damage sustained as a result of such
attachment being virus infected and strongly recommend that you carry out
your own virus check before opening any attachment.
"Xing, Steven" <SXing at BroadViewNet.com>
Gesendet von: drbd-user-bounces at lists.linbit.com
31.01.2012 16:46
An
"Kaloyan Kovachev" <kkovachev at varna.net>, "Digimer" <linux at alteeve.com>
Kopie
drbd-user at lists.linbit.com
Thema
Re: [DRBD-user] question about start drbd on single node after a power
outage
Your response both faster, ;)
Thanks all. I understand that there is a high risk of split brain and data
loss if force one node boot up.
But in my case, service recover is much more important than data loss.
I have 2 node pacemaker cluster, drbd running as master/slave(controlled
by pacemaker).
When power outage on both nodes(just like unplug the power cable from both
node at the same time)
Seems no chance for drbd to do anything(such as fence the peer), thus,
after I just boot up the previous primary node,
I saw the pacemaker trying to promote the drbd to master, but failed,
since if I run drbdadm status, it shows:
<drbd-status version="8.3.11" api="88">
<resources config_file="/etc/drbd.conf">
<resource minor="0" name="vksshared" cs="WFConnection" ro1="Secondary"
ro2="Unknown" ds1="Consistent" ds2="DUnknown" />
</resources>
</drbd-status>
I tried set both of the wfc time out to 5sec, that not work.
As you can see the service drbd started, but can not be promote, since
ds1="Consistent", only "UpToDate" will work.
Only when I boot up the other node, just after the 2 drbd instance
connected, drbd can declare disk status as "UpToDate".
If not boot up the other node, I have not find an automatic way no matter
it is safe or not to force it think itself is "UpToDate".
Seems the drbd did not remember its previous running status(primary or
slave) for some safe reason.
Do you have any idea/comments on this? I looked into the doc, could not
find any setting can make this done even not safe.
If I upgrade drbd to the latest version, will it help?
-----Original Message-----
From: Kaloyan Kovachev [mailto:kkovachev at varna.net]
Sent: January-31-12 9:35 AM
To: Digimer
Cc: Xing, Steven; drbd-user at lists.linbit.com
Subject: Re: [DRBD-user] question about start drbd on single node after a
power outage
You were faster than me :)
On Tue, 31 Jan 2012 09:04:49 -0500, Digimer <linux at alteeve.com> wrote:
>
> If you want to force the issue though, you can use 'wfc-timeout 300'
> which will tell DRBD to wait up to 5 minutes for it's peer. After that
> time, consider itself primary. Please don't use this though until
> you've exhausted all other ways of starting safely.
There are two (well documented) options in drbd.conf - wfc-timeout and
degr-wfc-timeout. To avoid split-brain i set both to 0.
If you need to skip waiting you can manually do this from the console in
case you start drbd standalone or before cman / pacemaker.
In my case it is exported via iSCSI (not as cluster resource), so have
additional wait loop for both nodes to became UpToDate for _all_
configured resources before exporting any of them - 'no data' is better
than 'broken data' - yes i have been bitten from the last one (luckily
during the preproduction phase) and believe me you don't wan't that on
production nodes (unless you have static read-only data)
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120131/122c438a/attachment.htm>
More information about the drbd-user
mailing list