[DRBD-user] Antwort: Re: question about start drbd on single node after a power outage

Tue Jan 31 17:11:50 CET 2012

On Tue, 31 Jan 2012 16:53:55 +0100, Robert.Koeppl at knapp.com wrote:
> Hi!
> You can force the node into primary by issuing "drbdadm -- 
> --overwrite-data-of-peer primary all". This is quite selfexplanatory 
> regardingits consequences when the other node comes up again.
> if you issue a "drbdadm disconnect all" after that you make sure it does

> not overwrite the other node when it comes up, so you can do some
recovery 
> on the other node in case you are missing some data and are lucky. 
> "drbdadm connect all" later on should reestablish the connection later
on.

You may also use 'invalidate-remote' ... keep in mind full sync will take
place. Taking a snapshot on 'before-resync-target' may help restoring some
data without disconnecting.
As you are absolutely sure that's what you want - for autorecovery use:
after-sb-opri discard-younger-primary
after-sb-1pri discard-secondary
after-sb-2pri violently-as0p
rrconflict    violently

> Mit freundlichen Grüßen / Best Regards
> 
> Robert Köppl
> 
> Customer Support & Projects 
> Teamleader IT Support
> 
> KNAPP Systemintegration GmbH
> Waltenbachstraße 9
> 8700 Leoben, Austria 
> Phone: +43 3842 805-322
> Fax: +43 3842 82930-500
> robert.koeppl at knapp.com 
> www.KNAPP.com 
> 
> Commercial register number: FN 138870x
> Commercial register court: Leoben
> 
> The information in this e-mail (including any attachment) is
confidential 
> and intended to be for the use of the addressee(s) only. If you have 
> received the e-mail by mistake, any disclosure, copy, distribution or
use 
> of the contents of the e-mail is prohibited, and you must delete the 
> e-mail from your system. As e-mail can be changed electronically KNAPP 
> assumes no responsibility for any alteration to this e-mail or its 
> attachments. KNAPP has taken every reasonable precaution to ensure that 
> any attachment to this e-mail has been swept for virus. However, KNAPP 
> does not accept any liability for damage sustained as a result of such 
> attachment being virus infected and strongly recommend that you carry
out 
> your own virus check before opening any attachment.
> 
> 
> 
> "Xing, Steven" <SXing at BroadViewNet.com> 
> Gesendet von: drbd-user-bounces at lists.linbit.com
> 31.01.2012 16:46
> 
> An
> "Kaloyan Kovachev" <kkovachev at varna.net>, "Digimer" <linux at alteeve.com>
> Kopie
> drbd-user at lists.linbit.com
> Thema
> Re: [DRBD-user] question about start drbd on single node after a power 
> outage
> 
> 
> 
> 
> 
> 
> Your response both faster, ;)
> Thanks all. I understand that there is a high risk of split brain and
data 
> loss if force one node boot up.
> But in my case, service recover is much more important than data loss.
> 
> I have 2 node pacemaker cluster, drbd running as master/slave(controlled

> by pacemaker).
> When power outage on both nodes(just like unplug the power cable from
both 
> node at the same time)
> Seems no chance for drbd to do anything(such as fence the peer), thus, 
> after I just boot up the previous primary node, 
> I saw the pacemaker trying to promote the drbd to master, but failed, 
> since if I run drbdadm status, it shows:
> 
> <drbd-status version="8.3.11" api="88">
> <resources config_file="/etc/drbd.conf">
> <resource minor="0" name="vksshared" cs="WFConnection" ro1="Secondary" 
> ro2="Unknown" ds1="Consistent" ds2="DUnknown" />
> </resources>
> </drbd-status>
> 
> I tried set both of the wfc time out to 5sec, that not work.
> 
> As you can see the service drbd started, but can not be promote, since 
> ds1="Consistent", only "UpToDate" will work.
> Only when I boot up the other node, just after the 2 drbd instance 
> connected, drbd can declare disk status as "UpToDate".
> 
> If not boot up the other node, I have not find an automatic way no
matter 
> it is safe or not to force it think itself is "UpToDate".
> Seems the drbd did not remember its previous running status(primary or 
> slave) for some safe reason.
> 
> Do you have any idea/comments on this? I looked into the doc, could not 
> find any setting can make this done even not safe. 
> If I upgrade drbd to the latest version, will it help?
> 
> 
> -----Original Message-----
> From: Kaloyan Kovachev [mailto:kkovachev at varna.net] 
> Sent: January-31-12 9:35 AM
> To: Digimer
> Cc: Xing, Steven; drbd-user at lists.linbit.com
> Subject: Re: [DRBD-user] question about start drbd on single node after
a 
> power outage
> 
> You were faster than me :)
> 
> On Tue, 31 Jan 2012 09:04:49 -0500, Digimer <linux at alteeve.com> wrote:
>> 
>> If you want to force the issue though, you can use 'wfc-timeout 300'
>> which will tell DRBD to wait up to 5 minutes for it's peer. After that 
>> time, consider itself primary. Please don't use this though until 
>> you've exhausted all other ways of starting safely.
> 
> There are two (well documented) options in drbd.conf - wfc-timeout and 
> degr-wfc-timeout. To avoid split-brain i set both to 0.
> 
> If you need to skip waiting you can manually do this from the console in

> case you start drbd standalone or before cman / pacemaker.
> 
> In my case it is exported via iSCSI (not as cluster resource), so have 
> additional wait loop for both nodes to became UpToDate for _all_ 
> configured resources before exporting any of them - 'no data' is better 
> than 'broken data' - yes i have been bitten from the last one (luckily 
> during the preproduction phase) and believe me you don't wan't that on 
> production nodes (unless you have static read-only data)
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user