[DRBD-user] Question about Linux-HA, stonith and data loss

Wed Jan 11 00:07:06 CET 2006

Lars,

Apologies for taking so long, and still not having read enough ...

On Tue, Dec 27, 2005 at 06:17:44PM +0100, Lars Ellenberg wrote:
> > Could I get you to elaborate on what Protocols A,B and C do in drbd-0.7
> > and in relation to connection loss ?
> 
> well.  basically all the same:
> 
> mark all writes that are not definetly acknowledged as written by the
> peer as out of sync in our bitmap, which basically schedules these
> blocks for resynchronization when we see our peer.

So, what do these protocols buy ?

I realise I was assuming a write-barrier somewhere, and I'm still 
imagining that I've simply used the overly-specific terminology
"block on write". Now I've started to think about write barriers,
the whole journalling thing becomes clearer.

So is there a write barrier there _somewhere_ ? 
or is it more of a rehearsal/accounting thing ?
or am I still missing something clever ?

And since it may help to have concrete examples, when dealing with
my sloppy wording, here's a couple:

see previous conversation in "3 simple questions for a nifty setup"
(sorry! painted myself into a corner in my mua and I'm too lazy 
to levitate out ;)

and here's another try at answering Christof's mail (not because I'm
trying to audition or anything !!! but because I noticed something I
didn't quite register properly before and I wondered what difference
it made)
==============
On Wed, Dec 14, 2005 at 06:59:29PM +0100, Christof Amelunxen wrote:
> Hi all,
> 
> we are currently implementing an informix dbms cluster using Linux-HA
> (1.2.3) and drbd (0.7.5) on SLES9 SP2. Everything is working perfectly
> well so far, thanks a lot for all the work that has been done.
> 
> I have a question about a special situation that may be an FAQ but still I
> didnt find any answers yet:
> 
> 1. NodeA (P) --- NodeB (S)   # everything ok
> 2. NodeA (P) - - NodeB (S)   # DRBD detects connection loss, goes WFC
> 3. NodeA (P) - - NodeB (S)   # Linux-HA detects split brain, A kills B

surely you mean B kills A, from what follows ?

I don't understand why you would configure a pair that could go split 
brain not to stonith the secondary node in preference, or is that 
another thing that went wrong ?

> 4.   /       - - NodeB (P)   # NodeB takes over, goes primary
> 
> There have been writes on NodeA between step 2 and 3. 

So what is happening exactly ?

I would worry more about writes "between 1 and 2", ie: writes that haven't
reached the wire at the time A ceases to function (if indeed that is what 
your thinking of).

For that (simpler) case, the question is simply "Is it _ever_ safe to return 
a committed transaction at A".  I was kinda hoping the answer to that was
"yes, and you get that transparently if you use the same kind of write
barrier you should want to use for a local disk", but I'm revisiting that
question :)

> These are lost after
> Linux-HA has killed A and made B primary. 

but why ? why did you do that ? even so ...

> I know the best solution is to
> avoid this situation by any chance 

okay I missed this bit. you're a bit vague around about "this".

> and we are using serial heartbeats, too, 

non-sequitur or am I on the wrong page of the book ?

> but what if it happens anyway?

Reading you litterally then "yes, the writes are lost", but I'm seriously
worried that you're asking the wrong question ...

I mean, what did you think was going on? quantum entanglement ?

You make no statement about what happens next.

Working from my previous set of assumptions (see ongoing discussion):

I have seen a db run split brain over drbd and had to reconcile the
spit halves and put them back together.  While I grant you the stonith
adds an extra dimension of fun, and if you simply invalidate without
a reconciliation your writes are really lost, my vague recollection is 
that the system either a) will not do that by default or b) needs to be 
manually told to that.

Go back to what a transaction is.  If going split brain is a scenario
you need to consider, then ask about how that impacts your transactions.
A bank balance is a classic example: in a split brain you could lose the
all important "have I reached my limit" property.  

I think I should stop there.

Good luck.

==============

After such a long and sometimes spiky seeming-email (even though its 
not meant badly ... No! don't make me go and write RTFM out long hand .... 
arrrghhh .... google is my best friend ... bzzzt .... sorry dave ... ) 
I feel an extra apology would be cathartic.  

Sorry.

Regards,
Paddy
-- 
Perl 6 will give you the big knob. -- Larry Wall