[DRBD-user] expected behaviour after primary crash and reconnect.

Peter Kruse pk at q-leap.com
Fri Apr 1 14:45:32 CEST 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

Bernd Schubert wrote:
> 
> 
> Well, the question is, why does it want to sync into the wrong direction on 
> the first connection attempt. To my knowledge the sync direction is from the 
> node with the recent data to the node with the outdated data.
> Is there by any chance something mounting the underlying partitions at boot 
> time?

no way, no.

> Here also a rather unprobable theory, after a server crash we updated the bios 
> of the mainboard, which set the time back by about 2 years. During the sync 
> from the failover node everything was synced back and not only the outdated 
> data. Is there perhaps the hardware clock of the master node running in the 
> future at boot time? I know, its just a silly idea, but who knows...

could have been, but the messages I posted showed exactly the same
time.

> Today Shane also reported the same problem, did you already try the timeout 
> value?
> 

I'm not using heartbeat so at this stage I don't have a problem with
a cluster software but am glad not to be the only one to stumble
over this...
I had a closer look at the difference between "reboot" and calling
"/etc/init.d/drbd restart".  When doing "reboot" the drbd script
is invoked at "K08" - pretty early.  The filesystems are still mounted
and so drbd cannot exit.  When the network goes down, this is what
happens:

drbd1: PingAck did not arrive in time.
d1: drbd1_receiver [9369]: cstate BrokenPipe --> Unconnected
cstate BrokenPe1: asender terminated
drbd1: drbd1_receiddrbd1: Connection lost.
drbd1: drbd1_receiver [9369]: cstate Unconnected --> WFConnection
drbd0: PingAck did not arterminated
drbd0: drbd0_receiver [9365]: cstate BrokenPipe --> Unconnected
drbd0: Connection lost.
etworkFailurereceiver [9365]: cstate Unconnected --> WFConnectinnected 
--> WF
drbd0: asender terminated
drbd0: drbd0_receiver [9365]: cstate NetworkFailure --> BrokenPipe
drbd0: short read expecting header on sock: r=-512
drbd0: worker on
done.

(Output is a little garbeld - came from a serial line).
There is no clean exit for the other node to recognize.
My impression is that this is the reason.

Thanks for your help.

	Peter



More information about the drbd-user mailing list