[DRBD-user] DRBD .7.24 - question for the gurus

Charles Riley criley at erad.com
Tue Mar 11 17:41:00 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

I have a drbd cluster which has experienced the following chain of 
events (DRBD .7.24):

All network interfaces on the primary node were restarted, to include 
the DRBD interface.
Heartbeat at this point lost connectivity to it's ping nodes, and failed 
over to the secondary node.
During the failover to the secondary node, the secondary node's DRBD was 
set from Secondary/Unknown --> Primary Unknown and all services were 
failed over.  So far so good.
However, the secondary node's drbd was stuck in WFConnection.  This went 
unnoticed.
The system ran for a period of time, and then was failed back to the 
primary node, causing us to jump back in time to the date of the initial 
failover as far as the filesystem was concerned.
When it was noticed that the filesystem was out of date, the data on it 
was restored from external sources.
Now I am brought in to the middle of this, and find that  the primary 
node (which now has good data) is 0: cs:StandAlone st:Primary/Unknown 
ld:Consistent
The secondary is 0: cs:WFConnection st:Secondary/Unknown ld:Consistent

My question is, what is going to happen when I reestablish communication 
between primary and secondary?
Is drbd going to do the right thing and bring the secondary node's disk 
up to date?
I'd just do a "drbdadm -- --do-what-I-say primary all" on the primary, 
but I'm pretty sure that once the failover node's drbd starts 
communicating again, a sync is going to start.
I just want to be prepared (and coordinate downtime) if I am going to 
have to recover again from external sources.

Thanks guys.

Charles




More information about the drbd-user mailing list