[DRBD-user] Hypothetical Secondary/Primary harware failure question

Varun Menghani varun.menghani at airtightnetworks.net
Tue Oct 11 11:47:00 CEST 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I have a question on similar lines.

I have two machines (Node1 and Node2) with heartbeat and drbd setup. I
have 2 interfaces eth0 and eth1 on each of these machines. Heartbeats
are exchanged through both the interfaces, DRBD uses the eth1 interface,
while the devices provide services through the eth0 interface. 
Under this setup, my devices are initially working fine and their status
is Primary/Secondary and Secondary/Primary respectively.
However the primary machine's (Node1) eth1 interface fails and the DRBD
status on the 2 machines is now Primary/Unknown and Secondary/Unknown
respectively. Both nodes don't become Primary since heartbeats are still
being exchanged through the eth0 interface. However the data being
written on the Primary (Node1) is not communicated to the Secondary
(Node2).
Now if I shutdown my Primary Server (Node1) for repairs, the Server
which was Secondary (Node2) becomes Primary, however it would not have
data mirrored on it which was written on the old Primary (Node1) in the
window between the eth1 interface going down and the old Primary (Node1)
being powered off.
Now when I finish repairing (Node1) and plug it back in the status I get
is :

Node1:
drbd driver loaded OK; device status:
version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at S1, 2005-08-16 20:37:52
 0: cs:WFConnection st:Secondary/Unknown ld:Consistent
    ns:13668 nr:0 dw:17312 dr:24005 al:21 bm:98 lo:0 pe:0 ua:0 ap:0


Node2 : 
drbd driver loaded OK; device status:
version: 0.7.11 (api:77/proto:74)
SVN Revision: 1807 build by root at S1, 2005-08-16 20:37:52
 0: cs:StandAlone st:Primary/Unknown ld:Consistent
    ns:0 nr:13656 dw:20036 dr:22373 al:54 bm:36 lo:0 pe:0 ua:0 ap:0


A reboot restores the state back to Primary/Secondary and
Secondary/Primary.

My Question here is: 
a. In this case have I lost certain data?
b. Is there any way for me to get to a consistent state after plugging
node1 in without a reboot?

Regards,
Varun.


On Tue, 2005-10-11 at 14:41, Etienne van Tonder wrote:
> If one of the machines in a active/passive setup fails and is shutdown via
> heartbeat stonith, if it then turns out the machine is faulty and can not be
> rebooted.
> 
> What is the correct procedure for bringing up the primary/secondary as a
> standalone so that operations can continue while the primary/secondary
> machine is being repaired.
> 
> Regards,
> 
> Etienne.
> 
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
-- 
Varun Menghani <varun.menghani at airtightnetworks.net>
Airtight Networks




More information about the drbd-user mailing list