[DRBD-user] Multiple Clusters

Dominique Chabord dominique.chabord at bluedjinn.com
Fri Jun 4 11:39:32 CEST 2004


Helmut Wollmersdorfer a écrit :
 > Jason Gray schrieb:
 >
 >> I'm looking at creating a multi-clustered array server network for our
 >> production environment.  Is it possible to have 5,6,7..n servers
 >> clustered
 >> together (kind of like a token ring) to act as redundant arrays for each
 >> other?
 >
 >
 > DRBD is a two node cluster.
 > You can have something like this:
 >
 >          Node1      Node2      Node3
 >          -----      -----      -----
 > drbd1    Primary    Secondary
 > drbd2    Secondary  Primary
 > drbd3               Primary    Secondary
 > drbd4    Secondary             Primary
 >
 >> So, instead of having a Primary array I would have 5,6 or 7 "Primary"
 >> arrays
 >> that mirror each other across an isolated network (10.0.0.0 say).
 >
 >
 > Not each to each.
 >
 > You can build a "chain":
 >
 > Situation 1:
 >
 >          Node1      Node2      Node3
 >          -----      -----      -----
 > drbd1    Primary    Secondary  stopped
 >
 > Then switch to Situation 2:
 >
 >          Node1      Node2      Node3
 >          -----      -----      -----
 > drbd1    stopped    Primary    Secondary

I'll keep this message, you explain very clearly what the situation is.
 >
 > This sort of switchover will take time, because Node3 needs to sync with
 > Node2. In case of a full sync of e.g. 100 GB this will need _hours_.

Good point.
So, what would be your recommendation here ?

I see three possible cases in case Node1 fails:
-CASE1 I think I can repair Node1, and resynch it with updates only 
after repair. Therefore I decide not to synchronize Node3 as a secondary
-CASE2 I think I cannot repair Node1, or it then will require a full 
sync anyway. Therefore I decide to synchronize Node3 as a secondary
-CASE3 I don't know why Node1 is down. I start sync to Node3 as soon as 
possible, then I see what I can get from Node1. If I repair it shortly, 
then, I'll stop synching Node3 and resynch Node1. I made noise for 
nothing during this period of time.

Shaman-X today implementation is CASE3: start sync automatically and 
change secondary manually.
Maybe we should think about a human decision to implement CASE1.
We might also give a chance to auto-repair Node1, something like:
if Node1 is not back before 10mn, or if Node1 is unstable and failed 
three times in the last 24h, etc...  then we go automated and we sync a 
new secondary.

Any opinion ?

Dominique
 >
 > Helmut Wollmersdorfer
 >
 > _______________________________________________
 > drbd-user mailing list
 > drbd-user at lists.linbit.com
 > http://lists.linbit.com/mailman/listinfo/drbd-user
 >
 >




More information about the drbd-user mailing list