[DRBD-user] stacked primaries-scenarios and drbd proxy size

Nils Stöckmann N.Stoeckmann at demetec.de
Mon Sep 17 18:43:13 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I'd like to ask for your opinion if how I'm trying to implement DRBD is
possible and if it is a good idea.

My plan is to migrate our companies unclustered and "if one thing breaks
we're going to have some work to do"-infrastructure to a
three-Primary-nodes cluster.

To accomplish this,  I had the idea to build this:

MAIN SITE            ||          Small Office Site
A           B                    C
|           |                    |
RAID        RAID                 RAID
|             | 
|           DRBD2------VPN-------DRBD2
|             |                    |
LVM          LVM                  LVM

Nodes A and B shall be used for load balancing and shall be able to
dynamically switch tasks and active services.

Is this actually possible? The "three nodes" DRBD manual page doesn't
explicitly forbid multiple primaries, however it doesn't explicitly say
it's possible, either.

I have the idea to create several gfs and a few ocfs volumes on lvm.
As long as I build the lvm on one of the DRBD2 devices, I everything
should be all right (because of the reduces size of DRBD2 compared to
DRBD1 because of the internal-metadata stacking). Is that correct?

To reduce the number of "single point of failure"s, is it a good Idea to
have the stacked node assigned dynamically by pacemaker? So in case Node
B fails, Node A can still be active as a server at the main site and
takes over node Bs role to mirror the data to the small Office site.

By the way: Do you know if there are predefined pacemaker OCF routines
for this scenario or is this easy to configure?

Are the stacked ressources still read/write accessible to the user when
any one of the nodes is down?
(Both in case of a fixed stacked node and in case of a dynamically
assigned stacked node)?
This question is of high importance, as any internet connection around
here does fail every once in a while, let alone the 24h forced disconnect.

As an alternative,
does it make more sense to have -- node A in the example -- be secondary
and let it be assigned the stacked primary role in by pacemaker in case
node B fails?
It's not a nice option, because we can't access the data in secondary
mode and use it for load balancing, but at least we have it ready in
case of a failure.

In case stacked primaries are impossible: Does the A-b active/passive
replication have to be the upper or lower layer?

A few numbers considering our data volume and network line:

At the moment, we have the following data to be clustered:
450K Files
60K Folders
150GB Data

~100ms ping latency
The VPN is going to run on a 50MBit/s down and 10Mbit/s up VDSL
connection on both sites, so 10Mbit/s will be availible maximum.

The latency of a ping is ~100ms at the moment, but I hope it will be
faster as soon as we have the new line in about 3 months.

On the disks that hold our to-be-clustered data, I have measured the
following change rates. At the moment, there are a few people ill or on
holiday, so add a factor of up to 1.5 (safety margin included) to have
the busy state.

Data change rate during working hours,
data sum equally dispersed onto 1 hour intervals:
avg: 0.656MB/s, max:1.9MB/s, dev:0,41MB/s

Overall data change rate, measured in 1 minute intervals:
avg:0.29MB/s, max: 35.0 MB/s, dev:1,24MB/s

Activitiy Peaks usually last 2-15 minutes.

Nr. of write accesses during working hours,
nr of actions equally dispersed onto 1 hour intervals
Max: 100 / s, Avg: 54/sec, dev:18/sec

overall nr. of write accesses, measured in 1 min intervals:
max: 1450/sec, avg:35/sec, dev:65/sec

At the moment clustering is not an option, because we have a 1Mbit/s up
and 100Kbit/s down line DSL connection, however we are supposed to get
50Mbit/s / 10Mbit/s VDSL.

Because the average data change rate*1.5touches the upload bandwidth,
and the peak change rate is way bigger, i thought about using drbd
proxy, which caches the peaks. Compared with compression, I think I
should be fine. What do you think?

In case I use DRBD proxy, how much RAM do you suggest to plan in?

Any comments, suggestions, ideas and hints are greatly appreciated,
thanks ahead!



More information about the drbd-user mailing list