[DRBD-user] Some queries

Lars Marowsky-Bree lmb at suse.de
Thu Dec 2 10:34:40 CET 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 2004-12-02T16:51:17, Vic Berdin <vic at digi.com.ph> wrote:

> I can see that (almost) everything works automatically now. My configured
> partition gets mirrored on the secondary as I add/delete files in the
> primary node. I can see disk activity on both nodes (via external HDD LED). 
> And the data does get mirrored on the secondary if do actual test inspection
> by mounting the secondary node's partition (ofcourse after shutting down
> drbd ;o)).

Be careful. That's actually the wrong way to check what's going on on
the secondary; if you shutdown drbd before mounting the device, the
mount will modify some bits on disk which then won't get replicated and
cause the filesystems to diverge, which will cause problems later.

Yes, you _can_ mount a filesystem after shutting down drbd, but this is
only meant as a last resort if drbd breaks and will later require a
manual full replication.

The proper way to access a drbd device is to promote the node you want
to access it on to 'primary' status and then mount the drbd device.

> Now my problem now is this: simulating a machine failure, I deliberately
> power down the primary machine. The secondary now inherits the resources
> abandoned by the primary: ldirectord loads and `datadisk start` gets
> executed by heartbeat - the secondry now becomes the drbd's primary. 

Correct so far.

> Now, upon turning back "on" the primary node, I notice that /proc/drbd
> status on both nodes does not seem to detect the existence of the another:
> 
> ------------------
> On the (resource inherited) node:
> 0: cs:WFConnection st:Primary/Unknown ns:0 nr:0 dw:12 dr:35 pe:0 ua:0
> 
> On the newly restarted node:
> 0: cs:StandAlone st:Secondary/Unknown ns:0 nr:0 dw:0 dr:0 pe:0 ua:0
> ------------------

Is your network configuration correct? And it could take a couple of
seconds for drbd to reconnect, so maybe you should just wait a second.

Is drbd itself part of your boot sequence? And you shouldn't have
configured 'load-only' or some such.

> Connection/mirroring will only resume if I manually do a `drbd reconnect` on
> the newly restarted (secondary) node. And this action seem to perform a
> complete replication of the primary:

This is correct behaviour of drbd in case the primary failed; you should
use 0.7.x if you want to get smarter resyncs all the time.

0.6.x can only 'smartly' resync if the secondary fails and the primary
stays up.

> And oh, btw, I'm actually doing these implementation/tests using a volatile
> file system for /var/lib/drbd. Thus, doing a complete reboot on a node
> cleans out its /var/lib/drbd when it restarts. I'm guessing there are also
> side effects on using a volatile /var/lib/drbd?

Yes, the side effect is that you are killing the generation counters and
that may cause various bugs and data corruption. /var/lib/drbd may not
be volatile; that's a severe setup bug.


Sincerely,
    Lars Marowsky-Brée <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business




More information about the drbd-user mailing list