[DRBD-user] DRBD and LVM Snapshot with 2 nodes configuration

Tue Apr 6 17:52:53 CEST 2004

/ 2004-04-06 16:19:37 +0200
\ Andreas Semt:
> Hello list,
> 
> I have a Heartbeat test configuration with two nodes (nodeA active,
> nodeB standby for warm failover) and DRBD for data synchronisation over
> GBit Ethernet. DRBD is on top of LVM (LVM is on top of Raid-5), because
> I want use the LVM snapshot facility to make periodic backups of the
> block devices controlled by DRBD (data on /dev/nb0: mysql, ldap,
> apache). So, my questions (nodeA is primary for DRBD):
> 1.) What I have to bear in mind when making a snapshot of the drbd
> device /dev/nb0 on nodeA? How would I do this without breaking drbd
> ("[...] never [...] access the underlying device directly" from Some
> Do's and Dont's, DRBD Article in Linux Mag)?

you could have set it up as LVM2 on top of DRBD instead.

Basically the problem is that you won't get a "clean" but only a
"consistent" snapshot if you have LVM below DRBD, because LVM expects to
be the topmost layer right below the file system, and if DRBD (or any
other stacking block device driver, for that matter) is in between, LVM
is not able to contact the filesystem and tell it to flush its journals
and metadata stuff,

> 2.) Can I stop drbd (/dev/nb0) on nodeA (without the result that nodeB
> becomes primary for drbd), do the snapshot, save the snapshot data,
> destroy the snapshot again (snapshot data is already saved), put the
> saved and tar'ed snapshot data on a second drbd device (/dev/nb1), start
> drbd, so a sync between nodeA and nodeB will be started. The result is:
> nodeA synch all the data on /dev/nb0 written during the snapshot (were
> drbd was down) and the data on /dev/nb1 (the saved and tar'ed snapshot
> file) with nodeB. nodeB has then the same data on /dev/nb0 (for mysql,
> ldap, apache) and on /dev/nb1 (the snapshot backup), too. This snapshot
> thing should run as Heartbeat service, later (if it makes sense).
> 
> Does this sounds like a good way? Any hints or ideas are most welcome!

"stop drbd on primary without making the peer primary"
is
"take down that device, and all services depending on the data on it".

this is the easy way, and of course will give you a clean and consistent
view of the data, because the data won't be used by anyone but the
backup now...

so what should work is backup/snapshot during downtime:

	node-A: Primary; node-B: Secondary
node-A# stop_all_services && umount /dev/nb0 && drbd stop
	node-A: Secondary; node-B: Secondary
either node# lvmcreate {snapshot}
either node# drbd start ; mount_that_device && start_all_services 

then do anything you want to do with the snapshot, and delete the
snapshot after you are done with it.  you should be able to access the
drbd device as usual while accessing the snapshot.

the difficult thing is: doing a clean and consistent snapshot while drbd
is online.

The failover/failback thing, and "just inbetween" doing the snapshot,
as you outline above, won't be such a good idea, because you probably
will get a consistent, but not a *clean* snapshot.

	Lars Ellenberg