[DRBD-user] truck based replication

Mon May 10 20:37:28 CEST 2010

On Mon, May 10, 2010 at 01:56:30PM -0400, Dan Barker wrote:
> >
> > Starting with old uuid, possibly partially dirty bitmap.
> >
> > [A] Generate new uuid, clean bitmap.
> >	The now clean bitmap will be re-dirtied with
> >	everything that is written from now on.
> > [B] Disk gets pulled, imaged, whatever.
> > [C] Second new uuid generated.
> >	(*NOT* clearing the bitmap this time)
> >	This marks the end of the imaging/disk pulling.
> >	Bitmap contains everything that has been redirtied since [A]
> > [D] bitmap continues to be updated.
> > [E] Connect of image (from [B]) to current:
> >	normal bitmap based resync.
> >	Everything that has been re-dirtied since [A] will be resynced.
> >
> > Or do I have to suspend Host A (at least the
> > application writing to that device) for that time?
> >
> > No.
> > If you can minimize write activity between [A] and [E],
> > you will have a minimal resync.
> 
> 
> Lars:
> 
> I read her question as "Must I suspend between [A] and [C]". I would think
> it would depend on the imaging process utilized.

No, it does not ;-)
That's the whole point.

Regardless of how you image,
as long as you start after [A],
you will always get an inconsistent (in general) image,
that, for each data block, contains
  some data, where the corresponding bit is still clean.
     That block may be updated until [E].
     	If it is updated, the bit will be set,
	   the block will be synced after [E].
	If it is NOT updated until [E],
	the block will be skipped during that resync.
  some data, that has already been updated
  since you started the imaging process,
  which means the bit is already dirty.
     the block will be synced after [E] again,
     even though it possibly was "good" alreday.
     Use checksum based resync, if your bandwidth is low,
     so the data itself will not be transfered.

> To my thinking, to have a zero-downtime truck based replication, you
> must have a secondary disk in the local datacenter, synchronized with
> the primary - and the secondary is disconnected, unattached, offlined
> and trucked to the remote site.

You can do so.
You can also use a local RAID 1, and pull a disk.
You can also use dd, and write an image, compressed, on tape, whatever.

As long as the image is started after [A], is compete,
AND contains the metadata as of just after [A],
it is a good base to do a bitmap resync into.

> Unless she can image the disk instantaneously, or image a snapshot of it,
> then writes during the archival process (she didn't specify that process)
> would corrupt the image.

Of course the image is inconsistent, in general.
That does not mean it is "corrupt".

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.