[DRBD-user] Several questions

Wed Jun 30 22:12:51 CEST 2004

/ 2004-06-30 11:42:56 -0400
\ Jeff Tucker:
> Hi, guys. I had several questions about DRBD that should
> hopefully be easy to answer. For background, I'm bringing up a
> new system with a pair of identical servers. I intend to
> failover a RAID-0 array created from 12 SCSI drives. The drives
> are configured as one big array and the entire thing will be
> replicated. I'd prefer to use the 2.6 kernel and I'm currently
> testing using 0.7 development releases of DRBD.
> 
> - Since I'm failing over the entire /dev/md1 array, do I need to
> specify a disk-size in my drbd.conf? I intend to put the
> metadata on the array as well. The filesystem will be added
> after the DRBD device is up (in other words, mkreiserfs
> /dev/nb0). So, I think I want the reiserfs filesystem to take up
> most of the md1 device and the metadata to take up the rest.
> Does this happen automatically if I just say to put the metadata
> on that same device and don't specify a disk-size?

yes.
on both nodes:
/etc/drbd.conf:
	device /dev/nb0;
	disk /dev/md1;
	meta-data internal;
	(two times, in both host sections)
drbdadm up all

on chosen node:
 drbdadm primary your-drbd-resource-name
 mkfs ...

> - I rebooted the primary to test a failover. The system failed
> over to the secondary just fine. When the primary came up and
> the units started to sync, I got a kernel panic due to a
> Reiserfs journal-601 error that said it was trying to write past
> the end of the device. I meant to save the actual numbers, but
> it wasn't even close. It said the size was something like 400000
> and it was trying to write to 1500000. Could that be caused by a
> lack of the disk-size? I was running the 20040528 snapshot at
> the time but have since updated to the recent release candidate
> 1. In my testing, the system usually fails over fine and
> rebuilds. It was just this one time I saw an error.

everything before -pre8 == rc1 is likely to corrupt data on resync.

> - The drives being failed over are a RAID-0 array. If a drive
> fails, I'll be replacing it. This means that a bunch of data out
> of the middle of the array goes away. When I bring that machine
> back online, I can't just write the data that the primary has
> received while it has been offline, I need to write all that
> data plus everything that was on the now-replaced drive. 

> Will DRBD handle this automatically?

you can set on-io-error Detach; this will detach the device,
and the next time some resync takes place it should be the full device,
including that "bunch of data out of the middle".

> Will I need to force a full resync somehow?

sometimes you probably still have to force a full sync, e.g. if _you_
know the disk failed, but drbd did not notice yet, and can not know that
you replaced the backing storage.

the command to do so is
 drbdadm invalidate resource-name

> - Pretty much the same question, but involving the metadata. If
> I replace a drive that includes some or all of the DRBD
> metadata, will I still be able to bring up the device when the
> system is restarted? Will DRBD realize the metadata is missing
> or corrupted and rebuild it?

if on startup drbd does not find valid meta data,
it will require a full sync on the next start.

this is the same situation as when you setup the devices for the first time.

	lge