[DRBD-user] Metadata on software RAID1 load slowlyness problem

joe_p joep at limelightnetworks.com
Tue Dec 12 07:50:25 CET 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


It appears that I also have this same problem with a gentoo 2.6.17.14 kernel
and drbd 0.7.22

Today we move the drbd meta to a single drive and drbdsetup ran in a couple
seconds compared to the 4-5 minutes it was taking.  I applied the patch
below tried the drbd meta data on the md raid again and did not see any
difference in behavior.   Once again it took 4-5 minutes for drbdsetup to
complete.  Is there something else that is needed to be done?



Lars Ellenberg wrote:
> 
> / 2006-12-08 13:58:33 -0200
> \ Daniel van Ham Colchete:
>> Hi Yall,
>> 
>> I'm building my very first DRBD RAID. I've been sucessfull and
>> everything is working great.
>> 
>> My current config is:
>> SERVER1:
>> sda 400GB + sdb 400GB = md Software RAID1
>> sdc 250GB
>> 
>> SERVER2:
>> sda 250GB + sdb 250GB = md Software RAID1
>> sdc 400GB
>> 
>> My problem is, when the DRBD`s metadata is allocated to a md partition
>> (internal or /dev/md5[0]) the load time is greater than 4 minutes.
>> What I mean by load is "drbdadm up drbd0". I had to recompile drbdadm
>> changing it's timeout value from 120 to 600 seconds because of that,
>> otherwise my distro wouldn't init the both drbd devices.
>> 
>> Them, on the server that the metadata is being stored at the plain sdc
>> disk, the load time is less than 2 seconds. Important: I'm using
>> exactly the same hardware to each size, all the 400GB sized
>> harddrivers are exectly the same.
>> 
>> So, I tried changing both metadata storage place to plain sata drives
>> and everything loaded in less than 2 secs.
>> 
>> I also tried storing the metadata in internal and exlusive MD
>> partitions but the load time were the same: 4-5 minutes to the 340GB
>> partition and 2' 5'' to the 194GB partition.
>> 
>> Looking through DRBD's source I found that the delay is probably
>> happening that the _drbd_md_sync_page_io of the drbd_actlog.c file.
>> Probabily at the wait_for_completion(&event); line.
>> 
>> I'm using the kernel 2.6.18 with DRBD 0.7.22. My kernel has the
>> BIO_RW_SYNC defined.
> 
> I think the md raid code strips off the BIO_RW_SYNC flag...
> that way you suffer from the additional induced latency
> with every meta data update.
> 
> this is just a suggestion for raid1, similar patch for the other raid
> levels...
> should apply with small offsets to other kernel versions, too.
> 
> it may well be that the _re_assignment of the bi_rw is actually
> the real bug: bio_clone has already set it, and even set some other
> flags there as well, which probably should not be stripped off either.
> feel free to forward this to lkml.
> 
> diff -u drivers/md/raid1.c*
> --- /mnt/kernel-src/linux-2.6.19/drivers/md/raid1.c.orig	2006-12-11
> 10:06:17.661776243 +0100
> +++ /mnt/kernel-src/linux-2.6.19/drivers/md/raid1.c	2006-12-11
> 10:11:42.189647127 +0100
> @@ -776,6 +776,7 @@
>  	struct page **behind_pages = NULL;
>  	const int rw = bio_data_dir(bio);
>  	int do_barriers;
> +	int do_sync;
>  
>  	/*
>  	 * Register the new request and wait if the reconstruction
> @@ -891,6 +892,7 @@
>  	atomic_set(&r1_bio->behind_remaining, 0);
>  
>  	do_barriers = bio_barrier(bio);
> +	do_sync = bio_sync(bio);
>  	if (do_barriers)
>  		set_bit(R1BIO_Barrier, &r1_bio->state);
>  
> @@ -906,7 +908,7 @@
>  		mbio->bi_sector	= r1_bio->sector + conf->mirrors[i].rdev->data_offset;
>  		mbio->bi_bdev = conf->mirrors[i].rdev->bdev;
>  		mbio->bi_end_io	= raid1_end_write_request;
> -		mbio->bi_rw = WRITE | do_barriers;
> +		mbio->bi_rw = WRITE | do_barriers | do_sync;
>  		mbio->bi_private = r1_bio;
>  
>  		if (behind_pages) {
> 
> 
> -- 
> : Lars Ellenberg                            Tel +43-1-8178292-0  :
> : LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
> : Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
> __
> please use the "List-Reply" function of your email client.
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 
> 

-- 
View this message in context: http://www.nabble.com/Metadata-on-software-RAID1-load-slowlyness-problem-tf2781419.html#a7828227
Sent from the DRBD - User mailing list archive at Nabble.com.




More information about the drbd-user mailing list