[DRBD-user] Re: [DRBD-cvs] * Various code cleanups. * Removed some ...

Philipp Reisner philipp.reisner at linbit.com
Tue Feb 10 13:46:19 CET 2004


Am Montag, 9. Februar 2004 23:03 schrieb Lars Ellenberg:
> / 2004-02-09 11:36:59 +0100
>
> \ drbd-cvs at linbit.com:
> > DRBD CVS committal
> >
> > Author  : phil
> > Module  : drbd
> >
> > Dir     : drbd/drbd
> >
> >
> > Modified Files:
> >       Tag: rel-0_7-branch
> > 	drbd_compat_wrappers.h drbd_dsender.c drbd_fs.c drbd_int.h
> > 	drbd_main.c drbd_proc.c drbd_receiver.c
> >
> >
> > Log Message:
> > * Various code cleanups.
> > * Removed some FIXMEs etc.
> >
> > [Lars: Would be nice if you could have a look at these modifications.]
>
> ok.
>
> > ===================================================================
> > diff -u -3 -r1.28.2.70 -r1.28.2.71
> > --- drbd_fs.c	9 Feb 2004 09:45:02 -0000	1.28.2.70
> > +++ drbd_fs.c	9 Feb 2004 10:36:54 -0000	1.28.2.71
> > @@ -156,8 +156,7 @@
> >  			drbd_set_my_capacity(mdev,size<<1);
> >  			mdev->la_size = size;
> >  			INFO("size = %lu KB\n",size);
> > -		}
> > -		//#warning "FIXME else { error handling }"
> > +		} else ERR("BM resizing failed. Leaving size unchanged\n");
> >  	}
> >
> >  	return rv;
>
> well, this is no error handling, but ok ;)
>

This happens if vmalloc for the new bitmap fails, it is perfectly
ok to not resize in this case and stay with the old size.

> > @@ -508,10 +507,10 @@
> >
> > -  //#warning "FIXME actually must hold device_mutex!"
> > +	D_ASSERT(semaphore_is_locked(&mdev->device_mutex));
>
> not true for "receive_BecomeSec", which is the race I meant.
> (as well as in cleanup_module, which is uncritical...)

ad receive_BecomeSec: I will remove the "secondary_remote" command. 
It was a hack for drbd-0.6 and heartbeat integration. It is no longer
needed since drbd does not becomes primary by it self any more...

Will fix the other case.

>
> > @@ -739,6 +738,7 @@
> >  		drbd_determin_dev_size(mdev);
> >  		drbd_md_write(mdev); // Write mdev->la_size to disk.
> >  		//#warning "yet an other reason to serialize all state changes on a
> > rw_semaphore" +		// PRE: Please explain the issue.
> >  		if (mdev->cstate == Connected) drbd_send_param(mdev,0);
> >  		break;
>
> complete context is:
>
> 	case DRBD_IOCTL_SET_DISK_SIZE:
> 		if (mdev->cstate > Connected) {
> 			err = -EBUSY;
> 			break;
> 		}
> 		err=0;
> 		mdev->lo_usize = (unsigned long)arg;
> 		drbd_determin_dev_size(mdev);
> 		drbd_md_write(mdev); // Write mdev->la_size to disk.
> 		//#warning "yet an other reason to serialize all state changes on a
> rw_semaphore" // PRE: Please explain the issue.
> 		if (mdev->cstate == Connected) drbd_send_param(mdev,0);
> 		break;
>
> the thing is: just because you test for the state somewhere does
> not ensure that the state is still the same some codelines later
> when you do the action depending on the state.
> because we currently have no proper locking for the devices state.
> thus you theoretically cannot get a consistent view of the state.
>
> I propose a mechanism like:
>  struct Drbd_Conf {
> 	rw_semaphore device_lock;
> 	// which is now semaphore device_mutex,
> 	// and has been semaphore ioctl_mutex
> 	...
>  }
>
> Then everybody doing something with the device (and thus in need
> of a consistent view of the state) needs to get a read lock first.
> everybody who wants to *change* configuration or state of the device
> needs to get a write lock.
>
> If I only have a read lock, and need to change the state, I would
> need to drop my read lock, then aquire a write lock, and do my changes.
>
> So in most cases I'd probably just "drbd_queue" a state change callback
> for the worker.  Maybe in some cases, where I already have the
> write_lock, I'll do the cahnges directly.
>

patch ? 

So the issue here is that we do not want to resize while a 
resynchronisation process is running. So the race is:
If the user manages that resynchronisation starts while 
his call to drbdsetup /dev/ndb/x resize is between 

		if (mdev->cstate > Connected) {
and 
 		drbd_determin_dev_size(mdev);

he will get resync that does not include the newly allocated 
space. 


Yes it should be fixed. No it is not a showstopper for the next
prerelease.

>
> > ===================================================================
> > diff -u -3 -r1.73.2.126 -r1.73.2.127
> > --- drbd_main.c	9 Feb 2004 09:40:45 -0000	1.73.2.126
> > +++ drbd_main.c	9 Feb 2004 10:36:54 -0000	1.73.2.127
> > @@ -89,8 +89,7 @@
> >  #endif
> >  #define DEVICE_REQUEST drbd_do_request
> >
> > -//#warning "FIXME review the MODULE_* macros below"
> > -MODULE_AUTHOR("Philipp Reisner <philipp.reisner at gmx.at>");
> > +MODULE_AUTHOR("Philipp Reisner <phil at linbit.com>, Lars Ellenberg
> > <lars at linbit.com>");
>
> thx.
> how about the drbd.org email addresses then?
>
> >  MODULE_DESCRIPTION("drbd - Distributed Replicated Block Device v"
> > REL_VERSION); MODULE_LICENSE("GPL");
> >  MODULE_PARM(minor_count,"i");
> > @@ -133,12 +132,12 @@
> >  int errno;
> >
> >  /************************* The transfer log start */
> > -STATIC void tl_init(drbd_dev *mdev)
> > +STATIC int tl_init(drbd_dev *mdev)
> >  {
> >  	struct drbd_barrier *b;
> >
> >  	b=kmalloc(sizeof(struct drbd_barrier),GFP_KERNEL);
> > -	// FIXME no mem ;-)
> > +	if(!b) return 0;
>
> hm. does this fix it?

Yes it does. This function is only called from module_init...
So module loading will fail, it we can not get these 16 bytes.

-Philipp
-- 
: Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria    http://www.linbit.com :




More information about the drbd-user mailing list