Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Am Montag, 9. Februar 2004 23:03 schrieb Lars Ellenberg:
> / 2004-02-09 11:36:59 +0100
>
> \ drbd-cvs at linbit.com:
> > DRBD CVS committal
> >
> > Author : phil
> > Module : drbd
> >
> > Dir : drbd/drbd
> >
> >
> > Modified Files:
> > Tag: rel-0_7-branch
> > drbd_compat_wrappers.h drbd_dsender.c drbd_fs.c drbd_int.h
> > drbd_main.c drbd_proc.c drbd_receiver.c
> >
> >
> > Log Message:
> > * Various code cleanups.
> > * Removed some FIXMEs etc.
> >
> > [Lars: Would be nice if you could have a look at these modifications.]
>
> ok.
>
> > ===================================================================
> > diff -u -3 -r1.28.2.70 -r1.28.2.71
> > --- drbd_fs.c 9 Feb 2004 09:45:02 -0000 1.28.2.70
> > +++ drbd_fs.c 9 Feb 2004 10:36:54 -0000 1.28.2.71
> > @@ -156,8 +156,7 @@
> > drbd_set_my_capacity(mdev,size<<1);
> > mdev->la_size = size;
> > INFO("size = %lu KB\n",size);
> > - }
> > - //#warning "FIXME else { error handling }"
> > + } else ERR("BM resizing failed. Leaving size unchanged\n");
> > }
> >
> > return rv;
>
> well, this is no error handling, but ok ;)
>
This happens if vmalloc for the new bitmap fails, it is perfectly
ok to not resize in this case and stay with the old size.
> > @@ -508,10 +507,10 @@
> >
> > - //#warning "FIXME actually must hold device_mutex!"
> > + D_ASSERT(semaphore_is_locked(&mdev->device_mutex));
>
> not true for "receive_BecomeSec", which is the race I meant.
> (as well as in cleanup_module, which is uncritical...)
ad receive_BecomeSec: I will remove the "secondary_remote" command.
It was a hack for drbd-0.6 and heartbeat integration. It is no longer
needed since drbd does not becomes primary by it self any more...
Will fix the other case.
>
> > @@ -739,6 +738,7 @@
> > drbd_determin_dev_size(mdev);
> > drbd_md_write(mdev); // Write mdev->la_size to disk.
> > //#warning "yet an other reason to serialize all state changes on a
> > rw_semaphore" + // PRE: Please explain the issue.
> > if (mdev->cstate == Connected) drbd_send_param(mdev,0);
> > break;
>
> complete context is:
>
> case DRBD_IOCTL_SET_DISK_SIZE:
> if (mdev->cstate > Connected) {
> err = -EBUSY;
> break;
> }
> err=0;
> mdev->lo_usize = (unsigned long)arg;
> drbd_determin_dev_size(mdev);
> drbd_md_write(mdev); // Write mdev->la_size to disk.
> //#warning "yet an other reason to serialize all state changes on a
> rw_semaphore" // PRE: Please explain the issue.
> if (mdev->cstate == Connected) drbd_send_param(mdev,0);
> break;
>
> the thing is: just because you test for the state somewhere does
> not ensure that the state is still the same some codelines later
> when you do the action depending on the state.
> because we currently have no proper locking for the devices state.
> thus you theoretically cannot get a consistent view of the state.
>
> I propose a mechanism like:
> struct Drbd_Conf {
> rw_semaphore device_lock;
> // which is now semaphore device_mutex,
> // and has been semaphore ioctl_mutex
> ...
> }
>
> Then everybody doing something with the device (and thus in need
> of a consistent view of the state) needs to get a read lock first.
> everybody who wants to *change* configuration or state of the device
> needs to get a write lock.
>
> If I only have a read lock, and need to change the state, I would
> need to drop my read lock, then aquire a write lock, and do my changes.
>
> So in most cases I'd probably just "drbd_queue" a state change callback
> for the worker. Maybe in some cases, where I already have the
> write_lock, I'll do the cahnges directly.
>
patch ?
So the issue here is that we do not want to resize while a
resynchronisation process is running. So the race is:
If the user manages that resynchronisation starts while
his call to drbdsetup /dev/ndb/x resize is between
if (mdev->cstate > Connected) {
and
drbd_determin_dev_size(mdev);
he will get resync that does not include the newly allocated
space.
Yes it should be fixed. No it is not a showstopper for the next
prerelease.
>
> > ===================================================================
> > diff -u -3 -r1.73.2.126 -r1.73.2.127
> > --- drbd_main.c 9 Feb 2004 09:40:45 -0000 1.73.2.126
> > +++ drbd_main.c 9 Feb 2004 10:36:54 -0000 1.73.2.127
> > @@ -89,8 +89,7 @@
> > #endif
> > #define DEVICE_REQUEST drbd_do_request
> >
> > -//#warning "FIXME review the MODULE_* macros below"
> > -MODULE_AUTHOR("Philipp Reisner <philipp.reisner at gmx.at>");
> > +MODULE_AUTHOR("Philipp Reisner <phil at linbit.com>, Lars Ellenberg
> > <lars at linbit.com>");
>
> thx.
> how about the drbd.org email addresses then?
>
> > MODULE_DESCRIPTION("drbd - Distributed Replicated Block Device v"
> > REL_VERSION); MODULE_LICENSE("GPL");
> > MODULE_PARM(minor_count,"i");
> > @@ -133,12 +132,12 @@
> > int errno;
> >
> > /************************* The transfer log start */
> > -STATIC void tl_init(drbd_dev *mdev)
> > +STATIC int tl_init(drbd_dev *mdev)
> > {
> > struct drbd_barrier *b;
> >
> > b=kmalloc(sizeof(struct drbd_barrier),GFP_KERNEL);
> > - // FIXME no mem ;-)
> > + if(!b) return 0;
>
> hm. does this fix it?
Yes it does. This function is only called from module_init...
So module loading will fail, it we can not get these 16 bytes.
-Philipp
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :