[Drbd-dev] DRBD-8 -- can access null mdev->bc pointer if local diskis not attached

Graham, Simon Simon.Graham at stratus.com
Wed Sep 6 03:31:42 CEST 2006


Found another one - not sure what the right fix is; if we get an
incoming connection whilst diskless, we can end up calling
drbd_send_sizes which includes a call to drbd_queue_order_type() outside
of an inc_local or inc_md_only section.

Presumably this value should be referenced inside the inc_md_only()
section at the top of the routine and default to some value if we have
no local disk.

Simon

> -----Original Message-----
> From: drbd-dev-bounces at linbit.com [mailto:drbd-dev-bounces at linbit.com]
> On Behalf Of Graham, Simon
> Sent: Tuesday, September 05, 2006 2:39 PM
> To: drbd-dev at linbit.com
> Subject: [Drbd-dev] DRBD-8 -- can access null mdev->bc pointer if
local
> diskis not attached
> 
> I've seen crashes where the code attempts to dereference a NULL mdev-
> >bc
> when there is no local disk attached - my reading of the code is that
> the bc field should only be referenced after a successful call to
> inc_local(mdev) (or equivalent such as inc_md_only()) - this is done
in
> most cases, but I've found several places where it is not (see
> attached).
> 
> I can go ahead and submit patches based on calling inc_local() for
most
> of these but I wanted to check that this is indeed the correct
> solution.
> 
> Simon
> 
> === Locations I have found ===
> 
> 1. drbd_set_role, line 965:
> 
> 		if ( ( ( mdev->state.conn < Connected ||
> 			 mdev->state.pdsk <= Attaching ) &&
> 		       mdev->bc->md.uuid[Bitmap] == 0) || forced ) {
> <<<===
> 			drbd_uuid_new_current(mdev);
> 		}
> 
> 2. drbd_ioctl, case DRBD_IOCTL_SET_DISK_SIZE seems dodgy to me -- just
> sets the size in mdev->bc; presumably
>    an inc_local(mdev) should be done?
> 
> 3. after_state_ch - several ones here, mostly probably OK (for
example,
> if ns.disk >=Inconsistent
>    which shouldn't happen unless you have local backing store).
> However,
> there is one that is not -- line 988
>    when we lose contact with the peer's copy of the data:
> 
> 	/*  Lost contact to peer's copy of the data  */
> 	if ( os.pdsk > DUnknown && ns.pdsk <= DUnknown ) {
> 		if ( mdev->p_uuid ) {
> 			kfree(mdev->p_uuid);
> 			mdev->p_uuid = NULL;
> 		}
> 		if (ns.role == Primary && mdev->bc->md.uuid[Bitmap] == 0
> ) {
> 
> ^^^^^^^^^^^^^^^^^====>>this has caused crashes
> 			/* Only do it if we have not yet done it... */
> 			INFO("Creating new current UUID\n");
> 			drbd_uuid_new_current(mdev);
> 		}
> 
> 4. drbd_send_uuids is currently called from some places where
> inc_local() has not been done -- I propose
>    ensuring it is only called when it is safe.
> 
> 5. receive_sync_uuid() -- there's no apparent check here and it calls
> _drbd_uuid_set which dereferences mdev->bc.
>    Not sure what is appropriate here???
> 
> 6. drbd_end_req() line 109 -- tests mdev->bc->dc.on_io_error without
> checking - maybe we cant get here, but
>    it seems to me that if you are diskless locally and do a write
_and_
> the remote write fails you would end
>    up here...
> 
> 7. drbd_kick_lo() -- not clear that this cant be called sometimes when
> we are diskless locally but perhaps
>    I missed something.
> 
> 
> _______________________________________________
> drbd-dev mailing list
> drbd-dev at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-dev


More information about the drbd-dev mailing list