[Drbd-dev] DRBD-8 -- can access null mdev->bc pointer if local disk is not attached

Graham, Simon Simon.Graham at stratus.com
Tue Sep 5 20:38:59 CEST 2006


I've seen crashes where the code attempts to dereference a NULL mdev->bc
when there is no local disk attached - my reading of the code is that
the bc field should only be referenced after a successful call to
inc_local(mdev) (or equivalent such as inc_md_only()) - this is done in
most cases, but I've found several places where it is not (see
attached).

I can go ahead and submit patches based on calling inc_local() for most
of these but I wanted to check that this is indeed the correct solution.

Simon

=== Locations I have found ===

1. drbd_set_role, line 965:

		if ( ( ( mdev->state.conn < Connected ||
			 mdev->state.pdsk <= Attaching ) &&
		       mdev->bc->md.uuid[Bitmap] == 0) || forced ) {
<<<===
			drbd_uuid_new_current(mdev);
		}

2. drbd_ioctl, case DRBD_IOCTL_SET_DISK_SIZE seems dodgy to me -- just
sets the size in mdev->bc; presumably
   an inc_local(mdev) should be done?

3. after_state_ch - several ones here, mostly probably OK (for example,
if ns.disk >=Inconsistent
   which shouldn't happen unless you have local backing store). However,
there is one that is not -- line 988
   when we lose contact with the peer's copy of the data:

	/*  Lost contact to peer's copy of the data  */
	if ( os.pdsk > DUnknown && ns.pdsk <= DUnknown ) {
		if ( mdev->p_uuid ) {
			kfree(mdev->p_uuid);
			mdev->p_uuid = NULL;
		}
		if (ns.role == Primary && mdev->bc->md.uuid[Bitmap] == 0
) {
	
^^^^^^^^^^^^^^^^^====>>this has caused crashes
			/* Only do it if we have not yet done it... */
			INFO("Creating new current UUID\n");
			drbd_uuid_new_current(mdev);
		}

4. drbd_send_uuids is currently called from some places where
inc_local() has not been done -- I propose
   ensuring it is only called when it is safe.

5. receive_sync_uuid() -- there's no apparent check here and it calls
_drbd_uuid_set which dereferences mdev->bc.
   Not sure what is appropriate here???

6. drbd_end_req() line 109 -- tests mdev->bc->dc.on_io_error without
checking - maybe we cant get here, but
   it seems to me that if you are diskless locally and do a write _and_
the remote write fails you would end
   up here...

7. drbd_kick_lo() -- not clear that this cant be called sometimes when
we are diskless locally but perhaps
   I missed something.




More information about the drbd-dev mailing list