[Drbd-dev] DRBD-8 -- can access null mdev->bc pointer if local
diskis not attached
Graham, Simon
Simon.Graham at stratus.com
Wed Sep 6 03:31:42 CEST 2006
Found another one - not sure what the right fix is; if we get an
incoming connection whilst diskless, we can end up calling
drbd_send_sizes which includes a call to drbd_queue_order_type() outside
of an inc_local or inc_md_only section.
Presumably this value should be referenced inside the inc_md_only()
section at the top of the routine and default to some value if we have
no local disk.
Simon
> -----Original Message-----
> From: drbd-dev-bounces at linbit.com [mailto:drbd-dev-bounces at linbit.com]
> On Behalf Of Graham, Simon
> Sent: Tuesday, September 05, 2006 2:39 PM
> To: drbd-dev at linbit.com
> Subject: [Drbd-dev] DRBD-8 -- can access null mdev->bc pointer if
local
> diskis not attached
>
> I've seen crashes where the code attempts to dereference a NULL mdev-
> >bc
> when there is no local disk attached - my reading of the code is that
> the bc field should only be referenced after a successful call to
> inc_local(mdev) (or equivalent such as inc_md_only()) - this is done
in
> most cases, but I've found several places where it is not (see
> attached).
>
> I can go ahead and submit patches based on calling inc_local() for
most
> of these but I wanted to check that this is indeed the correct
> solution.
>
> Simon
>
> === Locations I have found ===
>
> 1. drbd_set_role, line 965:
>
> if ( ( ( mdev->state.conn < Connected ||
> mdev->state.pdsk <= Attaching ) &&
> mdev->bc->md.uuid[Bitmap] == 0) || forced ) {
> <<<===
> drbd_uuid_new_current(mdev);
> }
>
> 2. drbd_ioctl, case DRBD_IOCTL_SET_DISK_SIZE seems dodgy to me -- just
> sets the size in mdev->bc; presumably
> an inc_local(mdev) should be done?
>
> 3. after_state_ch - several ones here, mostly probably OK (for
example,
> if ns.disk >=Inconsistent
> which shouldn't happen unless you have local backing store).
> However,
> there is one that is not -- line 988
> when we lose contact with the peer's copy of the data:
>
> /* Lost contact to peer's copy of the data */
> if ( os.pdsk > DUnknown && ns.pdsk <= DUnknown ) {
> if ( mdev->p_uuid ) {
> kfree(mdev->p_uuid);
> mdev->p_uuid = NULL;
> }
> if (ns.role == Primary && mdev->bc->md.uuid[Bitmap] == 0
> ) {
>
> ^^^^^^^^^^^^^^^^^====>>this has caused crashes
> /* Only do it if we have not yet done it... */
> INFO("Creating new current UUID\n");
> drbd_uuid_new_current(mdev);
> }
>
> 4. drbd_send_uuids is currently called from some places where
> inc_local() has not been done -- I propose
> ensuring it is only called when it is safe.
>
> 5. receive_sync_uuid() -- there's no apparent check here and it calls
> _drbd_uuid_set which dereferences mdev->bc.
> Not sure what is appropriate here???
>
> 6. drbd_end_req() line 109 -- tests mdev->bc->dc.on_io_error without
> checking - maybe we cant get here, but
> it seems to me that if you are diskless locally and do a write
_and_
> the remote write fails you would end
> up here...
>
> 7. drbd_kick_lo() -- not clear that this cant be called sometimes when
> we are diskless locally but perhaps
> I missed something.
>
>
> _______________________________________________
> drbd-dev mailing list
> drbd-dev at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-dev
More information about the drbd-dev
mailing list