[DRBD-user] Re: Real fix for drbd-0.7.12

Jeff Fisher jeff at lfchosting.com
Wed Aug 31 20:50:47 CEST 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> Ok, here is the real fix!!! The patch is against drbd-0.7.12 plain.
> 
> This was really hard. 
> 
> In the kernel's API there are two variants of all bitops. The atomic
> ones set_bit(), clear_bit(), test_bit() etc...  and the non atomic
> ones __set_bit(), __clear_bit() ...
> 
> The race condition:
>  CPU1 was in an IO completion handler and used the __set_bit(SYNC_STARTED,..)
>  there. Non atomic means: First, it fetched the word from memory....
>  ... CPU2 was exiting the _drbd_process_ee() function and did the clear bit 
>  clear_bit(PROCESS_EE_RUNNING) atomic = fetch, modify and write...
>  ... back on CPU1 we now do the modify and write...
> 
> So CPU2 sets the PROCESS_EE_RUNNING bit again, because it fetched
> the word before CPU1 did it's atomic update.
> 
> So I conclude, that the rule is:
> 
>   If you use the atomic bitops on a word, you may never ever user the
>   non atomic bitops on the same word anywhere in your code.
> 
> 
> But it feels good, to understand what was going on!

Good work. I'll test it out soon.

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20050831/7ce2083a/attachment.pgp>


More information about the drbd-user mailing list