[DRBD-user] NULL deref at drbd_submit_peer_request

Robert Altnoeder robert.altnoeder at linbit.com
Fri Feb 10 11:19:26 CET 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

Hello everyone,

on 02/10/2017 02:58 AM, Jasmin J. wrote:
> Hi!
>>>> When running a kind of system test (detach/attach loop in high
>>>> system load),
>> "Don't do that, then." :-)
>> [wonders what real-world scenario that test is supposed to excercise]
> He does a TEST to find hidden bugs!
> In my practice as a Kernel developer for safety critical systems, I
> did such tests a lot for my drivers. You should be happy that someone
> does such kind of stress testing and don't make him look like a fool.
Granted, anything that could - and even if rather in theory than in
practice - bring down a system, is a design or implementation error.
However, I guess what all of this boils down to, is that there are tons
of such rather theoretical bugs all over the place in the Linux kernel
(and actually most if not all general purpose operating systems),
design- and implementation-wise. So many of them, that the important
bugs - the ones that cause severe enough problems for a large enough
number of people - get fixed, and the less important ones don't, simply
because there isn't enough workforce available to do that.

Or, as Lars said it:
> We have to prioritize somehow, though.
> And spending time on debugging something in a path which can easily be
> avoided (by simply not doing it; d'oh) won't get the high score.

That's why safety critical systems normally don't run COTS hard- and
software. The mere possibility of there being bugs in a driver, despite
the application of the strictest quality standards to reduce the risk of
introducing bugs, is the reason why safety critical systems are normally
designed to even run their drivers outside of the kernel space (most of
the aerospace, medical, nuclear, military, etc. systems run microkernel
operating systems that isolate driver crashes and can normally recover
from them).

Anyway, for general purpose systems, people prefer having lots of
features, faster development cycles, lots of drivers for new hardware,
doing more with less and doing it in the cheapest way possible over
safe, robust and secure design and implementation, otherwise virtually
all general purpose OSs would have gone extinct a very long time ago.
That's the tradeoff that we have to deal with today.

So, to summarize:
- we still care about fixing bugs
- we do believe that correct implementation is important
- however, we'll have to postpone fixing some rarely occuring bugs due
to (human) resource constraints
- nonetheless, our goal is still building software with above-average
robustness (aka "high availability")
- but we are still in the ball park of general purpose systems here
- for really safety critical tasks, please use systems specifically
designed for the required level of safety

> BR,
> Jasmin
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

Best regards,
Robert Altnoeder
+43 1 817 82 92 0 <tel:431817829267>
robert.altnoeder at linbit.com <mailto:robert.altnoeder at linbit.com>

LIN <http://www.linbit.com/en/>BIT <http://www.linbit.com/en/> | Keeping
The Digital World Running
DRBD - Corosync - Pacemaker
f <https://www.facebook.com/LINBIT-DRBD-346343405475/> /  t
<https://twitter.com/linbit> /  in
<https://www.linkedin.com/company/linbit> /  g+

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20170210/a50b6662/attachment.htm>

More information about the drbd-user mailing list