Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello everyone, on 02/10/2017 02:58 AM, Jasmin J. wrote: > Hi! > >>>> When running a kind of system test (detach/attach loop in high >>>> system load), >> >> "Don't do that, then." :-) >> [wonders what real-world scenario that test is supposed to excercise] > He does a TEST to find hidden bugs! > In my practice as a Kernel developer for safety critical systems, I > did such tests a lot for my drivers. You should be happy that someone > does such kind of stress testing and don't make him look like a fool. Granted, anything that could - and even if rather in theory than in practice - bring down a system, is a design or implementation error. However, I guess what all of this boils down to, is that there are tons of such rather theoretical bugs all over the place in the Linux kernel (and actually most if not all general purpose operating systems), design- and implementation-wise. So many of them, that the important bugs - the ones that cause severe enough problems for a large enough number of people - get fixed, and the less important ones don't, simply because there isn't enough workforce available to do that. Or, as Lars said it: > We have to prioritize somehow, though. > And spending time on debugging something in a path which can easily be > avoided (by simply not doing it; d'oh) won't get the high score. That's why safety critical systems normally don't run COTS hard- and software. The mere possibility of there being bugs in a driver, despite the application of the strictest quality standards to reduce the risk of introducing bugs, is the reason why safety critical systems are normally designed to even run their drivers outside of the kernel space (most of the aerospace, medical, nuclear, military, etc. systems run microkernel operating systems that isolate driver crashes and can normally recover from them). Anyway, for general purpose systems, people prefer having lots of features, faster development cycles, lots of drivers for new hardware, doing more with less and doing it in the cheapest way possible over safe, robust and secure design and implementation, otherwise virtually all general purpose OSs would have gone extinct a very long time ago. That's the tradeoff that we have to deal with today. So, to summarize: - we still care about fixing bugs - we do believe that correct implementation is important - however, we'll have to postpone fixing some rarely occuring bugs due to (human) resource constraints - nonetheless, our goal is still building software with above-average robustness (aka "high availability") - but we are still in the ball park of general purpose systems here - for really safety critical tasks, please use systems specifically designed for the required level of safety > BR, > Jasmin > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user Best regards, -- Robert Altnoeder +43 1 817 82 92 0 <tel:431817829267> robert.altnoeder at linbit.com <mailto:robert.altnoeder at linbit.com> LIN <http://www.linbit.com/en/>BIT <http://www.linbit.com/en/> | Keeping The Digital World Running DRBD - Corosync - Pacemaker f <https://www.facebook.com/LINBIT-DRBD-346343405475/> / t <https://twitter.com/linbit> / in <https://www.linkedin.com/company/linbit> / g+ <https://plus.google.com/+Linbit/about> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20170210/a50b6662/attachment.htm>