Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 10/27/2017 02:57 PM, Mariusz Mazur wrote: > By default drbdmanage during its normal operation will lock any > lvm-related operations on a server somewhere between running the first > 'drbdmanage init' and a year into production deployment. This is known > and documented behavior. No, you are wrong. Drbdmanage does not lock anything. LVM itself locks up in certain situations. LVM is the component that suddenly changes behavior, so obviously the design flaw is within LVM and not within drbdmanage. Drbdmanage just does not compensate for LVM's shortcomings. This is also the reason that the codebase of the product that will replace drbdmanage in the future is already multiple times the size of the current drbdmanage, although it is still in an early stage of its development and just barely starting to do anything useful. One cause for this increase in size is that even in the experimental version that we have right now, the 55 lines of code that attempt LVM volume creation are backed by about 2000 lines of error detection, error correction and error reporting code. This is only necessary because the environment that drbdmanage has to deal with is extremely unreliable, full of inconsistencies, surprise errors, undocumented behavior, undocumented and ambiguous exit codes of external utilities, race conditions in the Linux kernel, especially the udev component, and all other kinds of nasty behavior that anyone could possibly think of. Therefore, I would suggest to start correcting problems where they actually occur instead of expecting everyone else to work around bad design and implementation in the layers above. That being said, your complaint should have been sent to the LVM developers, not us. > From brief contact with a linbit developer it seems to me that company > policy is 'new users just need to know to read 5.4.1'. Preferably > before 5.1 and 5.2 which actually show how to use init/add-node. The entire culture in the Unix/Linux world is mostly that you are expected to know what the system is doing and what you are doing, how to configure everything, how to find out how to fix things if they break, etc. Personally, I do not think that a production system should work like this, but for whatever reason, a long time ago, most of the industry has chosen to use operating systems that were originally designed for developers, technicians and scientists as the basis for their business applications. The systems were not originally designed to just work out of the box, and sometimes, it still shows. Actually, probably more often than not. > If I wanted to come up with a good way to leave new users thinking > "should've just used gluster, like everybody else", I don't think I'd > do a better job. We could try to provide a product that deals with as many of the potential problems as anyone can think of, but since someone obviously has to do all the work, the question is: How much would you be willing to pay for it? > (Btw: are there any other 5.4.1s a new user should be aware of?) Thousands probably, depending on the exact configuration. - Thin provisioning might lock up your system if you run out of space. - DRBD meta data may need to clean up slots if you have used them all and then replace a node with another node that has a different node ID. - LVM may become extremely slow if it is not configured to ignore DRBD devices and there are lots of DRBD devices that cannot be opened, e.g. because there is another Primary. - etc. ... Apparently, most people have not ever hit the problem you describe. I did not ever see it come up in my test environment. Some others have hit other problems that you did not encounter. br, -- Robert Altnoeder +43 1 817 82 92 0 robert.altnoeder at linbit.com LINBIT | Keeping The Digital World Running DRBD - Corosync - Pacemaker f / t / in / g+ DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.