Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello everbody! I have a question regarding the exact semantics of the oos value in /proc/drbd. The Users Guide https://docs.linbit.com/doc/users-guide-84/ch-admin/ says: "oos (out of sync). Amount of storage currently out of sync; in Kibibytes. Since 8.2.6." After several uncomforting events over the years we have now started to do regular verify runs. We will announce our script as open source right here at some point in the future, but we want to clarify some details first. Our script basically calls drbdadm verify on one resource at a time, because drbdadm verify all would kill the system for sure. After the verification run has completed, the script - analyses the oos: value, - eventually disconnects & connects the resource - starts verification of the next resource The script does not run as daemon, it's simply called regularily via cron, on the node with the more important resources. My main question is: Should the oos value always be 0? Does a non-0 value of oos mean that there have been sync errors? Or does oos include blocks that are currently beeing synched or waiting to be synched, too? In the latter case, what would be a valid condition to disconnect & connect a resource after a verification run? Also: Are there events that can cause a verification run to be aborted? One verification run on a huge resource (1.3 TB, HW RAID 5, dedicated GBit line) was finished way too fast, so I think something must have aborted it, like, say, - a buffer runs full -> automatic disconnect/reconnect -> verification aborted If something along this line is possible, is there a way to avoid or detect that? Maybe a kernel message we could grep for? Thanks, Regards, Christoph -- Christoph Lechleitner Geschäftsführung ------------------------------------------------------------------------ ITEG IT-Engineers GmbH | Conradstr. 5, A-6020 Innsbruck Mail: christoph.lechleitner at iteg.at | Web: http://www.iteg.at/ ------------------------------------------------------------------------