Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 09/08/2010 03:49 PM, Lars Ellenberg wrote: > On Thu, Sep 02, 2010 at 03:22:25PM +0200, Robert Verspuy wrote: >> >> On the database server we're using PostgreSQL. >> PostgreSQL is ACID-compliant, so the data on disk should not be corrupt. >> It could be possible that we lost some database insert/updates, >> but that's a risk I'm willing to accept, looking at the small change >> that all power is lost. > Excuse me, but WHAT? > > PostgreSQL is ACID compliant, IF AND ONLY IF the fsync/fdatasync and > similar it issues are behaving as expected, i.e. data is on stable > storage when PostgreSQL thinks it is. > Hmm. Yes you are right. I think I was a bit too fast in thinking, everything will be fine. I though that no-disk-flushes would make drbd to not add it's own flushes after every IO, but still accept and push through the flushes that came from the layer above the drbd device. But, as I understand, drbd will not do any flushes when no-disk-flush is set. Not it's own flushes, and also not the flush requests it gets from the layer above. > If data only reaches stable storage at some point after PostgreSQL > thinks it already was there, and most likely even in some random order, > then no, ACID compliance is not met. Ok, together with your other mail, I think I understand it now. So, I think there are two risks when using volatile caches with no-disk-barrier and no-disk-flushes and protocol C. First -> single node failure, there can be difference in what is actually on disk. After recovery, let the crashed node be the secondary and run a verify as soon as possible. If the now primary node crashes before the verify is done, you'll must restore the database from a backup. Second -> both nodes have a crash / power failure. This way, it's possible both nodes have corrupt data. Solution: restore a backup of the database. So in any case (just like when running postgresql on one server), your data loss is always limited to your last regular backup of the database. The reason for me to test with no-disk-barrier and no-disk-flushes is because of the big latency (25ms in stead of the expected 1 or 2 ms) when writing small blocks of data. (See also my e-mail from last week, asking directions where to start looking to find the what's causing the latency) > So no, if you run PostgreSQL on disks with volatile caches, > and you unplug the power hard, you can expect data loss > and possibly data corruption. > > Which is completely independend of DRBD. True, So when comparing: postgresql on one server, with it's own disk flushes and volatile caches against postgresql on two nodes with drbd, with no-disk-barrier, no-disk-flushed and volatile caches, then it's (looking at data loss / corruption) it's safer to run postgresql on one server, because of the disk flushes. Unless we find the cause and maybe a solution for the huge latency, then I can remove the no-disk-barrier, no-disk-flushes parameters. With kind regards, Robert Verspuy -- *Exa-Omicron* Patroonsweg 10 3892 DB Zeewolde Tel.: 088-OMICRON (66 427 66) http://www.exa-omicron.nl -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20100908/510eb5fa/attachment.htm>