Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
You forget to CC the list... On 03/25/2013 10:22 AM, Stanislav German-Evtushenko wrote: >> there is a certain risk of data not reaching the hard disk soundly. > It seems so but what is a way to catch the reason? Really not much you can do. Not sure if ZFS is viable and helpful. >> What's your hard disk stack? > RAID 1+0 on both nodes. Not knowing what controllers Dell put into that r710, replacing them is probably out of the question anyway. They *could* be to blame. You could try disabling the write cache. Prepare for a somewhat limited performance. Speaking of performance, that al_extents value is a mite small. >> If you have the freedom, there are some things to try, e.g. >> - not use ethernet bonding > I can try but it will decrease bandwidth from 200MiB/s to 100MiB/s and > disable redundancy. >> - not use dual-primary > I can't because I need online migration (I know it's dangerous to do > migration now, because data can be not synced properly) >> - use a smaller partition > Why? Just throwing some bones. If one of these help (doubtful, but not impossible), then you know where to debug further.