[DRBD-user] [Scst-devel] ]Re: vSphere MPIO with scst/drbd in dual primary mode. WAS: Re: R: Re: virt_dev->usn randomly generated!?

Vladislav Bolkhovitin vst at vlnb.net
Wed Mar 10 20:36:04 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

Pasi Kärkkäinen, on 03/10/2010 05:23 PM wrote:
> On Tue, Mar 09, 2010 at 11:24:12PM +0300, Vladislav Bolkhovitin wrote:
>> Zhen Xu, on 03/07/2010 02:23 AM wrote:
>>> I wonder how you can do NV_CACHE battery backed.  The NV_CACHE in SCST 
>>> is just the system memory/page cache.  It is not the cache on the RAID 
>>> card.  You could have UPS hooked to the server running SCST.  However, 
>>> if the system failed (memory, hardware, or just seg fault) and you had 
>>> to reboot without proper shutdown, the page cache will be lost.  It is a 
>>> very interesting setup that you wanted to do.  I am interested if you 
>>> have much success.
>> IMHO, basically, a battery backed RAID card isn't anyhow fundamentally 
>> different from a system with an UPS and possibility to clearly shutdown 
>> on a power shortage. They both have internal power supply, battery, 
>> system hardware, memory and software. The difference is only in the 
>> number of components, hence probability of failure. I think a good 
>> system with good hardware and dual power supply (they often fail on the 
>> first time) should be hardware wise in the same order of magnitude cache 
>> loosing failure probability as the RAID card.
> I don't quite agree. 
> If you have a RAID controller with battery backed cache memory,
> and the server shuts down, crashes, or you need to restart it for any reason,
> the cache contents will still be there after the reboot and all the data 
> can be flushed to the disks from the battery backed memory.
> If you use server normal memory as a cache you can't really do the same, can you? 

For what can you need it? Isn't the cache flushing on the 
shutdown/restart the same? Also note, I wrote "hardware wise" ;). But 
even for software failures, not every failure prevents the cache be 
flushed in the normal way, only few of them.

Overall, the danger of using write back system caching is very much 
overestimated, because:

1. ALL modern HDDs have at least 16MB of write back cache and it is well 
accepted and regularly used. A RAID with 10 such drives has 160MB(!) of 
write back cache.

2. Many (most?) modern applications are designed to work in write back 
caching environments, i.e. they know when to flush cache to minimize or 
even avoid damage, caused by any loss of not committed in case of 
failure data in the cache.

3. Linux kernel provides in /proc/sys/vm a set of knobs, using which you 
can limit amount of not committed ("dirty") data in the cache.


More information about the drbd-user mailing list