[DRBD-user] Raid controller cache module gives DRBD problems
mark at netexpo.nl
Fri Dec 14 10:24:44 CET 2007
> On Thu, Dec 13, 2007 at 02:00:05PM +0100, Mark Hunting wrote:
>> Hi all,
>> I have set up DRBD (0.7.21, Debian Etch) on two storage systems. Both
>> systems use a Areca Raid controller with 1GB cache memory. When I copy a
>> file to the DRBD device on the primary, the copying finishes with a
>> speed of about 400MB per second. That's should be impossible, DRBD uses
>> a gigabit link to synchronize, the copying should proceed with a rate op
>> 100 MB per second (drbd.conf: rate 100M).
>> So apparently the copying finishes before the data is transferred to the
>> secondary. Indeed I still see a lot of DRBD network traffic for some
>> seconds after the copy command finishes. This should be impossible with
>> DRBD, the copying should only finish when the data is both written on
>> the primary AND the secondary, right? When the primary server
>> unexpectedly goes down right after a file has been copied and the
>> secondary server takes over, the copied file is not there.
>> I think this has to do with the 1GB cache memory on the Areca cards,
>> when I transfer files that are much bigger than 1GB the copying indeed
>> finishes with a speed close to 100MB per second.
>> I don't understand this, what difference does it make for DRBD what is
>> underneath the DRBD device (/dev/drbd0)? It should not matter if there
>> is only one hard disk, a RAID system, a RAID system with cache memory or
>> whatever. But apparently it does matter in my case.
>> Is this a known problem, and how can I avoid it? Right now I can never
>> be sure my two storage systems are fully synchronized.
> man fsync
> man fdatasync
> man sync
> wikipedia page_cache
Yes I understand that the page cache can speed things up a lot. But how
is it possible that when I copy a new file to the DRBD disk the copying
is much faster than the DRBD link of 100MB/second. To be clear, with
'new' file I mean a file that I have never used on the DRBD disk before.
So it can not be in the cache of the secondary server. Therefore the
copying should go with a rate of 100MB/second, not faster like it does
now. Or am I missing something here?
More information about the drbd-user