[DRBD-user] [OT] rsync issues [Was Re: Read performance?]

David Masover ninja at slaphack.com
Sun Jun 3 23:26:10 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Sunday 03 June 2007 10:18:30 Lars Ellenberg wrote:

> it just _technically_ makes no sense at all.

Would you like to explain why not?

> and that may be due to some misunderstandings about vocabulary used,
> and the technical implications those things have, and about what
> jobs the different components of a system have.

Given my understanding of the system, I expected a certain, specific result 
from my dd experiments. I got a different result.

The behavior I want would make DRBD over poor connectivity (high latency, low 
bandwidth) much more livable, so I'd like to think it is the correct 
behavior. If so, there is a bug in DRBD, either in design or implementation.

If not, I would like someone to explain to me why it is correct behavior for 
it to take five minutes to read one megabyte from a fully synced and up to 
date DRBD device.

> you don't need to know the implementation details of distributed cluster
> file systems and non-shared disks with shared disk semantics.
> that is what developers are for.

I can be a developer. I AM a developer, just not of DRBD.

> if you don't understand the technical details, don't suggest
> implementations, or try to judge how hard something would be,
> unless you want to be ridiculed.
>
> talk about things you _do_ understand.

I think I did, hence the dd statistics.

Alright, if it makes you feel better, here is the problem, with no 
implementation details: When the DRBD device is busy over a high-latency, 
low-bandwidth network (VPN), even when the local device is consistent and up 
to date, local reads can block for five or ten MINUTES at a time.

I know the local device is up to date, because "drbdadm dstate backup" tells 
me it is "UpToDate" on both machines. Also, allow-two-primaries is not 
enabled, and the config files (/etc/drbd.conf) are identical on both 
machines.

I can see no reason whatsoever why, in this state, a read (which is only 
allowed on the primary machine) should ever take longer than it would take to 
read from the physical disk.

I am sure I should have used some other vocabulary here. Maybe "backing store" 
instead of "physical disk", for example.

But I don't see how I could be misunderstood here, so if you're going to 
criticize me now, please do it in a specific enough way that I can actually 
learn something from it.

Otherwise, can we move on to the actual problem, and leave behind the 
communication issues?

> maybe there is already something out there,
> that would even do something better than you thought of.

I've been looking for years. The closest we have to a good clustering 
filesystem on Linux is Coda/AFS, and the closest we have to a good network 
filesystem is NFS.

But there is not anything even close to as flexible and complete as I'd like. 
There's only the best for a particular job.

DRBD is the closest we've got for simple live replication between only a few 
nodes (one or two).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20070603/6a967be3/attachment.pgp>


More information about the drbd-user mailing list