[DRBD-user] Sync slow with drbd 0.7.10 and linux 2.6.11
Harry Edmon
harry at atmos.washington.edu
Mon Mar 14 18:51:15 CET 2005
I tried your suggestions. At first everything seemed to be faster.
Then I started up my normal processing on the primary node. The I/O got
slow. I then shut down drbd on the secondary/inconsistent node. When I
started it up again, almost all the I/O on the primary stopped. Here
are the messages I saw on the secondary:
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967295
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967294
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967293
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967292
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967291
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967290
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967289
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967288
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967287
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967286
drbd0: [drbd0_receiver/26463] sock_sendmsg time expired, ko = 4294967285 ...
When I stopped the secnodary, the primary took off with normal I/O. The
primary had no error messages out of the normal ones you would expect
from the secondary going up and down, except for:
drbd0: drbd0_receiver [4190]: cstate NetworkFailure --> BrokenPipe
drbd0: error receiving ReportBitMap, l: 4088!
drbd0: worker terminated
drbd0: drbd0_receiver [4190]: cstate BrokenPipe --> Unconnected
drbd0: Connection lost.
drbd0: drbd0_receiver [4190]: cstate Unconnected --> WFConnection
Philipp Reisner wrote:
>Am Mittwoch, 9. März 2005 23:10 schrieb Harry Edmon:
>
>
>>I have two 2.6.11 systems with 3ware cards hooked up via drbd 0.7.10.
>>When set up freshair2 (8000 3ware card) as the primary and sync it to
>>funnel1 (9000 3ware card), it runs 20-30 MBytes/sec. However, when I
>>reverse this (funnel1->freshair2) the sync rate is 7 MBytes/sec. I have
>>tested the network and disk bandwidth by doing a rcp from funnel1to
>>freshair2, and I get 27 Mbytes/sec, which seems to eliminates the
>>network and the disk. This is with nothing else running on either
>>machine. All that is left appears to be drbd.
>>
>>Both units are dual Xeon systems, and have have tried them with both
>>hyperthreading on and off. Anyone have any ideas? I have attached the
>>drbd.conf file.
>>
>>
>
>It seems that the 2.6.11 IO-Scheduling code has some new surprises
>ready for DRBD. I need some time to understand this new issues
>completely.
>
>What you can do so far:
>
> max-buffers 4096
> max-epoch-size 1024
>
> __AND__
>
> you need to tune the nr_requests parameter of the backing device
> via sys-fs. -> 1024 gave me reasonable performance...
>
>-Philipp
>
>
More information about the drbd-user
mailing list