[DRBD-user] Massive starvation in diskless state doing direct IO reads

Lars Ellenberg lars.ellenberg at linbit.com
Thu Sep 2 17:15:40 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Thu, Sep 02, 2010 at 04:40:20PM +0200, Roland Friedwagner wrote:
> Hello,
> 
> I encountered this doing performance and stability testing with
> iozone on a DRBD 2 node setup and stripped it down to this test case.
> It reproduces on two different Linux derivates:
> 
>  - RHEL 5.5
>    Linux wu-wien.ac.at 2.6.18-194.11.3.el5 #1 SMP Mon Aug 23 15:51:38 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>  - GRML64 2010.04 (debian derived; 64bit)
>    Linux wu-wien.ac.at 2.6.33-grml64 #1 SMP PREEMPT Fri Apr 2 10:19:25 UTC 2010 x86_64 GNU/Linux
>  - GRML 2010.04 (32bit)
>    Linux wu-wien.ac.at 2.6.33-grml #1 SMP PREEMPT Fri Apr 2 10:16:25 UTC 2010 i686 GNU/Linux
> 
> 
> DRBD Version: 8.3.8.1
> 
> Reproduction Steps:
> -------------------
> 1. Load drbd module                                           # on both nodes
> 2. drbdadm up r0                                              # on both nodes
> 3. drbdadm primary r0                                         # on node 1
> 4. drbdadm detach r0                                          # on node 1
> => Diskstate of primary is now Diskless
> 5. dd if=/dev/drbd0 of=/dev/null iflag=direct bs=9M count=50  # on node 1
> 
> Result:   Transfer speed < 5 MByte/sec !!!
> Expected: Transfer speed > 80 MByte/sec (near link bandwidth between nodes)
> 
> Actual transfer speed depends on ping-int parameter.
> Because communication completely stops until a ping from primary kicks it again
> (see tcpdump log and netstat output).
> The border for directio starvation is equal max-buffers size parameter (default 8M).

Uh.
Nice one ;-)

> # cat /proc/drbd
> version: 8.3.8.1 (api:88/proto:86-94)
> GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by root at grml, 2010-09-02 14:00:17
>  0: cs:Connected ro:Primary/Secondary ds:Diskless/UpToDate C r----
>     ns:0 nr:240896 dw:0 dr:732 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Starvation probably happens on the Secondary.
It should vanish if you
 echo 1 > /sys/modules/drbd/parameters/disable_sendpage

As long as you have _occasional_ write requests while you do the
sequential read, you should still be fine, too.

Can you confirm this so far?

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list