[DRBD-user] Massive starvation in diskless state doing direct IO reads

Lars Ellenberg lars.ellenberg at linbit.com
Fri Sep 3 15:35:44 CEST 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Fri, Sep 03, 2010 at 03:17:23PM +0200, Roland Friedwagner wrote:
> Am Donnerstag 02 September 2010 schrieb Lars Ellenberg:
> >
> > Starvation probably happens on the Secondary.
> > It should vanish if you
> >  echo 1 > /sys/modules/drbd/parameters/disable_sendpage
> >
> 
> Starvation is gone with disabled sendpage feature!
> 
> [secondary]# cat /sys/module/drbd/parameters/disable_sendpage
> Y
> [diskless primary]# dd if=/dev/drbd0 of=/dev/null iflag=direct bs=9M count=50
> 50+0 records in
> 50+0 records out
> 471859200 bytes (472 MB) copied, 4.91869 seconds, 95.9 MB/s
> 
> I did not see any performance degradation not using sendpage 
> with 1Gbit Link between Primary and Secondary (works like a charm;-).
> 
> Does using sendpage makes a difference when it comes to >= 10Gbit Links?
> 
> 
> > As long as you have _occasional_ write requests while you do the
> > sequential read, you should still be fine, too.
> >
> 
> Doing writes during big direct io reads does _not_ fix it:
> 
> I run this in background to produce about 10 writes/sec:
> 
>   while :; do date --iso-8601=sec; dd if=/dev/zero of=/dev/drbd0 bs=1k count=1 conv=fsync 2> /dev/null; sleep .1; done
> 
> But direct io reads are still stucked (and the writes also get stucked):

Now that you mention it, that makes sense, considering where the hang
occurs (receiver trying to allocate buffer pages for the read...).

please try below patch, and see how much it improves the situation.

diff --git a/drbd/drbd_receiver.c b/drbd/drbd_receiver.c
index 33e0541..a60ebfa 100644
--- a/drbd/drbd_receiver.c
+++ b/drbd/drbd_receiver.c
@@ -299,7 +299,11 @@ STATIC struct page *drbd_pp_alloc(struct drbd_conf *mdev, unsigned number, bool
 			break;
 		}
 
-		schedule();
+		/* If everything is on the net_ee (still referenced by tcp),
+		 * there won't be a wake_up() until the next process_done_ee
+		 * which may be ping-int (typically 10 seconds) away.
+		 * Retry ourselves a bit faster. */
+		schedule_timeout(HZ/10);
 	}
 	finish_wait(&drbd_pp_wait, &wait);
 

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list