[DRBD-user] drbd 7 + xfs + 2.6.7

Lars Marowsky-Bree lmb at suse.de
Fri Jul 23 22:02:23 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 2004-07-23T19:31:41,
   Florin Cazacu <florinc at reecemarketing.com> said:

>    With the patch you posted this are the errors I get:
> 
> Jul 23 18:12:30 dell1 kernel: drbd0: _drbd_send_page: (page_count(page) 
> < 1) in
> /usr/local/src/drbd-0.7.0/drbd/drbd_main.c:895
> Jul 23 18:12:30 dell1 kernel: drbd0: someone wants to send a free page!

Right, I just spoke to the XFS guys at OLS.

This is in fact to be expected, XFS is the special kid on the block once
more. The XFS metadata is in slab pages.  (For which the pagecount is
always 0, so in fact the check for PG_Slab should be before that.)
However, even the PG_Slab is not always set for PG_Slab pages.

XFS basically assumes it has full control over it's pages.

I'd suggest to start a discussion on whether this is sane or not on
LKML. I'd claim it's broken, what's being passed through the block layer
should be sane. As lge understand this all better than I do, I'd suggest
he volunteers ;-)

However, for the time being, I'd suggest changing the patch by lge as
attached; basically mostly silently fall back to the slow path even for
pagecount()==0.

I've also increased the logging interval; for XFS, this is going to be a
really regular occurence.

Could you please test this? Thanks a lot!


Sincerely,
    Lars Marowsky-Brée <lmb at suse.de>

-- 
High Availability & Clustering	    \ ever tried. ever failed. no matter.
SUSE Labs, Research and Development | try again. fail again. fail better.
SUSE LINUX AG - A Novell company    \ 	-- Samuel Beckett

-------------- next part --------------
Index: drbd_main.c
===================================================================
--- drbd_main.c	(revision 1452)
+++ drbd_main.c	(working copy)
@@ -903,10 +903,10 @@
 
 	/* report statistics, every 4096 calls,
 	 * if we had at least one fallback,
-	 * but at most once every five minutes */
+	 * but at most once every hour */
 	if ( (++total & 0xfffUL) == 0 ) {
 		unsigned long now = jiffies;
-		if (fallback && time_before(last_rep+300*HZ, now)) {
+		if (fallback && time_before(last_rep+3600*HZ, now)) {
 			last_rep = now;
 			INFO("sendpage fallback/total: %lu/%lu\n",
 			                          fallback, total);
@@ -917,21 +917,11 @@
 	mdev->send_task=current;
 	spin_unlock(&mdev->send_task_lock);
 
-	/* PARANOIA. if this ever triggers,
-	 * something in the layers above us is really kaputt */
-	ERR_IF (page_count(page) < 1) {
-		ERR("someone wants to send a free page!\n");
-		dump_stack();
-		++fallback;
-		sent =  _drbd_no_send_page(mdev, page, offset, size);
-		if (likely(sent > 0)) len -= sent;
-		goto out;
-	}
-
-	if (PageSlab(page)) {
-		/* probably xfs. fall back to sendmsg instead of sendpage.
-		 * FIXME
-		 * we should rather understand and fix the real problem...
+	if ((page_count(page) < 1) 
+		|| PageSlab(page)) {
+		/* XFS meta- & log-data is in slab pages, which have a
+		 * page_count of 0 and/or have PageSlab() set...
+		 * FIXME: This is a workaround.
 		 */
 		++fallback;
 		sent = _drbd_no_send_page(mdev, page, offset, size);


More information about the drbd-user mailing list