public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: William Lee Irwin III <wli@holomorphy.com>
To: Alan Cox <alan@redhat.com>,
	"Salyzyn, Mark" <mark_salyzyn@adaptec.com>,
	y@redhat.com, Clay Haapala <chaapala@cisco.com>,
	James Bottomley <James.Bottomley@steeleye.com>,
	Christoph Hellwig <hch@infradead.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	SCSI Mailing List <linux-scsi@vger.kernel.org>
Cc: akpm@osdl.org
Subject: Re: PATCH: Further aacraid work
Date: Fri, 18 Jun 2004 08:05:18 -0700	[thread overview]
Message-ID: <20040618150518.GB1863@holomorphy.com> (raw)
In-Reply-To: <20040617204828.GC1495@holomorphy.com>

On Thu, Jun 17, 2004 at 04:38:42PM -0400, Alan Cox wrote:
>> What do the stats look like with the patch Andrew Morton (I think) posted
>> to reverse the page order from the allocator ?

On Thu, Jun 17, 2004 at 01:48:28PM -0700, William Lee Irwin III wrote:
> Say, could you guys try this? jejb seemed to get decent results with it.

Proper changelog this time, and comments, too. Adaptec et al, please
verify this resolves the issues you've been having.
Someone say _something_.

---

Based on Arjan van de Ven's idea, with guidance and testing from
James Bottomley.

The physical ordering of pages delivered to the IO subsystem is
strongly related to the order in which fragments are subdivided from
larger blocks of memory tracked by the page allocator. Consider a
single MAX_ORDER block of memory in isolation acted on by a sequence of
order 0 allocations in an otherwise empty buddy system. Subdividing
the block beginning at the highest addresses will yield all the pages
of the block in reverse, and subdividing the block begining at the
lowest addresses will yield all the pages of the block in physical
address order. Empirical tests demonstrate this ordering is preserved,
and that changing the order of subdivision so that the lowest page is
split off first resolves the sglist merging difficulties encountered by
driver authors at Adaptec and others in James Bottomley's testing.
James found that before this patch, there were 40 merges out of about
32K segments.  Afterward, there were 24007 merges out of 19513 segments,
for a merge rate of about 55%. Merges of 128 segments, the maximum
allowed, were observed afterward, where beforehand they never occurred.
It also improves dbench on my workstation and works fine there.

Signed-off-by: William Lee Irwin III <wli@holomorphy.com>


diff -prauN linux-2.6.7/mm/page_alloc.c linux-2.6.7/mm/page_alloc.c 
--- linux-2.6.7/mm/page_alloc.c	Sat Jun 12 20:52:26 2004
+++ linux-2.6.7/mm/page_alloc.c	Fri Jun 18 07:45:05 2004
@@ -290,6 +290,20 @@
 #define MARK_USED(index, order, area) \
 	__change_bit((index) >> (1+(order)), (area)->map)
 
+/*
+ * The order of subdivision here is critical for the IO subsystem.
+ * Please do not alter this order without good reasons and regression
+ * testing. Specifically, as large blocks of memory are subdivided,
+ * the order in which smaller blocks are delivered depends on the order
+ * they're subdivided in this function. This is the primary factor
+ * influencing the order in which pages are delivered to the IO
+ * subsystem according to empirical testing, and this is also justified
+ * by considering the behavior of a buddy system containing a single
+ * large block of memory acted on by a series of small allocations.
+ * This behavior is a critical factor in sglist merging's success.
+ *
+ * -- wli
+ */
 static inline struct page *
 expand(struct zone *zone, struct page *page,
 	 unsigned long index, int low, int high, struct free_area *area)
@@ -297,14 +311,12 @@
 	unsigned long size = 1 << high;
 
 	while (high > low) {
-		BUG_ON(bad_range(zone, page));
 		area--;
 		high--;
 		size >>= 1;
-		list_add(&page->lru, &area->free_list);
-		MARK_USED(index, high, area);
-		index += size;
-		page += size;
+		BUG_ON(bad_range(zone, &page[size]));
+		list_add(&page[size].lru, &area->free_list);
+		MARK_USED(index + size, high, area);
 	}
 	return page;
 }

  parent reply	other threads:[~2004-06-18 15:06 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-17 17:54 PATCH: Further aacraid work Salyzyn, Mark
2004-06-17 20:38 ` Alan Cox
2004-06-17 20:48   ` William Lee Irwin III
2004-06-17 20:56     ` James Bottomley
2004-06-18 15:05     ` William Lee Irwin III [this message]
2004-06-18 20:32       ` William Lee Irwin III
2004-06-27 17:33         ` James Bottomley
  -- strict thread matches above, loose matches on Subject: below --
2004-06-29 20:55 Salyzyn, Mark
2004-06-29 23:22 ` Byron Stanoszek
2004-06-30 19:52 ` Byron Stanoszek
2004-06-30 19:59   ` Dario
2004-06-29 19:27 Salyzyn, Mark
2004-06-29 20:20 ` Byron Stanoszek
2004-06-29 20:42 ` Alan Cox
2004-06-29 18:53 Salyzyn, Mark
2004-06-29 19:03 ` Byron Stanoszek
2004-06-28 13:17 Salyzyn, Mark
2004-06-18 20:53 Salyzyn, Mark
     [not found] <286GI-5y3-11@gated-at.bofh.it>
     [not found] ` <286Qp-5EU-19@gated-at.bofh.it>
2004-06-17 19:10   ` Andi Kleen
2004-06-17 20:54     ` Alan Cox
2004-06-17 21:13       ` James Bottomley
2004-06-17 21:25       ` Andi Kleen
2004-06-18 15:19         ` Benjamin Herrenschmidt
2004-06-18  5:57       ` Jeff Garzik
2004-06-18 14:07         ` James Bottomley
2004-06-18 15:17     ` Benjamin Herrenschmidt
2004-06-17 14:39 Salyzyn, Mark
2004-06-17 14:55 ` James Bottomley
2004-06-17 14:58   ` Alan Cox
2004-06-17 15:15     ` Arjan van de Ven
2004-06-17 19:16       ` James Bottomley
2004-06-17 16:32   ` Clay Haapala
2004-06-17 16:37     ` James Bottomley
2004-06-17 16:46     ` Alan Cox
2004-06-17 15:11 ` Anton Blanchard
2004-06-17 12:53 Salyzyn, Mark
2004-06-17 13:07 ` Matthew Wilcox
2004-06-17 13:19   ` Christoph Hellwig
2004-06-17 13:55   ` James Bottomley
2004-06-17 13:32 ` Christoph Hellwig
2004-06-17 14:02 ` Alan Cox
2004-06-16 21:04 Alan Cox
2004-06-16 21:33 ` Christoph Hellwig
2004-06-16 21:40   ` Alan Cox
2004-06-16 21:42     ` Christoph Hellwig
2004-06-16 21:48       ` Alan Cox
2004-06-16 21:58         ` Christoph Hellwig
2004-06-16 22:06           ` Alan Cox
2004-06-29 17:48 ` Byron Stanoszek
2004-06-29 18:27   ` Mark Haverkamp
2004-06-29 18:37   ` Alan Cox
2004-06-30  2:02 ` bm
2004-06-30 16:07   ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040618150518.GB1863@holomorphy.com \
    --to=wli@holomorphy.com \
    --cc=James.Bottomley@steeleye.com \
    --cc=akpm@osdl.org \
    --cc=alan@redhat.com \
    --cc=chaapala@cisco.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mark_salyzyn@adaptec.com \
    --cc=y@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox