All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Lord <liml@rtr.ca>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	jens.axboe@oracle.com, lkml@rtr.ca, matthew@wil.cx,
	linux-ide@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-mm@kvack.org, mel@csn.ul.ie
Subject: [PATCH] fix page_alloc for larger I/O segments
Date: Thu, 13 Dec 2007 19:40:09 -0500	[thread overview]
Message-ID: <4761D0E9.4010701@rtr.ca> (raw)
In-Reply-To: <4761CE88.9070406@rtr.ca>

Mark Lord wrote:
> Mark Lord wrote:
>> Mark Lord wrote:
>>> Mark Lord wrote:
>>>> Andrew Morton wrote:
>>>>> On Thu, 13 Dec 2007 17:15:06 -0500
>>>>> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>>>>>
>>>>>> On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:
>>>>>>> On Thu, 13 Dec 2007 21:09:59 +0100
>>>>>>> Jens Axboe <jens.axboe@oracle.com> wrote:
>>>>>>>
>>>>>>>> OK, it's a vm issue,
>>>>>>> cc linux-mm and probable culprit.
>>>>>>>
>>>>>>>>  I have tens of thousand "backward" pages after a
>>>>>>>> boot - IOW, bvec->bv_page is the page before bvprv->bv_page, not
>>>>>>>> reverse. So it looks like that bug got reintroduced.
>>>>>>> Bill Irwin fixed this a couple of years back: changed the page 
>>>>>>> allocator so
>>>>>>> that it mostly hands out pages in ascending physical-address order.
>>>>>>>
>>>>>>> I guess we broke that, quite possibly in Mel's page allocator 
>>>>>>> rework.
>>>>>>>
>>>>>>> It would help if you could provide us with a simple recipe for
>>>>>>> demonstrating this problem, please.
>>>>>> The simple way seems to be to malloc a large area, touch every 
>>>>>> page and
>>>>>> then look at the physical pages assigned ... they now mostly seem 
>>>>>> to be
>>>>>> descending in physical address.
>>>>>>
>>>>>
>>>>> OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...
>>>> ..
>>>>
>>>> I'm actually running the treadmill right now (have been for many 
>>>> hours, actually,
>>>> to bisect it to a specific commit.
>>>>
>>>> Thought I was almost done, and then noticed that git-bisect doesn't 
>>>> keep
>>>> the Makefile VERSION lines the same, so I was actually running the 
>>>> wrong
>>>> kernel after the first few times.. duh.
>>>>
>>>> Wrote a script to fix it now.
>>> ..
>>>
>>> Well, that was a waste of three hours.
>> ..
>>
>> Ahh.. it seems to be sensitive to one/both of these:
>>
>> CONFIG_HIGHMEM64G=y with 4GB RAM:  not so bad, frequently does 20KB - 
>> 48KB segments.
>> CONFIG_HIGHMEM4G=y  with 2GB RAM:  very severe, rarely does more than 
>> 8KB segments.
>> CONFIG_HIGHMEM4G=y  with 3GB RAM:  very severe, rarely does more than 
>> 8KB segments.
>>
>> So if you want to reproduce this on a large memory machine, use 
>> "mem=2GB" for starters.
> ..
> 
> Here's the commit that causes the regression:
> 
> 535131e6925b4a95f321148ad7293f496e0e58d7 Choose pages from the per-cpu 
> list based on migration type
> 

And here is a patch that seems to fix it for me here:

* * * *

Fix page allocator to give better change of larger contiguous segments (again).

Signed-off-by: Mark Lord <mlord@pobox.com
---


--- old/mm/page_alloc.c.orig	2007-12-13 19:25:15.000000000 -0500
+++ linux-2.6/mm/page_alloc.c	2007-12-13 19:35:50.000000000 -0500
@@ -954,7 +954,7 @@
 				goto failed;
 		}
 		/* Find a page of the appropriate migrate type */
-		list_for_each_entry(page, &pcp->list, lru) {
+		list_for_each_entry_reverse(page, &pcp->list, lru) {
 			if (page_private(page) == migratetype) {
 				list_del(&page->lru);
 				pcp->count--;

WARNING: multiple messages have this Message-ID (diff)
From: Mark Lord <liml@rtr.ca>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	jens.axboe@oracle.com, lkml@rtr.ca, matthew@wil.cx,
	linux-ide@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-mm@kvack.org, mel@csn.ul.ie
Subject: [PATCH] fix page_alloc for larger I/O segments
Date: Thu, 13 Dec 2007 19:40:09 -0500	[thread overview]
Message-ID: <4761D0E9.4010701@rtr.ca> (raw)
In-Reply-To: <4761CE88.9070406@rtr.ca>

Mark Lord wrote:
> Mark Lord wrote:
>> Mark Lord wrote:
>>> Mark Lord wrote:
>>>> Andrew Morton wrote:
>>>>> On Thu, 13 Dec 2007 17:15:06 -0500
>>>>> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>>>>>
>>>>>> On Thu, 2007-12-13 at 14:02 -0800, Andrew Morton wrote:
>>>>>>> On Thu, 13 Dec 2007 21:09:59 +0100
>>>>>>> Jens Axboe <jens.axboe@oracle.com> wrote:
>>>>>>>
>>>>>>>> OK, it's a vm issue,
>>>>>>> cc linux-mm and probable culprit.
>>>>>>>
>>>>>>>>  I have tens of thousand "backward" pages after a
>>>>>>>> boot - IOW, bvec->bv_page is the page before bvprv->bv_page, not
>>>>>>>> reverse. So it looks like that bug got reintroduced.
>>>>>>> Bill Irwin fixed this a couple of years back: changed the page 
>>>>>>> allocator so
>>>>>>> that it mostly hands out pages in ascending physical-address order.
>>>>>>>
>>>>>>> I guess we broke that, quite possibly in Mel's page allocator 
>>>>>>> rework.
>>>>>>>
>>>>>>> It would help if you could provide us with a simple recipe for
>>>>>>> demonstrating this problem, please.
>>>>>> The simple way seems to be to malloc a large area, touch every 
>>>>>> page and
>>>>>> then look at the physical pages assigned ... they now mostly seem 
>>>>>> to be
>>>>>> descending in physical address.
>>>>>>
>>>>>
>>>>> OIC.  -mm's /proc/pid/pagemap can be used to get the pfn's...
>>>> ..
>>>>
>>>> I'm actually running the treadmill right now (have been for many 
>>>> hours, actually,
>>>> to bisect it to a specific commit.
>>>>
>>>> Thought I was almost done, and then noticed that git-bisect doesn't 
>>>> keep
>>>> the Makefile VERSION lines the same, so I was actually running the 
>>>> wrong
>>>> kernel after the first few times.. duh.
>>>>
>>>> Wrote a script to fix it now.
>>> ..
>>>
>>> Well, that was a waste of three hours.
>> ..
>>
>> Ahh.. it seems to be sensitive to one/both of these:
>>
>> CONFIG_HIGHMEM64G=y with 4GB RAM:  not so bad, frequently does 20KB - 
>> 48KB segments.
>> CONFIG_HIGHMEM4G=y  with 2GB RAM:  very severe, rarely does more than 
>> 8KB segments.
>> CONFIG_HIGHMEM4G=y  with 3GB RAM:  very severe, rarely does more than 
>> 8KB segments.
>>
>> So if you want to reproduce this on a large memory machine, use 
>> "mem=2GB" for starters.
> ..
> 
> Here's the commit that causes the regression:
> 
> 535131e6925b4a95f321148ad7293f496e0e58d7 Choose pages from the per-cpu 
> list based on migration type
> 

And here is a patch that seems to fix it for me here:

* * * *

Fix page allocator to give better change of larger contiguous segments (again).

Signed-off-by: Mark Lord <mlord@pobox.com
---


--- old/mm/page_alloc.c.orig	2007-12-13 19:25:15.000000000 -0500
+++ linux-2.6/mm/page_alloc.c	2007-12-13 19:35:50.000000000 -0500
@@ -954,7 +954,7 @@
 				goto failed;
 		}
 		/* Find a page of the appropriate migrate type */
-		list_for_each_entry(page, &pcp->list, lru) {
+		list_for_each_entry_reverse(page, &pcp->list, lru) {
 			if (page_private(page) == migratetype) {
 				list_del(&page->lru);
 				pcp->count--;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2007-12-14  0:40 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-13 18:36 QUEUE_FLAG_CLUSTER: not working in 2.6.24 ? Mark Lord
2007-12-13 18:37 ` Mark Lord
2007-12-13 18:42   ` Matthew Wilcox
2007-12-13 18:46     ` James Bottomley
2007-12-13 18:48   ` Mark Lord
2007-12-13 18:53     ` Matthew Wilcox
2007-12-13 19:03       ` Mark Lord
2007-12-13 19:26         ` Jens Axboe
2007-12-13 19:30           ` Mark Lord
2007-12-13 19:32             ` Mark Lord
2007-12-13 19:39               ` Jens Axboe
2007-12-13 19:42                 ` Mark Lord
2007-12-13 19:53                   ` Jens Axboe
2007-12-13 19:59                     ` Mark Lord
2007-12-13 20:05                       ` Jens Axboe
2007-12-13 20:02                     ` Jens Axboe
2007-12-13 20:06                       ` Mark Lord
2007-12-13 20:09                         ` Jens Axboe
2007-12-13 20:14                           ` Mark Lord
2007-12-13 20:18                             ` Mark Lord
2007-12-13 20:21                             ` Jens Axboe
2007-12-13 22:02                           ` Andrew Morton
2007-12-13 22:02                             ` Andrew Morton
2007-12-13 22:15                             ` James Bottomley
2007-12-13 22:15                               ` James Bottomley
2007-12-13 22:29                               ` Andrew Morton
2007-12-13 22:29                                 ` Andrew Morton
2007-12-13 22:33                                 ` Mark Lord
2007-12-13 22:33                                   ` Mark Lord
2007-12-13 23:13                                   ` Mark Lord
2007-12-13 23:13                                     ` Mark Lord
2007-12-14  0:05                                     ` Mark Lord
2007-12-14  0:05                                       ` Mark Lord
2007-12-14  0:30                                       ` Mark Lord
2007-12-14  0:30                                         ` Mark Lord
2007-12-14  0:37                                         ` Andrew Morton
2007-12-14  0:37                                           ` Andrew Morton
2007-12-14  0:42                                           ` Mark Lord
2007-12-14  0:42                                             ` Mark Lord
2007-12-14  0:46                                             ` [PATCH] fix page_alloc for larger I/O segments (improved) Mark Lord
2007-12-14  0:46                                               ` Mark Lord
2007-12-14  0:57                                               ` James Bottomley
2007-12-14  0:57                                                 ` James Bottomley
2007-12-14  1:11                                                 ` Andrew Morton
2007-12-14  1:11                                                   ` Andrew Morton
2007-12-14  2:23                                                   ` Mark Lord
2007-12-14  2:23                                                     ` Mark Lord
2007-12-14  2:23                                                     ` Mark Lord
2007-12-14 17:42                                               ` Mel Gorman
2007-12-14 17:42                                                 ` Mel Gorman
2007-12-14 18:07                                                 ` Mark Lord
2007-12-14 18:07                                                   ` Mark Lord
2007-12-16 21:56                                                   ` Mel Gorman
2007-12-16 21:56                                                     ` Mel Gorman
2007-12-14 18:13                                                 ` Matthew Wilcox
2007-12-14 18:13                                                   ` Matthew Wilcox
2007-12-14 18:30                                                   ` Mark Lord
2007-12-14 18:30                                                     ` Mark Lord
2007-12-20 22:37                                                   ` Matthew Wilcox
2007-12-20 22:37                                                     ` Matthew Wilcox
2007-12-14  0:47                                             ` QUEUE_FLAG_CLUSTER: not working in 2.6.24 ? Mark Lord
2007-12-14  0:47                                               ` Mark Lord
2007-12-14 11:50                                           ` Mel Gorman
2007-12-14 11:50                                             ` Mel Gorman
2007-12-14 13:57                                             ` Mark Lord
2007-12-14 13:57                                               ` Mark Lord
2007-12-14  0:40                                         ` Mark Lord [this message]
2007-12-14  0:40                                           ` [PATCH] fix page_alloc for larger I/O segments Mark Lord
2007-12-14  1:03                                           ` Andrew Morton
2007-12-14  1:03                                             ` Andrew Morton
2007-12-14  4:00                                             ` Matthew Wilcox
2007-12-14  4:00                                               ` Matthew Wilcox
2007-12-15  1:09                                 ` QUEUE_FLAG_CLUSTER: not working in 2.6.24 ? Mel Gorman
2007-12-15  1:09                                   ` Mel Gorman
2007-12-15  2:02                                   ` Andrew Morton
2007-12-15  2:02                                     ` Andrew Morton
2007-12-15  5:55                                     ` Matt Mackall
2007-12-15  5:55                                       ` Matt Mackall
2007-12-16 21:55                                     ` Mel Gorman
2007-12-16 21:55                                       ` Mel Gorman
2007-12-17 19:24                                       ` Randy Dunlap
2007-12-17 19:24                                         ` Randy Dunlap
2007-12-18  2:42                                         ` Matt Mackall
2007-12-18  2:42                                           ` Matt Mackall
2007-12-13 22:17                             ` Jens Axboe
2007-12-13 22:17                               ` Jens Axboe
2007-12-13 22:02                           ` VM allocates pages in reverse order again Matthew Wilcox
2007-12-13 22:02                             ` Matthew Wilcox
2007-12-13 19:37             ` QUEUE_FLAG_CLUSTER: not working in 2.6.24 ? Jens Axboe
2007-12-13 19:53           ` Mark Lord

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4761D0E9.4010701@rtr.ca \
    --to=liml@rtr.ca \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lkml@rtr.ca \
    --cc=matthew@wil.cx \
    --cc=mel@csn.ul.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.