From: Mel Gorman <mel@csn.ul.ie>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Dave Chinner <david@fromorbit.com>,
Chris Mason <chris.mason@oracle.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 3/4] mm: introduce free_pages_bulk
Date: Thu, 15 Apr 2010 14:46:48 +0100 [thread overview]
Message-ID: <20100415134648.GF10966@csn.ul.ie> (raw)
In-Reply-To: <20100415192412.D1AA.A69D9226@jp.fujitsu.com>
On Thu, Apr 15, 2010 at 07:24:53PM +0900, KOSAKI Motohiro wrote:
> Now, vmscan is using __pagevec_free() for batch freeing. but
> pagevec consume slightly lots stack (sizeof(long)*8), and x86_64
> stack is very strictly limited.
>
> Then, now we are planning to use page->lru list instead pagevec
> for reducing stack. and introduce new helper function.
>
> This is similar to __pagevec_free(), but receive list instead
> pagevec. and this don't use pcp cache. it is good characteristics
> for vmscan.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
> include/linux/gfp.h | 1 +
> mm/page_alloc.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 45 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 4c6d413..dbcac56 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -332,6 +332,7 @@ extern void free_hot_cold_page(struct page *page, int cold);
> #define __free_page(page) __free_pages((page), 0)
> #define free_page(addr) free_pages((addr),0)
>
> +void free_pages_bulk(struct zone *zone, struct list_head *list);
> void page_alloc_init(void);
> void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> void drain_all_pages(void);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ba9aea7..1f68832 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2049,6 +2049,50 @@ void free_pages(unsigned long addr, unsigned int order)
>
> EXPORT_SYMBOL(free_pages);
>
> +/*
> + * Frees a number of pages from the list
> + * Assumes all pages on list are in same zone and order==0.
> + *
> + * This is similar to __pagevec_free(), but receive list instead pagevec.
> + * and this don't use pcp cache. it is good characteristics for vmscan.
> + */
> +void free_pages_bulk(struct zone *zone, struct list_head *list)
> +{
> + unsigned long flags;
> + struct page *page;
> + struct page *page2;
> + int nr_pages = 0;
> +
> + list_for_each_entry_safe(page, page2, list, lru) {
> + int wasMlocked = __TestClearPageMlocked(page);
> +
> + if (free_pages_prepare(page, 0)) {
> + /* Make orphan the corrupted page. */
> + list_del(&page->lru);
> + continue;
> + }
> + if (unlikely(wasMlocked)) {
> + local_irq_save(flags);
> + free_page_mlock(page);
> + local_irq_restore(flags);
> + }
You could clear this under the zone->lock below before calling
__free_one_page. It'd avoid a large number of IRQ enables and disables which
are a problem on some CPUs (P4 and Itanium both blow in this regard according
to PeterZ).
> + nr_pages++;
> + }
> +
> + spin_lock_irqsave(&zone->lock, flags);
> + __count_vm_events(PGFREE, nr_pages);
> + zone->all_unreclaimable = 0;
> + zone->pages_scanned = 0;
> + __mod_zone_page_state(zone, NR_FREE_PAGES, nr_pages);
> +
> + list_for_each_entry_safe(page, page2, list, lru) {
> + /* have to delete it as __free_one_page list manipulates */
> + list_del(&page->lru);
> + __free_one_page(page, zone, 0, page_private(page));
> + }
This has the effect of bypassing the per-cpu lists as well as making the
zone lock hotter. The cache hotness of the data within the page is
probably not a factor but the cache hotness of the stuct page is.
The zone lock getting hotter is a greater problem. Large amounts of page
reclaim or dumping of page cache will now contend on the zone lock where
as previously it would have dumped into the per-cpu lists (potentially
but not necessarily avoiding the zone lock).
While there might be a stack saving in the next patch, there would appear
to be definite performance implications in taking this patch.
Functionally, I see no problem but I'd put this sort of patch on the
very long finger until the performance aspects of it could be examined.
> + spin_unlock_irqrestore(&zone->lock, flags);
> +}
> +
> /**
> * alloc_pages_exact - allocate an exact number physically-contiguous pages.
> * @size: the number of bytes to allocate
> --
> 1.6.5.2
>
>
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-04-15 13:46 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-13 0:17 [PATCH] mm: disallow direct reclaim page writeback Dave Chinner
2010-04-13 8:31 ` KOSAKI Motohiro
2010-04-13 10:29 ` Dave Chinner
2010-04-13 11:39 ` KOSAKI Motohiro
2010-04-13 14:36 ` Dave Chinner
2010-04-14 3:12 ` Dave Chinner
2010-04-14 6:52 ` KOSAKI Motohiro
2010-04-15 1:56 ` Dave Chinner
2010-04-14 6:52 ` KOSAKI Motohiro
2010-04-14 7:36 ` Dave Chinner
2010-04-13 9:58 ` Mel Gorman
2010-04-13 11:19 ` Dave Chinner
2010-04-13 19:34 ` Mel Gorman
2010-04-13 20:20 ` Chris Mason
2010-04-14 1:40 ` Dave Chinner
2010-04-14 4:59 ` KAMEZAWA Hiroyuki
2010-04-14 5:41 ` Dave Chinner
2010-04-14 5:54 ` KOSAKI Motohiro
2010-04-14 6:13 ` Minchan Kim
2010-04-14 7:19 ` Minchan Kim
2010-04-14 9:42 ` KAMEZAWA Hiroyuki
2010-04-14 10:01 ` Minchan Kim
2010-04-14 10:07 ` Mel Gorman
2010-04-14 10:16 ` Minchan Kim
2010-04-14 7:06 ` Dave Chinner
2010-04-14 6:52 ` KOSAKI Motohiro
2010-04-14 7:28 ` Dave Chinner
2010-04-14 8:51 ` Mel Gorman
2010-04-15 1:34 ` Dave Chinner
2010-04-15 4:09 ` KOSAKI Motohiro
2010-04-15 4:11 ` [PATCH 1/4] vmscan: delegate pageout io to flusher thread if current is kswapd KOSAKI Motohiro
2010-04-15 8:05 ` Suleiman Souhlal
2010-04-15 8:17 ` KOSAKI Motohiro
2010-04-15 8:26 ` KOSAKI Motohiro
2010-04-15 10:30 ` Johannes Weiner
2010-04-15 17:24 ` Suleiman Souhlal
2010-04-20 2:56 ` Ying Han
2010-04-15 9:32 ` Dave Chinner
2010-04-15 9:41 ` KOSAKI Motohiro
2010-04-15 17:27 ` Suleiman Souhlal
2010-04-15 23:33 ` Dave Chinner
2010-04-15 23:41 ` Suleiman Souhlal
2010-04-16 9:50 ` Alan Cox
2010-04-17 3:06 ` Dave Chinner
2010-04-15 8:18 ` KOSAKI Motohiro
2010-04-15 10:31 ` Mel Gorman
2010-04-15 11:26 ` KOSAKI Motohiro
2010-04-15 4:13 ` [PATCH 2/4] vmscan: kill prev_priority completely KOSAKI Motohiro
2010-04-15 4:14 ` [PATCH 3/4] vmscan: move priority variable into scan_control KOSAKI Motohiro
2010-04-15 4:15 ` [PATCH 4/4] vmscan: delegate page cleaning io to flusher thread if VM pressure is low KOSAKI Motohiro
2010-04-15 4:35 ` [PATCH] mm: disallow direct reclaim page writeback KOSAKI Motohiro
2010-04-15 6:32 ` Dave Chinner
2010-04-15 6:44 ` KOSAKI Motohiro
2010-04-15 6:58 ` Dave Chinner
2010-04-15 6:20 ` Dave Chinner
2010-04-15 6:35 ` KOSAKI Motohiro
2010-04-15 8:54 ` Dave Chinner
2010-04-15 10:21 ` KOSAKI Motohiro
2010-04-15 10:23 ` [PATCH 1/4] vmscan: simplify shrink_inactive_list() KOSAKI Motohiro
2010-04-15 13:15 ` Mel Gorman
2010-04-15 15:01 ` Andi Kleen
2010-04-15 15:44 ` Mel Gorman
2010-04-15 16:54 ` Andi Kleen
2010-04-15 23:40 ` Dave Chinner
2010-04-16 7:13 ` Andi Kleen
2010-04-16 14:57 ` Mel Gorman
2010-04-17 2:37 ` Dave Chinner
2010-04-16 14:55 ` Mel Gorman
2010-04-15 18:22 ` Valdis.Kletnieks
2010-04-16 9:39 ` Mel Gorman
2010-04-15 10:24 ` [PATCH 2/4] [cleanup] mm: introduce free_pages_prepare KOSAKI Motohiro
2010-04-15 13:33 ` Mel Gorman
2010-04-15 10:24 ` [PATCH 3/4] mm: introduce free_pages_bulk KOSAKI Motohiro
2010-04-15 13:46 ` Mel Gorman [this message]
2010-04-15 10:26 ` [PATCH 4/4] vmscan: replace the pagevec in shrink_inactive_list() with list KOSAKI Motohiro
2010-04-15 10:28 ` [PATCH] mm: disallow direct reclaim page writeback Mel Gorman
2010-04-15 13:42 ` Chris Mason
2010-04-15 17:50 ` tytso
2010-04-16 15:05 ` Mel Gorman
[not found] ` <20100416150510.GL19264@csn.ul.ie>
2010-04-19 15:15 ` Mel Gorman
[not found] ` <20100419151511.GV19264@csn.ul.ie>
2010-04-19 17:38 ` Chris Mason
2010-04-16 4:14 ` Dave Chinner
2010-04-16 15:14 ` Mel Gorman
2010-04-18 0:32 ` Andrew Morton
2010-04-18 19:05 ` Christoph Hellwig
2010-04-18 16:31 ` Andrew Morton
2010-04-18 19:35 ` Christoph Hellwig
2010-04-18 19:11 ` Sorin Faibish
2010-04-18 19:10 ` Sorin Faibish
2010-04-18 21:30 ` James Bottomley
2010-04-18 23:34 ` Sorin Faibish
2010-04-19 3:08 ` tytso
2010-04-19 0:35 ` Dave Chinner
2010-04-19 0:49 ` Arjan van de Ven
2010-04-19 1:08 ` Dave Chinner
2010-04-19 4:32 ` Arjan van de Ven
2010-04-19 15:20 ` Mel Gorman
2010-04-23 1:06 ` Dave Chinner
2010-04-23 10:50 ` Mel Gorman
2010-04-15 14:57 ` Andi Kleen
2010-04-15 2:37 ` Johannes Weiner
2010-04-15 2:43 ` KOSAKI Motohiro
2010-04-16 23:56 ` Johannes Weiner
2010-04-14 6:52 ` KOSAKI Motohiro
2010-04-14 10:06 ` Andi Kleen
2010-04-14 11:20 ` Chris Mason
2010-04-14 12:15 ` Andi Kleen
2010-04-14 12:32 ` Alan Cox
2010-04-14 12:34 ` Andi Kleen
2010-04-14 13:23 ` Mel Gorman
2010-04-14 14:07 ` Chris Mason
2010-04-14 0:24 ` Minchan Kim
2010-04-14 4:44 ` Dave Chinner
2010-04-14 7:54 ` Minchan Kim
2010-04-16 1:13 ` KAMEZAWA Hiroyuki
2010-04-16 4:18 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100415134648.GF10966@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).