linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan.kim@gmail.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 10/11] Direct compact when a high-order allocation fails
Date: Wed, 24 Mar 2010 20:59:45 +0900	[thread overview]
Message-ID: <28c262361003240459m7d981203nea98df5196812b6c@mail.gmail.com> (raw)
In-Reply-To: <20100324111159.GD21147@csn.ul.ie>

On Wed, Mar 24, 2010 at 8:11 PM, Mel Gorman <mel@csn.ul.ie> wrote:
> On Wed, Mar 24, 2010 at 08:10:40AM +0900, Minchan Kim wrote:
>> Hi, Mel.
>>
>> On Tue, Mar 23, 2010 at 9:25 PM, Mel Gorman <mel@csn.ul.ie> wrote:
>> > Ordinarily when a high-order allocation fails, direct reclaim is entered to
>> > free pages to satisfy the allocation.  With this patch, it is determined if
>> > an allocation failed due to external fragmentation instead of low memory
>> > and if so, the calling process will compact until a suitable page is
>> > freed. Compaction by moving pages in memory is considerably cheaper than
>> > paging out to disk and works where there are locked pages or no swap. If
>> > compaction fails to free a page of a suitable size, then reclaim will
>> > still occur.
>> >
>> > Direct compaction returns as soon as possible. As each block is compacted,
>> > it is checked if a suitable page has been freed and if so, it returns.
>> >
>> > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
>> > Acked-by: Rik van Riel <riel@redhat.com>
>> > ---
>> >  include/linux/compaction.h |   16 +++++-
>> >  include/linux/vmstat.h     |    1 +
>> >  mm/compaction.c            |  118 ++++++++++++++++++++++++++++++++++++++++++++
>> >  mm/page_alloc.c            |   26 ++++++++++
>> >  mm/vmstat.c                |   15 +++++-
>> >  5 files changed, 172 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/include/linux/compaction.h b/include/linux/compaction.h
>> > index c94890b..b851428 100644
>> > --- a/include/linux/compaction.h
>> > +++ b/include/linux/compaction.h
>> > @@ -1,14 +1,26 @@
>> >  #ifndef _LINUX_COMPACTION_H
>> >  #define _LINUX_COMPACTION_H
>> >
>> > -/* Return values for compact_zone() */
>> > +/* Return values for compact_zone() and try_to_compact_pages() */
>> >  #define COMPACT_INCOMPLETE     0
>> > -#define COMPACT_COMPLETE       1
>> > +#define COMPACT_PARTIAL                1
>> > +#define COMPACT_COMPLETE       2
>> >
>> >  #ifdef CONFIG_COMPACTION
>> >  extern int sysctl_compact_memory;
>> >  extern int sysctl_compaction_handler(struct ctl_table *table, int write,
>> >                        void __user *buffer, size_t *length, loff_t *ppos);
>> > +
>> > +extern int fragmentation_index(struct zone *zone, unsigned int order);
>> > +extern unsigned long try_to_compact_pages(struct zonelist *zonelist,
>> > +                       int order, gfp_t gfp_mask, nodemask_t *mask);
>> > +#else
>> > +static inline unsigned long try_to_compact_pages(struct zonelist *zonelist,
>> > +                       int order, gfp_t gfp_mask, nodemask_t *nodemask)
>> > +{
>> > +       return COMPACT_INCOMPLETE;
>> > +}
>> > +
>> >  #endif /* CONFIG_COMPACTION */
>> >
>> >  #if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
>> > diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
>> > index 56e4b44..b4b4d34 100644
>> > --- a/include/linux/vmstat.h
>> > +++ b/include/linux/vmstat.h
>> > @@ -44,6 +44,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
>> >                KSWAPD_SKIP_CONGESTION_WAIT,
>> >                PAGEOUTRUN, ALLOCSTALL, PGROTATED,
>> >                COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED,
>> > +               COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS,
>> >  #ifdef CONFIG_HUGETLB_PAGE
>> >                HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
>> >  #endif
>> > diff --git a/mm/compaction.c b/mm/compaction.c
>> > index 8df6e3d..6688700 100644
>> > --- a/mm/compaction.c
>> > +++ b/mm/compaction.c
>> > @@ -34,6 +34,8 @@ struct compact_control {
>> >        unsigned long nr_anon;
>> >        unsigned long nr_file;
>> >
>> > +       unsigned int order;             /* order a direct compactor needs */
>> > +       int migratetype;                /* MOVABLE, RECLAIMABLE etc */
>> >        struct zone *zone;
>> >  };
>> >
>> > @@ -301,10 +303,31 @@ static void update_nr_listpages(struct compact_control *cc)
>> >  static inline int compact_finished(struct zone *zone,
>> >                                                struct compact_control *cc)
>> >  {
>> > +       unsigned int order;
>> > +       unsigned long watermark = low_wmark_pages(zone) + (1 << cc->order);
>> > +
>> >        /* Compaction run completes if the migrate and free scanner meet */
>> >        if (cc->free_pfn <= cc->migrate_pfn)
>> >                return COMPACT_COMPLETE;
>> >
>> > +       /* Compaction run is not finished if the watermark is not met */
>> > +       if (!zone_watermark_ok(zone, cc->order, watermark, 0, 0))
>> > +               return COMPACT_INCOMPLETE;
>> > +
>> > +       if (cc->order == -1)
>> > +               return COMPACT_INCOMPLETE;
>> > +
>> > +       /* Direct compactor: Is a suitable page free? */
>> > +       for (order = cc->order; order < MAX_ORDER; order++) {
>> > +               /* Job done if page is free of the right migratetype */
>> > +               if (!list_empty(&zone->free_area[order].free_list[cc->migratetype]))
>> > +                       return COMPACT_PARTIAL;
>> > +
>> > +               /* Job done if allocation would set block type */
>> > +               if (order >= pageblock_order && zone->free_area[order].nr_free)
>> > +                       return COMPACT_PARTIAL;
>> > +       }
>> > +
>> >        return COMPACT_INCOMPLETE;
>> >  }
>> >
>> > @@ -348,6 +371,101 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>> >        return ret;
>> >  }
>> >
>> > +static inline unsigned long compact_zone_order(struct zone *zone,
>> > +                                               int order, gfp_t gfp_mask)
>> > +{
>> > +       struct compact_control cc = {
>> > +               .nr_freepages = 0,
>> > +               .nr_migratepages = 0,
>> > +               .order = order,
>> > +               .migratetype = allocflags_to_migratetype(gfp_mask),
>> > +               .zone = zone,
>> > +       };
>> > +       INIT_LIST_HEAD(&cc.freepages);
>> > +       INIT_LIST_HEAD(&cc.migratepages);
>> > +
>> > +       return compact_zone(zone, &cc);
>> > +}
>> > +
>> > +/**
>> > + * try_to_compact_pages - Direct compact to satisfy a high-order allocation
>> > + * @zonelist: The zonelist used for the current allocation
>> > + * @order: The order of the current allocation
>> > + * @gfp_mask: The GFP mask of the current allocation
>> > + * @nodemask: The allowed nodes to allocate from
>> > + *
>> > + * This is the main entry point for direct page compaction.
>> > + */
>> > +unsigned long try_to_compact_pages(struct zonelist *zonelist,
>> > +                       int order, gfp_t gfp_mask, nodemask_t *nodemask)
>> > +{
>> > +       enum zone_type high_zoneidx = gfp_zone(gfp_mask);
>> > +       int may_enter_fs = gfp_mask & __GFP_FS;
>> > +       int may_perform_io = gfp_mask & __GFP_IO;
>> > +       unsigned long watermark;
>> > +       struct zoneref *z;
>> > +       struct zone *zone;
>> > +       int rc = COMPACT_INCOMPLETE;
>> > +
>> > +       /* Check whether it is worth even starting compaction */
>> > +       if (order == 0 || !may_enter_fs || !may_perform_io)
>> > +               return rc;
>> > +
>> > +       /*
>> > +        * We will not stall if the necessary conditions are not met for
>> > +        * migration but direct reclaim seems to account stalls similarly
>> > +        */
>>
>> I can't understand this comment.
>> In case of direct reclaim, shrink_zones's long time is just stall
>> by view point of allocation customer.
>> So "Allocation is stalled" makes sense to me.
>>
>> But "Compaction is stalled" doesn't make sense to me.
>
> I considered a "stall" to be when the allocator is doing work that is not
> allocation-related such as page reclaim or in this case - memory compaction.

I agree.

>
>> How about "COMPACTION_DIRECT" like "PGSCAN_DIRECT"?
>
> PGSCAN_DIRECT is page-based counter on the number of pages scanned. The
> similar naming but very different meaning could be confusing to someone not
> familar with the counters. The event being counted here is the number of
> times compaction happened just like ALLOCSTALL counts the number of times
> direct reclaim happened.

You're right. I just wanted to change the name as one which imply
direct compaction.
That's because I believe we will implement it by backgroud, too.
Then It's more straightforward, I think. :-)

>
> How about COMPACTSTALL like ALLOCSTALL? :/

I wouldn't have a strong objection any more if you insist on it.

>> I think It's straightforward.
>> Naming is important since it makes ABI.
>>
>> > +       count_vm_event(COMPACTSTALL);
>> > +
>>
>>
>>
>>
>>
>> --
>> Kind regards,
>> Minchan Kim
>>
>
> --
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-03-24 11:59 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-23 12:25 [PATCH 0/11] Memory Compaction v5 Mel Gorman
2010-03-23 12:25 ` [PATCH 01/11] mm,migration: Take a reference to the anon_vma before migrating Mel Gorman
2010-03-23 12:25 ` [PATCH 02/11] mm,migration: Do not try to migrate unmapped anonymous pages Mel Gorman
2010-03-23 17:22   ` Christoph Lameter
2010-03-23 18:04     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 03/11] mm: Share the anon_vma ref counts between KSM and page migration Mel Gorman
2010-03-23 17:25   ` Christoph Lameter
2010-03-23 23:55   ` KAMEZAWA Hiroyuki
2010-03-23 12:25 ` [PATCH 04/11] Allow CONFIG_MIGRATION to be set without CONFIG_NUMA or memory hot-remove Mel Gorman
2010-03-23 12:25 ` [PATCH 05/11] Export unusable free space index via /proc/unusable_index Mel Gorman
2010-03-23 17:31   ` Christoph Lameter
2010-03-23 18:14     ` Mel Gorman
2010-03-24  0:03   ` KAMEZAWA Hiroyuki
2010-03-24  0:16     ` Minchan Kim
2010-03-24  0:13       ` KAMEZAWA Hiroyuki
2010-03-24 10:25     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 06/11] Export fragmentation index via /proc/extfrag_index Mel Gorman
2010-03-23 17:37   ` Christoph Lameter
2010-03-23 12:25 ` [PATCH 07/11] Memory compaction core Mel Gorman
2010-03-23 17:56   ` Christoph Lameter
2010-03-23 18:15     ` Mel Gorman
2010-03-23 18:33       ` Christoph Lameter
2010-03-23 18:58         ` Mel Gorman
2010-03-23 19:20           ` Christoph Lameter
2010-03-24  1:03   ` KAMEZAWA Hiroyuki
2010-03-24  1:47     ` Minchan Kim
2010-03-24  1:53       ` KAMEZAWA Hiroyuki
2010-03-24  2:10         ` Minchan Kim
2010-03-24 10:57           ` Mel Gorman
2010-03-24 20:33   ` Andrew Morton
2010-03-24 20:59     ` Jonathan Corbet
2010-03-24 21:14       ` Andrew Morton
2010-03-24 21:19         ` Christoph Lameter
2010-03-24 21:19       ` Andrea Arcangeli
2010-03-24 21:28         ` Jonathan Corbet
2010-03-24 21:47           ` Andrea Arcangeli
2010-03-24 21:54             ` Jonathan Corbet
2010-03-24 22:06               ` Andrea Arcangeli
2010-03-24 21:57             ` Andrea Arcangeli
2010-03-25  9:13     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 08/11] Add /proc trigger for memory compaction Mel Gorman
2010-03-23 18:25   ` Christoph Lameter
2010-03-23 18:32     ` Mel Gorman
2010-03-24 20:33   ` Andrew Morton
2010-03-26 10:46     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 09/11] Add /sys trigger for per-node " Mel Gorman
2010-03-23 18:27   ` Christoph Lameter
2010-03-23 22:45   ` Minchan Kim
2010-03-24  0:19   ` KAMEZAWA Hiroyuki
2010-03-23 12:25 ` [PATCH 10/11] Direct compact when a high-order allocation fails Mel Gorman
2010-03-23 23:10   ` Minchan Kim
2010-03-24 11:11     ` Mel Gorman
2010-03-24 11:59       ` Minchan Kim [this message]
2010-03-24 12:06         ` Minchan Kim
2010-03-24 12:10           ` Mel Gorman
2010-03-24 12:09         ` Mel Gorman
2010-03-24 12:25           ` Minchan Kim
2010-03-24  1:19   ` KAMEZAWA Hiroyuki
2010-03-24 11:40     ` Mel Gorman
2010-03-25  0:30       ` KAMEZAWA Hiroyuki
2010-03-25  9:48         ` Mel Gorman
2010-03-25  9:50           ` KAMEZAWA Hiroyuki
2010-03-25 10:16             ` Mel Gorman
2010-03-26  1:03               ` KAMEZAWA Hiroyuki
2010-03-26  9:40                 ` Mel Gorman
2010-03-24 20:48   ` Andrew Morton
2010-03-25  0:57     ` KAMEZAWA Hiroyuki
2010-03-25 10:21     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 11/11] Do not compact within a preferred zone after a compaction failure Mel Gorman
2010-03-23 18:31   ` Christoph Lameter
2010-03-23 18:39     ` Mel Gorman
2010-03-23 19:27       ` Christoph Lameter
2010-03-24 10:37         ` Mel Gorman
2010-03-24 19:54           ` Christoph Lameter
2010-03-24 20:53   ` Andrew Morton
2010-03-25  9:40     ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2010-03-12 16:41 [PATCH 0/11] Memory Compaction v4 Mel Gorman
2010-03-12 16:41 ` [PATCH 10/11] Direct compact when a high-order allocation fails Mel Gorman
2010-03-16  2:47   ` Minchan Kim
2010-03-19  6:21   ` KOSAKI Motohiro
2010-03-19  6:31     ` KOSAKI Motohiro
2010-03-19 10:10       ` Mel Gorman
2010-03-25 11:22         ` KOSAKI Motohiro
2010-03-19 10:09     ` Mel Gorman
2010-03-25 11:08       ` KOSAKI Motohiro
2010-03-25 15:11         ` Mel Gorman
2010-03-26  6:01           ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28c262361003240459m7d981203nea98df5196812b6c@mail.gmail.com \
    --to=minchan.kim@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=cl@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).