From: Minchan Kim <minchan@kernel.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Richard Davies <richard@arachsys.com>,
Shaohua Li <shli@kernel.org>, Rik van Riel <riel@redhat.com>,
Avi Kivity <avi@redhat.com>, QEMU-devel <qemu-devel@nongnu.org>,
KVM <kvm@vger.kernel.org>, Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 6/9] mm: compaction: Acquire the zone->lock as late as possible
Date: Tue, 25 Sep 2012 16:36:54 +0900 [thread overview]
Message-ID: <20120925073654.GN13234@bbox> (raw)
In-Reply-To: <20120924085238.GY11266@suse.de>
On Mon, Sep 24, 2012 at 09:52:38AM +0100, Mel Gorman wrote:
> On Fri, Sep 21, 2012 at 02:35:57PM -0700, Andrew Morton wrote:
> > On Fri, 21 Sep 2012 11:46:20 +0100
> > Mel Gorman <mgorman@suse.de> wrote:
> >
> > > Compactions free scanner acquires the zone->lock when checking for PageBuddy
> > > pages and isolating them. It does this even if there are no PageBuddy pages
> > > in the range.
> > >
> > > This patch defers acquiring the zone lock for as long as possible. In the
> > > event there are no free pages in the pageblock then the lock will not be
> > > acquired at all which reduces contention on zone->lock.
> > >
> > > ...
> > >
> > > --- a/mm/compaction.c
> > > +++ b/mm/compaction.c
> > > @@ -93,6 +93,28 @@ static inline bool compact_trylock_irqsave(spinlock_t *lock,
> > > return compact_checklock_irqsave(lock, flags, false, cc);
> > > }
> > >
> > > +/* Returns true if the page is within a block suitable for migration to */
> > > +static bool suitable_migration_target(struct page *page)
> > > +{
> > > +
> >
> > stray newline
> >
>
> Fixed.
>
> > > + int migratetype = get_pageblock_migratetype(page);
> > > +
> > > + /* Don't interfere with memory hot-remove or the min_free_kbytes blocks */
> > > + if (migratetype == MIGRATE_ISOLATE || migratetype == MIGRATE_RESERVE)
> > > + return false;
> > > +
> > > + /* If the page is a large free page, then allow migration */
> > > + if (PageBuddy(page) && page_order(page) >= pageblock_order)
> > > + return true;
> > > +
> > > + /* If the block is MIGRATE_MOVABLE or MIGRATE_CMA, allow migration */
> > > + if (migrate_async_suitable(migratetype))
> > > + return true;
> > > +
> > > + /* Otherwise skip the block */
> > > + return false;
> > > +}
> > > +
> > >
> > > ...
> > >
> > > @@ -168,23 +193,38 @@ static unsigned long isolate_freepages_block(unsigned long blockpfn,
> > > int isolated, i;
> > > struct page *page = cursor;
> > >
> > > - if (!pfn_valid_within(blockpfn)) {
> > > - if (strict)
> > > - return 0;
> > > - continue;
> > > - }
> > > + if (!pfn_valid_within(blockpfn))
> > > + goto strict_check;
> > > nr_scanned++;
> > >
> > > - if (!PageBuddy(page)) {
> > > - if (strict)
> > > - return 0;
> > > - continue;
> > > - }
> > > + if (!PageBuddy(page))
> > > + goto strict_check;
> > > +
> > > + /*
> > > + * The zone lock must be held to isolate freepages. This
> > > + * unfortunately this is a very coarse lock and can be
> >
> > this this
> >
>
> Fixed.
>
> > > + * heavily contended if there are parallel allocations
> > > + * or parallel compactions. For async compaction do not
> > > + * spin on the lock and we acquire the lock as late as
> > > + * possible.
> > > + */
> > > + locked = compact_checklock_irqsave(&cc->zone->lock, &flags,
> > > + locked, cc);
> > > + if (!locked)
> > > + break;
> > > +
> > > + /* Recheck this is a suitable migration target under lock */
> > > + if (!strict && !suitable_migration_target(page))
> > > + break;
> > > +
> > > + /* Recheck this is a buddy page under lock */
> > > + if (!PageBuddy(page))
> > > + goto strict_check;
> > >
> > > /* Found a free page, break it into order-0 pages */
> > > isolated = split_free_page(page);
> > > if (!isolated && strict)
> > > - return 0;
> > > + goto strict_check;
> > > total_isolated += isolated;
> > > for (i = 0; i < isolated; i++) {
> > > list_add(&page->lru, freelist);
> > > @@ -196,9 +236,23 @@ static unsigned long isolate_freepages_block(unsigned long blockpfn,
> > > blockpfn += isolated - 1;
> > > cursor += isolated - 1;
> > > }
> > > +
> > > + continue;
> > > +
> > > +strict_check:
> > > + /* Abort isolation if the caller requested strict isolation */
> > > + if (strict) {
> > > + total_isolated = 0;
> > > + goto out;
> > > + }
> > > }
> > >
> > > trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated);
> > > +
> > > +out:
> > > + if (locked)
> > > + spin_unlock_irqrestore(&cc->zone->lock, flags);
> > > +
> > > return total_isolated;
> > > }
> >
> > A a few things about this function.
> >
> > Would it be cleaner if we did
> >
> > if (!strict) {
> > if (!suitable_migration_target(page))
> > break;
> > } else {
> > if (!PageBuddy(page))
> > goto strict_check;
> > }
> >
> > and then remove the test of `strict' from strict_check (which then
> > should be renamed)?
> >
>
> I was not able to implement what you suggested without breakage.
> However, I can do something very similar if "strict" mode does not bail
> out ASAP and instead double checks at the end that everything was
> isolated correctly. This does mean that CMAs failure case is slightly
> more expensive but that really is a corner case and I think it's
> acceptable if the code is easier to follow as a result.
>
> > Which perhaps means that the whole `strict_check' block can go away:
> >
> > if (!strict) {
> > if (!suitable_migration_target(page))
> > break;
> > } else {
> > if (!PageBuddy(page)) {
> > total_isolated = 0;
> > goto out;
> > }
> >
> > Have a think about it? The function is a little straggly.
> >
>
> Proposal below.
>
> > Secondly, it is correct/desirable to skip the (now misnamed
> > `trace_mm_compaction_isolate_freepages') tracepoint generation if we
> > baled out of the loop? The fact that we entered
> > isolate_freepages_block() but failed to isolate anything is data which
> > people might be interested in?
> >
>
> You're right, it may be interesting for someone debugging CMA to know that
> nr_scanned != nr_isolated at the time of allocation failure.
>
> > Thirdly, that (existing) comment "This assumes the block is valid" is
> > either too vague or wrong. If it's valid, why wo we check
> > pfn_valid_within()?
> >
>
> Comment is stale and should be removed.
>
> > > @@ -218,13 +272,18 @@ static unsigned long isolate_freepages_block(unsigned long blockpfn,
> > > unsigned long
> > > isolate_freepages_range(unsigned long start_pfn, unsigned long end_pfn)
> > > {
> > > - unsigned long isolated, pfn, block_end_pfn, flags;
> > > + unsigned long isolated, pfn, block_end_pfn;
> > > struct zone *zone = NULL;
> > > LIST_HEAD(freelist);
> > > + struct compact_control cc;
> > >
> > > if (pfn_valid(start_pfn))
> > > zone = page_zone(pfn_to_page(start_pfn));
> > >
> > > + /* cc needed for isolate_freepages_block to acquire zone->lock */
> > > + cc.zone = zone;
> > > + cc.sync = true;
> >
> > We initialise two of cc's fields, leave the other 12 fields containing
> > random garbage, then start using it. I see no bug here, but...
> >
>
> I get your point. At the very least we should initialise it with zeros.
>
> How about this?
>
> ---8<---
> mm: compaction: Iron out isolate_freepages_block() and isolate_freepages_range()
>
> Andrew pointed out that isolate_freepages_block() is "straggly" and
> isolate_freepages_range() is making assumptions on how compact_control is
> used which is delicate. This patch straightens isolate_freepages_block()
> and makes it fly straight and initialses compact_control to zeros in
> isolate_freepages_range(). The code should be easier to follow and
> is functionally equivalent. The CMA failure path is now a little more
> expensive but that is a marginal corner-case.
>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Minchan Kim <minchan@kernel.org>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-09-25 7:33 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-21 10:46 [PATCH 0/9] Reduce compaction scanning and lock contention Mel Gorman
2012-09-21 10:46 ` [PATCH 1/9] Revert "mm: compaction: check lock contention first before taking lock" Mel Gorman
2012-09-21 17:46 ` Rafael Aquini
2012-09-21 10:46 ` [PATCH 2/9] Revert "mm-compaction-abort-compaction-loop-if-lock-is-contended-or-run-too-long-fix" Mel Gorman
2012-09-21 17:47 ` Rafael Aquini
2012-09-21 10:46 ` [PATCH 3/9] Revert "mm: compaction: abort compaction loop if lock is contended or run too long" Mel Gorman
2012-09-21 17:48 ` Rafael Aquini
2012-09-21 10:46 ` [PATCH 4/9] mm: compaction: Abort compaction loop if lock is contended or run too long Mel Gorman
2012-09-21 17:50 ` Rafael Aquini
2012-09-21 21:31 ` Andrew Morton
2012-09-25 7:34 ` Minchan Kim
2012-09-21 10:46 ` [PATCH 5/9] mm: compaction: Acquire the zone->lru_lock as late as possible Mel Gorman
2012-09-21 17:51 ` Rafael Aquini
2012-09-25 7:05 ` Minchan Kim
2012-09-25 7:51 ` Mel Gorman
2012-09-25 8:13 ` Minchan Kim
2012-09-25 21:39 ` Andrew Morton
2012-09-26 0:23 ` Minchan Kim
2012-09-26 10:17 ` Mel Gorman
2012-09-21 10:46 ` [PATCH 6/9] mm: compaction: Acquire the zone->lock " Mel Gorman
2012-09-21 17:52 ` Rafael Aquini
2012-09-21 21:35 ` Andrew Morton
2012-09-24 8:52 ` Mel Gorman
2012-09-25 7:36 ` Minchan Kim [this message]
2012-09-25 7:35 ` Minchan Kim
2012-09-21 10:46 ` [PATCH 7/9] Revert "mm: have order > 0 compaction start off where it left" Mel Gorman
2012-09-21 17:52 ` Rafael Aquini
2012-09-25 7:37 ` Minchan Kim
2012-09-21 10:46 ` [PATCH 8/9] mm: compaction: Cache if a pageblock was scanned and no pages were isolated Mel Gorman
2012-09-21 17:53 ` Rafael Aquini
2012-09-21 21:36 ` Andrew Morton
2012-09-24 9:39 ` Mel Gorman
2012-09-24 21:26 ` Andrew Morton
2012-09-25 9:12 ` Mel Gorman
2012-09-25 20:03 ` Andrew Morton
2012-09-27 12:06 ` [PATCH] mm: compaction: cache if a pageblock was scanned and no pages were isolated -fix2 Mel Gorman
2012-09-27 13:12 ` [PATCH 8/9] mm: compaction: Cache if a pageblock was scanned and no pages were isolated Mel Gorman
2012-09-26 0:49 ` Minchan Kim
2012-09-27 12:14 ` Mel Gorman
2012-09-21 10:46 ` [PATCH 9/9] mm: compaction: Restart compaction from near where it left off Mel Gorman
2012-09-21 17:54 ` Rafael Aquini
2012-09-21 13:51 ` [PATCH 0/9] Reduce compaction scanning and lock contention Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120925073654.GN13234@bbox \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=qemu-devel@nongnu.org \
--cc=richard@arachsys.com \
--cc=riel@redhat.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).