public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	linux-kernel@vger.kernel.org, linux-mm@vger.kernel.org,
	Minchan Kim <minchan@kernel.org>,
	Michal Nazarewicz <mina86@mina86.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
	Mel Gorman <mgorman@suse.de>,
	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Subject: Re: [PATCH v5 14/14] mm, compaction: try to capture the just-created high-order freepage
Date: Tue, 29 Jul 2014 17:34:37 +0200	[thread overview]
Message-ID: <53D7BF0D.5050404@suse.cz> (raw)
In-Reply-To: <20140729073456.GC1610@js1304-P5Q-DELUXE>

On 07/29/2014 09:34 AM, Joonsoo Kim wrote:
> I don't look at it in detail, but, it looks really duplicated and hard
> to maintain. From my experience, this is really error-prone. Please
> think of freepage counting bugs reported by my recent patchset.
> Freepage counting handles counting at different places for performance reason
> and finally bugs are there. IMHO, making common function and using it
> is better than this approach even if we touch the fastpath.

OK, so opposite opinion than Minchan's :)

> Could you separate this patch to this patchset?
> I think that this patch doesn't get much reviewed from other developers
> unlike other patches.

Yeah I will.

>> @@ -570,6 +572,14 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>   	unsigned long flags;
>>   	bool locked = false;
>>   	struct page *page = NULL, *valid_page = NULL;
>> +	unsigned long capture_pfn = 0;   /* current candidate for capturing */
>> +	unsigned long next_capture_pfn = 0; /* next candidate for capturing */
>> +
>> +	if (cc->order > 0 && cc->order <= pageblock_order && capture) {
>> +		/* This may be outside the zone, but we check that later */
>> +		capture_pfn = low_pfn & ~((1UL << cc->order) - 1);
>> +		next_capture_pfn = ALIGN(low_pfn + 1, (1UL << cc->order));
>> +	}
>
> Instead of inserting capture logic to common code (compaction and
> CMA), could you add it only to compaction code such as
> isolate_migratepages(). Capture logic needs too many hooks as you see
> on below snippets. And it makes code so much complicated.

Could do it in isolate_migratepages() for whole pageblocks only (as 
David's patch did), but that restricts the usefulness. Or maybe do it 
fine grained by calling isolate_migratepages_block() multiple times. But 
the overhead of multiple calls would probably suck even more for 
lower-order compactions. For CMA the added overhead is basically only 
checks for next_capture_pfn that will be always false, so predictable. 
And mostly just in branches where isolation is failing, which is not the 
CMA's "fast path" I guess?

But I see you're talking about "complicated", not overhead. Well it's 4 
hunks inside the isolate_migratepages_block() for loop. I don't think 
it's *that* bad, thanks to how the function was cleaned up by the 
previosu patches.
Hmm but you made me realize I could make it nicer by doing a "goto 
isolation_fail" which would handle the next_capture_pfn update at a 
single place.

>> +static bool compact_capture_page(struct compact_control *cc)
>> +{
>> +	struct page *page = *cc->capture_page;
>> +	int cpu;
>> +
>> +	if (!page)
>> +		return false;
>> +
>> +	/* Unsafe check if it's worth to try acquiring the zone->lock at all */
>> +	if (PageBuddy(page) && page_order_unsafe(page) >= cc->order)
>> +		goto try_capture;
>> +
>> +	/*
>> +	 * There's a good chance that we have just put free pages on this CPU's
>> +	 * lru cache and pcplists after the page migrations. Drain them to
>> +	 * allow merging.
>> +	 */
>> +	cpu = get_cpu();
>> +	lru_add_drain_cpu(cpu);
>> +	drain_local_pages(NULL);
>> +	put_cpu();
>
> Just for curiosity.
>
> If lru_add_drain_cpu() is cheap enough to capture high order page, why
> __alloc_pages_direct_compact() doesn't call it before
> get_page_from_freelist()?

No idea. I guess it wasn't noticed at the time that page migration uses 
putback_lru_page() on the page that was freed, which puts it into the 
lru_add cache, only to be freed. I think it would be better to free the 
page immediately in this case, and use lru_add cache only for pages that 
will really go to lru.

Heck, it could be even better to tell page migration to skip pcplists as 
well, to avoid drain_local_pages. Often you migrate because you want to 
use the original page for something. NUMA balancing migrations are 
different, I guess.

>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1489,9 +1489,11 @@ static int __isolate_free_page(struct page *page, unsigned int order)
>>   {
>>   	unsigned long watermark;
>>   	struct zone *zone;
>> +	struct free_area *area;
>>   	int mt;
>> +	unsigned int freepage_order = page_order(page);
>>
>> -	BUG_ON(!PageBuddy(page));
>> +	VM_BUG_ON_PAGE((!PageBuddy(page) || freepage_order < order), page);
>>
>>   	zone = page_zone(page);
>>   	mt = get_pageblock_migratetype(page);
>> @@ -1506,9 +1508,12 @@ static int __isolate_free_page(struct page *page, unsigned int order)
>>   	}
>>
>
> In __isolate_free_page(), we check zone_watermark_ok() with order 0.
> But normal allocation logic would check zone_watermark_ok() with requested
> order. Your capture logic uses __isolate_free_page() and it would
> affect compaction success rate significantly. And it means that
> capture logic allocates high order page on page allocator
> too aggressively compared to other component such as normal high order

It's either that, or the extra lru drain that makes the different. But 
the "aggressiveness" would in fact mean better accuracy. Watermark 
checking may be inaccurate. Especially when memory is close to the 
watermark and there is only a single high-order page that would satisfy 
the allocation.

> allocation. Could you test this patch again after changing order for
> zone_watermark_ok() in __isolate_free_page()?

I can do that. If that makes capture significantly worse, it just 
highlights the watermark checking inaccuracy.

> Thanks.
>


  reply	other threads:[~2014-07-29 15:34 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-28 13:11 [PATCH v5 00/14] compaction: balancing overhead and success rates Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 01/14] mm, THP: don't hold mmap_sem in khugepaged when allocating THP Vlastimil Babka
2014-07-28 23:39   ` David Rientjes
2014-07-28 13:11 ` [PATCH v5 02/14] mm, compaction: defer each zone individually instead of preferred zone Vlastimil Babka
2014-07-28 23:59   ` David Rientjes
2014-07-29  9:02     ` Vlastimil Babka
2014-07-29  6:38   ` Joonsoo Kim
2014-07-29  9:12     ` Vlastimil Babka
2014-07-30 16:22       ` Vlastimil Babka
2014-08-01  8:51         ` Vlastimil Babka
2014-08-04  6:45           ` Joonsoo Kim
2014-07-28 13:11 ` [PATCH v5 03/14] mm, compaction: do not count compact_stall if all zones skipped compaction Vlastimil Babka
2014-07-29  0:04   ` David Rientjes
2014-07-28 13:11 ` [PATCH v5 04/14] mm, compaction: do not recheck suitable_migration_target under lock Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range() Vlastimil Babka
2014-07-29  0:29   ` David Rientjes
2014-07-29  9:27     ` Vlastimil Babka
2014-07-29 23:02       ` David Rientjes
2014-07-29 23:21         ` Kirill A. Shutemov
2014-07-29 23:51           ` David Rientjes
2014-07-30  9:27             ` Vlastimil Babka
2014-07-30  9:39         ` Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 06/14] mm, compaction: reduce zone checking frequency in the migration scanner Vlastimil Babka
2014-07-29  0:44   ` David Rientjes
2014-07-29  9:31     ` Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 07/14] mm, compaction: khugepaged should not give up due to need_resched() Vlastimil Babka
2014-07-29  0:59   ` David Rientjes
2014-07-29  9:45     ` Vlastimil Babka
2014-07-29 22:57       ` David Rientjes
2014-07-29  6:53   ` Joonsoo Kim
2014-07-29  7:31     ` David Rientjes
2014-07-29  8:27       ` Joonsoo Kim
2014-07-29  9:16         ` David Rientjes
2014-07-29  9:49       ` Vlastimil Babka
2014-07-29 22:53         ` David Rientjes
2014-07-30  9:08           ` Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 08/14] mm, compaction: periodically drop lock and restore IRQs in scanners Vlastimil Babka
2014-07-29  1:03   ` David Rientjes
2014-07-28 13:11 ` [PATCH v5 09/14] mm, compaction: skip rechecks when lock was already held Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 10/14] mm, compaction: remember position within pageblock in free pages scanner Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 11/14] mm, compaction: skip buddy pages by their order in the migrate scanner Vlastimil Babka
2014-07-29  1:05   ` David Rientjes
2014-07-28 13:11 ` [PATCH v5 12/14] mm: rename allocflags_to_migratetype for clarity Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 13/14] mm, compaction: pass gfp mask to compact_control Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 14/14] mm, compaction: try to capture the just-created high-order freepage Vlastimil Babka
2014-07-29  7:34   ` Joonsoo Kim
2014-07-29 15:34     ` Vlastimil Babka [this message]
2014-07-30  8:39       ` Joonsoo Kim
2014-07-30  9:56         ` Vlastimil Babka
2014-07-30 14:19           ` Joonsoo Kim
2014-07-30 15:05             ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53D7BF0D.5050404@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=zhangyanfei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox