From: Michal Hocko <mhocko@kernel.org>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@linux.com>,
Vlastimil Babka <vbabka@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Linux-MM <linux-mm@kvack.org>,
Linux-Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm: page_alloc: High-order per-cpu page allocator v5
Date: Fri, 2 Dec 2016 09:21:08 +0100 [thread overview]
Message-ID: <20161202082108.GB6830@dhcp22.suse.cz> (raw)
In-Reply-To: <20161202060346.GA21434@js1304-P5Q-DELUXE>
On Fri 02-12-16 15:03:46, Joonsoo Kim wrote:
[...]
> > o pcp accounting during free is now confined to free_pcppages_bulk as it's
> > impossible for the caller to know exactly how many pages were freed.
> > Due to the high-order caches, the number of pages drained for a request
> > is no longer precise.
> >
> > o The high watermark for per-cpu pages is increased to reduce the probability
> > that a single refill causes a drain on the next free.
[...]
> I guess that this patch would cause following problems.
>
> 1. If pcp->batch is too small, high order page will not be freed
> easily and survive longer. Think about following situation.
>
> Batch count: 7
> MIGRATE_UNMOVABLE -> MIGRATE_MOVABLE -> MIGRATE_RECLAIMABLE -> order 1
> -> order 2...
>
> free count: 1 + 1 + 1 + 2 + 4 = 9
> so order 3 would not be freed.
I guess the second paragraph above in the changelog tries to clarify
that...
> 2. And, It seems that this logic penalties high order pages. One free
> to high order page means 1 << order pages free rather than just
> one page free. This logic do round-robin to choose the target page so
> amount of freed page will be different by the order.
Yes this is indeed possible. The first paragraph above mentions this
problem.
> I think that it
> makes some sense because high order page are less important to cache
> in pcp than lower order but I'd like to know if it is intended or not.
> If intended, it deserves the comment.
>
> 3. I guess that order-0 file/anon page alloc/free is dominent in many
> workloads. If this case happen, it invalidates effect of high order
> cache in pcp since cached high order pages would be also freed to the
> buddy when burst order-0 free happens.
Yes this is true and I was wondering the same but I believe this can be
enahanced later on. E.g. we can check the order when crossing pcp->high
mark and only the given order portion of the batch. I just wouldn't over
optimize at this stage.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@linux.com>,
Vlastimil Babka <vbabka@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Linux-MM <linux-mm@kvack.org>,
Linux-Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm: page_alloc: High-order per-cpu page allocator v5
Date: Fri, 2 Dec 2016 09:21:08 +0100 [thread overview]
Message-ID: <20161202082108.GB6830@dhcp22.suse.cz> (raw)
In-Reply-To: <20161202060346.GA21434@js1304-P5Q-DELUXE>
On Fri 02-12-16 15:03:46, Joonsoo Kim wrote:
[...]
> > o pcp accounting during free is now confined to free_pcppages_bulk as it's
> > impossible for the caller to know exactly how many pages were freed.
> > Due to the high-order caches, the number of pages drained for a request
> > is no longer precise.
> >
> > o The high watermark for per-cpu pages is increased to reduce the probability
> > that a single refill causes a drain on the next free.
[...]
> I guess that this patch would cause following problems.
>
> 1. If pcp->batch is too small, high order page will not be freed
> easily and survive longer. Think about following situation.
>
> Batch count: 7
> MIGRATE_UNMOVABLE -> MIGRATE_MOVABLE -> MIGRATE_RECLAIMABLE -> order 1
> -> order 2...
>
> free count: 1 + 1 + 1 + 2 + 4 = 9
> so order 3 would not be freed.
I guess the second paragraph above in the changelog tries to clarify
that...
> 2. And, It seems that this logic penalties high order pages. One free
> to high order page means 1 << order pages free rather than just
> one page free. This logic do round-robin to choose the target page so
> amount of freed page will be different by the order.
Yes this is indeed possible. The first paragraph above mentions this
problem.
> I think that it
> makes some sense because high order page are less important to cache
> in pcp than lower order but I'd like to know if it is intended or not.
> If intended, it deserves the comment.
>
> 3. I guess that order-0 file/anon page alloc/free is dominent in many
> workloads. If this case happen, it invalidates effect of high order
> cache in pcp since cached high order pages would be also freed to the
> buddy when burst order-0 free happens.
Yes this is true and I was wondering the same but I believe this can be
enahanced later on. E.g. we can check the order when crossing pcp->high
mark and only the given order portion of the batch. I just wouldn't over
optimize at this stage.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2016-12-02 8:21 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-02 0:22 [PATCH 0/2] High-order per-cpu cache v5 Mel Gorman
2016-12-02 0:22 ` Mel Gorman
2016-12-02 0:22 ` [PATCH 1/2] mm, page_alloc: Keep pcp count and list contents in sync if struct page is corrupted Mel Gorman
2016-12-02 0:22 ` Mel Gorman
2016-12-02 3:47 ` Hillf Danton
2016-12-02 3:47 ` Hillf Danton
2016-12-02 6:19 ` Vlastimil Babka
2016-12-02 6:19 ` Vlastimil Babka
2016-12-02 9:30 ` Hillf Danton
2016-12-02 9:30 ` Hillf Danton
2016-12-02 10:04 ` Michal Hocko
2016-12-02 10:04 ` Michal Hocko
2016-12-02 11:02 ` Mel Gorman
2016-12-02 11:02 ` Mel Gorman
2016-12-02 8:12 ` Michal Hocko
2016-12-02 8:12 ` Michal Hocko
2016-12-02 9:49 ` Mel Gorman
2016-12-02 9:49 ` Mel Gorman
2016-12-02 10:03 ` Michal Hocko
2016-12-02 10:03 ` Michal Hocko
2016-12-02 0:22 ` [PATCH 2/2] mm: page_alloc: High-order per-cpu page allocator v5 Mel Gorman
2016-12-02 0:22 ` Mel Gorman
2016-12-02 6:03 ` Joonsoo Kim
2016-12-02 6:03 ` Joonsoo Kim
2016-12-02 8:21 ` Michal Hocko [this message]
2016-12-02 8:21 ` Michal Hocko
2016-12-05 3:10 ` Joonsoo Kim
2016-12-05 3:10 ` Joonsoo Kim
2016-12-02 9:04 ` Mel Gorman
2016-12-02 9:04 ` Mel Gorman
2016-12-05 3:06 ` Joonsoo Kim
2016-12-05 3:06 ` Joonsoo Kim
2016-12-05 9:57 ` Mel Gorman
2016-12-05 9:57 ` Mel Gorman
2016-12-06 2:43 ` Joonsoo Kim
2016-12-06 2:43 ` Joonsoo Kim
2016-12-06 13:53 ` Mel Gorman
2016-12-06 13:53 ` Mel Gorman
2016-12-02 8:25 ` Michal Hocko
2016-12-02 8:25 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161202082108.GB6830@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=brouer@redhat.com \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.