From: Mel Gorman <mgorman@techsingularity.net>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, Kaiyang Zhao <kaiyang2@cs.cmu.edu>,
Vlastimil Babka <vbabka@suse.cz>,
David Rientjes <rientjes@google.com>,
linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [RFC PATCH 05/26] mm: page_alloc: per-migratetype pcplist for THPs
Date: Fri, 28 Apr 2023 11:29:36 +0100 [thread overview]
Message-ID: <20230428102936.7qsaskyjkpiyapgq@techsingularity.net> (raw)
In-Reply-To: <20230421150648.GB320347@cmpxchg.org>
On Fri, Apr 21, 2023 at 11:06:48AM -0400, Johannes Weiner wrote:
> On Fri, Apr 21, 2023 at 01:47:44PM +0100, Mel Gorman wrote:
> > On Tue, Apr 18, 2023 at 03:12:52PM -0400, Johannes Weiner wrote:
> > > Right now, there is only one pcplist for THP allocations. However,
> > > while most THPs are movable, the huge zero page is not. This means a
> > > movable THP allocation can grab an unmovable block from the pcplist,
> > > and a subsequent THP split, partial free, and reallocation of the
> > > remainder will mix movable and unmovable pages in the block.
> > >
> > > While this isn't a huge source of block pollution in practice, it
> > > happens often enough to trigger debug warnings fairly quickly under
> > > load. In the interest of tightening up pageblock hygiene, make the THP
> > > pcplists fully migratetype-aware, just like the lower order ones.
> > >
> > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> >
> > Split out :P
> >
> > Take special care of this one because, while I didn't check this, I
> > suspect it'll push the PCP structure size into the next cache line and
> > increase overhead.
> >
> > The changelog makes it unclear why exactly this happens or why the
> > patch fixes it.
>
> Before this, I'd see warnings from the last patch in the series about
> received migratetype not matching requested mt.
>
> The way it happens is that the zero page gets freed and the unmovable
> block put on the pcplist. A regular THP allocation is subsequently
> served from an unmovable block.
>
> Mental note, I think this can happen the other way around too: a
> regular THP on the pcp being served to a MIGRATE_UNMOVABLE zero
> THP. It's not supposed to, but it looks like there is a bug in the
> code that's meant to prevent that from happening in rmqueue():
>
> if (likely(pcp_allowed_order(order))) {
> /*
> * MIGRATE_MOVABLE pcplist could have the pages on CMA area and
> * we need to skip it when CMA area isn't allowed.
> */
> if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA ||
> migratetype != MIGRATE_MOVABLE) {
> page = rmqueue_pcplist(preferred_zone, zone, order,
> migratetype, alloc_flags);
> if (likely(page))
> goto out;
> }
> }
>
> Surely that last condition should be migratetype == MIGRATE_MOVABLE?
>
It should be. It would have been missed for ages because it would need a
test case based on a machine configuration that requires CMA for functional
correctness and is using THP which is an unlikely combination.
> > The huge zero page strips GFP_MOVABLE (so unmovable)
> > but at allocation time, it doesn't really matter what the movable type
> > is because it's a full pageblock. It doesn't appear to be a hazard until
> > the split happens. Assuming that's the case, it should be ok to always
> > set the pageblock movable for THP allocations regardless of GFP flags at
> > allocation time or else set the pageblock MOVABLE at THP split (always
> > MOVABLE at allocation time makes more sense).
>
> The regular allocator compaction skips over compound pages anyway, so
> the migratetype should indeed not matter there.
>
> The bigger issue is CMA. alloc_contig_range() will try to move THPs to
> free a larger range. We have to be careful not to place an unmovable
> zero THP into a CMA region. That means we can not play games with MT -
> we really do have to physically keep unmovable and movable THPs apart.
>
Fair point.
> Another option would be not to use pcp for the zero THP. It's cached
> anyway in the caller. But it would add branches to the THP alloc and
> free fast paths (pcp_allowed_order() also checking migratetype).
And this is probably the most straight-forward option. The intent behind
caching some THPs on PCP was faulting large mappings of normal THPs and
reducing the contention on the zone lock a little. The zero THP is somewhat
special because it should not be allocated at high frequency.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2023-04-28 10:29 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-18 19:12 [RFC PATCH 00/26] mm: reliable huge page allocator Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 01/26] block: bdev: blockdev page cache is movable Johannes Weiner
2023-04-19 4:07 ` Matthew Wilcox
2023-04-21 12:25 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 02/26] mm: compaction: avoid GFP_NOFS deadlocks Johannes Weiner
2023-04-21 12:27 ` Mel Gorman
2023-04-21 14:17 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 03/26] mm: make pageblock_order 2M per default Johannes Weiner
2023-04-19 0:01 ` Kirill A. Shutemov
2023-04-19 2:55 ` Johannes Weiner
2023-04-19 3:44 ` Johannes Weiner
2023-04-19 11:10 ` David Hildenbrand
2023-04-19 10:36 ` Vlastimil Babka
2023-04-19 11:09 ` David Hildenbrand
2023-04-21 12:37 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 04/26] mm: page_isolation: write proper kerneldoc Johannes Weiner
2023-04-21 12:39 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 05/26] mm: page_alloc: per-migratetype pcplist for THPs Johannes Weiner
2023-04-21 12:47 ` Mel Gorman
2023-04-21 15:06 ` Johannes Weiner
2023-04-28 10:29 ` Mel Gorman [this message]
2023-04-18 19:12 ` [RFC PATCH 06/26] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-04-21 12:54 ` Mel Gorman
2023-04-21 15:08 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 07/26] mm: page_alloc: move capture_control to the page allocator Johannes Weiner
2023-04-21 12:59 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 08/26] mm: page_alloc: claim blocks during compaction capturing Johannes Weiner
2023-04-21 13:12 ` Mel Gorman
2023-04-25 14:39 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 09/26] mm: page_alloc: move expand() above compaction_capture() Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 10/26] mm: page_alloc: allow compaction capturing from larger blocks Johannes Weiner
2023-04-21 14:14 ` Mel Gorman
2023-04-25 15:40 ` Johannes Weiner
2023-04-28 10:41 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 11/26] mm: page_alloc: introduce MIGRATE_FREE Johannes Weiner
2023-04-21 14:25 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 12/26] mm: page_alloc: per-migratetype free counts Johannes Weiner
2023-04-21 14:28 ` Mel Gorman
2023-04-21 15:35 ` Johannes Weiner
2023-04-21 16:03 ` Mel Gorman
2023-04-21 16:32 ` Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 13/26] mm: compaction: remove compaction result helpers Johannes Weiner
2023-04-21 14:32 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 14/26] mm: compaction: simplify should_compact_retry() Johannes Weiner
2023-04-21 14:36 ` Mel Gorman
2023-04-25 2:15 ` Johannes Weiner
2023-04-25 0:56 ` Huang, Ying
2023-04-25 2:11 ` Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 15/26] mm: compaction: simplify free block check in suitable_migration_target() Johannes Weiner
2023-04-21 14:39 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 16/26] mm: compaction: improve compaction_suitable() accuracy Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 17/26] mm: compaction: refactor __compaction_suitable() Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 18/26] mm: compaction: remove unnecessary is_via_compact_memory() checks Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 19/26] mm: compaction: drop redundant watermark check in compaction_zonelist_suitable() Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check in kswapd Johannes Weiner
2023-04-25 3:12 ` Huang, Ying
2023-04-25 14:26 ` Johannes Weiner
2023-04-26 1:30 ` Huang, Ying
2023-04-26 15:22 ` Johannes Weiner
2023-04-27 5:41 ` Huang, Ying
2023-04-18 19:13 ` [RFC PATCH 21/26] mm: compaction: align compaction goals with reclaim goals Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 22/26] mm: page_alloc: manage free memory in whole pageblocks Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 23/26] mm: page_alloc: kill highatomic Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 24/26] mm: page_alloc: kill watermark boosting Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 25/26] mm: page_alloc: disallow fallbacks when 2M defrag is enabled Johannes Weiner
2023-04-21 14:56 ` Mel Gorman
2023-04-21 15:24 ` Johannes Weiner
2023-04-21 15:55 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 26/26] mm: page_alloc: add sanity checks for migratetypes Johannes Weiner
2023-04-18 23:54 ` [RFC PATCH 00/26] mm: reliable huge page allocator Kirill A. Shutemov
2023-04-19 2:08 ` Johannes Weiner
2023-04-19 10:56 ` Vlastimil Babka
2023-04-19 4:11 ` Matthew Wilcox
2023-04-21 16:11 ` Mel Gorman
2023-04-21 17:14 ` Matthew Wilcox
2023-05-02 15:21 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230428102936.7qsaskyjkpiyapgq@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=hannes@cmpxchg.org \
--cc=kaiyang2@cs.cmu.edu \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox