All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Matthew Brost" <matthew.brost@intel.com>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v6 1/2] mm: Introduce opportunistic_compaction concept to vmscan and shrinkers
Date: Wed, 17 Jun 2026 03:38:22 +0000	[thread overview]
Message-ID: <20260617033823.5A6221F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260617032218.1165929-2-matthew.brost@intel.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Concurrent order-0 wakers cause opportunistic high-order kswapd reclaims to become destructive, defeating the patch's purpose.
- [High] The opportunistic compaction hint decays to UNSET on retries and premature wakeups, silently converting opportunistic reclaim into destructive reclaim.
- [Medium] The `order` field referenced in `struct shrink_control`'s documentation is missing from the struct, rendering the API incomplete.
--

commit 458389aa250d63553aa4e2e1f9e0c39d7fbae43e
Author: Matthew Brost <matthew.brost@intel.com>
Subject: mm: Introduce opportunistic_compaction concept to vmscan and shrinkers

This commit introduces an opportunistic compaction hint for kswapd and
shrinkers. It aims to prevent aggressive reclaim of working sets during
fragmentation-sensitive high-order allocations. Shrinkers can use this
hint to skip destructive work when reclaim is unlikely to produce a
contiguous high-order block.

> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 1a00be90d93a9..5f3e8dc981296 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -37,6 +37,26 @@ struct shrink_control {
>  	/* current node being shrunk (for NUMA aware shrinkers) */
>  	int nid;
>  
> +	/*
> +	 * Opportunistic compaction hint.
> +	 *
> +	 * Set by the reclaim path to tell shrinkers that this pass is
> +	 * driven by an order > 0 allocation that the caller is willing to
> +	 * have fail (e.g., __GFP_NORETRY / __GFP_RETRY_MAYFAIL without
> +	 * __GFP_NOFAIL). Such allocations only really benefit from
> +	 * shrinking when doing so frees up a contiguous, high-order block;
> +	 * thrashing working sets in the hope of producing one is typically
> +	 * counter-productive.
> +	 *
> +	 * Shrinkers that can produce naturally-aligned high-order folios
> +	 * (see shrink_control::order) should treat this as a hint to skip
> +	 * costly work that is unlikely to help compaction (for example,
> +	 * evicting hot/working-set pages just to free single pages).
> +	 *
> +	 * Only meaningful when @order > 0; ignored otherwise.
> +	 */
> +	bool opportunistic_compaction;

[Severity: Medium]
The comment references a shrink_control::order field, but does this field
actually exist in struct shrink_control?

Without passing the target allocation order from scan_control through
shrink_slab(), it seems shrinker implementations won't be able to determine
the requested block size to conditionally skip work.

> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bd1b1aa125819..c40bc3f9ddd43 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c

[ ... ]

> @@ -7385,6 +7428,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o
>  
>  			if (READ_ONCE(pgdat->kswapd_order) < reclaim_order)
>  				WRITE_ONCE(pgdat->kswapd_order, reclaim_order);
> +
> +			if (kswapd_opportunistic_compaction ==
> +			    KSWAPD_NO_OPPORTUNISTIC_COMPACTION)
> +				atomic_set(&pgdat->kswapd_opportunistic_compaction,
> +					   KSWAPD_NO_OPPORTUNISTIC_COMPACTION);
>  		}
>  
>  		finish_wait(&pgdat->kswapd_wait, &wait);

[Severity: High]
Does this conditional block inadvertently allow the opportunistic hint to
decay on retries?

In kswapd(), the hint is extracted and reset to
KSWAPD_UNSET_OPPORTUNISTIC_COMPACTION before calling balance_pgdat().

If the first reclaim pass fails and kswapd goes back to sleep, this code
only restores the hint if it was KSWAPD_NO_OPPORTUNISTIC_COMPACTION. 
If it was KSWAPD_OPPORTUNISTIC_COMPACTION, the state decays to UNSET.

On the subsequent retry, sc_opportunistic_compaction() will evaluate
UNSET as false, meaning a multi-pass reclaim or premature wakeup will
cause kswapd to resume full destructive reclaim for the high-order
allocation.

[ ... ]

> @@ -7537,6 +7594,28 @@ void wakeup_kswapd(struct zone *zone, gfp_t gfp_flags, int order,
>  	if (READ_ONCE(pgdat->kswapd_order) < order)
>  		WRITE_ONCE(pgdat->kswapd_order, order);
>  
> +	/*
> +	 * Fold this waker into the per-pgdat opportunistic-compaction hint
> +	 * that kswapd will pick up at the start of its next run.
> +	 *
> +	 * The state is sticky in the "NO" direction: once any waker in this
> +	 * batch is order-0 or a non-failable high-order allocation, the hint
> +	 * stays cleared until kswapd consumes it. Only when every waker so
> +	 * far is a failable high-order allocation do we set
> +	 * KSWAPD_OPPORTUNISTIC_COMPACTION, asking shrinkers to skip work
> +	 * that won't realistically help compaction.
> +	 */
> +	if (atomic_read(&pgdat->kswapd_opportunistic_compaction) !=
> +	    KSWAPD_NO_OPPORTUNISTIC_COMPACTION) {
> +		if (!order || !gfp_opportunistic_compaction(gfp_flags))
> +			atomic_set(&pgdat->kswapd_opportunistic_compaction,
> +				   KSWAPD_NO_OPPORTUNISTIC_COMPACTION);
> +		else if (order && gfp_opportunistic_compaction(gfp_flags))
> +			atomic_cmpxchg(&pgdat->kswapd_opportunistic_compaction,
> +				       KSWAPD_UNSET_OPPORTUNISTIC_COMPACTION,
> +				       KSWAPD_OPPORTUNISTIC_COMPACTION);
> +	}

[Severity: High]
Can concurrent order-0 wakers cause opportunistic high-order kswapd
reclaims to become destructive here?

Since pgdat->kswapd_order tracks the maximum requested allocation order
while the hint independently aggregates with a sticky-NO approach, what
happens if an opportunistic high-order allocation (e.g., order 9) and a
normal order-0 allocation concurrently wake kswapd?

It appears the aggregated state becomes order=9 with
KSWAPD_NO_OPPORTUNISTIC_COMPACTION. This would cause kswapd to execute
balance_pgdat() for order 9 with the non-opportunistic hint, meaning
shrinkers receive opportunistic_compaction = false and aggressively destroy
the working set to satisfy the order-9 allocation.

Because order-0 allocations (like page cache faults) are common, does this
guarantee the opportunistic compaction heuristic will be routinely bypassed
under normal memory pressure?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260617032218.1165929-1-matthew.brost@intel.com?part=1

  reply	other threads:[~2026-06-17  3:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-17  3:22 [PATCH v6 0/2] mm, drm/xe: Avoid reclaim/eviction loops under fragmentation Matthew Brost
2026-06-17  3:22 ` [PATCH v6 1/2] mm: Introduce opportunistic_compaction concept to vmscan and shrinkers Matthew Brost
2026-06-17  3:38   ` sashiko-bot [this message]
2026-06-17  3:22 ` [PATCH v6 2/2] drm/xe: Make use of shrink_control::opportunistic_compaction hint Matthew Brost
2026-06-17  3:28   ` sashiko-bot
2026-06-17  4:37 ` ✓ CI.KUnit: success for mm, drm/xe: Avoid reclaim/eviction loops under fragmentation Patchwork
2026-06-17  5:16 ` ✓ Xe.CI.BAT: " Patchwork
2026-06-17 10:55 ` ✓ Xe.CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260617033823.5A6221F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.