From: Matthew Brost <matthew.brost@intel.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
<intel-xe@lists.freedesktop.org>,
<dri-devel@lists.freedesktop.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH v2 1/5] mm: Introduce zone_appears_fragmented()
Date: Thu, 30 Apr 2026 09:34:48 -0700 [thread overview]
Message-ID: <afOEqOZVWi/cNf9g@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <f4324084f3808b9a42b87ec27bef7fe920dde9ea.camel@linux.intel.com>
On Thu, Apr 30, 2026 at 09:47:37AM +0200, Thomas Hellström wrote:
> On Wed, 2026-04-29 at 19:47 -0700, Matthew Brost wrote:
> > On Fri, Apr 24, 2026 at 09:26:18AM +0200, David Hildenbrand (Arm)
> > wrote:
> > > On 4/24/26 09:05, Thomas Hellström wrote:
> > > > On Thu, 2026-04-23 at 15:21 -0700, Matthew Brost wrote:
> > > > > On Thu, Apr 23, 2026 at 12:08:36PM -0700, Matthew Brost wrote:
> > > > > >
> > > > > > If the order were included in shrink_control, there is about
> > > > > > a 95%
> > > > > > certain that this change would allow TTM / Xe to break the
> > > > > > problematic
> > > > > > kswapd feedback loop. This may also better express the intent
> > > > > > of
> > > > > > the
> > > > > > problem we are trying to fix here.
> > > > > >
> > > > > > For reference, the cover letter [1] details the problem.
> > > > > >
> > > > > > Any guidance from the core MM folks would be
> > > > > > appreciated—would
> > > > > > adding
> > > > > > the order to shrink_control be an acceptable solution?
> > > > > >
> > > > > > Matt
> > > > > >
> > > > > > [1] https://patchwork.freedesktop.org/series/165330/
> > > > > >
> > > > >
> > > > > It doesn't look like __GFP_NORETRY, __GFP_RETRY_MAYFAIL,
> > > > > __GFP_NOFAIL
> > > > > make it to the sc->gfp_mask flags from the caller and get into
> > > > > kswapd
> > > > > loop...
> > > >
> > > > Perhaps that's because they mostly (only?) make sense from direct
> > > > reclaim? Looks like the trace is from kswapd.
> > >
> > > kswap obtains the desired order through pgdat->kswapd_order, as a
> > > hint from
> > > allocation code (wakeup_kswapd). The order can be easily merged
> > > (just use the max)
> > >
> >
> > Yes.
> >
> > My current thinking is wire the order into shrink_control as that is
> > quite straight forward + only call this helper + short circuit
> > shrinker
> > on higher orders.
> >
> > > We do have the gfp_flags there, but merging them from different
> > > wakeups is a bit
> > > more tricky (and when to reset?).
> > >
> > > Assume we have one urgent request for order-0 and one non-urgent
> > > (noretry,nofail, ...) request for order-9, we'd have to figure out
> > > a way how to
> > > represent that. Gets more complicated for more orders.
> > >
> > > Of course, we could have some kind of array, and try to store some
> > > "priority"
> > > per order. But I assume plumbing that into the rest of kswapd might
> > > not be that
> > > easy.
> >
> > Yes, this seems non-trivial. I was also on a call with Google today
> > discussing what Android (client Linux) would like from shrinking, and
> > my
> > initial feeling is that we will need to do some surgery to the
> > shrinker
> > core and GPU shrinkers to make all of this work well over the next
> > year
> > or so.
> >
> > So again, I think starting with wiring order into shrink_control and
> > this helper is a good place to start, as it fixes an immediate issue.
> >
> > Let me know if that seems like a reasonable direction.
>
> +1 for wiring order into shrink_control, and possibly also the priority
> as mentioned in an earlier email.
>
Let me look at how priority field is used as well.
> However for cgroups-aware shrinkers, The number of free memory in a
> zone might not be an indication of fragmentation-triggered reclaim at
> all, it could be the result of the cgroup hitting its memory limits.
>
I agree for cgroups what is in place here is not sufficent and based
Google's feedback of every user space in Andriod is assigned a cgroup so
we will quickly need a cgroup story.
> So I think if we can solve this with a combination of GFP flags,
> plumbed-through order and plumbed-through priority, that would be
> ideal.
That is an idea. The other thing that came up is TTM LRU doesn't
understand relavence of hotness compared to other shrinkers LRUs (e.g.,
core pages) so our TTM shrinker may be evicting hot GPU pages while cold
non-GPU pages could be evicted which would create less stress on the
system. Perhaps priority / GFP flags will help here?
Matt
>
> Thanks,
> Thomas
>
> >
> > Matt
> >
> > >
> > >
> > > --
> > > Cheers,
> > >
> > > David
next prev parent reply other threads:[~2026-04-30 16:51 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 5:56 [PATCH v2 0/5] mm, drm/ttm, drm/xe: Avoid reclaim/eviction loops under fragmentation Matthew Brost
2026-04-23 5:56 ` [PATCH v2 1/5] mm: Introduce zone_appears_fragmented() Matthew Brost
2026-04-23 6:04 ` Balbir Singh
2026-04-23 6:16 ` Matthew Brost
2026-04-23 6:27 ` Matthew Brost
2026-04-23 10:27 ` David Hildenbrand (Arm)
2026-04-23 11:27 ` Thomas Hellström
2026-04-23 19:08 ` Matthew Brost
2026-04-23 22:21 ` Matthew Brost
2026-04-24 7:05 ` Thomas Hellström
2026-04-24 7:26 ` David Hildenbrand (Arm)
2026-04-30 2:47 ` Matthew Brost
2026-04-30 7:47 ` Thomas Hellström
2026-04-30 16:34 ` Matthew Brost [this message]
2026-04-30 19:59 ` Thomas Hellström
2026-04-30 17:06 ` Vlastimil Babka (SUSE)
2026-04-28 9:51 ` Andi Shyti
2026-04-28 10:05 ` Andi Shyti
2026-04-30 2:34 ` Matthew Brost
2026-04-30 2:37 ` Matthew Brost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afOEqOZVWi/cNf9g@gsse-cloud1.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=hannes@cmpxchg.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox