From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Chen Ridong <chenridong@huaweicloud.com>, intel-xe@lists.freedesktop.org
Cc: "Natalie Vock" <natalie.vock@gmx.de>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Tejun Heo" <tj@kernel.org>, "Michal Koutný" <mkoutny@suse.com>,
cgroups@vger.kernel.org, "Huang Rui" <ray.huang@amd.com>,
"Matthew Brost" <matthew.brost@intel.com>,
"Matthew Auld" <matthew.auld@intel.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"Simona Vetter" <simona@ffwll.ch>,
"David Airlie" <airlied@gmail.com>,
"Christian König" <christian.koenig@amd.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/5] cgroup/dmem: Add reclaim callback for lowering max below current usage
Date: Tue, 28 Apr 2026 09:02:26 +0200 [thread overview]
Message-ID: <4ab1f9215e06ebd730fcff6aeba06c753b105667.camel@linux.intel.com> (raw)
In-Reply-To: <220aadd9-0d92-44ce-8a70-bd30030defa9@huaweicloud.com>
On Sat, 2026-04-25 at 14:42 +0800, Chen Ridong wrote:
>
>
> On 2026/3/27 16:15, Thomas Hellström wrote:
> > Add an optional reclaim callback to struct dmem_cgroup_region.
> > When
> > dmem.max is set below current usage, invoke the callback to evict
> > memory
> > and retry setting the limit rather than failing immediately.
> > Signal
> > interruptions propagate back to the write() caller.
> >
> > RFC:
> > Due to us updating the max limit _after_ the usage has been
> > sufficiently lowered, this should be prone to failures if there are
> > aggressive allocators running in parallel to the reclaim.
> > So can we somehow enforce the new limit while the eviction is
> > happening?
> >
> > Assisted-by: GitHub Copilot:claude-sonnet-4.6
> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > ---
> > include/linux/cgroup_dmem.h | 11 +++++
> > kernel/cgroup/dmem.c | 94
> > +++++++++++++++++++++++++++++++++----
> > 2 files changed, 96 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/cgroup_dmem.h
> > b/include/linux/cgroup_dmem.h
> > index dd4869f1d736..61520a431740 100644
> > --- a/include/linux/cgroup_dmem.h
> > +++ b/include/linux/cgroup_dmem.h
> > @@ -26,6 +26,10 @@ bool dmem_cgroup_state_evict_valuable(struct
> > dmem_cgroup_pool_state *limit_pool,
> > bool ignore_low, bool
> > *ret_hit_low);
> >
> > void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state
> > *pool);
> > +void dmem_cgroup_region_set_reclaim(struct dmem_cgroup_region
> > *region,
> > + int (*reclaim)(struct
> > dmem_cgroup_pool_state *pool,
> > + u64
> > target_bytes, void *priv),
> > + void *priv);
> > #else
> > static inline __printf(2,3) struct dmem_cgroup_region *
> > dmem_cgroup_register_region(u64 size, const char *name_fmt, ...)
> > @@ -62,5 +66,12 @@ bool dmem_cgroup_state_evict_valuable(struct
> > dmem_cgroup_pool_state *limit_pool,
> > static inline void dmem_cgroup_pool_state_put(struct
> > dmem_cgroup_pool_state *pool)
> > { }
> >
> > +static inline void
> > +dmem_cgroup_region_set_reclaim(struct dmem_cgroup_region *region,
> > + int (*reclaim)(struct
> > dmem_cgroup_pool_state *pool,
> > + u64 target_bytes,
> > void *priv),
> > + void *priv)
> > +{ }
> > +
> > #endif
> > #endif /* _CGROUP_DMEM_H */
> > diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
> > index 3e6d4c0b26a1..f993fb058b74 100644
> > --- a/kernel/cgroup/dmem.c
> > +++ b/kernel/cgroup/dmem.c
> > @@ -51,6 +51,18 @@ struct dmem_cgroup_region {
> > * No new pools should be added to the region afterwards.
> > */
> > bool unregistered;
> > +
> > + /**
> > + * @reclaim: Optional callback invoked when dmem.max is
> > set below the
> > + * current usage of a pool. The driver should attempt to
> > free at least
> > + * @target_bytes from @pool. May be called multiple times
> > if usage
> > + * remains above the limit after returning.
> > + */
> > + int (*reclaim)(struct dmem_cgroup_pool_state *pool, u64
> > target_bytes,
> > + void *priv);
> > +
> > + /** @reclaim_priv: Private data passed to @reclaim. */
> > + void *reclaim_priv;
> > };
> >
> > struct dmemcg_state {
> > @@ -145,23 +157,59 @@ static void free_cg_pool(struct
> > dmem_cgroup_pool_state *pool)
> > }
> >
> > static int
> > -set_resource_min(struct dmem_cgroup_pool_state *pool, u64 val)
> > +set_resource_min(struct dmem_cgroup_pool_state *pool, u64 val,
> > + struct dmem_cgroup_region *region)
> > {
> > page_counter_set_min(&pool->cnt, val);
> > return 0;
> > }
> >
> > static int
> > -set_resource_low(struct dmem_cgroup_pool_state *pool, u64 val)
> > +set_resource_low(struct dmem_cgroup_pool_state *pool, u64 val,
> > + struct dmem_cgroup_region *region)
> > {
> > page_counter_set_low(&pool->cnt, val);
> > return 0;
> > }
> >
> > static int
> > -set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val)
> > +set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val,
>
> Though we are discussing how to set the maximum, renaming 'val' to
> 'max' would
> improve readability in the next version.
Since all the set_resource_xxx() functions are using @val, that'd be an
unrelated change. Possibly in a follow-up patch?
Thanks,
Thomas
next prev parent reply other threads:[~2026-04-28 7:03 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-27 8:15 [PATCH 0/5] Add reclaim to the dmem cgroup controller Thomas Hellström
2026-03-27 8:15 ` [PATCH 1/5] cgroup/dmem: Return error when setting max below current usage Thomas Hellström
2026-03-27 8:15 ` [PATCH 2/5] cgroup/dmem: Add reclaim callback for lowering " Thomas Hellström
2026-04-22 8:31 ` Maarten Lankhorst
2026-04-22 8:42 ` Thomas Hellström
2026-04-22 9:50 ` Maarten Lankhorst
2026-04-22 10:20 ` Thomas Hellström
2026-04-22 10:29 ` Maarten Lankhorst
2026-04-22 10:36 ` Thomas Hellström
2026-04-24 16:33 ` Michal Koutný
2026-04-25 6:42 ` Chen Ridong
2026-04-28 7:02 ` Thomas Hellström [this message]
2026-03-27 8:15 ` [PATCH 3/5] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
2026-03-27 8:15 ` [PATCH 4/5] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Thomas Hellström
2026-03-27 8:16 ` [PATCH 5/5] drm/amdgpu: " Thomas Hellström
2026-03-27 8:26 ` ✗ CI.checkpatch: warning for Add reclaim to the dmem cgroup controller Patchwork
2026-03-27 8:27 ` ✓ CI.KUnit: success " Patchwork
2026-03-27 8:33 ` [PATCH 0/5] " Thomas Hellström
2026-03-27 9:01 ` ✓ Xe.CI.BAT: success for " Patchwork
2026-03-27 23:55 ` ✓ Xe.CI.FULL: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ab1f9215e06ebd730fcff6aeba06c753b105667.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=hannes@cmpxchg.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=mkoutny@suse.com \
--cc=mripard@kernel.org \
--cc=natalie.vock@gmx.de \
--cc=ray.huang@amd.com \
--cc=rodrigo.vivi@intel.com \
--cc=simona@ffwll.ch \
--cc=tj@kernel.org \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox