All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org, mkoutny@suse.com,
	hannes@cmpxchg.org, tj@kernel.org
Subject: Re: [PATCH v5 6/6] drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager
Date: Thu, 11 Jun 2026 14:35:26 +0000	[thread overview]
Message-ID: <20260611143527.6EB471F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260611142242.2529-7-thomas.hellstrom@linux.intel.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] The newly registered dmem cgroup region is never assigned to `mgr->cg_region`, causing a memory leak and leaving dangling reclaim callbacks that lead to a Use-After-Free.
- [High] Calling `dmem_cgroup_unregister_region()` before `ttm_resource_manager_set_used(man, false)` creates a race window where concurrent allocations can dereference a freed cgroup region.
--

commit 02aedad5c82131466a2fb9b6bba851f053487de8
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager

Register the VRAM manager with the dmem cgroup reclaim infrastructure
so that lowering dmem.max below current VRAM usage triggers TTM
eviction rather than failing with -EBUSY.

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 08f05c3aed1d5..ee98b963e84a0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -933,12 +938,15 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>  	if (err)
>  		return err;
>  
> -	man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram",
> -					      &(struct dmem_cgroup_init){
> -						.size = adev->gmc.real_vram_size,
> -					      });
> -	if (IS_ERR(man->cg))
> -		return PTR_ERR(man->cg);
> +	cg = dmem_cgroup_register_region(&(struct dmem_cgroup_init){
> +					     .size = adev->gmc.real_vram_size,
> +					     .ops = &amdgpu_vram_mgr_dmem_ops,
> +					     .reclaim_priv = man,
> +					 }, "vram");
> +	if (IS_ERR(cg))
> +		return PTR_ERR(cg);
> +
> +	ttm_resource_manager_set_dmem_region(man, cg);

[Severity: Critical]
Is cg missing an assignment to mgr->cg_region here?

Because mgr->cg_region is zero-initialized and never updated,
amdgpu_vram_mgr_fini() will later pass NULL to dmem_cgroup_unregister_region().
This causes the cgroup region to leak and remain on the global list.

If cgroup reclaim is triggered after the driver unbinds and the man pointer
is freed, will the callback execute and dereference reclaim_priv, resulting
in a use-after-free?

>  	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
>  	ttm_resource_manager_set_used(man, true);

[ ... ]

> @@ -960,6 +968,15 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev)
>  	int ret;
>  	struct amdgpu_vram_reservation *rsv, *temp;
>  
> +	/*
> +	 * Drain any in-flight dmem cgroup reclaim callbacks and remove the
> +	 * region from the global list before tearing down the manager.
> +	 * This must happen first so no reclaim callback can access the
> +	 * manager after this point.
> +	 */
> +	dmem_cgroup_unregister_region(mgr->cg_region);
> +	mgr->cg_region = NULL;
> +
>  	ttm_resource_manager_set_used(man, false);

[Severity: High]
Does unregistering the cgroup region before calling
ttm_resource_manager_set_used(man, false) create a race condition with
concurrent BO allocations?

Assuming mgr->cg_region was properly set, dmem_cgroup_unregister_region()
will remove the region from global lists and drop its references. Meanwhile,
the manager remains active (man->use_type == true).

Because amdgpu_vram_mgr_fini() runs during driver unbind where concurrent
ioctls may still be executing, could a concurrent BO allocation pass the check
in ttm_bo_mem_space() and call dmem_cgroup_try_charge(man->cg, ...)?

This would access the unregistered region and cause a use-after-free. Should
ttm_resource_manager_set_used(man, false) be executed before unregistering
the region?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260611142242.2529-1-thomas.hellstrom@linux.intel.com?part=6

  reply	other threads:[~2026-06-11 14:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11 14:22 [PATCH v5 0/6] [PATCH v5 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
2026-06-11 14:22 ` [PATCH v5 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
2026-06-11 14:37   ` sashiko-bot
2026-06-11 17:27     ` Thomas Hellström
2026-06-11 14:22 ` [PATCH v5 2/6] cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization Thomas Hellström
2026-06-11 14:22 ` [PATCH v5 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage Thomas Hellström
2026-06-11 14:40   ` sashiko-bot
2026-06-11 17:28     ` Thomas Hellström
2026-06-11 14:22 ` [PATCH v5 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
2026-06-11 14:22 ` [PATCH v5 5/6] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Thomas Hellström
2026-06-11 14:22 ` [PATCH v5 6/6] drm/amdgpu: " Thomas Hellström
2026-06-11 14:35   ` sashiko-bot [this message]
2026-06-11 14:49 ` ✗ CI.checkpatch: warning for Add reclaim to the dmem cgroup controller (rev5) Patchwork
2026-06-11 14:51 ` ✓ CI.KUnit: success " Patchwork
2026-06-11 16:44 ` ✓ Xe.CI.BAT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260611143527.6EB471F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=mkoutny@suse.com \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.