dri-devel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: tj@kernel.org, mkoutny@suse.com, dri-devel@lists.freedesktop.org,
	hannes@cmpxchg.org
Subject: Re: [PATCH v7 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage
Date: Fri, 03 Jul 2026 13:24:09 +0000	[thread overview]
Message-ID: <20260703132410.001CF1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260703130541.2686-4-thomas.hellstrom@linux.intel.com>

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Holding `srcu_read_lock` across a potentially unbounded reclaim loop causes `synchronize_srcu()` to hang indefinitely during region unregistration.
--

commit a6f9976e310be028f3b0c1a4f382c0d26ce5f305
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Subject: cgroup/dmem: Add reclaim callback for lowering max below current usage

This commit adds an optional reclaim callback to struct dmem_cgroup_region 
that is invoked when dmem.max is lowered. It introduces an SRCU domain to
protect reclaim callbacks against concurrent region unregistrations.

> diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
> index 373ad04d289d5..93bc9813903c8 100644
> --- a/kernel/cgroup/dmem.c
> +++ b/kernel/cgroup/dmem.c

[ ... ]

> @@ -145,21 +161,52 @@ static void free_cg_pool(struct dmem_cgroup_pool_state *pool)
>  }
>  
>  static void
> -set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val)
> +set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val, bool nonblock)
>  {
> -	page_counter_set_max(&pool->cnt, val);
> +	struct dmem_cgroup_region *region = pool->region;
> +	unsigned long limit = (unsigned long)val;
> +
> +	/* Apply the new limit immediately so concurrent allocations are throttled. */
> +	xchg(&pool->cnt.max, limit);
> +
> +	if (nonblock)
> +		return;
> +
> +	int srcu_idx = srcu_read_lock(&dmemcg_srcu);
> +
> +	if (!READ_ONCE(region->unregistered) && region->ops && region->ops->reclaim) {
> +		for (int retries = DMEM_MAX_RECLAIM_RETRIES; ; ) {
> +			u64 usage = page_counter_read(&pool->cnt);
> +			int ret;
> +
> +			if (usage <= limit)
> +				break;
> +
> +			if (signal_pending(current))
> +				break;
> +
> +			ret = region->ops->reclaim(pool, usage - limit, region->reclaim_priv);
> +
> +			/* -ENOSPC means no progress; other errors are fatal. */
> +			if (ret && (ret != -ENOSPC || !retries--))
> +				break;
> +
> +			cond_resched();
> +		}

[Severity: High]
Can this loop cause synchronize_srcu() in dmem_cgroup_unregister_region() to
hang indefinitely?

If the driver's reclaim() callback repeatedly returns 0 (indicating some
progress) but concurrent allocations keep usage above the limit, this loop
could run for a very long time.

Because srcu_read_lock() is held outside the loop, a concurrent call to
dmem_cgroup_unregister_region() would block in synchronize_srcu() waiting
for this reader to finish.

Would it be safer to scope the SRCU read lock inside the loop iteration?
This would allow synchronize_srcu() to proceed and the loop could safely abort
when it sees region->unregistered become true.

> +	}
> +	srcu_read_unlock(&dmemcg_srcu, srcu_idx);
>  }

[ ... ]

> @@ -488,9 +539,11 @@ void dmem_cgroup_unregister_region(struct dmem_cgroup_region *region)
>  	 * no new pools should be added to the dead region
>  	 * by get_cg_pool_unlocked.
>  	 */
> -	region->unregistered = true;
> +	WRITE_ONCE(region->unregistered, true);
>  	spin_unlock(&dmemcg_lock);
>  
> +	synchronize_srcu(&dmemcg_srcu);
> +
>  	kref_put(&region->ref, dmemcg_free_region);
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260703130541.2686-1-thomas.hellstrom@linux.intel.com?part=3

  reply	other threads:[~2026-07-03 13:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-03 13:05 [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thomas Hellström
2026-07-03 13:05 ` [PATCH v7 1/6] drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init() Thomas Hellström
2026-07-03 13:08   ` Christian König
2026-07-03 13:11     ` Thomas Hellström
2026-07-03 13:26   ` sashiko-bot
2026-07-03 13:05 ` [PATCH v7 2/6] cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization Thomas Hellström
2026-07-03 13:05 ` [PATCH v7 3/6] cgroup/dmem: Add reclaim callback for lowering max below current usage Thomas Hellström
2026-07-03 13:24   ` sashiko-bot [this message]
2026-07-03 13:05 ` [PATCH v7 4/6] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Thomas Hellström
2026-07-03 13:19   ` sashiko-bot
2026-07-03 13:05 ` [PATCH v7 5/6] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Thomas Hellström
2026-07-03 13:05 ` [PATCH v7 6/6] drm/amdgpu: " Thomas Hellström
2026-07-03 13:25   ` sashiko-bot
2026-07-03 14:37 ` [PATCH v7 0/6] Add reclaim to the dmem cgroup controller Thadeu Lima de Souza Cascardo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260703132410.001CF1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=mkoutny@suse.com \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox