public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection
@ 2025-01-14 15:39 Friedrich Vock
  2025-01-14 15:58 ` Michal Koutný
  2025-01-27 15:27 ` [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre Friedrich Vock
  0 siblings, 2 replies; 14+ messages in thread
From: Friedrich Vock @ 2025-01-14 15:39 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Simona Vetter,
	David Airlie
  Cc: Maarten Lankhorst, Maxime Ripard, dri-devel, cgroups

If the current css doesn't contain any pool that is a descendant of
the "pool" (i.e. when found_descendant == false), then "pool" will
point to some unrelated pool. If the current css has a child, we'll
overwrite parent_pool with this unrelated pool on the next iteration.

Fix this by overwriting "pool" only if it actually is a descendant of
parent_pool, and setting it to NULL otherwise. Also, skip traversing
subtrees if pool == NULL to avoid overwriting parent_pool (and because
it's pointless).

Fixes: b168ed458 ("kernel/cgroup: Add "dmem" memory accounting cgroup")
Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
---
 kernel/cgroup/dmem.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 52736ef0ccf25..10d37df5d50f6 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -222,8 +222,7 @@ dmem_cgroup_calculate_protection(struct dmem_cgroup_pool_state *limit_pool,
 	struct page_counter *climit;
 	struct cgroup_subsys_state *css, *next_css;
 	struct dmemcg_state *dmemcg_iter;
-	struct dmem_cgroup_pool_state *pool, *parent_pool;
-	bool found_descendant;
+	struct dmem_cgroup_pool_state *pool, *candidate_pool, *parent_pool;

 	climit = &limit_pool->cnt;

@@ -241,7 +240,13 @@ dmem_cgroup_calculate_protection(struct dmem_cgroup_pool_state *limit_pool,
 	 */
 	while (pool != test_pool) {
 		next_css = css_next_child(NULL, css);
-		if (next_css) {
+		/*
+		 * pool is NULL when the current css does not contain a
+		 * pool of the type we're interested in. In that case, it's
+		 * impossible that any child css contains a relevant pool, so
+		 * skip the subtree entirely and move on to the next sibling.
+		 */
+		if (next_css && pool) {
 			parent_pool = pool;
 		} else {
 			while (css != &limit_pool->cs->css) {
@@ -260,16 +265,16 @@ dmem_cgroup_calculate_protection(struct dmem_cgroup_pool_state *limit_pool,
 		}
 		css = next_css;

-		found_descendant = false;
 		dmemcg_iter = container_of(css, struct dmemcg_state, css);

-		list_for_each_entry_rcu(pool, &dmemcg_iter->pools, css_node) {
-			if (pool_parent(pool) == parent_pool) {
-				found_descendant = true;
+		pool = NULL;
+		list_for_each_entry_rcu(candidate_pool, &dmemcg_iter->pools, css_node) {
+			if (pool_parent(candidate_pool) == parent_pool) {
+				pool = candidate_pool;
 				break;
 			}
 		}
-		if (!found_descendant)
+		if (!pool)
 			continue;

 		page_counter_calculate_protection(
--
2.48.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection
  2025-01-14 15:39 [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection Friedrich Vock
@ 2025-01-14 15:58 ` Michal Koutný
  2025-01-16  8:20   ` Friedrich Vock
  2025-01-27 15:27 ` [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre Friedrich Vock
  1 sibling, 1 reply; 14+ messages in thread
From: Michal Koutný @ 2025-01-14 15:58 UTC (permalink / raw)
  To: Friedrich Vock
  Cc: Tejun Heo, Johannes Weiner, Simona Vetter, David Airlie,
	Maarten Lankhorst, Maxime Ripard, dri-devel, cgroups

[-- Attachment #1: Type: text/plain, Size: 640 bytes --]

On Tue, Jan 14, 2025 at 04:39:12PM +0100, Friedrich Vock <friedrich.vock@gmx.de> wrote:
> If the current css doesn't contain any pool that is a descendant of
> the "pool" (i.e. when found_descendant == false), then "pool" will
> point to some unrelated pool. If the current css has a child, we'll
> overwrite parent_pool with this unrelated pool on the next iteration.

Could this be verified with more idiomatic way with
cgroup_is_descendant()? (The predicate could be used between pools [1]
if they pin respective cgroups).

Thanks,
Michal

[1] https://lore.kernel.org/all/uj6railxyazpu6ocl2ygo6lw4lavbsgg26oq57pxxqe5uzxw42@fhnqvq3tia6n/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection
  2025-01-14 15:58 ` Michal Koutný
@ 2025-01-16  8:20   ` Friedrich Vock
  2025-01-17 17:29     ` Michal Koutný
  0 siblings, 1 reply; 14+ messages in thread
From: Friedrich Vock @ 2025-01-16  8:20 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Tejun Heo, Johannes Weiner, Simona Vetter, David Airlie,
	Maarten Lankhorst, Maxime Ripard, dri-devel, cgroups

Hi,

On 14.01.25 16:58, Michal Koutný wrote:
> On Tue, Jan 14, 2025 at 04:39:12PM +0100, Friedrich Vock <friedrich.vock@gmx.de> wrote:
>> If the current css doesn't contain any pool that is a descendant of
>> the "pool" (i.e. when found_descendant == false), then "pool" will
>> point to some unrelated pool. If the current css has a child, we'll
>> overwrite parent_pool with this unrelated pool on the next iteration.
>
> Could this be verified with more idiomatic way with
> cgroup_is_descendant()? (The predicate could be used between pools [1]
> if they pin respective cgroups).

I'm not sure if I'm missing something, but I don't think
cgroup_is_descendant is really related to this issue. Each css can
contain some amount of "pools" representing memory from different
devices (e.g. with the current DRM implementation, one pool corresponds
to VRAM of a specific GPU). These pools are allocated on-demand, so if a
cgroup has not made any allocations for a specific device, there will be
no pool corresponding to that device's memory. Pools have a hierarchy of
their own (that is, for a given cgroup's pool corresponding to some
device, the "parent pool" refers to the parent cgroup's pool
corresponding to the same device).

In dmem_cgroup_calculate_protection, we're trying to update the
protection values for the entire pool hierarchy between
limit_pool/test_pool (with the end goal of having accurate
effective-protection values for test_pool). Since pools only store
parent pointers to establish that hierarchy, to find child pools given
only the parent pool, we iterate over the pools of all child cgroups and
check if the parent pointer matches with our current "parent pool" pointer.

The bug happens when some cgroup doesn't have any pool in the hierarchy
we're iterating over (that is, we iterate over all pools but don't find
any pool whose parent matches our current "parent pool" pointer). The
cgroup itself is part of the (cgroup) hierarchy, so the result of
cgroup_is_descendant is obviously true - but because of how we allocate
pools on-demand, it's still possible there is no pool that is part of
the (pool) hierarchy we're iterating over.

Thanks,
Friedrich

>
> Thanks,
> Michal
>
> [1] https://lore.kernel.org/all/uj6railxyazpu6ocl2ygo6lw4lavbsgg26oq57pxxqe5uzxw42@fhnqvq3tia6n/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection
  2025-01-16  8:20   ` Friedrich Vock
@ 2025-01-17 17:29     ` Michal Koutný
  2025-01-17 19:02       ` Friedrich Vock
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Koutný @ 2025-01-17 17:29 UTC (permalink / raw)
  To: Friedrich Vock
  Cc: Tejun Heo, Johannes Weiner, Simona Vetter, David Airlie,
	Maarten Lankhorst, Maxime Ripard, dri-devel, cgroups

[-- Attachment #1: Type: text/plain, Size: 1887 bytes --]

On Thu, Jan 16, 2025 at 09:20:08AM +0100, Friedrich Vock <friedrich.vock@gmx.de> wrote:
> These pools are allocated on-demand, so if a
> cgroup has not made any allocations for a specific device, there will be
> no pool corresponding to that device's memory.

Here I understand.

> Pools have a hierarchy of their own (that is, for a given cgroup's
> pool corresponding to some device, the "parent pool" refers to the
> parent cgroup's pool corresponding to the same device).
> 
> In dmem_cgroup_calculate_protection, we're trying to update the
> protection values for the entire pool hierarchy between
> limit_pool/test_pool (with the end goal of having accurate
> effective-protection values for test_pool).

If you check and bail out at start:
	if (!cgroup_is_descendant(test_pool->cs->css.cgroup, limit_pool->cs->css.cgroup))
		return;
...

> Since pools only store parent pointers to establish that hierarchy, to
> find child pools given only the parent pool, we iterate over the pools
> of all child cgroups and check if the parent pointer matches with our
> current "parent pool" pointer.
 
> The bug happens when some cgroup doesn't have any pool in the hierarchy
> we're iterating over (that is, we iterate over all pools but don't find
> any pool whose parent matches our current "parent pool" pointer).

...then the initial check ensures, you always find a pool that is
a descendant of limit_pool (at least the test_pool).
And there are pools for whole path between limit_pool and test_pool, or
am I mistaken here?

> The cgroup itself is part of the (cgroup) hierarchy, so the result of
> cgroup_is_descendant is obviously true - but because of how we
> allocate pools on-demand, it's still possible there is no pool that is
> part of the (pool) hierarchy we're iterating over.

Can there be a pool without cgroup?

Thanks,
Michal

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection
  2025-01-17 17:29     ` Michal Koutný
@ 2025-01-17 19:02       ` Friedrich Vock
  2025-01-24  9:56         ` Michal Koutný
  0 siblings, 1 reply; 14+ messages in thread
From: Friedrich Vock @ 2025-01-17 19:02 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Tejun Heo, Johannes Weiner, Simona Vetter, David Airlie,
	Maarten Lankhorst, Maxime Ripard, dri-devel, cgroups

On 17.01.25 18:29, Michal Koutný wrote:
> On Thu, Jan 16, 2025 at 09:20:08AM +0100, Friedrich Vock <friedrich.vock@gmx.de> wrote:
>> These pools are allocated on-demand, so if a
>> cgroup has not made any allocations for a specific device, there will be
>> no pool corresponding to that device's memory.
>
> Here I understand.
>
>> Pools have a hierarchy of their own (that is, for a given cgroup's
>> pool corresponding to some device, the "parent pool" refers to the
>> parent cgroup's pool corresponding to the same device).
>>
>> In dmem_cgroup_calculate_protection, we're trying to update the
>> protection values for the entire pool hierarchy between
>> limit_pool/test_pool (with the end goal of having accurate
>> effective-protection values for test_pool).
>
> If you check and bail out at start:
> 	if (!cgroup_is_descendant(test_pool->cs->css.cgroup, limit_pool->cs->css.cgroup))
> 		return;
> ...
>
>> Since pools only store parent pointers to establish that hierarchy, to
>> find child pools given only the parent pool, we iterate over the pools
>> of all child cgroups and check if the parent pointer matches with our
>> current "parent pool" pointer.
>
>> The bug happens when some cgroup doesn't have any pool in the hierarchy
>> we're iterating over (that is, we iterate over all pools but don't find
>> any pool whose parent matches our current "parent pool" pointer).
>
> ...then the initial check ensures, you always find a pool that is
> a descendant of limit_pool (at least the test_pool).
> And there are pools for whole path between limit_pool and test_pool, or
> am I mistaken here?

Yeah, there are pools for the whole path between limit_pool and
test_pool, but the issue is that we traverse the entire tree of cgroups,
and we don't always stay on the path between limit_pool and test_pool
(because we're iterating from the top down, and we don't know what the
path is in that direction - so we just traverse the whole tree until we
find test_pool).

This means that we'll sometimes end up straying off-path - and there are
no guarantees for which pools are present in the cgroups we visit there.
These cgroups are the potentially problematic ones where the issue can
happen.

Ideally we could always stay on the path between limit_pool and
test_pool, but this is hardly possible because we can only follow parent
links (so bottom-up traversal) but for accurate protection calculation
we need to traverse the path top-down.

>
>> The cgroup itself is part of the (cgroup) hierarchy, so the result of
>> cgroup_is_descendant is obviously true - but because of how we
>> allocate pools on-demand, it's still possible there is no pool that is
>> part of the (pool) hierarchy we're iterating over.
>
> Can there be a pool without cgroup?

No, each pool is always associated with exactly one cgroup. What I was
talking about was the case where a parent cgroup has pools A and B, but
its child cgroup only has a pool for A. In that case, the child cgroup
has no pool that is part of B's pool hierarchy.

Thanks,
Friedrich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection
  2025-01-17 19:02       ` Friedrich Vock
@ 2025-01-24  9:56         ` Michal Koutný
  2025-01-26 16:38           ` Friedrich Vock
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Koutný @ 2025-01-24  9:56 UTC (permalink / raw)
  To: Friedrich Vock
  Cc: Tejun Heo, Johannes Weiner, Simona Vetter, David Airlie,
	Maarten Lankhorst, Maxime Ripard, dri-devel, cgroups

[-- Attachment #1: Type: text/plain, Size: 1652 bytes --]

On Fri, Jan 17, 2025 at 08:02:55PM +0100, Friedrich Vock <friedrich.vock@gmx.de> wrote:
> Yeah, there are pools for the whole path between limit_pool and
> test_pool, but the issue is that we traverse the entire tree of cgroups,
> and we don't always stay on the path between limit_pool and test_pool
> (because we're iterating from the top down, and we don't know what the
> path is in that direction - so we just traverse the whole tree until we
> find test_pool).
> 
> This means that we'll sometimes end up straying off-path - and there are
> no guarantees for which pools are present in the cgroups we visit there.
> These cgroups are the potentially problematic ones where the issue can
> happen.
> 
> Ideally we could always stay on the path between limit_pool and
> test_pool, but this is hardly possible because we can only follow parent
> links (so bottom-up traversal) but for accurate protection calculation
> we need to traverse the path top-down.

Aha, thanks for bearing with me.

	css_foreach_descendant_pre(css, limit_pool->cs->css) {
		dmemcg_iter = container_of(css, struct dmemcg_state, css);

		struct dmem_cgroup_pool_state *found_pool = NULL;
		list_for_each_entry_rcu(pool, &dmemcg_iter->pools, css_node) {
			if (pool->region == limit_pool->region) {
				found_pool = pool
				break;
			}
		}
		if (!found_pool)
			continue;

		page_counter_calculate_protection(
			climit, &found->cnt, true);
	}

Here I use (IMO) more idiomatic css_foreach_descendant_pre() instead and
I use the predicate based on ->region (correct?) to match pool's
devices.

Would that work as intended?

Michal

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection
  2025-01-24  9:56         ` Michal Koutný
@ 2025-01-26 16:38           ` Friedrich Vock
  0 siblings, 0 replies; 14+ messages in thread
From: Friedrich Vock @ 2025-01-26 16:38 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Tejun Heo, Johannes Weiner, Simona Vetter, David Airlie,
	Maarten Lankhorst, Maxime Ripard, dri-devel, cgroups

On 24.01.25 10:56, Michal Koutný wrote:
> On Fri, Jan 17, 2025 at 08:02:55PM +0100, Friedrich Vock <friedrich.vock@gmx.de> wrote:
>> Yeah, there are pools for the whole path between limit_pool and
>> test_pool, but the issue is that we traverse the entire tree of cgroups,
>> and we don't always stay on the path between limit_pool and test_pool
>> (because we're iterating from the top down, and we don't know what the
>> path is in that direction - so we just traverse the whole tree until we
>> find test_pool).
>>
>> This means that we'll sometimes end up straying off-path - and there are
>> no guarantees for which pools are present in the cgroups we visit there.
>> These cgroups are the potentially problematic ones where the issue can
>> happen.
>>
>> Ideally we could always stay on the path between limit_pool and
>> test_pool, but this is hardly possible because we can only follow parent
>> links (so bottom-up traversal) but for accurate protection calculation
>> we need to traverse the path top-down.
>
> Aha, thanks for bearing with me.
>
> 	css_foreach_descendant_pre(css, limit_pool->cs->css) {
> 		dmemcg_iter = container_of(css, struct dmemcg_state, css);
>
> 		struct dmem_cgroup_pool_state *found_pool = NULL;
> 		list_for_each_entry_rcu(pool, &dmemcg_iter->pools, css_node) {
> 			if (pool->region == limit_pool->region) {
> 				found_pool = pool
> 				break;
> 			}
> 		}
> 		if (!found_pool)
> 			continue;
>
> 		page_counter_calculate_protection(
> 			climit, &found->cnt, true);
> 	}
>
> Here I use (IMO) more idiomatic css_foreach_descendant_pre() instead and
> I use the predicate based on ->region (correct?) to match pool's
> devices.

Good catch with ->region! That works well indeed, I think it might not
have been a thing back when I wrote this tree traversal :>

I've applied this snippet (with a few minor edits) and taken it for a
test run too - it appears to work nicely in practice as well. I'll send
out a v2 with this approach tomorrow.

Thanks,
Friedrich

>
> Would that work as intended?
>
> Michal


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre
  2025-01-14 15:39 [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection Friedrich Vock
  2025-01-14 15:58 ` Michal Koutný
@ 2025-01-27 15:27 ` Friedrich Vock
  2025-02-12  8:40   ` Maarten Lankhorst
  2025-02-18 14:55   ` Maarten Lankhorst
  1 sibling, 2 replies; 14+ messages in thread
From: Friedrich Vock @ 2025-01-27 15:27 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Simona Vetter,
	David Airlie
  Cc: Maarten Lankhorst, Maxime Ripard, dri-devel, cgroups,
	Friedrich Vock

The current implementation has a bug: If the current css doesn't
contain any pool that is a descendant of the "pool" (i.e. when
found_descendant == false), then "pool" will point to some unrelated
pool. If the current css has a child, we'll overwrite parent_pool with
this unrelated pool on the next iteration.

Since we can just check whether a pool refers to the same region to
determine whether or not it's related, all the additional pool tracking
is unnecessary, so just switch to using css_for_each_descendant_pre for
traversal.

Fixes: b168ed458 ("kernel/cgroup: Add "dmem" memory accounting cgroup")
Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
---

v2 (Michal): Switch to the more idiomatic css_for_each_descendant_pre
instead of fixing the open-coded version

---
 kernel/cgroup/dmem.c | 50 ++++++++++----------------------------------
 1 file changed, 11 insertions(+), 39 deletions(-)

diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index 52736ef0ccf2..77d9bb1c147f 100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -220,60 +220,32 @@ dmem_cgroup_calculate_protection(struct dmem_cgroup_pool_state *limit_pool,
 				 struct dmem_cgroup_pool_state *test_pool)
 {
 	struct page_counter *climit;
-	struct cgroup_subsys_state *css, *next_css;
+	struct cgroup_subsys_state *css;
 	struct dmemcg_state *dmemcg_iter;
-	struct dmem_cgroup_pool_state *pool, *parent_pool;
-	bool found_descendant;
+	struct dmem_cgroup_pool_state *pool, *found_pool;

 	climit = &limit_pool->cnt;

 	rcu_read_lock();
-	parent_pool = pool = limit_pool;
-	css = &limit_pool->cs->css;

-	/*
-	 * This logic is roughly equivalent to css_foreach_descendant_pre,
-	 * except we also track the parent pool to find out which pool we need
-	 * to calculate protection values for.
-	 *
-	 * We can stop the traversal once we find test_pool among the
-	 * descendants since we don't really care about any others.
-	 */
-	while (pool != test_pool) {
-		next_css = css_next_child(NULL, css);
-		if (next_css) {
-			parent_pool = pool;
-		} else {
-			while (css != &limit_pool->cs->css) {
-				next_css = css_next_child(css, css->parent);
-				if (next_css)
-					break;
-				css = css->parent;
-				parent_pool = pool_parent(parent_pool);
-			}
-			/*
-			 * We can only hit this when test_pool is not a
-			 * descendant of limit_pool.
-			 */
-			if (WARN_ON_ONCE(css == &limit_pool->cs->css))
-				break;
-		}
-		css = next_css;
-
-		found_descendant = false;
+	css_for_each_descendant_pre(css, &limit_pool->cs->css) {
 		dmemcg_iter = container_of(css, struct dmemcg_state, css);
+		found_pool = NULL;

 		list_for_each_entry_rcu(pool, &dmemcg_iter->pools, css_node) {
-			if (pool_parent(pool) == parent_pool) {
-				found_descendant = true;
+			if (pool->region == limit_pool->region) {
+				found_pool = pool;
 				break;
 			}
 		}
-		if (!found_descendant)
+		if (!found_pool)
 			continue;

 		page_counter_calculate_protection(
-			climit, &pool->cnt, true);
+			climit, &found_pool->cnt, true);
+
+		if (found_pool == test_pool)
+			break;
 	}
 	rcu_read_unlock();
 }
--
2.48.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre
  2025-01-27 15:27 ` [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre Friedrich Vock
@ 2025-02-12  8:40   ` Maarten Lankhorst
  2025-02-18 14:55   ` Maarten Lankhorst
  1 sibling, 0 replies; 14+ messages in thread
From: Maarten Lankhorst @ 2025-02-12  8:40 UTC (permalink / raw)
  To: Friedrich Vock, Tejun Heo, Johannes Weiner, Michal Koutný,
	Simona Vetter, David Airlie
  Cc: Maxime Ripard, dri-devel, cgroups

Hey,

I was hoping someone else would look at it, but it seems not.

This patch appears to work on my system, it would be helpful if I could 
get the exact sequence failing to write a reproducer, but I don't want 
to hold up a bugfix because of it.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Should I send this through the drm-misc-fixes tree?

Cheers,
~Maarten

On 2025-01-27 16:27, Friedrich Vock wrote:
> The current implementation has a bug: If the current css doesn't
> contain any pool that is a descendant of the "pool" (i.e. when
> found_descendant == false), then "pool" will point to some unrelated
> pool. If the current css has a child, we'll overwrite parent_pool with
> this unrelated pool on the next iteration.
> 
> Since we can just check whether a pool refers to the same region to
> determine whether or not it's related, all the additional pool tracking
> is unnecessary, so just switch to using css_for_each_descendant_pre for
> traversal.
> 
> Fixes: b168ed458 ("kernel/cgroup: Add "dmem" memory accounting cgroup")
> Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
> ---
> 
> v2 (Michal): Switch to the more idiomatic css_for_each_descendant_pre
> instead of fixing the open-coded version
> 
> ---
>   kernel/cgroup/dmem.c | 50 ++++++++++----------------------------------
>   1 file changed, 11 insertions(+), 39 deletions(-)
> 
> diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
> index 52736ef0ccf2..77d9bb1c147f 100644
> --- a/kernel/cgroup/dmem.c
> +++ b/kernel/cgroup/dmem.c
> @@ -220,60 +220,32 @@ dmem_cgroup_calculate_protection(struct dmem_cgroup_pool_state *limit_pool,
>   				 struct dmem_cgroup_pool_state *test_pool)
>   {
>   	struct page_counter *climit;
> -	struct cgroup_subsys_state *css, *next_css;
> +	struct cgroup_subsys_state *css;
>   	struct dmemcg_state *dmemcg_iter;
> -	struct dmem_cgroup_pool_state *pool, *parent_pool;
> -	bool found_descendant;
> +	struct dmem_cgroup_pool_state *pool, *found_pool;
> 
>   	climit = &limit_pool->cnt;
> 
>   	rcu_read_lock();
> -	parent_pool = pool = limit_pool;
> -	css = &limit_pool->cs->css;
> 
> -	/*
> -	 * This logic is roughly equivalent to css_foreach_descendant_pre,
> -	 * except we also track the parent pool to find out which pool we need
> -	 * to calculate protection values for.
> -	 *
> -	 * We can stop the traversal once we find test_pool among the
> -	 * descendants since we don't really care about any others.
> -	 */
> -	while (pool != test_pool) {
> -		next_css = css_next_child(NULL, css);
> -		if (next_css) {
> -			parent_pool = pool;
> -		} else {
> -			while (css != &limit_pool->cs->css) {
> -				next_css = css_next_child(css, css->parent);
> -				if (next_css)
> -					break;
> -				css = css->parent;
> -				parent_pool = pool_parent(parent_pool);
> -			}
> -			/*
> -			 * We can only hit this when test_pool is not a
> -			 * descendant of limit_pool.
> -			 */
> -			if (WARN_ON_ONCE(css == &limit_pool->cs->css))
> -				break;
> -		}
> -		css = next_css;
> -
> -		found_descendant = false;
> +	css_for_each_descendant_pre(css, &limit_pool->cs->css) {
>   		dmemcg_iter = container_of(css, struct dmemcg_state, css);
> +		found_pool = NULL;
> 
>   		list_for_each_entry_rcu(pool, &dmemcg_iter->pools, css_node) {
> -			if (pool_parent(pool) == parent_pool) {
> -				found_descendant = true;
> +			if (pool->region == limit_pool->region) {
> +				found_pool = pool;
>   				break;
>   			}
>   		}
> -		if (!found_descendant)
> +		if (!found_pool)
>   			continue;
> 
>   		page_counter_calculate_protection(
> -			climit, &pool->cnt, true);
> +			climit, &found_pool->cnt, true);
> +
> +		if (found_pool == test_pool)
> +			break;
>   	}
>   	rcu_read_unlock();
>   }
> --
> 2.48.0
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre
  2025-01-27 15:27 ` [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre Friedrich Vock
  2025-02-12  8:40   ` Maarten Lankhorst
@ 2025-02-18 14:55   ` Maarten Lankhorst
  2025-02-18 18:39     ` Tejun Heo
  1 sibling, 1 reply; 14+ messages in thread
From: Maarten Lankhorst @ 2025-02-18 14:55 UTC (permalink / raw)
  To: Friedrich Vock, Tejun Heo, Johannes Weiner, Michal Koutný,
	Simona Vetter, David Airlie
  Cc: Maxime Ripard, dri-devel, cgroups

Hey,

Should this fix go through the cgroup tree?

Cheers,
~Maarten

On 2025-01-27 16:27, Friedrich Vock wrote:
> The current implementation has a bug: If the current css doesn't
> contain any pool that is a descendant of the "pool" (i.e. when
> found_descendant == false), then "pool" will point to some unrelated
> pool. If the current css has a child, we'll overwrite parent_pool with
> this unrelated pool on the next iteration.
> 
> Since we can just check whether a pool refers to the same region to
> determine whether or not it's related, all the additional pool tracking
> is unnecessary, so just switch to using css_for_each_descendant_pre for
> traversal.
> 
> Fixes: b168ed458 ("kernel/cgroup: Add "dmem" memory accounting cgroup")
> Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
> ---
> 
> v2 (Michal): Switch to the more idiomatic css_for_each_descendant_pre
> instead of fixing the open-coded version
> 
> ---
>   kernel/cgroup/dmem.c | 50 ++++++++++----------------------------------
>   1 file changed, 11 insertions(+), 39 deletions(-)
> 
> diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
> index 52736ef0ccf2..77d9bb1c147f 100644
> --- a/kernel/cgroup/dmem.c
> +++ b/kernel/cgroup/dmem.c
> @@ -220,60 +220,32 @@ dmem_cgroup_calculate_protection(struct dmem_cgroup_pool_state *limit_pool,
>   				 struct dmem_cgroup_pool_state *test_pool)
>   {
>   	struct page_counter *climit;
> -	struct cgroup_subsys_state *css, *next_css;
> +	struct cgroup_subsys_state *css;
>   	struct dmemcg_state *dmemcg_iter;
> -	struct dmem_cgroup_pool_state *pool, *parent_pool;
> -	bool found_descendant;
> +	struct dmem_cgroup_pool_state *pool, *found_pool;
> 
>   	climit = &limit_pool->cnt;
> 
>   	rcu_read_lock();
> -	parent_pool = pool = limit_pool;
> -	css = &limit_pool->cs->css;
> 
> -	/*
> -	 * This logic is roughly equivalent to css_foreach_descendant_pre,
> -	 * except we also track the parent pool to find out which pool we need
> -	 * to calculate protection values for.
> -	 *
> -	 * We can stop the traversal once we find test_pool among the
> -	 * descendants since we don't really care about any others.
> -	 */
> -	while (pool != test_pool) {
> -		next_css = css_next_child(NULL, css);
> -		if (next_css) {
> -			parent_pool = pool;
> -		} else {
> -			while (css != &limit_pool->cs->css) {
> -				next_css = css_next_child(css, css->parent);
> -				if (next_css)
> -					break;
> -				css = css->parent;
> -				parent_pool = pool_parent(parent_pool);
> -			}
> -			/*
> -			 * We can only hit this when test_pool is not a
> -			 * descendant of limit_pool.
> -			 */
> -			if (WARN_ON_ONCE(css == &limit_pool->cs->css))
> -				break;
> -		}
> -		css = next_css;
> -
> -		found_descendant = false;
> +	css_for_each_descendant_pre(css, &limit_pool->cs->css) {
>   		dmemcg_iter = container_of(css, struct dmemcg_state, css);
> +		found_pool = NULL;
> 
>   		list_for_each_entry_rcu(pool, &dmemcg_iter->pools, css_node) {
> -			if (pool_parent(pool) == parent_pool) {
> -				found_descendant = true;
> +			if (pool->region == limit_pool->region) {
> +				found_pool = pool;
>   				break;
>   			}
>   		}
> -		if (!found_descendant)
> +		if (!found_pool)
>   			continue;
> 
>   		page_counter_calculate_protection(
> -			climit, &pool->cnt, true);
> +			climit, &found_pool->cnt, true);
> +
> +		if (found_pool == test_pool)
> +			break;
>   	}
>   	rcu_read_unlock();
>   }
> --
> 2.48.0
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre
  2025-02-18 14:55   ` Maarten Lankhorst
@ 2025-02-18 18:39     ` Tejun Heo
  2025-02-19  8:53       ` Maarten Lankhorst
  2025-02-19  9:08       ` Maxime Ripard
  0 siblings, 2 replies; 14+ messages in thread
From: Tejun Heo @ 2025-02-18 18:39 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Friedrich Vock, Johannes Weiner, Michal Koutný,
	Simona Vetter, David Airlie, Maxime Ripard, dri-devel, cgroups

Hello,

On Tue, Feb 18, 2025 at 03:55:43PM +0100, Maarten Lankhorst wrote:
> Should this fix go through the cgroup tree?

I haven't been routing any dmem patches. Might as well stick to drm tree?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre
  2025-02-18 18:39     ` Tejun Heo
@ 2025-02-19  8:53       ` Maarten Lankhorst
  2025-02-19  9:08       ` Maxime Ripard
  1 sibling, 0 replies; 14+ messages in thread
From: Maarten Lankhorst @ 2025-02-19  8:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Friedrich Vock, Johannes Weiner, Michal Koutný,
	Simona Vetter, David Airlie, Maxime Ripard, dri-devel, cgroups

Hey,

On 2025-02-18 19:39, Tejun Heo wrote:
> Hello,
> 
> On Tue, Feb 18, 2025 at 03:55:43PM +0100, Maarten Lankhorst wrote:
>> Should this fix go through the cgroup tree?
> 
> I haven't been routing any dmem patches. Might as well stick to drm tree?

Thanks, I've pushed the fix to drm-misc-fixes. It should likely enter 
v6.14-rc4, otherwise rc5. :)

Cheers,
~Maarten

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre
  2025-02-18 18:39     ` Tejun Heo
  2025-02-19  8:53       ` Maarten Lankhorst
@ 2025-02-19  9:08       ` Maxime Ripard
  2025-02-19 13:26         ` Simona Vetter
  1 sibling, 1 reply; 14+ messages in thread
From: Maxime Ripard @ 2025-02-19  9:08 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Maarten Lankhorst, Friedrich Vock, Johannes Weiner,
	Michal Koutný, Simona Vetter, David Airlie, dri-devel,
	cgroups

[-- Attachment #1: Type: text/plain, Size: 465 bytes --]

On Tue, Feb 18, 2025 at 08:39:58AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Tue, Feb 18, 2025 at 03:55:43PM +0100, Maarten Lankhorst wrote:
> > Should this fix go through the cgroup tree?
> 
> I haven't been routing any dmem patches. Might as well stick to drm tree?

We merged the dmem cgroup through drm because we also had driver
changes, but going forward, as far as I'm concerned, it's "your" thing,
and it really shouldn't go through drm

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre
  2025-02-19  9:08       ` Maxime Ripard
@ 2025-02-19 13:26         ` Simona Vetter
  0 siblings, 0 replies; 14+ messages in thread
From: Simona Vetter @ 2025-02-19 13:26 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Tejun Heo, Maarten Lankhorst, Friedrich Vock, Johannes Weiner,
	Michal Koutný, Simona Vetter, David Airlie, dri-devel,
	cgroups

On Wed, Feb 19, 2025 at 10:08:57AM +0100, Maxime Ripard wrote:
> On Tue, Feb 18, 2025 at 08:39:58AM -1000, Tejun Heo wrote:
> > Hello,
> > 
> > On Tue, Feb 18, 2025 at 03:55:43PM +0100, Maarten Lankhorst wrote:
> > > Should this fix go through the cgroup tree?
> > 
> > I haven't been routing any dmem patches. Might as well stick to drm tree?
> 
> We merged the dmem cgroup through drm because we also had driver
> changes, but going forward, as far as I'm concerned, it's "your" thing,
> and it really shouldn't go through drm

I guess we could also route it through drm-misc. Either way I think we
need a MAINTAINERS entry for dmem so that dri-devel gets cc'ed. And then
make a decision on which git repo should be the standard path. I think
either is fine, but at least for now it looks like most interactions are
between dmem and drm, and not between dmem and cgroups at large. And in
any case we can just ack patches for going the other path on a
case-by-case basis.

Cheers, Sima
-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-02-19 13:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-14 15:39 [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection Friedrich Vock
2025-01-14 15:58 ` Michal Koutný
2025-01-16  8:20   ` Friedrich Vock
2025-01-17 17:29     ` Michal Koutný
2025-01-17 19:02       ` Friedrich Vock
2025-01-24  9:56         ` Michal Koutný
2025-01-26 16:38           ` Friedrich Vock
2025-01-27 15:27 ` [PATCH v2] cgroup/dmem: Don't open-code css_for_each_descendant_pre Friedrich Vock
2025-02-12  8:40   ` Maarten Lankhorst
2025-02-18 14:55   ` Maarten Lankhorst
2025-02-18 18:39     ` Tejun Heo
2025-02-19  8:53       ` Maarten Lankhorst
2025-02-19  9:08       ` Maxime Ripard
2025-02-19 13:26         ` Simona Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox