All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michal Koutný" <mkoutny@suse.com>
To: Waiman Long <longman@redhat.com>
Cc: Chen Ridong <chenridong@huaweicloud.com>,
	Tejun Heo <tj@kernel.org>,  Johannes Weiner <hannes@cmpxchg.org>,
	Peter Zijlstra <peterz@infradead.org>,
	cgroups@vger.kernel.org,  linux-kernel@vger.kernel.org,
	Aaron Tomlin <atomlin@atomlin.com>,
	 Guopeng Zhang <guopeng.zhang@linux.dev>
Subject: Re: [PATCH-next v5 6/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach()
Date: Wed, 24 Jun 2026 17:45:27 +0200	[thread overview]
Message-ID: <ajutWBoJqkhktkvX@localhost.localdomain> (raw)
In-Reply-To: <20260602023203.248077-7-longman@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3179 bytes --]

Hello Waiman.

On Mon, Jun 01, 2026 at 10:32:03PM -0400, Waiman Long <longman@redhat.com> wrote:
> This problem is less an issue when enabling the cpuset controller as all
> the newly created child cpusets will have exactly the same set of CPUs
> and memory nodes except when deadline tasks are involved in migration
> as the deadline task accounting data can be off.
> 
> It can be more problematic when the cpuset controller is disabled as
> their set of CPUs and memory nodes may differ from their parent or with
> the moving of multi-threaded process from different threaded cgroups.

When I generalize that it can be an issue for any threaded controller
that somehow relies on the _difference_ between old and new thread
membership.

So I checked some: pids and perf_events look alright (no
diff-dependency) but I noticed the very same issue is tackled in
sched_change_group/scx_cgroup_move_task and that there is a member
inside task_struct allocated for this state tracking already:
  task_struct::scx::cgrp_moving_from

> Fix that by tracking the set of source (old) and destination cpusets
> in singly linked lists and iterating them all to properly update the
> internal data. Also keep the current cs and oldcs variables up-to-date
> with the css and task iterators.

So there would be more than a single use for something conceptually
like:

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 004e6d56a499a..740c02f220c75 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1326,6 +1326,9 @@ struct task_struct {
 #ifdef CONFIG_PREEMPT_RT
        struct llist_node               cg_dead_lnode;
 #endif /* CONFIG_PREEMPT_RT */
+#ifdef CONFIG_CGROUPS_MOVING_FROM
+       struct cgroup                   *cgrp_moving_from;
+#endif
 #endif /* CONFIG_CGROUPS */
 #ifdef CONFIG_X86_CPU_RESCTRL
        u32                             closid;
diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h
index 1a3af2ea2a794..5b63afe83f333 100644
--- a/include/linux/sched/ext.h
+++ b/include/linux/sched/ext.h
@@ -240,9 +240,6 @@ struct sched_ext_entity {
        bool                    disallow;       /* reject switching into SCX */
 
        /* cold fields */
-#ifdef CONFIG_EXT_GROUP_SCHED
-       struct cgroup           *cgrp_moving_from;
-#endif
        struct list_head        tasks_node;
 };
 
diff --git a/init/Kconfig b/init/Kconfig
index 2937c4d308aec..d7e7d4477f862 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1186,6 +1186,7 @@ config EXT_GROUP_SCHED
        depends on SCHED_CLASS_EXT && CGROUP_SCHED
        select GROUP_SCHED_WEIGHT
        select GROUP_SCHED_BANDWIDTH
+       select CGROUPS_MOVING_FROM
        default y
 
 endif #CGROUP_SCHED
@@ -1288,6 +1289,7 @@ config CPUSETS
        depends on SMP
        select UNION_FIND
        select CPU_ISOLATION
+       select CGROUPS_MOVING_FROM
        help
          This option will let you create and manage CPUSETs which
          allow dynamically partitioning a system into sets of CPUs and

I think this could simplify the before-after state tracking generally,
WDYT?

Michal

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

  parent reply	other threads:[~2026-06-24 15:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-02  2:31 [PATCH-next v5 0/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Waiman Long
2026-06-02  2:31 ` [PATCH-next v5 1/6] cgroup/cpuset: Fix node inconsistencies between cpuset_update_tasks_nodemask() and cpuset_attach() Waiman Long
2026-06-02 13:37   ` Ridong Chen
2026-06-02 18:43     ` Waiman Long
2026-06-02  2:31 ` [PATCH-next v5 2/6] cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper Waiman Long
2026-06-02 13:40   ` Ridong Chen
2026-06-02  2:32 ` [PATCH-next v5 3/6] cgroup/cpuset: Expand the scope of cpuset_can_attach_check() Waiman Long
2026-06-02 13:51   ` Ridong Chen
2026-06-02  2:32 ` [PATCH-next v5 4/6] cgroup/cpuset: Make cpuset_attach_old_cs track task group leaders Waiman Long
2026-06-02 13:58   ` Ridong Chen
2026-06-02  2:32 ` [PATCH-next v5 5/6] cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside cpuset_attach_task() Waiman Long
2026-06-02  2:32 ` [PATCH-next v5 6/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Waiman Long
2026-06-03 10:26   ` [PATCH] cgroup/cpuset: Support multiple source/destination cpusets using pids pattern Ridong Chen
2026-06-03 10:32     ` Ridong Chen
2026-06-03 18:47     ` Waiman Long
2026-06-05  7:35       ` Ridong Chen
2026-06-05 17:15         ` Waiman Long
2026-06-07  3:12           ` Ridong Chen
2026-06-08 18:49             ` Waiman Long
2026-06-11  6:17               ` Ridong Chen
2026-06-07  3:22           ` Ridong Chen
2026-06-24 15:45   ` Michal Koutný [this message]
2026-06-24 15:51 ` [PATCH-next v5 0/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach() Michal Koutný

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajutWBoJqkhktkvX@localhost.localdomain \
    --to=mkoutny@suse.com \
    --cc=atomlin@atomlin.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chenridong@huaweicloud.com \
    --cc=guopeng.zhang@linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.