All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Marco Crivellari <marco.crivellari@suse.com>,
	Michal Hocko <mhocko@suse.com>,
	Muchun Song <muchun.song@linux.dev>,
	Peter Zijlstra <peterz@infradead.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>, Tejun Heo <tj@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vlastimil Babka <vbabka@suse.cz>,
	Waiman Long <longman@redhat.com>,
	cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH 15/33] sched/isolation: Flush memcg workqueues on cpuset isolated partition change
Date: Fri, 29 Aug 2025 17:47:56 +0200	[thread overview]
Message-ID: <20250829154814.47015-16-frederic@kernel.org> (raw)
In-Reply-To: <20250829154814.47015-1-frederic@kernel.org>

The HK_TYPE_DOMAIN housekeeping cpumask is now modifyable at runtime. In
order to synchronize against memcg workqueue to make sure that no
asynchronous draining is still pending or executing on a newly made
isolated CPU, the housekeeping susbsystem must flush the memcg
workqueues.

However the memcg workqueues can't be flushed easily since they are
queued to the main per-CPU workqueue pool.

Solve this with creating a memcg specific pool and provide and use the
appropriate flushing API.

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 include/linux/memcontrol.h |  4 ++++
 kernel/sched/isolation.c   |  2 ++
 kernel/sched/sched.h       |  1 +
 mm/memcontrol.c            | 12 +++++++++++-
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 785173aa0739..8b23ff000473 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1048,6 +1048,8 @@ static inline u64 cgroup_id_from_mm(struct mm_struct *mm)
 	return id;
 }
 
+void mem_cgroup_flush_workqueue(void);
+
 extern int mem_cgroup_init(void);
 #else /* CONFIG_MEMCG */
 
@@ -1453,6 +1455,8 @@ static inline u64 cgroup_id_from_mm(struct mm_struct *mm)
 	return 0;
 }
 
+static inline void mem_cgroup_flush_workqueue(void) { }
+
 static inline int mem_cgroup_init(void) { return 0; }
 #endif /* CONFIG_MEMCG */
 
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 48f3b6b20604..e85f402b103a 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -124,6 +124,8 @@ int housekeeping_update(struct cpumask *mask, enum hk_type type)
 
 	synchronize_rcu();
 
+	mem_cgroup_flush_workqueue();
+
 	kfree(old);
 
 	return 0;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index d3512138027b..1dad1ac7fc61 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -44,6 +44,7 @@
 #include <linux/lockdep_api.h>
 #include <linux/lockdep.h>
 #include <linux/memblock.h>
+#include <linux/memcontrol.h>
 #include <linux/minmax.h>
 #include <linux/mm.h>
 #include <linux/module.h>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2649d6c09160..1aa2dfa32ccd 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -95,6 +95,8 @@ static bool cgroup_memory_nokmem __ro_after_init;
 /* BPF memory accounting disabled? */
 static bool cgroup_memory_nobpf __ro_after_init;
 
+static struct workqueue_struct *memcg_wq __ro_after_init;
+
 static struct kmem_cache *memcg_cachep;
 static struct kmem_cache *memcg_pn_cachep;
 
@@ -1974,7 +1976,7 @@ static void schedule_drain_work(int cpu, struct work_struct *work)
 {
 	guard(rcu)();
 	if (!cpu_is_isolated(cpu))
-		schedule_work_on(cpu, work);
+		queue_work_on(cpu, memcg_wq, work);
 }
 
 /*
@@ -5071,6 +5073,11 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 	refill_stock(memcg, nr_pages);
 }
 
+void mem_cgroup_flush_workqueue(void)
+{
+	flush_workqueue(memcg_wq);
+}
+
 static int __init cgroup_memory(char *s)
 {
 	char *token;
@@ -5113,6 +5120,9 @@ int __init mem_cgroup_init(void)
 	cpuhp_setup_state_nocalls(CPUHP_MM_MEMCQ_DEAD, "mm/memctrl:dead", NULL,
 				  memcg_hotplug_cpu_dead);
 
+	memcg_wq = alloc_workqueue("memcg", 0, 0);
+	WARN_ON(!memcg_wq);
+
 	for_each_possible_cpu(cpu) {
 		INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work,
 			  drain_local_memcg_stock);
-- 
2.51.0


  parent reply	other threads:[~2025-08-29 15:49 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29 15:47 [PATCH 00/33 v2] cpuset/isolation: Honour kthreads preferred affinity Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 01/33] sched/isolation: Remove housekeeping static key Frederic Weisbecker
2025-08-29 21:34   ` Waiman Long
2025-09-18 12:04     ` Frederic Weisbecker
2025-09-01 10:26   ` Peter Zijlstra
2025-09-18 13:18     ` Frederic Weisbecker
2025-09-11 20:57   ` Phil Auld
2025-08-29 15:47 ` [PATCH 02/33] PCI: Protect against concurrent change of housekeeping cpumask Frederic Weisbecker
     [not found]   ` <458c5db8-0c31-4c02-9c41-b7eca851d04a@redhat.com>
2025-09-18 14:00     ` Frederic Weisbecker
2025-09-22 21:51       ` Waiman Long
2025-09-23  9:07         ` Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 03/33] cpu: Revert "cpu/hotplug: Prevent self deadlock on CPU hot-unplug" Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 04/33] memcg: Prepare to protect against concurrent isolated cpuset change Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 05/33] mm: vmstat: " Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 06/33] sched/isolation: Save boot defined domain flags Frederic Weisbecker
2025-09-11 21:02   ` Phil Auld
2025-08-29 15:47 ` [PATCH 07/33] cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT Frederic Weisbecker
2025-09-11 21:03   ` Phil Auld
2025-08-29 15:47 ` [PATCH 08/33] driver core: cpu: Convert /sys/devices/system/cpu/isolated " Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 09/33] net: Keep ignoring isolated cpuset change Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 10/33] block: Protect against concurrent " Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 11/33] cpu: Provide lockdep check for CPU hotplug lock write-held Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 12/33] cpuset: Provide lockdep check for cpuset lock held Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 13/33] sched/isolation: Convert housekeeping cpumasks to rcu pointers Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 14/33] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset Frederic Weisbecker
2025-09-01  0:40   ` Waiman Long
2025-09-22 14:57     ` Frederic Weisbecker
2025-08-29 15:47 ` Frederic Weisbecker [this message]
2025-08-29 15:47 ` [PATCH 16/33] sched/isolation: Flush vmstat workqueues on cpuset isolated partition change Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 17/33] cpuset: Propagate cpuset isolation update to workqueue through housekeeping Frederic Weisbecker
2025-09-01  2:51   ` Waiman Long
2025-09-22 15:10     ` Frederic Weisbecker
2025-08-29 15:47 ` [PATCH 18/33] cpuset: Remove cpuset_cpu_is_isolated() Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 19/33] sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() Frederic Weisbecker
2025-09-02 14:28   ` Waiman Long
2025-09-02 15:48     ` Waiman Long
2025-09-22 15:20       ` Frederic Weisbecker
2025-09-22 15:19     ` Frederic Weisbecker
2025-09-22 21:59       ` Waiman Long
2025-09-23  9:11         ` Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 20/33] PCI: Remove superfluous HK_TYPE_WQ check Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 21/33] kthread: Refine naming of affinity related fields Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 22/33] kthread: Include unbound kthreads in the managed affinity list Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 23/33] kthread: Include kthreadd to " Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 24/33] kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 25/33] sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 26/33] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Frederic Weisbecker
2025-09-02 15:44   ` Waiman Long
2025-09-23  9:17     ` Frederic Weisbecker
2025-09-23  9:24       ` Gabriele Monaco
2025-08-29 15:48 ` [PATCH 27/33] sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-09-02 16:43   ` Waiman Long
2025-09-23  9:43     ` Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 28/33] kthread: Honour kthreads preferred affinity after cpuset changes Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 29/33] kthread: Comment on the purpose and placement of kthread_affine_node() call Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 30/33] kthread: Add API to update preferred affinity on kthread runtime Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 31/33] kthread: Document kthread_affine_preferred() Frederic Weisbecker
2025-08-29 15:48 ` [RFC PATCH 32/33] genirq: Correctly handle preferred kthreads affinity Frederic Weisbecker
2025-08-29 15:48 ` [PATCH 33/33] doc: Add housekeeping documentation Frederic Weisbecker
2025-09-02 19:12 ` [PATCH 00/33 v2] cpuset/isolation: Honour kthreads preferred affinity Waiman Long
2025-09-23  9:48   ` Frederic Weisbecker
  -- strict thread matches above, loose matches on Subject: below --
2025-12-24 13:44 [PATCH 00/33 v5] " Frederic Weisbecker
2025-12-24 13:45 ` [PATCH 15/33] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Frederic Weisbecker
2026-01-01 22:13 [PATCH 00/33 v6] cpuset/isolation: Honour kthreads preferred affinity Frederic Weisbecker
2026-01-01 22:13 ` [PATCH 15/33] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Frederic Weisbecker
2026-01-25 22:45 [PATCH 00/33 v7] cpuset/isolation: Honour kthreads preferred affinity Frederic Weisbecker
2026-01-25 22:45 ` [PATCH 15/33] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250829154814.47015-16-frederic@kernel.org \
    --to=frederic@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=marco.crivellari@suse.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=muchun.song@linux.dev \
    --cc=peterz@infradead.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.