linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/33 v3] cpuset/isolation: Honour kthreads preferred affinity
@ 2025-10-13 20:31 Frederic Weisbecker
  2025-10-13 20:31 ` [PATCH 01/33] PCI: Prepare to protect against concurrent isolated cpuset change Frederic Weisbecker
                   ` (32 more replies)
  0 siblings, 33 replies; 50+ messages in thread
From: Frederic Weisbecker @ 2025-10-13 20:31 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, David S . Miller, Danilo Krummrich,
	Johannes Weiner, Catalin Marinas, Rafael J . Wysocki, Ingo Molnar,
	Jens Axboe, linux-block, cgroups, Michal Koutny, Shakeel Butt,
	Simon Horman, Waiman Long, Phil Auld, linux-pci, Muchun Song,
	Peter Zijlstra, Eric Dumazet, Thomas Gleixner, Vlastimil Babka,
	Greg Kroah-Hartman, Marco Crivellari, Will Deacon, Roman Gushchin,
	Michal Hocko, Lai Jiangshan, linux-mm, Gabriele Monaco,
	Andrew Morton, Tejun Heo, Bjorn Helgaas, Paolo Abeni, netdev,
	linux-arm-kernel, Jakub Kicinski

Hi,

The kthread code was enhanced lately to provide an infrastructure which
manages the preferred affinity of unbound kthreads (node or custom
cpumask) against housekeeping constraints and CPU hotplug events.

One crucial missing piece is cpuset: when an isolated partition is
created, deleted, or its CPUs updated, all the unbound kthreads in the
top cpuset are affine to _all_ the non-isolated CPUs, possibly breaking
their preferred affinity along the way

Solve this with performing the kthreads affinity update from cpuset to
the kthreads consolidated relevant code instead so that preferred
affinities are honoured.

The dispatch of the new cpumasks to workqueues and kthreads is performed
by housekeeping, as per the nice Tejun's suggestion.

As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set
from isolcpus= and cpuset isolated partitions. Housekeeping cpumasks are
now modifyable with specific synchronization. A big step toward making
nohz_full= also mutable through cpuset in the future.

Changes since v2:

* Keep static key (peterz)

* Handle PCI work flush

* Comment why RCU is held until PCI work is queued (Waiman)

* Add new tags

* Add CONFIG_LOCKDEP ifdeffery (Waiman)

* Rename workqueue_unbound_exclude_cpumask() to workqueue_unbound_housekeeping_update()
  and invert the parameter (Waiman)
  
* Fix a few changelogs that used to mention that HK_TYPE_KERNEL_NOISE
  must depend on HK_TYPE_DOMAIN. It's strongly advised but not mandatory (Waiman)
  
* Cherry-pick latest version of "cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping"
  (Waiman and Gabriele)

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	kthread/core-v3

HEAD: 4ba707cdced479592e9f461e1944b7fa6f75910f

Thanks,
	Frederic
---

Frederic Weisbecker (32):
      PCI: Prepare to protect against concurrent isolated cpuset change
      cpu: Revert "cpu/hotplug: Prevent self deadlock on CPU hot-unplug"
      memcg: Prepare to protect against concurrent isolated cpuset change
      mm: vmstat: Prepare to protect against concurrent isolated cpuset change
      sched/isolation: Save boot defined domain flags
      cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT
      driver core: cpu: Convert /sys/devices/system/cpu/isolated to use HK_TYPE_DOMAIN_BOOT
      net: Keep ignoring isolated cpuset change
      block: Protect against concurrent isolated cpuset change
      cpu: Provide lockdep check for CPU hotplug lock write-held
      cpuset: Provide lockdep check for cpuset lock held
      sched/isolation: Convert housekeeping cpumasks to rcu pointers
      cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset
      sched/isolation: Flush memcg workqueues on cpuset isolated partition change
      sched/isolation: Flush vmstat workqueues on cpuset isolated partition change
      PCI: Flush PCI probe workqueue on cpuset isolated partition change
      cpuset: Propagate cpuset isolation update to workqueue through housekeeping
      cpuset: Remove cpuset_cpu_is_isolated()
      sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated()
      PCI: Remove superfluous HK_TYPE_WQ check
      kthread: Refine naming of affinity related fields
      kthread: Include unbound kthreads in the managed affinity list
      kthread: Include kthreadd to the managed affinity list
      kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management
      sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN
      sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN
      kthread: Honour kthreads preferred affinity after cpuset changes
      kthread: Comment on the purpose and placement of kthread_affine_node() call
      kthread: Add API to update preferred affinity on kthread runtime
      kthread: Document kthread_affine_preferred()
      genirq: Correctly handle preferred kthreads affinity
      doc: Add housekeeping documentation

Gabriele Monaco (1):
      cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping

 Documentation/cpu_isolation/housekeeping.rst | 111 +++++++++++++++
 arch/arm64/kernel/cpufeature.c               |  18 ++-
 block/blk-mq.c                               |   6 +-
 drivers/base/cpu.c                           |   2 +-
 drivers/pci/pci-driver.c                     |  71 +++++++---
 include/linux/cpu.h                          |   4 +
 include/linux/cpuhplock.h                    |   1 +
 include/linux/cpuset.h                       |   8 +-
 include/linux/kthread.h                      |   2 +
 include/linux/memcontrol.h                   |   4 +
 include/linux/mmu_context.h                  |   2 +-
 include/linux/pci.h                          |   3 +
 include/linux/percpu-rwsem.h                 |   1 +
 include/linux/sched/isolation.h              |   7 +-
 include/linux/vmstat.h                       |   2 +
 include/linux/workqueue.h                    |   2 +-
 init/Kconfig                                 |   1 +
 kernel/cgroup/cpuset.c                       | 134 +++++++++++++-----
 kernel/cpu.c                                 |  42 +++---
 kernel/irq/manage.c                          |  47 ++++---
 kernel/kthread.c                             | 195 +++++++++++++++++++--------
 kernel/sched/isolation.c                     | 137 +++++++++++++++----
 kernel/sched/sched.h                         |   4 +
 kernel/workqueue.c                           |  17 ++-
 mm/memcontrol.c                              |  25 +++-
 mm/vmstat.c                                  |  15 ++-
 net/core/net-sysfs.c                         |   2 +-
 27 files changed, 647 insertions(+), 216 deletions(-)


^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2025-10-29 18:05 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-13 20:31 [PATCH 00/33 v3] cpuset/isolation: Honour kthreads preferred affinity Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 01/33] PCI: Prepare to protect against concurrent isolated cpuset change Frederic Weisbecker
2025-10-14 20:53   ` Bjorn Helgaas
2025-10-13 20:31 ` [PATCH 02/33] cpu: Revert "cpu/hotplug: Prevent self deadlock on CPU hot-unplug" Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 03/33] memcg: Prepare to protect against concurrent isolated cpuset change Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 04/33] mm: vmstat: " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 05/33] sched/isolation: Save boot defined domain flags Frederic Weisbecker
2025-10-23 15:45   ` Valentin Schneider
2025-10-13 20:31 ` [PATCH 06/33] cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 07/33] driver core: cpu: Convert /sys/devices/system/cpu/isolated " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 08/33] net: Keep ignoring isolated cpuset change Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 09/33] block: Protect against concurrent " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 10/33] cpu: Provide lockdep check for CPU hotplug lock write-held Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 11/33] cpuset: Provide lockdep check for cpuset lock held Frederic Weisbecker
2025-10-14 13:29   ` Chen Ridong
2025-10-13 20:31 ` [PATCH 12/33] sched/isolation: Convert housekeeping cpumasks to rcu pointers Frederic Weisbecker
2025-10-21  1:46   ` Chen Ridong
2025-10-21  1:57     ` Chen Ridong
2025-10-21  4:03     ` Waiman Long
2025-10-21  3:49   ` Waiman Long
2025-10-13 20:31 ` [PATCH 13/33] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset Frederic Weisbecker
2025-10-21  4:10   ` Waiman Long
2025-10-22  1:36     ` Chen Ridong
2025-10-21 13:39   ` Waiman Long
2025-10-13 20:31 ` [PATCH 14/33] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Frederic Weisbecker
2025-10-21 19:16   ` Waiman Long
2025-10-21 19:28     ` Waiman Long
2025-10-13 20:31 ` [PATCH 15/33] sched/isolation: Flush vmstat " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 16/33] PCI: Flush PCI probe workqueue " Frederic Weisbecker
2025-10-14 20:50   ` Bjorn Helgaas
2025-10-13 20:31 ` [PATCH 17/33] cpuset: Propagate cpuset isolation update to workqueue through housekeeping Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 18/33] cpuset: Remove cpuset_cpu_is_isolated() Frederic Weisbecker
2025-10-29 18:05   ` Waiman Long
2025-10-13 20:31 ` [PATCH 19/33] sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 20/33] PCI: Remove superfluous HK_TYPE_WQ check Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 21/33] kthread: Refine naming of affinity related fields Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 22/33] kthread: Include unbound kthreads in the managed affinity list Frederic Weisbecker
2025-10-21 22:42   ` Waiman Long
2025-10-13 20:31 ` [PATCH 23/33] kthread: Include kthreadd to " Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 24/33] kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 25/33] sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 26/33] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 27/33] sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 28/33] kthread: Honour kthreads preferred affinity after cpuset changes Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 29/33] kthread: Comment on the purpose and placement of kthread_affine_node() call Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 30/33] kthread: Add API to update preferred affinity on kthread runtime Frederic Weisbecker
2025-10-14 12:35   ` Simon Horman
2025-10-13 20:31 ` [PATCH 31/33] kthread: Document kthread_affine_preferred() Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 32/33] genirq: Correctly handle preferred kthreads affinity Frederic Weisbecker
2025-10-13 20:31 ` [PATCH 33/33] doc: Add housekeeping documentation Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).