* [RFC PATCH 2/6] workqueue: Decouple HK_FLAG_WQ and HK_FLAG_DOMAIN cpumask fetch
[not found] ` <20210714135420.69624-1-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2021-07-14 13:54 ` Frederic Weisbecker
2021-07-14 13:54 ` [RFC PATCH 4/6] sched/isolation: Split domain housekeeping mask from the rest Frederic Weisbecker
` (3 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2021-07-14 13:54 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Tejun Heo, Peter Zijlstra, Juri Lelli,
Alex Belits, Nitesh Lal, Thomas Gleixner, Nicolas Saenz,
Christoph Lameter, Marcelo Tosatti, Zefan Li,
cgroups-u79uwXL29TY76Z2rM5mHXA
To prepare for supporting each feature of the housekeeping cpumask
toward cpuset, prepare for HK_FLAG_DOMAIN to move to its own cpumask.
This will allow to modify the set passed through "isolcpus=" kernel boot
parameter on runtime.
Signed-off-by: Frederic Weisbecker <frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Marcelo Tosatti <mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Nitesh Lal <nilal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Nicolas Saenz <nsaenzju-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Christoph Lameter <cl-LoxgEY9JZOazQB+pC5nmwQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
Cc: Alex Belits <abelits-eYqpPyKDWXRBDgjK7y7TUQ@public.gmane.org>
---
kernel/workqueue.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 50142fc08902..d29c5b61a333 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5938,13 +5938,13 @@ static void __init wq_numa_init(void)
void __init workqueue_init_early(void)
{
int std_nice[NR_STD_WORKER_POOLS] = { 0, HIGHPRI_NICE_LEVEL };
- int hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
int i, cpu;
BUILD_BUG_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
- cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(hk_flags));
+ cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(HK_FLAG_WQ));
+ cpumask_and(wq_unbound_cpumask, wq_unbound_cpumask, housekeeping_cpumask(HK_FLAG_DOMAIN));
pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC);
--
2.25.1
^ permalink raw reply related [flat|nested] 18+ messages in thread* [RFC PATCH 4/6] sched/isolation: Split domain housekeeping mask from the rest
[not found] ` <20210714135420.69624-1-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2021-07-14 13:54 ` [RFC PATCH 2/6] workqueue: " Frederic Weisbecker
@ 2021-07-14 13:54 ` Frederic Weisbecker
2021-07-14 13:54 ` [RFC PATCH 5/6] sched/isolation: Make HK_FLAG_DOMAIN mutable Frederic Weisbecker
` (2 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2021-07-14 13:54 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Tejun Heo, Peter Zijlstra, Juri Lelli,
Alex Belits, Nitesh Lal, Thomas Gleixner, Nicolas Saenz,
Christoph Lameter, Marcelo Tosatti, Zefan Li,
cgroups-u79uwXL29TY76Z2rM5mHXA
To prepare for supporting each feature of the housekeeping cpumask
toward cpuset, move HK_FLAG_DOMAIN to its own cpumask. This will allow
to modify the set passed through "isolcpus=" kernel boot parameter on
runtime.
Signed-off-by: Frederic Weisbecker <frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Marcelo Tosatti <mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Nitesh Lal <nilal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Nicolas Saenz <nsaenzju-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Christoph Lameter <cl-LoxgEY9JZOazQB+pC5nmwQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
Cc: Alex Belits <abelits-eYqpPyKDWXRBDgjK7y7TUQ@public.gmane.org>
---
kernel/sched/isolation.c | 54 +++++++++++++++++++++++++++++++++-------
1 file changed, 45 insertions(+), 9 deletions(-)
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 7f06eaf12818..c2bdf7e6dc39 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -12,6 +12,7 @@
DEFINE_STATIC_KEY_FALSE(housekeeping_overridden);
EXPORT_SYMBOL_GPL(housekeeping_overridden);
static cpumask_var_t housekeeping_mask;
+static cpumask_var_t hk_domain_mask;
static unsigned int housekeeping_flags;
bool housekeeping_enabled(enum hk_flags flags)
@@ -26,7 +27,7 @@ int housekeeping_any_cpu(enum hk_flags flags)
if (static_branch_unlikely(&housekeeping_overridden)) {
if (housekeeping_flags & flags) {
- cpu = sched_numa_find_closest(housekeeping_mask, smp_processor_id());
+ cpu = sched_numa_find_closest(housekeeping_cpumask(flags), smp_processor_id());
if (cpu < nr_cpu_ids)
return cpu;
@@ -39,9 +40,13 @@ EXPORT_SYMBOL_GPL(housekeeping_any_cpu);
const struct cpumask *housekeeping_cpumask(enum hk_flags flags)
{
- if (static_branch_unlikely(&housekeeping_overridden))
+ if (static_branch_unlikely(&housekeeping_overridden)) {
+ WARN_ON_ONCE((flags & HK_FLAG_DOMAIN) && (flags & ~HK_FLAG_DOMAIN));
+ if (housekeeping_flags & HK_FLAG_DOMAIN)
+ return hk_domain_mask;
if (housekeeping_flags & flags)
return housekeeping_mask;
+ }
return cpu_possible_mask;
}
EXPORT_SYMBOL_GPL(housekeeping_cpumask);
@@ -50,7 +55,7 @@ void housekeeping_affine(struct task_struct *t, enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overridden))
if (housekeeping_flags & flags)
- set_cpus_allowed_ptr(t, housekeeping_mask);
+ set_cpus_allowed_ptr(t, housekeeping_cpumask(flags));
}
EXPORT_SYMBOL_GPL(housekeeping_affine);
@@ -58,11 +63,13 @@ bool housekeeping_test_cpu(int cpu, enum hk_flags flags)
{
if (static_branch_unlikely(&housekeeping_overridden))
if (housekeeping_flags & flags)
- return cpumask_test_cpu(cpu, housekeeping_mask);
+ return cpumask_test_cpu(cpu, housekeeping_cpumask(flags));
return true;
}
EXPORT_SYMBOL_GPL(housekeeping_test_cpu);
+
+
void __init housekeeping_init(void)
{
if (!housekeeping_flags)
@@ -91,28 +98,57 @@ static int __init housekeeping_setup(char *str, enum hk_flags flags)
alloc_bootmem_cpumask_var(&tmp);
if (!housekeeping_flags) {
- alloc_bootmem_cpumask_var(&housekeeping_mask);
- cpumask_andnot(housekeeping_mask,
- cpu_possible_mask, non_housekeeping_mask);
+ if (flags & ~HK_FLAG_DOMAIN) {
+ alloc_bootmem_cpumask_var(&housekeeping_mask);
+ cpumask_andnot(housekeeping_mask,
+ cpu_possible_mask, non_housekeeping_mask);
+ }
+
+ if (flags & HK_FLAG_DOMAIN) {
+ alloc_bootmem_cpumask_var(&hk_domain_mask);
+ cpumask_andnot(hk_domain_mask,
+ cpu_possible_mask, non_housekeeping_mask);
+ }
cpumask_andnot(tmp, cpu_present_mask, non_housekeeping_mask);
if (cpumask_empty(tmp)) {
pr_warn("Housekeeping: must include one present CPU, "
"using boot CPU:%d\n", smp_processor_id());
- __cpumask_set_cpu(smp_processor_id(), housekeeping_mask);
+ if (flags & ~HK_FLAG_DOMAIN)
+ __cpumask_set_cpu(smp_processor_id(), housekeeping_mask);
+ if (flags & HK_FLAG_DOMAIN)
+ __cpumask_set_cpu(smp_processor_id(), hk_domain_mask);
__cpumask_clear_cpu(smp_processor_id(), non_housekeeping_mask);
}
} else {
+ struct cpumask *prev;
+
cpumask_andnot(tmp, cpu_present_mask, non_housekeeping_mask);
if (cpumask_empty(tmp))
__cpumask_clear_cpu(smp_processor_id(), non_housekeeping_mask);
cpumask_andnot(tmp, cpu_possible_mask, non_housekeeping_mask);
- if (!cpumask_equal(tmp, housekeeping_mask)) {
+
+ if (housekeeping_flags == HK_FLAG_DOMAIN)
+ prev = hk_domain_mask;
+ else
+ prev = housekeeping_mask;
+
+ if (!cpumask_equal(tmp, prev)) {
pr_warn("Housekeeping: nohz_full= must match isolcpus=\n");
free_bootmem_cpumask_var(tmp);
free_bootmem_cpumask_var(non_housekeeping_mask);
return 0;
}
+
+ if ((housekeeping_flags & HK_FLAG_DOMAIN) && (flags & ~HK_FLAG_DOMAIN)) {
+ alloc_bootmem_cpumask_var(&housekeeping_mask);
+ cpumask_copy(housekeeping_mask, hk_domain_mask);
+ }
+
+ if ((housekeeping_flags & ~HK_FLAG_DOMAIN) && (flags & HK_FLAG_DOMAIN)) {
+ alloc_bootmem_cpumask_var(&hk_domain_mask);
+ cpumask_copy(hk_domain_mask, housekeeping_mask);
+ }
}
free_bootmem_cpumask_var(tmp);
--
2.25.1
^ permalink raw reply related [flat|nested] 18+ messages in thread* [RFC PATCH 5/6] sched/isolation: Make HK_FLAG_DOMAIN mutable
[not found] ` <20210714135420.69624-1-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2021-07-14 13:54 ` [RFC PATCH 2/6] workqueue: " Frederic Weisbecker
2021-07-14 13:54 ` [RFC PATCH 4/6] sched/isolation: Split domain housekeeping mask from the rest Frederic Weisbecker
@ 2021-07-14 13:54 ` Frederic Weisbecker
2021-07-21 14:28 ` Vincent Donnefort
2021-07-14 13:54 ` [RFC PATCH 6/6] cpuset: Add cpuset.isolation_mask file Frederic Weisbecker
2021-07-16 18:02 ` [RFC PATCH 0/6] cpuset: Allow to modify isolcpus through cpuset Waiman Long
4 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2021-07-14 13:54 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Tejun Heo, Peter Zijlstra, Juri Lelli,
Alex Belits, Nitesh Lal, Thomas Gleixner, Nicolas Saenz,
Christoph Lameter, Marcelo Tosatti, Zefan Li,
cgroups-u79uwXL29TY76Z2rM5mHXA
In order to prepare supporting "isolcpus=" changes toward cpuset,
provide an API to modify the "isolcpus=" cpumask passed on boot.
TODO:
* Propagate the change to all interested subsystems (workqueue, net, pci)
* Make sure we can't concurrently change this cpumask (assert cpuset_rwsem
is held).
Signed-off-by: Frederic Weisbecker <frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Marcelo Tosatti <mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Nitesh Lal <nilal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Nicolas Saenz <nsaenzju-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Christoph Lameter <cl-LoxgEY9JZOazQB+pC5nmwQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
Cc: Alex Belits <abelits-eYqpPyKDWXRBDgjK7y7TUQ@public.gmane.org>
---
include/linux/sched/isolation.h | 4 ++++
kernel/sched/isolation.c | 19 +++++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
index cc9f393e2a70..a5960cb357d2 100644
--- a/include/linux/sched/isolation.h
+++ b/include/linux/sched/isolation.h
@@ -21,6 +21,7 @@ enum hk_flags {
DECLARE_STATIC_KEY_FALSE(housekeeping_overridden);
extern int housekeeping_any_cpu(enum hk_flags flags);
extern const struct cpumask *housekeeping_cpumask(enum hk_flags flags);
+extern void housekeeping_cpumask_set(struct cpumask *mask, enum hk_flags flags);
extern bool housekeeping_enabled(enum hk_flags flags);
extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
@@ -38,6 +39,9 @@ static inline const struct cpumask *housekeeping_cpumask(enum hk_flags flags)
return cpu_possible_mask;
}
+static inline void housekeeping_cpumask_set(struct cpumask *mask,
+ enum hk_flags flags) { }
+
static inline bool housekeeping_enabled(enum hk_flags flags)
{
return false;
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index c2bdf7e6dc39..c071433059cf 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -68,7 +68,26 @@ bool housekeeping_test_cpu(int cpu, enum hk_flags flags)
}
EXPORT_SYMBOL_GPL(housekeeping_test_cpu);
+// Only support HK_FLAG_DOMAIN for now
+// TODO: propagate the changes through all interested subsystems:
+// workqueues, net, pci; ...
+void housekeeping_cpumask_set(struct cpumask *mask, enum hk_flags flags)
+{
+ /* Only HK_FLAG_DOMAIN change supported for now */
+ if (WARN_ON_ONCE(flags != HK_FLAG_DOMAIN))
+ return;
+ if (!static_key_enabled(&housekeeping_overridden.key)) {
+ if (cpumask_equal(mask, cpu_possible_mask))
+ return;
+ if (WARN_ON_ONCE(!alloc_cpumask_var(&hk_domain_mask, GFP_KERNEL)))
+ return;
+ cpumask_copy(hk_domain_mask, mask);
+ static_branch_enable(&housekeeping_overridden);
+ } else {
+ cpumask_copy(hk_domain_mask, mask);
+ }
+}
void __init housekeeping_init(void)
{
--
2.25.1
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [RFC PATCH 5/6] sched/isolation: Make HK_FLAG_DOMAIN mutable
2021-07-14 13:54 ` [RFC PATCH 5/6] sched/isolation: Make HK_FLAG_DOMAIN mutable Frederic Weisbecker
@ 2021-07-21 14:28 ` Vincent Donnefort
0 siblings, 0 replies; 18+ messages in thread
From: Vincent Donnefort @ 2021-07-21 14:28 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Tejun Heo, Peter Zijlstra, Juri Lelli, Alex Belits,
Nitesh Lal, Thomas Gleixner, Nicolas Saenz, Christoph Lameter,
Marcelo Tosatti, Zefan Li, cgroups
Hi Frederic,
[...]
>
> +// Only support HK_FLAG_DOMAIN for now
> +// TODO: propagate the changes through all interested subsystems:
> +// workqueues, net, pci; ...
> +void housekeeping_cpumask_set(struct cpumask *mask, enum hk_flags flags)
> +{
> + /* Only HK_FLAG_DOMAIN change supported for now */
> + if (WARN_ON_ONCE(flags != HK_FLAG_DOMAIN))
> + return;
>
> + if (!static_key_enabled(&housekeeping_overridden.key)) {
> + if (cpumask_equal(mask, cpu_possible_mask))
> + return;
> + if (WARN_ON_ONCE(!alloc_cpumask_var(&hk_domain_mask, GFP_KERNEL)))
> + return;
> + cpumask_copy(hk_domain_mask, mask);
> + static_branch_enable(&housekeeping_overridden);
I get a warning here. static_branch_enable() is trying to take cpus_read_lock().
But the same lock is already taken by cpuset_write_u64().
Also, shouldn't it set HK_FLAG_DOMAIN in housekeeping_flags to enable
housekeeping if the kernel started without isolcpus="" ?
--
Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* [RFC PATCH 6/6] cpuset: Add cpuset.isolation_mask file
[not found] ` <20210714135420.69624-1-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (2 preceding siblings ...)
2021-07-14 13:54 ` [RFC PATCH 5/6] sched/isolation: Make HK_FLAG_DOMAIN mutable Frederic Weisbecker
@ 2021-07-14 13:54 ` Frederic Weisbecker
2021-07-14 16:31 ` Marcelo Tosatti
[not found] ` <20210714135420.69624-7-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2021-07-16 18:02 ` [RFC PATCH 0/6] cpuset: Allow to modify isolcpus through cpuset Waiman Long
4 siblings, 2 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2021-07-14 13:54 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Tejun Heo, Peter Zijlstra, Juri Lelli,
Alex Belits, Nitesh Lal, Thomas Gleixner, Nicolas Saenz,
Christoph Lameter, Marcelo Tosatti, Zefan Li,
cgroups-u79uwXL29TY76Z2rM5mHXA
Add a new cpuset.isolation_mask file in order to be able to modify the
housekeeping cpumask for each individual isolation feature on runtime.
In the future this will include nohz_full, unbound timers,
unbound workqueues, unbound kthreads, managed irqs, etc...
Start with supporting domain exclusion and CPUs passed through
"isolcpus=".
The cpuset.isolation_mask defaults to 0. Setting it to 1 will exclude
the given cpuset from the domains (they will be attached to NULL domain).
As long as a CPU is part of any cpuset with cpuset.isolation_mask set to
1, it will remain isolated even if it overlaps with another cpuset that
has cpuset.isolation_mask set to 0. The same applies to parent and
subdirectories.
If a cpuset is a subset of "isolcpus=", it automatically maps it and
cpuset.isolation_mask will be set to 1. This subset is then cleared from
the initial "isolcpus=" mask. The user is then free to override
cpuset.isolation_mask to 0 in order to revert the effect of "isolcpus=".
Here is an example of use where the CPU 7 has been isolated on boot and
get re-attached to domains later from cpuset:
$ cat /proc/cmdline
isolcpus=7
$ cd /sys/fs/cgroup/cpuset
$ mkdir cpu7
$ cd cpu7
$ cat cpuset.cpus
0-7
$ cat cpuset.isolation_mask
0
$ ls /sys/kernel/debug/domains/cpu7 # empty because isolcpus=7
$ echo 7 > cpuset.cpus
$ cat cpuset.isolation_mask # isolcpus subset automatically mapped
1
$ echo 0 > cpuset.isolation_mask
$ ls /sys/kernel/debug/domains/cpu7/
domain0 domain1
CHECKME: Should we have individual cpuset.isolation.$feature files for
each isolation feature instead of a single mask file?
CHECKME: The scheduler is unhappy when _every_ CPUs are isolated
Signed-off-by: Frederic Weisbecker <frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Marcelo Tosatti <mtosatti-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Nitesh Lal <nilal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Nicolas Saenz <nsaenzju-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Christoph Lameter <cl-LoxgEY9JZOazQB+pC5nmwQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
Cc: Alex Belits <abelits-eYqpPyKDWXRBDgjK7y7TUQ@public.gmane.org>
---
kernel/cgroup/cpuset.c | 111 +++++++++++++++++++++++++++++++++++++++--
1 file changed, 107 insertions(+), 4 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index adb5190c4429..ecb63be04408 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -82,6 +82,7 @@ struct cpuset {
struct cgroup_subsys_state css;
unsigned long flags; /* "unsigned long" so bitops work */
+ unsigned long isol_flags;
/*
* On default hierarchy:
@@ -258,6 +259,17 @@ static inline int is_spread_slab(const struct cpuset *cs)
return test_bit(CS_SPREAD_SLAB, &cs->flags);
}
+/* bits in struct cpuset flags field */
+typedef enum {
+ CS_ISOL_DOMAIN,
+ CS_ISOL_MAX
+} isol_flagbits_t;
+
+static inline int is_isol_domain(const struct cpuset *cs)
+{
+ return test_bit(CS_ISOL_DOMAIN, &cs->isol_flags);
+}
+
static inline int is_partition_root(const struct cpuset *cs)
{
return cs->partition_root_state > 0;
@@ -269,6 +281,13 @@ static struct cpuset top_cpuset = {
.partition_root_state = PRS_ENABLED,
};
+/*
+ * CPUs passed through "isolcpus=" on boot, waiting to be mounted
+ * as soon as we meet a cpuset directory whose cpus_allowed is a
+ * subset of "isolcpus="
+ */
+static cpumask_var_t unmounted_isolcpus_mask;
+
/**
* cpuset_for_each_child - traverse online children of a cpuset
* @child_cs: loop cursor pointing to the current child
@@ -681,6 +700,39 @@ static inline int nr_cpusets(void)
return static_key_count(&cpusets_enabled_key.key) + 1;
}
+static int update_domain_housekeeping_mask(void)
+{
+ struct cpuset *cp; /* top-down scan of cpusets */
+ struct cgroup_subsys_state *pos_css;
+ cpumask_var_t domain_mask;
+
+ if (!zalloc_cpumask_var(&domain_mask, GFP_KERNEL))
+ return -ENOMEM;
+
+ cpumask_andnot(domain_mask, cpu_possible_mask, unmounted_isolcpus_mask);
+
+ rcu_read_lock();
+ cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) {
+ if (is_isol_domain(cp))
+ cpumask_andnot(domain_mask, domain_mask, cp->cpus_allowed);
+
+ if (cpumask_subset(cp->cpus_allowed, unmounted_isolcpus_mask)) {
+ unsigned long flags;
+ cpumask_andnot(unmounted_isolcpus_mask, unmounted_isolcpus_mask,
+ cp->cpus_allowed);
+ spin_lock_irqsave(&callback_lock, flags);
+ cp->isol_flags |= BIT(CS_ISOL_DOMAIN);
+ spin_unlock_irqrestore(&callback_lock, flags);
+ }
+ }
+ rcu_read_unlock();
+
+ housekeeping_cpumask_set(domain_mask, HK_FLAG_DOMAIN);
+ free_cpumask_var(domain_mask);
+
+ return 0;
+}
+
/*
* generate_sched_domains()
*
@@ -741,6 +793,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
struct cpuset **csa; /* array of all cpuset ptrs */
int csn; /* how many cpuset ptrs in csa so far */
int i, j, k; /* indices for partition finding loops */
+ int err;
cpumask_var_t *doms; /* resulting partition; i.e. sched domains */
struct sched_domain_attr *dattr; /* attributes for custom domains */
int ndoms = 0; /* number of sched domains in result */
@@ -752,6 +805,10 @@ static int generate_sched_domains(cpumask_var_t **domains,
dattr = NULL;
csa = NULL;
+ err = update_domain_housekeeping_mask();
+ if (err < 0)
+ pr_err("Can't update housekeeping cpumask\n");
+
/* Special case for the 99% of systems with one, full, sched domain */
if (root_load_balance && !top_cpuset.nr_subparts_cpus) {
ndoms = 1;
@@ -1449,7 +1506,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp)
* root as well.
*/
if (!cpumask_empty(cp->cpus_allowed) &&
- is_sched_load_balance(cp) &&
+ (is_sched_load_balance(cp) || is_isol_domain(cs)) &&
(!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) ||
is_partition_root(cp)))
need_rebuild_sched_domains = true;
@@ -1935,6 +1992,30 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
return err;
}
+/*
+ * update_isol_flags - read a 0 or a 1 in a file and update associated isol flag
+ * mask: the new mask value to apply (see isol_flagbits_t)
+ * cs: the cpuset to update
+ *
+ * Call with cpuset_mutex held.
+ */
+static int update_isol_flags(struct cpuset *cs, u64 mask)
+{
+ unsigned long old_mask = cs->isol_flags;
+
+ if (mask & ~(BIT_ULL(CS_ISOL_MAX) - 1))
+ return -EINVAL;
+
+ spin_lock_irq(&callback_lock);
+ cs->isol_flags = (unsigned long)mask;
+ spin_unlock_irq(&callback_lock);
+
+ if (mask ^ old_mask)
+ rebuild_sched_domains_locked();
+
+ return 0;
+}
+
/*
* update_prstate - update partititon_root_state
* cs: the cpuset to update
@@ -2273,6 +2354,9 @@ typedef enum {
FILE_MEMORY_PRESSURE,
FILE_SPREAD_PAGE,
FILE_SPREAD_SLAB,
+//CHECKME: should we have individual cpuset.isolation.$feature files
+//instead of a mask of features in a single file?
+ FILE_ISOLATION_MASK,
} cpuset_filetype_t;
static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
@@ -2314,6 +2398,9 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
case FILE_SPREAD_SLAB:
retval = update_flag(CS_SPREAD_SLAB, cs, val);
break;
+ case FILE_ISOLATION_MASK:
+ retval = update_isol_flags(cs, val);
+ break;
default:
retval = -EINVAL;
break;
@@ -2481,6 +2568,8 @@ static u64 cpuset_read_u64(struct cgroup_subsys_state *css, struct cftype *cft)
return is_spread_page(cs);
case FILE_SPREAD_SLAB:
return is_spread_slab(cs);
+ case FILE_ISOLATION_MASK:
+ return cs->isol_flags;
default:
BUG();
}
@@ -2658,6 +2747,13 @@ static struct cftype legacy_files[] = {
.private = FILE_MEMORY_PRESSURE_ENABLED,
},
+ {
+ .name = "isolation_mask",
+ .read_u64 = cpuset_read_u64,
+ .write_u64 = cpuset_write_u64,
+ .private = FILE_ISOLATION_MASK,
+ },
+
{ } /* terminate */
};
@@ -2834,9 +2930,12 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css)
if (is_partition_root(cs))
update_prstate(cs, 0);
- if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
- is_sched_load_balance(cs))
- update_flag(CS_SCHED_LOAD_BALANCE, cs, 0);
+ if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) {
+ if (is_sched_load_balance(cs))
+ update_flag(CS_SCHED_LOAD_BALANCE, cs, 0);
+ if (is_isol_domain(cs))
+ update_isol_flags(cs, cs->isol_flags & ~BIT(CS_ISOL_DOMAIN));
+ }
if (cs->use_parent_ecpus) {
struct cpuset *parent = parent_cs(cs);
@@ -2873,6 +2972,9 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css)
top_cpuset.mems_allowed = top_cpuset.effective_mems;
}
+ cpumask_andnot(unmounted_isolcpus_mask, cpu_possible_mask,
+ housekeeping_cpumask(HK_FLAG_DOMAIN));
+
spin_unlock_irq(&callback_lock);
percpu_up_write(&cpuset_rwsem);
}
@@ -2932,6 +3034,7 @@ int __init cpuset_init(void)
top_cpuset.relax_domain_level = -1;
BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
+ BUG_ON(!alloc_cpumask_var(&unmounted_isolcpus_mask, GFP_KERNEL));
return 0;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [RFC PATCH 6/6] cpuset: Add cpuset.isolation_mask file
2021-07-14 13:54 ` [RFC PATCH 6/6] cpuset: Add cpuset.isolation_mask file Frederic Weisbecker
@ 2021-07-14 16:31 ` Marcelo Tosatti
[not found] ` <20210714163157.GA140679-ZB2g03Rrq1XR7s880joybQ@public.gmane.org>
[not found] ` <20210714135420.69624-7-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
1 sibling, 1 reply; 18+ messages in thread
From: Marcelo Tosatti @ 2021-07-14 16:31 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Tejun Heo, Peter Zijlstra, Juri Lelli, Alex Belits,
Nitesh Lal, Thomas Gleixner, Nicolas Saenz, Christoph Lameter,
Zefan Li, cgroups
On Wed, Jul 14, 2021 at 03:54:20PM +0200, Frederic Weisbecker wrote:
> Add a new cpuset.isolation_mask file in order to be able to modify the
> housekeeping cpumask for each individual isolation feature on runtime.
> In the future this will include nohz_full, unbound timers,
> unbound workqueues, unbound kthreads, managed irqs, etc...
>
> Start with supporting domain exclusion and CPUs passed through
> "isolcpus=".
It is possible to just add return -ENOTSUPPORTED for the features
whose support is not present?
> The cpuset.isolation_mask defaults to 0. Setting it to 1 will exclude
> the given cpuset from the domains (they will be attached to NULL domain).
> As long as a CPU is part of any cpuset with cpuset.isolation_mask set to
> 1, it will remain isolated even if it overlaps with another cpuset that
> has cpuset.isolation_mask set to 0. The same applies to parent and
> subdirectories.
>
> If a cpuset is a subset of "isolcpus=", it automatically maps it and
> cpuset.isolation_mask will be set to 1. This subset is then cleared from
> the initial "isolcpus=" mask. The user is then free to override
> cpuset.isolation_mask to 0 in order to revert the effect of "isolcpus=".
>
> Here is an example of use where the CPU 7 has been isolated on boot and
> get re-attached to domains later from cpuset:
>
> $ cat /proc/cmdline
> isolcpus=7
> $ cd /sys/fs/cgroup/cpuset
> $ mkdir cpu7
> $ cd cpu7
> $ cat cpuset.cpus
> 0-7
> $ cat cpuset.isolation_mask
> 0
> $ ls /sys/kernel/debug/domains/cpu7 # empty because isolcpus=7
> $ echo 7 > cpuset.cpus
> $ cat cpuset.isolation_mask # isolcpus subset automatically mapped
> 1
> $ echo 0 > cpuset.isolation_mask
> $ ls /sys/kernel/debug/domains/cpu7/
> domain0 domain1
>
> CHECKME: Should we have individual cpuset.isolation.$feature files for
> each isolation feature instead of a single mask file?
Yes, guess that is useful, for example due to the -ENOTSUPPORTED
comment above.
Guarantees on updates
=====================
Perhaps start with a document with:
On return to the write to the cpumask file, what are the guarantees?
For example, for kthread it is that any kernel threads from that point
on should start with the new mask. Therefore userspace should
respect the order:
1) Change kthread mask.
2) Move threads.
Updates to interface
====================
Also, thinking about updates to the interface (which today are one
cpumask per isolation feature) might be useful. What can happen:
1) New isolation feature is added, feature name added to the interface.
Userspace must support new filename. If not there, then thats an
old kernel without support for it.
2) If an isolation feature is removed, a file will be gone. What should
be the behaviour there? Remove the file? (userspace should probably
ignore the failure in that case?) (then features names should not be
reused, as that can confuse #1 above).
Or maybe have a versioned scheme?
>
> CHECKME: The scheduler is unhappy when _every_ CPUs are isolated
>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Nitesh Lal <nilal@redhat.com>
> Cc: Nicolas Saenz <nsaenzju@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Christoph Lameter <cl@gentwo.de>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Zefan Li <lizefan.x@bytedance.com>
> Cc: Alex Belits <abelits@marvell.com>
> ---
> kernel/cgroup/cpuset.c | 111 +++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 107 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index adb5190c4429..ecb63be04408 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -82,6 +82,7 @@ struct cpuset {
> struct cgroup_subsys_state css;
>
> unsigned long flags; /* "unsigned long" so bitops work */
> + unsigned long isol_flags;
>
> /*
> * On default hierarchy:
> @@ -258,6 +259,17 @@ static inline int is_spread_slab(const struct cpuset *cs)
> return test_bit(CS_SPREAD_SLAB, &cs->flags);
> }
>
> +/* bits in struct cpuset flags field */
> +typedef enum {
> + CS_ISOL_DOMAIN,
> + CS_ISOL_MAX
> +} isol_flagbits_t;
> +
> +static inline int is_isol_domain(const struct cpuset *cs)
> +{
> + return test_bit(CS_ISOL_DOMAIN, &cs->isol_flags);
> +}
> +
> static inline int is_partition_root(const struct cpuset *cs)
> {
> return cs->partition_root_state > 0;
> @@ -269,6 +281,13 @@ static struct cpuset top_cpuset = {
> .partition_root_state = PRS_ENABLED,
> };
>
> +/*
> + * CPUs passed through "isolcpus=" on boot, waiting to be mounted
> + * as soon as we meet a cpuset directory whose cpus_allowed is a
> + * subset of "isolcpus="
> + */
> +static cpumask_var_t unmounted_isolcpus_mask;
> +
> /**
> * cpuset_for_each_child - traverse online children of a cpuset
> * @child_cs: loop cursor pointing to the current child
> @@ -681,6 +700,39 @@ static inline int nr_cpusets(void)
> return static_key_count(&cpusets_enabled_key.key) + 1;
> }
>
> +static int update_domain_housekeeping_mask(void)
> +{
> + struct cpuset *cp; /* top-down scan of cpusets */
> + struct cgroup_subsys_state *pos_css;
> + cpumask_var_t domain_mask;
> +
> + if (!zalloc_cpumask_var(&domain_mask, GFP_KERNEL))
> + return -ENOMEM;
> +
> + cpumask_andnot(domain_mask, cpu_possible_mask, unmounted_isolcpus_mask);
> +
> + rcu_read_lock();
> + cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) {
> + if (is_isol_domain(cp))
> + cpumask_andnot(domain_mask, domain_mask, cp->cpus_allowed);
> +
> + if (cpumask_subset(cp->cpus_allowed, unmounted_isolcpus_mask)) {
> + unsigned long flags;
> + cpumask_andnot(unmounted_isolcpus_mask, unmounted_isolcpus_mask,
> + cp->cpus_allowed);
> + spin_lock_irqsave(&callback_lock, flags);
> + cp->isol_flags |= BIT(CS_ISOL_DOMAIN);
> + spin_unlock_irqrestore(&callback_lock, flags);
> + }
> + }
> + rcu_read_unlock();
> +
> + housekeeping_cpumask_set(domain_mask, HK_FLAG_DOMAIN);
> + free_cpumask_var(domain_mask);
> +
> + return 0;
> +}
> +
> /*
> * generate_sched_domains()
> *
> @@ -741,6 +793,7 @@ static int generate_sched_domains(cpumask_var_t **domains,
> struct cpuset **csa; /* array of all cpuset ptrs */
> int csn; /* how many cpuset ptrs in csa so far */
> int i, j, k; /* indices for partition finding loops */
> + int err;
> cpumask_var_t *doms; /* resulting partition; i.e. sched domains */
> struct sched_domain_attr *dattr; /* attributes for custom domains */
> int ndoms = 0; /* number of sched domains in result */
> @@ -752,6 +805,10 @@ static int generate_sched_domains(cpumask_var_t **domains,
> dattr = NULL;
> csa = NULL;
>
> + err = update_domain_housekeeping_mask();
> + if (err < 0)
> + pr_err("Can't update housekeeping cpumask\n");
> +
> /* Special case for the 99% of systems with one, full, sched domain */
> if (root_load_balance && !top_cpuset.nr_subparts_cpus) {
> ndoms = 1;
> @@ -1449,7 +1506,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp)
> * root as well.
> */
> if (!cpumask_empty(cp->cpus_allowed) &&
> - is_sched_load_balance(cp) &&
> + (is_sched_load_balance(cp) || is_isol_domain(cs)) &&
> (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) ||
> is_partition_root(cp)))
> need_rebuild_sched_domains = true;
> @@ -1935,6 +1992,30 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
> return err;
> }
>
> +/*
> + * update_isol_flags - read a 0 or a 1 in a file and update associated isol flag
> + * mask: the new mask value to apply (see isol_flagbits_t)
> + * cs: the cpuset to update
> + *
> + * Call with cpuset_mutex held.
> + */
> +static int update_isol_flags(struct cpuset *cs, u64 mask)
> +{
> + unsigned long old_mask = cs->isol_flags;
> +
> + if (mask & ~(BIT_ULL(CS_ISOL_MAX) - 1))
> + return -EINVAL;
> +
> + spin_lock_irq(&callback_lock);
> + cs->isol_flags = (unsigned long)mask;
> + spin_unlock_irq(&callback_lock);
> +
> + if (mask ^ old_mask)
> + rebuild_sched_domains_locked();
> +
> + return 0;
> +}
> +
> /*
> * update_prstate - update partititon_root_state
> * cs: the cpuset to update
> @@ -2273,6 +2354,9 @@ typedef enum {
> FILE_MEMORY_PRESSURE,
> FILE_SPREAD_PAGE,
> FILE_SPREAD_SLAB,
> +//CHECKME: should we have individual cpuset.isolation.$feature files
> +//instead of a mask of features in a single file?
> + FILE_ISOLATION_MASK,
> } cpuset_filetype_t;
>
> static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
> @@ -2314,6 +2398,9 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
> case FILE_SPREAD_SLAB:
> retval = update_flag(CS_SPREAD_SLAB, cs, val);
> break;
> + case FILE_ISOLATION_MASK:
> + retval = update_isol_flags(cs, val);
> + break;
> default:
> retval = -EINVAL;
> break;
> @@ -2481,6 +2568,8 @@ static u64 cpuset_read_u64(struct cgroup_subsys_state *css, struct cftype *cft)
> return is_spread_page(cs);
> case FILE_SPREAD_SLAB:
> return is_spread_slab(cs);
> + case FILE_ISOLATION_MASK:
> + return cs->isol_flags;
> default:
> BUG();
> }
> @@ -2658,6 +2747,13 @@ static struct cftype legacy_files[] = {
> .private = FILE_MEMORY_PRESSURE_ENABLED,
> },
>
> + {
> + .name = "isolation_mask",
> + .read_u64 = cpuset_read_u64,
> + .write_u64 = cpuset_write_u64,
> + .private = FILE_ISOLATION_MASK,
> + },
> +
> { } /* terminate */
> };
>
> @@ -2834,9 +2930,12 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css)
> if (is_partition_root(cs))
> update_prstate(cs, 0);
>
> - if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
> - is_sched_load_balance(cs))
> - update_flag(CS_SCHED_LOAD_BALANCE, cs, 0);
> + if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) {
> + if (is_sched_load_balance(cs))
> + update_flag(CS_SCHED_LOAD_BALANCE, cs, 0);
> + if (is_isol_domain(cs))
> + update_isol_flags(cs, cs->isol_flags & ~BIT(CS_ISOL_DOMAIN));
> + }
>
> if (cs->use_parent_ecpus) {
> struct cpuset *parent = parent_cs(cs);
> @@ -2873,6 +2972,9 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css)
> top_cpuset.mems_allowed = top_cpuset.effective_mems;
> }
>
> + cpumask_andnot(unmounted_isolcpus_mask, cpu_possible_mask,
> + housekeeping_cpumask(HK_FLAG_DOMAIN));
> +
> spin_unlock_irq(&callback_lock);
> percpu_up_write(&cpuset_rwsem);
> }
> @@ -2932,6 +3034,7 @@ int __init cpuset_init(void)
> top_cpuset.relax_domain_level = -1;
>
> BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL));
> + BUG_ON(!alloc_cpumask_var(&unmounted_isolcpus_mask, GFP_KERNEL));
>
> return 0;
> }
> --
> 2.25.1
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread[parent not found: <20210714135420.69624-7-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [RFC PATCH 6/6] cpuset: Add cpuset.isolation_mask file
[not found] ` <20210714135420.69624-7-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2021-07-14 16:52 ` Peter Zijlstra
[not found] ` <YO8WWxWBmNuI0iUW-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
0 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2021-07-14 16:52 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Tejun Heo, Juri Lelli, Alex Belits, Nitesh Lal,
Thomas Gleixner, Nicolas Saenz, Christoph Lameter,
Marcelo Tosatti, Zefan Li, cgroups-u79uwXL29TY76Z2rM5mHXA
On Wed, Jul 14, 2021 at 03:54:20PM +0200, Frederic Weisbecker wrote:
> Add a new cpuset.isolation_mask file in order to be able to modify the
> housekeeping cpumask for each individual isolation feature on runtime.
> In the future this will include nohz_full, unbound timers,
> unbound workqueues, unbound kthreads, managed irqs, etc...
>
> Start with supporting domain exclusion and CPUs passed through
> "isolcpus=".
>
> The cpuset.isolation_mask defaults to 0. Setting it to 1 will exclude
> the given cpuset from the domains (they will be attached to NULL domain).
> As long as a CPU is part of any cpuset with cpuset.isolation_mask set to
> 1, it will remain isolated even if it overlaps with another cpuset that
> has cpuset.isolation_mask set to 0. The same applies to parent and
> subdirectories.
>
> If a cpuset is a subset of "isolcpus=", it automatically maps it and
> cpuset.isolation_mask will be set to 1. This subset is then cleared from
> the initial "isolcpus=" mask. The user is then free to override
> cpuset.isolation_mask to 0 in order to revert the effect of "isolcpus=".
>
> Here is an example of use where the CPU 7 has been isolated on boot and
> get re-attached to domains later from cpuset:
>
> $ cat /proc/cmdline
> isolcpus=7
> $ cd /sys/fs/cgroup/cpuset
> $ mkdir cpu7
> $ cd cpu7
> $ cat cpuset.cpus
> 0-7
> $ cat cpuset.isolation_mask
> 0
> $ ls /sys/kernel/debug/domains/cpu7 # empty because isolcpus=7
> $ echo 7 > cpuset.cpus
> $ cat cpuset.isolation_mask # isolcpus subset automatically mapped
> 1
> $ echo 0 > cpuset.isolation_mask
> $ ls /sys/kernel/debug/domains/cpu7/
> domain0 domain1
>
cpusets already has means to create paritions; why are you creating
something else?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH 0/6] cpuset: Allow to modify isolcpus through cpuset
[not found] ` <20210714135420.69624-1-frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (3 preceding siblings ...)
2021-07-14 13:54 ` [RFC PATCH 6/6] cpuset: Add cpuset.isolation_mask file Frederic Weisbecker
@ 2021-07-16 18:02 ` Waiman Long
[not found] ` <8ea7a78f-948e-75e8-1c4f-59b349c858f6-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
4 siblings, 1 reply; 18+ messages in thread
From: Waiman Long @ 2021-07-16 18:02 UTC (permalink / raw)
To: Frederic Weisbecker, LKML
Cc: Tejun Heo, Peter Zijlstra, Juri Lelli, Alex Belits, Nitesh Lal,
Thomas Gleixner, Nicolas Saenz, Christoph Lameter,
Marcelo Tosatti, Zefan Li, cgroups-u79uwXL29TY76Z2rM5mHXA
On 7/14/21 9:54 AM, Frederic Weisbecker wrote:
> The fact that "isolcpus=" behaviour can't be modified at runtime is an
> eternal source of discussion and debate opposing a useful feature against
> a terrible interface.
>
> I've long since tried to figure out a proper way to control this at
> runtime using cpusets, which isn't easy as a boot time single cpumask
> is difficult to map to a hierarchy of cpusets that can even overlap.
I have a cpuset patch that allow disabling of load balancing in a
cgroup-v2 setting:
https://lore.kernel.org/lkml/20210621184924.27493-1-longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org/
The idea of cpuset partition is that there will be no overlap of cpus in
different partitions. So there will be no confusion whether a cpu is
load-balanced or not.
>
> The idea here is to map the boot-set isolation behaviour to any cpuset
> directory whose cpumask is a subset of "isolcpus=". I let you browse
> for details on the last patch.
>
> Note this is still WIP and half-baked, but I figured it's important to
> validate the interface early.
Using different cpumasks for different isolated properties is the easy
part. The hard part is to make different subsystems to change their
behavior as the isolation masks change dynamically at run time.
Currently, they check the housekeeping cpumask only at boot time or when
certain events happen.
Cheers,
Longman
^ permalink raw reply [flat|nested] 18+ messages in thread