* [PATCH] cpufreq/schedutil: Only bind threads if needed
@ 2024-09-12 13:53 Christian Loehle
2024-09-12 15:41 ` Rafael J. Wysocki
2024-09-12 16:58 ` Christian Loehle
0 siblings, 2 replies; 9+ messages in thread
From: Christian Loehle @ 2024-09-12 13:53 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org, linux-pm, Rafael J. Wysocki
Cc: Qais Yousef, Juri Lelli, Ingo Molnar, Viresh Kumar,
Dietmar Eggemann, Pierre Gondois
Remove the unconditional binding of sugov kthreads to the affected CPUs
if the cpufreq driver indicates that updates can happen from any CPU.
This allows userspace to set affinities to either save power (waking up
bigger CPUs on HMP can be expensive) or increasing performance (by
letting the utilized CPUs run without preemption of the sugov kthread).
Without this patch the behavior of sugov threads will basically be a
boot-time dice roll on which CPU of the PD has to handle all the
cpufreq updates. With the recent decreases of update filtering these
two basic problems become more and more apparent:
1. The wake_cpu might be idle and we are waking it up from another
CPU just for the cpufreq update. Apart from wasting power, the exit
latency of it's idle state might be longer than the sugov threads
running time, essentially delaying the cpufreq update unnecessarily.
2. We are preempting either the requesting or another busy CPU of the
PD, while the update could be done from a CPU that we deem less
important and pay the price of an IPI and two context-switches.
The change is essentially not setting PF_NO_SETAFFINITY on
dvfs_possible_from_any_cpu, no behavior change if userspace doesn't
touch affinities.
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
kernel/sched/cpufreq_schedutil.c | 6 +++++-
kernel/sched/syscalls.c | 3 +++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 43111a515a28..466fb79e0b81 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -683,7 +683,11 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
}
sg_policy->thread = thread;
- kthread_bind_mask(thread, policy->related_cpus);
+ if (policy->dvfs_possible_from_any_cpu)
+ set_cpus_allowed_ptr(thread, policy->related_cpus);
+ else
+ kthread_bind_mask(thread, policy->related_cpus);
+
init_irq_work(&sg_policy->irq_work, sugov_irq_work);
mutex_init(&sg_policy->work_lock);
diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
index c62acf509b74..7d4a4edfcfb9 100644
--- a/kernel/sched/syscalls.c
+++ b/kernel/sched/syscalls.c
@@ -1159,6 +1159,9 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
if (!task_has_dl_policy(p) || !dl_bandwidth_enabled())
return 0;
+ if (dl_entity_is_special(&p->dl))
+ return 0;
+
/*
* Since bandwidth control happens on root_domain basis,
* if admission test is enabled, we only admit -deadline
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] cpufreq/schedutil: Only bind threads if needed
2024-09-12 13:53 [PATCH] cpufreq/schedutil: Only bind threads if needed Christian Loehle
@ 2024-09-12 15:41 ` Rafael J. Wysocki
2024-09-12 16:01 ` Christian Loehle
2024-10-01 10:00 ` Viresh Kumar
2024-09-12 16:58 ` Christian Loehle
1 sibling, 2 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2024-09-12 15:41 UTC (permalink / raw)
To: Christian Loehle
Cc: linux-kernel@vger.kernel.org, linux-pm, Rafael J. Wysocki,
Qais Yousef, Juri Lelli, Ingo Molnar, Viresh Kumar,
Dietmar Eggemann, Pierre Gondois
On Thu, Sep 12, 2024 at 3:53 PM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> Remove the unconditional binding of sugov kthreads to the affected CPUs
> if the cpufreq driver indicates that updates can happen from any CPU.
> This allows userspace to set affinities to either save power (waking up
> bigger CPUs on HMP can be expensive) or increasing performance (by
> letting the utilized CPUs run without preemption of the sugov kthread).
>
> Without this patch the behavior of sugov threads will basically be a
> boot-time dice roll on which CPU of the PD has to handle all the
> cpufreq updates. With the recent decreases of update filtering these
> two basic problems become more and more apparent:
> 1. The wake_cpu might be idle and we are waking it up from another
> CPU just for the cpufreq update. Apart from wasting power, the exit
> latency of it's idle state might be longer than the sugov threads
> running time, essentially delaying the cpufreq update unnecessarily.
> 2. We are preempting either the requesting or another busy CPU of the
> PD, while the update could be done from a CPU that we deem less
> important and pay the price of an IPI and two context-switches.
>
> The change is essentially not setting PF_NO_SETAFFINITY on
> dvfs_possible_from_any_cpu, no behavior change if userspace doesn't
> touch affinities.
I'd like to hear from Viresh on this.
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> ---
> kernel/sched/cpufreq_schedutil.c | 6 +++++-
> kernel/sched/syscalls.c | 3 +++
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 43111a515a28..466fb79e0b81 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -683,7 +683,11 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
> }
>
> sg_policy->thread = thread;
> - kthread_bind_mask(thread, policy->related_cpus);
> + if (policy->dvfs_possible_from_any_cpu)
> + set_cpus_allowed_ptr(thread, policy->related_cpus);
> + else
> + kthread_bind_mask(thread, policy->related_cpus);
> +
> init_irq_work(&sg_policy->irq_work, sugov_irq_work);
> mutex_init(&sg_policy->work_lock);
>
> diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
> index c62acf509b74..7d4a4edfcfb9 100644
> --- a/kernel/sched/syscalls.c
> +++ b/kernel/sched/syscalls.c
> @@ -1159,6 +1159,9 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
> if (!task_has_dl_policy(p) || !dl_bandwidth_enabled())
> return 0;
>
> + if (dl_entity_is_special(&p->dl))
> + return 0;
> +
Care to explain this particular piece?
> /*
> * Since bandwidth control happens on root_domain basis,
> * if admission test is enabled, we only admit -deadline
> --
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpufreq/schedutil: Only bind threads if needed
2024-09-12 15:41 ` Rafael J. Wysocki
@ 2024-09-12 16:01 ` Christian Loehle
2024-09-25 8:14 ` Juri Lelli
2024-10-01 10:00 ` Viresh Kumar
1 sibling, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2024-09-12 16:01 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: linux-kernel@vger.kernel.org, linux-pm, Qais Yousef, Juri Lelli,
Ingo Molnar, Viresh Kumar, Dietmar Eggemann, Pierre Gondois
On 9/12/24 16:41, Rafael J. Wysocki wrote:
> On Thu, Sep 12, 2024 at 3:53 PM Christian Loehle
> <christian.loehle@arm.com> wrote:
>>
>> Remove the unconditional binding of sugov kthreads to the affected CPUs
>> if the cpufreq driver indicates that updates can happen from any CPU.
>> This allows userspace to set affinities to either save power (waking up
>> bigger CPUs on HMP can be expensive) or increasing performance (by
>> letting the utilized CPUs run without preemption of the sugov kthread).
>>
>> Without this patch the behavior of sugov threads will basically be a
>> boot-time dice roll on which CPU of the PD has to handle all the
>> cpufreq updates. With the recent decreases of update filtering these
>> two basic problems become more and more apparent:
>> 1. The wake_cpu might be idle and we are waking it up from another
>> CPU just for the cpufreq update. Apart from wasting power, the exit
>> latency of it's idle state might be longer than the sugov threads
>> running time, essentially delaying the cpufreq update unnecessarily.
>> 2. We are preempting either the requesting or another busy CPU of the
>> PD, while the update could be done from a CPU that we deem less
>> important and pay the price of an IPI and two context-switches.
>>
>> The change is essentially not setting PF_NO_SETAFFINITY on
>> dvfs_possible_from_any_cpu, no behavior change if userspace doesn't
>> touch affinities.
>
> I'd like to hear from Viresh on this.
>
>> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
>> ---
>> kernel/sched/cpufreq_schedutil.c | 6 +++++-
>> kernel/sched/syscalls.c | 3 +++
>> 2 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> index 43111a515a28..466fb79e0b81 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -683,7 +683,11 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
>> }
>>
>> sg_policy->thread = thread;
>> - kthread_bind_mask(thread, policy->related_cpus);
>> + if (policy->dvfs_possible_from_any_cpu)
>> + set_cpus_allowed_ptr(thread, policy->related_cpus);
>> + else
>> + kthread_bind_mask(thread, policy->related_cpus);
>> +
>> init_irq_work(&sg_policy->irq_work, sugov_irq_work);
>> mutex_init(&sg_policy->work_lock);
>>
>> diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
>> index c62acf509b74..7d4a4edfcfb9 100644
>> --- a/kernel/sched/syscalls.c
>> +++ b/kernel/sched/syscalls.c
>> @@ -1159,6 +1159,9 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
>> if (!task_has_dl_policy(p) || !dl_bandwidth_enabled())
>> return 0;
>>
>> + if (dl_entity_is_special(&p->dl))
>> + return 0;
>> +
>
> Care to explain this particular piece?
Looks suspicious but the truncated comment below explains it:
/*
* Since bandwidth control happens on root_domain basis,
* if admission test is enabled, we only admit -deadline
* tasks allowed to run on all the CPUs in the task's
* root_domain.
*/
So that would only allow setting it to all CPUs for the relevant
platforms unfortunately.
That should be fine though since the sugov task is pretty much
a dummy in terms of bandwidth / admission control internally, so
no harm done to not enforce this when userspace wants to set
affinities.
...Unless Juri disagrees.
>
>> /*
>> * Since bandwidth control happens on root_domain basis,
>> * if admission test is enabled, we only admit -deadline
>> --
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpufreq/schedutil: Only bind threads if needed
2024-09-12 13:53 [PATCH] cpufreq/schedutil: Only bind threads if needed Christian Loehle
2024-09-12 15:41 ` Rafael J. Wysocki
@ 2024-09-12 16:58 ` Christian Loehle
1 sibling, 0 replies; 9+ messages in thread
From: Christian Loehle @ 2024-09-12 16:58 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org, linux-pm, Rafael J. Wysocki
Cc: Qais Yousef, Juri Lelli, Ingo Molnar, Viresh Kumar,
Dietmar Eggemann, Pierre Gondois
On 9/12/24 14:53, Christian Loehle wrote:
> Remove the unconditional binding of sugov kthreads to the affected CPUs
> if the cpufreq driver indicates that updates can happen from any CPU.
> This allows userspace to set affinities to either save power (waking up
> bigger CPUs on HMP can be expensive) or increasing performance (by
> letting the utilized CPUs run without preemption of the sugov kthread).
>
> Without this patch the behavior of sugov threads will basically be a
> boot-time dice roll on which CPU of the PD has to handle all the
> cpufreq updates. With the recent decreases of update filtering these
> two basic problems become more and more apparent:
> 1. The wake_cpu might be idle and we are waking it up from another
> CPU just for the cpufreq update. Apart from wasting power, the exit
> latency of it's idle state might be longer than the sugov threads
> running time, essentially delaying the cpufreq update unnecessarily.
> 2. We are preempting either the requesting or another busy CPU of the
> PD, while the update could be done from a CPU that we deem less
> important and pay the price of an IPI and two context-switches.
>
> The change is essentially not setting PF_NO_SETAFFINITY on
> dvfs_possible_from_any_cpu, no behavior change if userspace doesn't
> touch affinities.
>
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
I'll add some numbers to illustrate, although the example might not be
particularly realistic.
The classic fio workload will trigger cpufreq update very often, so
I used that on the Pixel6, affinity set to CPU7 (bitmask 80)(Big PD is
[6,7]).
Without this patch we have either all sugov enqueues on CPU6 or CPU7,
depending on where the first CPU frequency request (since boot) was
issued from (the deadline select_task_rq is rather simple, so it
will just wake_cpu if that is still valid, which here it always is).
I set different affinities for the sugov:6 worker and annotate
IOPS (throughput) and power (mW average), the test is for 30s each.
cpumask IOPS
80 7477
888.3 mW {'CPU-Big': 742.5668631414397, 'CPU-Mid': 11.919500003712095, 'CPU-Little': 133.79554163073317}
40 7378
942.8 mW {'CPU-Big': 797.4037245094225, 'CPU-Mid': 12.440698878099667, 'CPU-Little': 132.98899390286172}
f 7469
873.7 mW {'CPU-Big': 718.2933574826287, 'CPU-Mid': 11.89176754939742, 'CPU-Little': 143.55634276873246}
2 7501
872.8 mW {'CPU-Big': 718.6036614909397, 'CPU-Mid': 11.711731623773632, 'CPU-Little': 142.4830173663132}
1 7392
859.5 mW {'CPU-Big': 704.926078017567, 'CPU-Mid': 12.196892652625284, 'CPU-Little': 142.4231989686622}
Throughput is somewhat comparable for all, anyway the frequency only
bounces from capacity 1024 to 512 because of the instability of
iowait boost.
For 40 (CPU6) we see significantly more power usage, as the sugov
will prevent CPU6 from power down, as it gets woken up by the sugov
worker often.
f,2 and 1 affinities have slightly higher power coming from the
littles (as expected), but significantly less power from the bigs,
since they don't have the double context-switch on preemption and
the actual sugov CPU cycles.
1 has by far the least power, as it's always in WFI anyway, it
handles the IO interrupts from fio on CPU7.
Granted that is a somewhat worst-case scenario just to illustrate
the problem.
I also have a patch for preferring wake_cpu = smp_processor_id() for
the sugov worker, which is somewhat adjacent to this, the numbers
above make a case for that even without touching affinities.
I thought this might be less controversial for now.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpufreq/schedutil: Only bind threads if needed
2024-09-12 16:01 ` Christian Loehle
@ 2024-09-25 8:14 ` Juri Lelli
2024-09-25 9:36 ` Christian Loehle
0 siblings, 1 reply; 9+ messages in thread
From: Juri Lelli @ 2024-09-25 8:14 UTC (permalink / raw)
To: Christian Loehle
Cc: Rafael J. Wysocki, linux-kernel@vger.kernel.org, linux-pm,
Qais Yousef, Ingo Molnar, Viresh Kumar, Dietmar Eggemann,
Pierre Gondois
Hi,
On 12/09/24 17:01, Christian Loehle wrote:
> On 9/12/24 16:41, Rafael J. Wysocki wrote:
> > On Thu, Sep 12, 2024 at 3:53 PM Christian Loehle
...
> >> diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
> >> index c62acf509b74..7d4a4edfcfb9 100644
> >> --- a/kernel/sched/syscalls.c
> >> +++ b/kernel/sched/syscalls.c
> >> @@ -1159,6 +1159,9 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
> >> if (!task_has_dl_policy(p) || !dl_bandwidth_enabled())
> >> return 0;
> >>
> >> + if (dl_entity_is_special(&p->dl))
> >> + return 0;
> >> +
> >
> > Care to explain this particular piece?
>
> Looks suspicious but the truncated comment below explains it:
> /*
> * Since bandwidth control happens on root_domain basis,
> * if admission test is enabled, we only admit -deadline
> * tasks allowed to run on all the CPUs in the task's
> * root_domain.
> */
> So that would only allow setting it to all CPUs for the relevant
> platforms unfortunately.
>
> That should be fine though since the sugov task is pretty much
> a dummy in terms of bandwidth / admission control internally, so
> no harm done to not enforce this when userspace wants to set
> affinities.
> ...Unless Juri disagrees.
Nope, I agree. :)
Wonder if we should put a comment along the lines of what you said above
right above the new conditions (so that people will not need to wonder
about it in the future). But not a strict requirement for me.
Thanks! (and apologies for the delay in replying)
Juri
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpufreq/schedutil: Only bind threads if needed
2024-09-25 8:14 ` Juri Lelli
@ 2024-09-25 9:36 ` Christian Loehle
0 siblings, 0 replies; 9+ messages in thread
From: Christian Loehle @ 2024-09-25 9:36 UTC (permalink / raw)
To: Juri Lelli
Cc: Rafael J. Wysocki, linux-kernel@vger.kernel.org, linux-pm,
Qais Yousef, Ingo Molnar, Viresh Kumar, Dietmar Eggemann,
Pierre Gondois
On 9/25/24 09:14, Juri Lelli wrote:
> Hi,
>
> On 12/09/24 17:01, Christian Loehle wrote:
>> On 9/12/24 16:41, Rafael J. Wysocki wrote:
>>> On Thu, Sep 12, 2024 at 3:53 PM Christian Loehle
>
> ...
>
>>>> diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
>>>> index c62acf509b74..7d4a4edfcfb9 100644
>>>> --- a/kernel/sched/syscalls.c
>>>> +++ b/kernel/sched/syscalls.c
>>>> @@ -1159,6 +1159,9 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
>>>> if (!task_has_dl_policy(p) || !dl_bandwidth_enabled())
>>>> return 0;
>>>>
>>>> + if (dl_entity_is_special(&p->dl))
>>>> + return 0;
>>>> +
>>>
>>> Care to explain this particular piece?
>>
>> Looks suspicious but the truncated comment below explains it:
>> /*
>> * Since bandwidth control happens on root_domain basis,
>> * if admission test is enabled, we only admit -deadline
>> * tasks allowed to run on all the CPUs in the task's
>> * root_domain.
>> */
>> So that would only allow setting it to all CPUs for the relevant
>> platforms unfortunately.
>>
>> That should be fine though since the sugov task is pretty much
>> a dummy in terms of bandwidth / admission control internally, so
>> no harm done to not enforce this when userspace wants to set
>> affinities.
>> ...Unless Juri disagrees.
>
> Nope, I agree. :)
>
> Wonder if we should put a comment along the lines of what you said above
> right above the new conditions (so that people will not need to wonder
> about it in the future). But not a strict requirement for me.
>
> Thanks! (and apologies for the delay in replying)
> Juri
Thank you Juri for taking a look. I agree with the comment, will
do a v2.
Regards,
Christian
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] cpufreq/schedutil: Only bind threads if needed
2024-09-12 15:41 ` Rafael J. Wysocki
2024-09-12 16:01 ` Christian Loehle
@ 2024-10-01 10:00 ` Viresh Kumar
1 sibling, 0 replies; 9+ messages in thread
From: Viresh Kumar @ 2024-10-01 10:00 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Christian Loehle, linux-kernel@vger.kernel.org, linux-pm,
Qais Yousef, Juri Lelli, Ingo Molnar, Dietmar Eggemann,
Pierre Gondois
On 12-09-24, 17:41, Rafael J. Wysocki wrote:
> On Thu, Sep 12, 2024 at 3:53 PM Christian Loehle
> <christian.loehle@arm.com> wrote:
> >
> > Remove the unconditional binding of sugov kthreads to the affected CPUs
> > if the cpufreq driver indicates that updates can happen from any CPU.
> > This allows userspace to set affinities to either save power (waking up
> > bigger CPUs on HMP can be expensive) or increasing performance (by
> > letting the utilized CPUs run without preemption of the sugov kthread).
> >
> > Without this patch the behavior of sugov threads will basically be a
> > boot-time dice roll on which CPU of the PD has to handle all the
> > cpufreq updates. With the recent decreases of update filtering these
> > two basic problems become more and more apparent:
> > 1. The wake_cpu might be idle and we are waking it up from another
> > CPU just for the cpufreq update. Apart from wasting power, the exit
> > latency of it's idle state might be longer than the sugov threads
> > running time, essentially delaying the cpufreq update unnecessarily.
> > 2. We are preempting either the requesting or another busy CPU of the
> > PD, while the update could be done from a CPU that we deem less
> > important and pay the price of an IPI and two context-switches.
> >
> > The change is essentially not setting PF_NO_SETAFFINITY on
> > dvfs_possible_from_any_cpu, no behavior change if userspace doesn't
> > touch affinities.
>
> I'd like to hear from Viresh on this.
Looks good to me.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
--
viresh
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] cpufreq/schedutil: Only bind threads if needed
@ 2025-01-20 10:09 Christian Loehle
2025-01-23 20:10 ` Rafael J. Wysocki
0 siblings, 1 reply; 9+ messages in thread
From: Christian Loehle @ 2025-01-20 10:09 UTC (permalink / raw)
To: Rafael J. Wysocki, Peter Zijlstra, Ingo Molnar,
linux-kernel@vger.kernel.org
Cc: Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
Mel Gorman, Valentin Schneider, linux-pm
Remove the unconditional binding of sugov kthreads to the affected CPUs
if the cpufreq driver indicates that updates can happen from any CPU.
This allows userspace to set affinities to either save power (waking up
bigger CPUs on HMP can be expensive) or increasing performance (by
letting the utilized CPUs run without preemption of the sugov kthread).
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
---
- RESEND: pick up tags
- v2: Add comment for the dl_task_check_affinity return (Juri)
v2: https://lore.kernel.org/lkml/a4a70646-98a4-4b85-955e-62d66ba68927@arm.com/
v1: https://lore.kernel.org/lkml/480f2140-ea59-4e1d-a68d-18cbcec10941@arm.com/
kernel/sched/cpufreq_schedutil.c | 6 +++++-
kernel/sched/syscalls.c | 7 +++++++
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 28c77904ea74..a81444501158 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -691,7 +691,11 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
}
sg_policy->thread = thread;
- kthread_bind_mask(thread, policy->related_cpus);
+ if (policy->dvfs_possible_from_any_cpu)
+ set_cpus_allowed_ptr(thread, policy->related_cpus);
+ else
+ kthread_bind_mask(thread, policy->related_cpus);
+
init_irq_work(&sg_policy->irq_work, sugov_irq_work);
mutex_init(&sg_policy->work_lock);
diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
index ff0e5ab4e37c..8230358d2b90 100644
--- a/kernel/sched/syscalls.c
+++ b/kernel/sched/syscalls.c
@@ -1139,6 +1139,13 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
if (!task_has_dl_policy(p) || !dl_bandwidth_enabled())
return 0;
+ /*
+ * The special/sugov task isn't part of regular bandwidth/admission
+ * control so let userspace change affinities.
+ */
+ if (dl_entity_is_special(&p->dl))
+ return 0;
+
/*
* Since bandwidth control happens on root_domain basis,
* if admission test is enabled, we only admit -deadline
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] cpufreq/schedutil: Only bind threads if needed
2025-01-20 10:09 Christian Loehle
@ 2025-01-23 20:10 ` Rafael J. Wysocki
0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-01-23 20:10 UTC (permalink / raw)
To: Christian Loehle
Cc: Rafael J. Wysocki, Peter Zijlstra, Ingo Molnar,
linux-kernel@vger.kernel.org, Vincent Guittot, Dietmar Eggemann,
Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
linux-pm
On Mon, Jan 20, 2025 at 11:10 AM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> Remove the unconditional binding of sugov kthreads to the affected CPUs
> if the cpufreq driver indicates that updates can happen from any CPU.
> This allows userspace to set affinities to either save power (waking up
> bigger CPUs on HMP can be expensive) or increasing performance (by
> letting the utilized CPUs run without preemption of the sugov kthread).
>
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
> Acked-by: Juri Lelli <juri.lelli@redhat.com>
> ---
> - RESEND: pick up tags
> - v2: Add comment for the dl_task_check_affinity return (Juri)
> v2: https://lore.kernel.org/lkml/a4a70646-98a4-4b85-955e-62d66ba68927@arm.com/
> v1: https://lore.kernel.org/lkml/480f2140-ea59-4e1d-a68d-18cbcec10941@arm.com/
>
> kernel/sched/cpufreq_schedutil.c | 6 +++++-
> kernel/sched/syscalls.c | 7 +++++++
> 2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 28c77904ea74..a81444501158 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -691,7 +691,11 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
> }
>
> sg_policy->thread = thread;
> - kthread_bind_mask(thread, policy->related_cpus);
> + if (policy->dvfs_possible_from_any_cpu)
> + set_cpus_allowed_ptr(thread, policy->related_cpus);
> + else
> + kthread_bind_mask(thread, policy->related_cpus);
> +
> init_irq_work(&sg_policy->irq_work, sugov_irq_work);
> mutex_init(&sg_policy->work_lock);
>
> diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
> index ff0e5ab4e37c..8230358d2b90 100644
> --- a/kernel/sched/syscalls.c
> +++ b/kernel/sched/syscalls.c
> @@ -1139,6 +1139,13 @@ int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
> if (!task_has_dl_policy(p) || !dl_bandwidth_enabled())
> return 0;
>
> + /*
> + * The special/sugov task isn't part of regular bandwidth/admission
> + * control so let userspace change affinities.
> + */
> + if (dl_entity_is_special(&p->dl))
> + return 0;
> +
> /*
> * Since bandwidth control happens on root_domain basis,
> * if admission test is enabled, we only admit -deadline
> --
Applied as 6.14-rc material, thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-01-23 20:10 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-12 13:53 [PATCH] cpufreq/schedutil: Only bind threads if needed Christian Loehle
2024-09-12 15:41 ` Rafael J. Wysocki
2024-09-12 16:01 ` Christian Loehle
2024-09-25 8:14 ` Juri Lelli
2024-09-25 9:36 ` Christian Loehle
2024-10-01 10:00 ` Viresh Kumar
2024-09-12 16:58 ` Christian Loehle
-- strict thread matches above, loose matches on Subject: below --
2025-01-20 10:09 Christian Loehle
2025-01-23 20:10 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox