* [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
@ 2024-07-23 18:10 Waiman Long
2024-07-24 10:30 ` Uladzislau Rezki
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Waiman Long @ 2024-07-23 18:10 UTC (permalink / raw)
To: Paul E. McKenney, Frederic Weisbecker, Neeraj Upadhyay,
Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang
Cc: rcu, linux-kernel, Vratislav Bendel, Waiman Long
It was discovered that isolated CPUs could sometimes be disturbed by
kworkers processing kfree_rcu() works causing higher than expected
latency. It is because the RCU core uses "system_wq" which doesn't have
the WQ_UNBOUND flag to handle all its work items. Fix this violation of
latency limits by using "system_unbound_wq" in the RCU core instead.
This will ensure that those work items will not be run on CPUs marked
as isolated.
Beside the WQ_UNBOUND flag, the other major difference between system_wq
and system_unbound_wq is their max_active count. The system_unbound_wq
has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.
Reported-by: Vratislav Bendel <vbendel@redhat.com>
Closes: https://issues.redhat.com/browse/RHEL-50220
Signed-off-by: Waiman Long <longman@redhat.com>
---
kernel/rcu/tasks.h | 4 ++--
kernel/rcu/tree.c | 8 ++++----
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e641cc681901..494aa9513d0b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3539,10 +3539,10 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
if (delayed_work_pending(&krcp->monitor_work)) {
delay_left = krcp->monitor_work.timer.expires - jiffies;
if (delay < delay_left)
- mod_delayed_work(system_wq, &krcp->monitor_work, delay);
+ mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
return;
}
- queue_delayed_work(system_wq, &krcp->monitor_work, delay);
+ queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
}
static void
@@ -3634,7 +3634,7 @@ static void kfree_rcu_monitor(struct work_struct *work)
// be that the work is in the pending state when
// channels have been detached following by each
// other.
- queue_rcu_work(system_wq, &krwp->rcu_work);
+ queue_rcu_work(system_unbound_wq, &krwp->rcu_work);
}
}
@@ -3704,7 +3704,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp)
if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
!atomic_xchg(&krcp->work_in_progress, 1)) {
if (atomic_read(&krcp->backoff_page_cache_fill)) {
- queue_delayed_work(system_wq,
+ queue_delayed_work(system_unbound_wq,
&krcp->page_cache_work,
msecs_to_jiffies(rcu_delay_page_cache_fill_msec));
} else {
--
2.43.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
2024-07-23 18:10 [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs Waiman Long
@ 2024-07-24 10:30 ` Uladzislau Rezki
2024-07-24 13:30 ` Breno Leitao
2024-07-25 15:35 ` Neeraj Upadhyay
2 siblings, 0 replies; 9+ messages in thread
From: Uladzislau Rezki @ 2024-07-24 10:30 UTC (permalink / raw)
To: Waiman Long
Cc: Paul E. McKenney, Frederic Weisbecker, Neeraj Upadhyay,
Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
linux-kernel, Vratislav Bendel
On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
> It was discovered that isolated CPUs could sometimes be disturbed by
> kworkers processing kfree_rcu() works causing higher than expected
> latency. It is because the RCU core uses "system_wq" which doesn't have
> the WQ_UNBOUND flag to handle all its work items. Fix this violation of
> latency limits by using "system_unbound_wq" in the RCU core instead.
> This will ensure that those work items will not be run on CPUs marked
> as isolated.
>
> Beside the WQ_UNBOUND flag, the other major difference between system_wq
> and system_unbound_wq is their max_active count. The system_unbound_wq
> has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
> is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.
>
> Reported-by: Vratislav Bendel <vbendel@redhat.com>
> Closes: https://issues.redhat.com/browse/RHEL-50220
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/rcu/tasks.h | 4 ++--
> kernel/rcu/tree.c | 8 ++++----
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index e641cc681901..494aa9513d0b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3539,10 +3539,10 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
> if (delayed_work_pending(&krcp->monitor_work)) {
> delay_left = krcp->monitor_work.timer.expires - jiffies;
> if (delay < delay_left)
> - mod_delayed_work(system_wq, &krcp->monitor_work, delay);
> + mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
> return;
> }
> - queue_delayed_work(system_wq, &krcp->monitor_work, delay);
> + queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
> }
>
> static void
> @@ -3634,7 +3634,7 @@ static void kfree_rcu_monitor(struct work_struct *work)
> // be that the work is in the pending state when
> // channels have been detached following by each
> // other.
> - queue_rcu_work(system_wq, &krwp->rcu_work);
> + queue_rcu_work(system_unbound_wq, &krwp->rcu_work);
> }
> }
>
> @@ -3704,7 +3704,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp)
> if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
> !atomic_xchg(&krcp->work_in_progress, 1)) {
> if (atomic_read(&krcp->backoff_page_cache_fill)) {
> - queue_delayed_work(system_wq,
> + queue_delayed_work(system_unbound_wq,
> &krcp->page_cache_work,
> msecs_to_jiffies(rcu_delay_page_cache_fill_msec));
> } else {
> --
> 2.43.5
>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Thanks!
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
2024-07-23 18:10 [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs Waiman Long
2024-07-24 10:30 ` Uladzislau Rezki
@ 2024-07-24 13:30 ` Breno Leitao
2024-07-24 14:23 ` Uladzislau Rezki
2024-07-29 3:06 ` Waiman Long
2024-07-25 15:35 ` Neeraj Upadhyay
2 siblings, 2 replies; 9+ messages in thread
From: Breno Leitao @ 2024-07-24 13:30 UTC (permalink / raw)
To: Waiman Long
Cc: Paul E. McKenney, Frederic Weisbecker, Neeraj Upadhyay,
Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
linux-kernel, Vratislav Bendel
On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
> It was discovered that isolated CPUs could sometimes be disturbed by
> kworkers processing kfree_rcu() works causing higher than expected
> latency. It is because the RCU core uses "system_wq" which doesn't have
> the WQ_UNBOUND flag to handle all its work items. Fix this violation of
> latency limits by using "system_unbound_wq" in the RCU core instead.
> This will ensure that those work items will not be run on CPUs marked
> as isolated.
>
> Beside the WQ_UNBOUND flag, the other major difference between system_wq
> and system_unbound_wq is their max_active count. The system_unbound_wq
> has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
> is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.
>
> Reported-by: Vratislav Bendel <vbendel@redhat.com>
I've seen this problem a while ago and reported to the list:
https://lore.kernel.org/all/Zp906X7VJGNKl5fW@gmail.com/
I've just applied this test, and run my workload for 2 hours without
hitting this issue. Thanks for solving it.
Tested-by: Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
2024-07-24 13:30 ` Breno Leitao
@ 2024-07-24 14:23 ` Uladzislau Rezki
2024-07-29 3:06 ` Waiman Long
1 sibling, 0 replies; 9+ messages in thread
From: Uladzislau Rezki @ 2024-07-24 14:23 UTC (permalink / raw)
To: Breno Leitao
Cc: Waiman Long, Paul E. McKenney, Frederic Weisbecker,
Neeraj Upadhyay, Joel Fernandes, Josh Triplett, Boqun Feng,
Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
Lai Jiangshan, Zqiang, rcu, linux-kernel, Vratislav Bendel
On Wed, Jul 24, 2024 at 06:30:29AM -0700, Breno Leitao wrote:
> On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
> > It was discovered that isolated CPUs could sometimes be disturbed by
> > kworkers processing kfree_rcu() works causing higher than expected
> > latency. It is because the RCU core uses "system_wq" which doesn't have
> > the WQ_UNBOUND flag to handle all its work items. Fix this violation of
> > latency limits by using "system_unbound_wq" in the RCU core instead.
> > This will ensure that those work items will not be run on CPUs marked
> > as isolated.
> >
> > Beside the WQ_UNBOUND flag, the other major difference between system_wq
> > and system_unbound_wq is their max_active count. The system_unbound_wq
> > has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
> > is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.
> >
> > Reported-by: Vratislav Bendel <vbendel@redhat.com>
>
> I've seen this problem a while ago and reported to the list:
>
> https://lore.kernel.org/all/Zp906X7VJGNKl5fW@gmail.com/
>
> I've just applied this test, and run my workload for 2 hours without
> hitting this issue. Thanks for solving it.
>
> Tested-by: Breno Leitao <leitao@debian.org>
>
Thank you for testing this! I saw your recent email about that :)
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
2024-07-23 18:10 [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs Waiman Long
2024-07-24 10:30 ` Uladzislau Rezki
2024-07-24 13:30 ` Breno Leitao
@ 2024-07-25 15:35 ` Neeraj Upadhyay
2024-07-25 17:02 ` Waiman Long
2 siblings, 1 reply; 9+ messages in thread
From: Neeraj Upadhyay @ 2024-07-25 15:35 UTC (permalink / raw)
To: Waiman Long
Cc: Paul E. McKenney, Frederic Weisbecker, Neeraj Upadhyay,
Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
linux-kernel, Vratislav Bendel
On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
> It was discovered that isolated CPUs could sometimes be disturbed by
> kworkers processing kfree_rcu() works causing higher than expected
> latency. It is because the RCU core uses "system_wq" which doesn't have
> the WQ_UNBOUND flag to handle all its work items. Fix this violation of
> latency limits by using "system_unbound_wq" in the RCU core instead.
> This will ensure that those work items will not be run on CPUs marked
> as isolated.
>
Alternative approach here could be, in case we want to keep per CPU worker
pools, define a wq with WQ_CPU_INTENSIVE flag. Are there cases where
WQ_CPU_INTENSIVE wq won't be sufficient for the problem this patch
is fixing?
- Neeraj
> Beside the WQ_UNBOUND flag, the other major difference between system_wq
> and system_unbound_wq is their max_active count. The system_unbound_wq
> has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
> is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.
>
> Reported-by: Vratislav Bendel <vbendel@redhat.com>
> Closes: https://issues.redhat.com/browse/RHEL-50220
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> kernel/rcu/tasks.h | 4 ++--
> kernel/rcu/tree.c | 8 ++++----
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index e641cc681901..494aa9513d0b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3539,10 +3539,10 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
> if (delayed_work_pending(&krcp->monitor_work)) {
> delay_left = krcp->monitor_work.timer.expires - jiffies;
> if (delay < delay_left)
> - mod_delayed_work(system_wq, &krcp->monitor_work, delay);
> + mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
> return;
> }
> - queue_delayed_work(system_wq, &krcp->monitor_work, delay);
> + queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
> }
>
> static void
> @@ -3634,7 +3634,7 @@ static void kfree_rcu_monitor(struct work_struct *work)
> // be that the work is in the pending state when
> // channels have been detached following by each
> // other.
> - queue_rcu_work(system_wq, &krwp->rcu_work);
> + queue_rcu_work(system_unbound_wq, &krwp->rcu_work);
> }
> }
>
> @@ -3704,7 +3704,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp)
> if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
> !atomic_xchg(&krcp->work_in_progress, 1)) {
> if (atomic_read(&krcp->backoff_page_cache_fill)) {
> - queue_delayed_work(system_wq,
> + queue_delayed_work(system_unbound_wq,
> &krcp->page_cache_work,
> msecs_to_jiffies(rcu_delay_page_cache_fill_msec));
> } else {
> --
> 2.43.5
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
2024-07-25 15:35 ` Neeraj Upadhyay
@ 2024-07-25 17:02 ` Waiman Long
2024-07-25 19:33 ` Neeraj Upadhyay
0 siblings, 1 reply; 9+ messages in thread
From: Waiman Long @ 2024-07-25 17:02 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Paul E. McKenney, Frederic Weisbecker, Joel Fernandes,
Josh Triplett, Boqun Feng, Uladzislau Rezki, Steven Rostedt,
Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu, linux-kernel,
Vratislav Bendel
On 7/25/24 11:35, Neeraj Upadhyay wrote:
> On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
>> It was discovered that isolated CPUs could sometimes be disturbed by
>> kworkers processing kfree_rcu() works causing higher than expected
>> latency. It is because the RCU core uses "system_wq" which doesn't have
>> the WQ_UNBOUND flag to handle all its work items. Fix this violation of
>> latency limits by using "system_unbound_wq" in the RCU core instead.
>> This will ensure that those work items will not be run on CPUs marked
>> as isolated.
>>
> Alternative approach here could be, in case we want to keep per CPU worker
> pools, define a wq with WQ_CPU_INTENSIVE flag. Are there cases where
> WQ_CPU_INTENSIVE wq won't be sufficient for the problem this patch
> is fixing?
What exactly will we gain by defining a WQ_CPU_INTENSIVE workqueue? Or
what will we lose by using system_unbound_wq? All the calls that are
modified to use system_unbound_wq are using WORK_CPU_UNBOUND as their
cpu. IOW, they doesn't care which CPUs are used to run the work items.
The only downside I can see is the possible loss of some cache locality.
In fact, WQ_CPU_INTENSIVE can be considered a subset of WQ_UNBOUND. An
WQ_UNBOUND workqueue will avoid using isolated CPUs, but not a
WQ_CPU_INTENSIVE workqueue.
Cheers,
Longman
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
2024-07-25 17:02 ` Waiman Long
@ 2024-07-25 19:33 ` Neeraj Upadhyay
2024-07-25 19:52 ` Waiman Long
0 siblings, 1 reply; 9+ messages in thread
From: Neeraj Upadhyay @ 2024-07-25 19:33 UTC (permalink / raw)
To: Waiman Long
Cc: Paul E. McKenney, Frederic Weisbecker, Joel Fernandes,
Josh Triplett, Boqun Feng, Uladzislau Rezki, Steven Rostedt,
Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu, linux-kernel,
Vratislav Bendel
On Thu, Jul 25, 2024 at 01:02:01PM -0400, Waiman Long wrote:
> On 7/25/24 11:35, Neeraj Upadhyay wrote:
> > On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
> > > It was discovered that isolated CPUs could sometimes be disturbed by
> > > kworkers processing kfree_rcu() works causing higher than expected
> > > latency. It is because the RCU core uses "system_wq" which doesn't have
> > > the WQ_UNBOUND flag to handle all its work items. Fix this violation of
> > > latency limits by using "system_unbound_wq" in the RCU core instead.
> > > This will ensure that those work items will not be run on CPUs marked
> > > as isolated.
> > >
> > Alternative approach here could be, in case we want to keep per CPU worker
> > pools, define a wq with WQ_CPU_INTENSIVE flag. Are there cases where
> > WQ_CPU_INTENSIVE wq won't be sufficient for the problem this patch
> > is fixing?
>
> What exactly will we gain by defining a WQ_CPU_INTENSIVE workqueue? Or what
> will we lose by using system_unbound_wq? All the calls that are modified to
> use system_unbound_wq are using WORK_CPU_UNBOUND as their cpu. IOW, they
> doesn't care which CPUs are used to run the work items. The only downside I
> can see is the possible loss of some cache locality.
>
For the nohz_full case, where unbounded pool workers run only on housekeeping CPU
(cpu0), if multiple other CPUs are queuing work, the execution of those
works could get delayed. However, this should not generally happen as
other CPUs would be mostly running in user mode.
> In fact, WQ_CPU_INTENSIVE can be considered a subset of WQ_UNBOUND. An
> WQ_UNBOUND workqueue will avoid using isolated CPUs, but not a
> WQ_CPU_INTENSIVE workqueue.
Got it, thanks!
I have picked the patch for further review and testing [1]
[1] https://git.kernel.org/pub/scm/linux/kernel/git/neeraj.upadhyay/linux-rcu.git/log/?h=next
- Neeraj
>
> Cheers,
> Longman
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
2024-07-25 19:33 ` Neeraj Upadhyay
@ 2024-07-25 19:52 ` Waiman Long
0 siblings, 0 replies; 9+ messages in thread
From: Waiman Long @ 2024-07-25 19:52 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: Paul E. McKenney, Frederic Weisbecker, Joel Fernandes,
Josh Triplett, Boqun Feng, Uladzislau Rezki, Steven Rostedt,
Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu, linux-kernel,
Vratislav Bendel
On 7/25/24 15:33, Neeraj Upadhyay wrote:
> On Thu, Jul 25, 2024 at 01:02:01PM -0400, Waiman Long wrote:
>> On 7/25/24 11:35, Neeraj Upadhyay wrote:
>>> On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
>>>> It was discovered that isolated CPUs could sometimes be disturbed by
>>>> kworkers processing kfree_rcu() works causing higher than expected
>>>> latency. It is because the RCU core uses "system_wq" which doesn't have
>>>> the WQ_UNBOUND flag to handle all its work items. Fix this violation of
>>>> latency limits by using "system_unbound_wq" in the RCU core instead.
>>>> This will ensure that those work items will not be run on CPUs marked
>>>> as isolated.
>>>>
>>> Alternative approach here could be, in case we want to keep per CPU worker
>>> pools, define a wq with WQ_CPU_INTENSIVE flag. Are there cases where
>>> WQ_CPU_INTENSIVE wq won't be sufficient for the problem this patch
>>> is fixing?
>> What exactly will we gain by defining a WQ_CPU_INTENSIVE workqueue? Or what
>> will we lose by using system_unbound_wq? All the calls that are modified to
>> use system_unbound_wq are using WORK_CPU_UNBOUND as their cpu. IOW, they
>> doesn't care which CPUs are used to run the work items. The only downside I
>> can see is the possible loss of some cache locality.
>>
> For the nohz_full case, where unbounded pool workers run only on housekeeping CPU
> (cpu0), if multiple other CPUs are queuing work, the execution of those
> works could get delayed. However, this should not generally happen as
> other CPUs would be mostly running in user mode.
Well, it there is only one housekeeping CPU, a lot of background kernel
tasks will be slowed down. Users should be careful about the proper
balance between the number of housekeeping and nohz-full CPUs.
>
>
>> In fact, WQ_CPU_INTENSIVE can be considered a subset of WQ_UNBOUND. An
>> WQ_UNBOUND workqueue will avoid using isolated CPUs, but not a
>> WQ_CPU_INTENSIVE workqueue.
> Got it, thanks!
>
> I have picked the patch for further review and testing [1]
>
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/neeraj.upadhyay/linux-rcu.git/log/?h=next
Thanks, let me know if you see any problem.
Cheers,
Longman
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs
2024-07-24 13:30 ` Breno Leitao
2024-07-24 14:23 ` Uladzislau Rezki
@ 2024-07-29 3:06 ` Waiman Long
1 sibling, 0 replies; 9+ messages in thread
From: Waiman Long @ 2024-07-29 3:06 UTC (permalink / raw)
To: Breno Leitao
Cc: Paul E. McKenney, Frederic Weisbecker, Neeraj Upadhyay,
Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu,
linux-kernel, Vratislav Bendel
On 7/24/24 09:30, Breno Leitao wrote:
> On Tue, Jul 23, 2024 at 02:10:25PM -0400, Waiman Long wrote:
>> It was discovered that isolated CPUs could sometimes be disturbed by
>> kworkers processing kfree_rcu() works causing higher than expected
>> latency. It is because the RCU core uses "system_wq" which doesn't have
>> the WQ_UNBOUND flag to handle all its work items. Fix this violation of
>> latency limits by using "system_unbound_wq" in the RCU core instead.
>> This will ensure that those work items will not be run on CPUs marked
>> as isolated.
>>
>> Beside the WQ_UNBOUND flag, the other major difference between system_wq
>> and system_unbound_wq is their max_active count. The system_unbound_wq
>> has a max_active of WQ_MAX_ACTIVE (512) while system_wq's max_active
>> is WQ_DFL_ACTIVE (256) which is half of WQ_MAX_ACTIVE.
>>
>> Reported-by: Vratislav Bendel <vbendel@redhat.com>
> I've seen this problem a while ago and reported to the list:
>
> https://lore.kernel.org/all/Zp906X7VJGNKl5fW@gmail.com/
>
> I've just applied this test, and run my workload for 2 hours without
> hitting this issue. Thanks for solving it.
>
> Tested-by: Breno Leitao <leitao@debian.org>
Thank for testing this patch. So it is just us that saw this problem.
Cheers,
Longman
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-07-29 3:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-23 18:10 [PATCH] rcu: Use system_unbound_wq to avoid disturbing isolated CPUs Waiman Long
2024-07-24 10:30 ` Uladzislau Rezki
2024-07-24 13:30 ` Breno Leitao
2024-07-24 14:23 ` Uladzislau Rezki
2024-07-29 3:06 ` Waiman Long
2024-07-25 15:35 ` Neeraj Upadhyay
2024-07-25 17:02 ` Waiman Long
2024-07-25 19:33 ` Neeraj Upadhyay
2024-07-25 19:52 ` Waiman Long
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox