btrfs thread_pool_size logic out of sync with workqueue

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* btrfs thread_pool_size logic out of sync with workqueue
@ 2025-04-28 23:23 Junxuan Liao
  2025-04-29 18:12 ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Junxuan Liao @ 2025-04-28 23:23 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, Tejun Heo, Lai Jiangshan,
	linux-btrfs, linux-kernel

Hi all,

Commit 636b927eba5bc63375 (workqueue: Make unbound workqueues to use
per-cpu pool_workqueues) makes max_active per CPU for unbounded
workqueues as well, but thread_pool_size in btrfs_fs_info and the mount
option thread_pool still assume the limit is global.

e.g. The default value of 8 allows a total of 64 workers on an 8-CPU
machine.

As far as I know, this means that on the Btrfs side we can no longer
control the concurrency level at the same granularity as before. We
should rewrite the auto-scaling logic, change the default
thread_pool_size value, and update the documentation. 

Am I missing something?

-- 
Thanks,
Junxuan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs thread_pool_size logic out of sync with workqueue
  2025-04-28 23:23 btrfs thread_pool_size logic out of sync with workqueue Junxuan Liao
@ 2025-04-29 18:12 ` Tejun Heo
  2025-04-29 18:28   ` Junxuan Liao
  0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2025-04-29 18:12 UTC (permalink / raw)
  To: Junxuan Liao
  Cc: Chris Mason, Josef Bacik, David Sterba, Lai Jiangshan,
	linux-btrfs, linux-kernel

On Mon, Apr 28, 2025 at 06:23:39PM -0500, Junxuan Liao wrote:
> Hi all,
> 
> Commit 636b927eba5bc63375 (workqueue: Make unbound workqueues to use
> per-cpu pool_workqueues) makes max_active per CPU for unbounded
> workqueues as well, but thread_pool_size in btrfs_fs_info and the mount
> option thread_pool still assume the limit is global.
> 
> e.g. The default value of 8 allows a total of 64 workers on an 8-CPU
> machine.
> 
> As far as I know, this means that on the Btrfs side we can no longer
> control the concurrency level at the same granularity as before. We
> should rewrite the auto-scaling logic, change the default
> thread_pool_size value, and update the documentation. 
> 
> Am I missing something?

5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for
unbound workqueues") turned max_active system-wide. The count is now split
across nodes proportional to the number of cpus each node has. This is still
a different behavior from before where max_active was applied per node, but
no behavior change on single node machines and the new behavior is easier to
work with.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs thread_pool_size logic out of sync with workqueue
  2025-04-29 18:12 ` Tejun Heo
@ 2025-04-29 18:28   ` Junxuan Liao
  2025-04-29 18:35     ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Junxuan Liao @ 2025-04-29 18:28 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Chris Mason, Josef Bacik, David Sterba, Lai Jiangshan,
	linux-btrfs, linux-kernel



On 4/29/25 1:12 PM, Tejun Heo wrote:
> 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for
> unbound workqueues") turned max_active system-wide. The count is now split
> across nodes proportional to the number of cpus each node has. This is still
> a different behavior from before where max_active was applied per node, but
> no behavior change on single node machines and the new behavior is easier to
> work with.

Thanks! I missed that. workqueue.rst doesn't reflect the change though.

-- 
Junxuan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs thread_pool_size logic out of sync with workqueue
  2025-04-29 18:28   ` Junxuan Liao
@ 2025-04-29 18:35     ` Tejun Heo
  2025-04-29 21:15       ` Junxuan Liao
  0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2025-04-29 18:35 UTC (permalink / raw)
  To: Junxuan Liao
  Cc: Chris Mason, Josef Bacik, David Sterba, Lai Jiangshan,
	linux-btrfs, linux-kernel

On Tue, Apr 29, 2025 at 01:28:46PM -0500, Junxuan Liao wrote:
> 
> 
> On 4/29/25 1:12 PM, Tejun Heo wrote:
> > 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for
> > unbound workqueues") turned max_active system-wide. The count is now split
> > across nodes proportional to the number of cpus each node has. This is still
> > a different behavior from before where max_active was applied per node, but
> > no behavior change on single node machines and the new behavior is easier to
> > work with.
> 
> Thanks! I missed that. workqueue.rst doesn't reflect the change though.

Yeah, I really should but patches welcome. :-)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs thread_pool_size logic out of sync with workqueue
  2025-04-29 18:35     ` Tejun Heo
@ 2025-04-29 21:15       ` Junxuan Liao
  2025-04-29 21:26         ` Junxuan Liao
  0 siblings, 1 reply; 7+ messages in thread
From: Junxuan Liao @ 2025-04-29 21:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Chris Mason, Josef Bacik, David Sterba, Lai Jiangshan,
	linux-btrfs, linux-kernel

> 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for
> unbound workqueues") turned max_active system-wide. 

So versions from 6.6 to 6.8 do have the bug, right? I guess the
performance regression isn't easy to trigger so no one noticed and
reported it.

-- 
Thanks,
Junxuan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs thread_pool_size logic out of sync with workqueue
  2025-04-29 21:15       ` Junxuan Liao
@ 2025-04-29 21:26         ` Junxuan Liao
  2025-04-30 18:46           ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Junxuan Liao @ 2025-04-29 21:26 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Chris Mason, Josef Bacik, David Sterba, Lai Jiangshan,
	linux-btrfs, linux-kernel

On 4/29/25 4:15 PM, Junxuan Liao wrote:
> So versions from 6.6 to 6.8 do have the bug, right? I guess the
> performance regression isn't easy to trigger so no one noticed and
> reported it.
> 

My bad. I missed that it has already been reported in 2023.
https://lore.kernel.org/all/dbu6wiwu3sdhmhikb2w6lns7b27gbobfavhjj57kwi2quafgwl@htjcc5oikcr3/

-- 
Junxuan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs thread_pool_size logic out of sync with workqueue
  2025-04-29 21:26         ` Junxuan Liao
@ 2025-04-30 18:46           ` Tejun Heo
  0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2025-04-30 18:46 UTC (permalink / raw)
  To: Junxuan Liao
  Cc: Chris Mason, Josef Bacik, David Sterba, Lai Jiangshan,
	linux-btrfs, linux-kernel

On Tue, Apr 29, 2025 at 04:26:52PM -0500, Junxuan Liao wrote:
> On 4/29/25 4:15 PM, Junxuan Liao wrote:
> > So versions from 6.6 to 6.8 do have the bug, right? I guess the
> > performance regression isn't easy to trigger so no one noticed and
> > reported it.
> 
> My bad. I missed that it has already been reported in 2023.
> https://lore.kernel.org/all/dbu6wiwu3sdhmhikb2w6lns7b27gbobfavhjj57kwi2quafgwl@htjcc5oikcr3/

Yeah, the changes were too invasive to backport through -stable especially
at the time as I didn't know how well the new scheme would work. There have
been some updates afterwards but it seemed to have held up fine, so if some
distros wanna backport them, this should be pretty safe now. However, I
think the window has already passed for -stable backports.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-04-30 18:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-28 23:23 btrfs thread_pool_size logic out of sync with workqueue Junxuan Liao
2025-04-29 18:12 ` Tejun Heo
2025-04-29 18:28   ` Junxuan Liao
2025-04-29 18:35     ` Tejun Heo
2025-04-29 21:15       ` Junxuan Liao
2025-04-29 21:26         ` Junxuan Liao
2025-04-30 18:46           ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox