Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] net: Replace system_unbound_wq with system_dfl_wq
@ 2026-05-11 13:47 Marco Crivellari
  2026-05-11 13:47 ` [PATCH net-next 1/2] ipmr: Replace use of " Marco Crivellari
  2026-05-11 13:47 ` [PATCH net-next 2/2] ipvs: " Marco Crivellari
  0 siblings, 2 replies; 5+ messages in thread
From: Marco Crivellari @ 2026-05-11 13:47 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko,
	Simon Horman, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, David Ahern, Florian Westphal, Ido Schimmel,
	Julian Anastasov, Pablo Neira Ayuso, Phil Sutter, Simon Horman

Hi,

=== Current situation: problems ===

Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.

This leads to different scenarios if a work item is scheduled on an
isolated CPU where "delay" value is 0 or greater then 0:
        schedule_delayed_work(, 0);

This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:

        schedule_delayed_work(, 1);

Will move the timer on an housekeeping CPU, and schedule the work there.

Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistency cannot be addressed without refactoring the API.

=== Changes to the WQ API ===

The following, address the recent changes in the Workqueue API:

- commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
- commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The old workqueues will be removed in a future release cycle.

=== Introduced Changes by this series ===

1) [P 1-2]  Replace uses of system_unbound_wq with system_dfl_wq

No behavioral changes are introduced.


Thanks!

Marco Crivellari (2):
  ipmr: Replace use of system_unbound_wq with system_dfl_wq
  ipvs: Replace use of system_unbound_wq with system_dfl_wq

 net/ipv4/ipmr_base.c            |  2 +-
 net/netfilter/ipvs/ip_vs_conn.c |  4 ++--
 net/netfilter/ipvs/ip_vs_ctl.c  | 10 +++++-----
 3 files changed, 8 insertions(+), 8 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net-next 1/2] ipmr: Replace use of system_unbound_wq with system_dfl_wq
  2026-05-11 13:47 [PATCH net-next 0/2] net: Replace system_unbound_wq with system_dfl_wq Marco Crivellari
@ 2026-05-11 13:47 ` Marco Crivellari
  2026-05-11 13:47 ` [PATCH net-next 2/2] ipvs: " Marco Crivellari
  1 sibling, 0 replies; 5+ messages in thread
From: Marco Crivellari @ 2026-05-11 13:47 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko,
	Simon Horman, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, David Ahern, Ido Schimmel, Simon Horman

This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:

   commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
   commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.

Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:

   system_wq -> system_percpu_wq
   system_unbound_wq -> system_dfl_wq

This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
removed in the future.

Cc: David Ahern <dsahern@kernel.org>
Cc: Ido Schimmel <idosch@nvidia.com>
Cc: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 net/ipv4/ipmr_base.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/ipmr_base.c b/net/ipv4/ipmr_base.c
index 3930d612c3de..867b24beded1 100644
--- a/net/ipv4/ipmr_base.c
+++ b/net/ipv4/ipmr_base.c
@@ -39,7 +39,7 @@ static void __mr_free_table(struct work_struct *work)
 
 void mr_table_free(struct mr_table *mrt)
 {
-	queue_rcu_work(system_unbound_wq, &mrt->work);
+	queue_rcu_work(system_dfl_wq, &mrt->work);
 }
 
 struct mr_table *
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next 2/2] ipvs: Replace use of system_unbound_wq with system_dfl_wq
  2026-05-11 13:47 [PATCH net-next 0/2] net: Replace system_unbound_wq with system_dfl_wq Marco Crivellari
  2026-05-11 13:47 ` [PATCH net-next 1/2] ipmr: Replace use of " Marco Crivellari
@ 2026-05-11 13:47 ` Marco Crivellari
  2026-05-12  4:22   ` Julian Anastasov
  1 sibling, 1 reply; 5+ messages in thread
From: Marco Crivellari @ 2026-05-11 13:47 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Marco Crivellari, Michal Hocko,
	Simon Horman, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, Julian Anastasov, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, lvs-devel, netfilter-devel,
	coreteam

This patch continues the effort to refactor workqueue APIs, which has begun
with the changes introducing new workqueues and a new alloc_workqueue flag:

   commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
   commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

The point of the refactoring is to eventually alter the default behavior of
workqueues to become unbound by default so that their workload placement is
optimized by the scheduler.

Before that to happen, workqueue users must be converted to the better named
new workqueues with no intended behaviour changes:

   system_wq -> system_percpu_wq
   system_unbound_wq -> system_dfl_wq

This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
removed in the future.

Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: Phil Sutter <phil@nwl.cc>
Cc: lvs-devel@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 net/netfilter/ipvs/ip_vs_conn.c |  4 ++--
 net/netfilter/ipvs/ip_vs_ctl.c  | 10 +++++-----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 9ea6b4fa78bf..2625c0379556 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -285,7 +285,7 @@ static inline int ip_vs_conn_hash(struct ip_vs_conn *cp)
 	/* Schedule resizing if load increases */
 	if (atomic_read(&ipvs->conn_count) > t->u_thresh &&
 	    !test_and_set_bit(IP_VS_WORK_CONN_RESIZE, &ipvs->work_flags))
-		mod_delayed_work(system_unbound_wq, &ipvs->conn_resize_work, 0);
+		mod_delayed_work(system_dfl_wq, &ipvs->conn_resize_work, 0);
 
 	return ret;
 }
@@ -916,7 +916,7 @@ static void conn_resize_work_handler(struct work_struct *work)
 
 out:
 	/* Monitor if we need to shrink table */
-	queue_delayed_work(system_unbound_wq, &ipvs->conn_resize_work,
+	queue_delayed_work(system_dfl_wq, &ipvs->conn_resize_work,
 			   more_work ? 1 : 2 * HZ);
 }
 
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index c7c7f6a7a9f6..f8fe1c8981d8 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -800,7 +800,7 @@ static void svc_resize_work_handler(struct work_struct *work)
 	if (!READ_ONCE(ipvs->enable) || !more_work ||
 	    test_bit(IP_VS_WORK_SVC_NORESIZE, &ipvs->work_flags))
 		return;
-	queue_delayed_work(system_unbound_wq, &ipvs->svc_resize_work, 1);
+	queue_delayed_work(system_dfl_wq, &ipvs->svc_resize_work, 1);
 }
 
 static inline void
@@ -1833,7 +1833,7 @@ ip_vs_add_service(struct netns_ipvs *ipvs, struct ip_vs_service_user_kern *u,
 	/* Schedule resize work */
 	if (t && ip_vs_get_num_services(ipvs) > t->u_thresh &&
 	    !test_and_set_bit(IP_VS_WORK_SVC_RESIZE, &ipvs->work_flags))
-		queue_delayed_work(system_unbound_wq, &ipvs->svc_resize_work,
+		queue_delayed_work(system_dfl_wq, &ipvs->svc_resize_work,
 				   1);
 
 	*svc_p = svc;
@@ -2078,7 +2078,7 @@ static int ip_vs_del_service(struct ip_vs_service *svc)
 	} else if (ns <= t->l_thresh &&
 		   !test_and_set_bit(IP_VS_WORK_SVC_RESIZE,
 				     &ipvs->work_flags)) {
-		queue_delayed_work(system_unbound_wq, &ipvs->svc_resize_work,
+		queue_delayed_work(system_dfl_wq, &ipvs->svc_resize_work,
 				   1);
 	}
 	return 0;
@@ -2511,7 +2511,7 @@ static int ipvs_proc_conn_lfactor(const struct ctl_table *table, int write,
 		} else {
 			WRITE_ONCE(*valp, val);
 			if (rcu_access_pointer(ipvs->conn_tab))
-				mod_delayed_work(system_unbound_wq,
+				mod_delayed_work(system_dfl_wq,
 						 &ipvs->conn_resize_work, 0);
 		}
 	}
@@ -2543,7 +2543,7 @@ static int ipvs_proc_svc_lfactor(const struct ctl_table *table, int write,
 			    READ_ONCE(ipvs->enable) &&
 			    !test_bit(IP_VS_WORK_SVC_NORESIZE,
 				      &ipvs->work_flags))
-				mod_delayed_work(system_unbound_wq,
+				mod_delayed_work(system_dfl_wq,
 						 &ipvs->svc_resize_work, 0);
 			mutex_unlock(&ipvs->service_mutex);
 		}
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 2/2] ipvs: Replace use of system_unbound_wq with system_dfl_wq
  2026-05-11 13:47 ` [PATCH net-next 2/2] ipvs: " Marco Crivellari
@ 2026-05-12  4:22   ` Julian Anastasov
  2026-05-12  7:36     ` Marco Crivellari
  0 siblings, 1 reply; 5+ messages in thread
From: Julian Anastasov @ 2026-05-12  4:22 UTC (permalink / raw)
  To: Marco Crivellari
  Cc: linux-kernel, netdev, Tejun Heo, Lai Jiangshan,
	Frederic Weisbecker, Sebastian Andrzej Siewior, Michal Hocko,
	Simon Horman, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	lvs-devel, netfilter-devel, coreteam


	Hello,

On Mon, 11 May 2026, Marco Crivellari wrote:

> This patch continues the effort to refactor workqueue APIs, which has begun
> with the changes introducing new workqueues and a new alloc_workqueue flag:
> 
>    commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
>    commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
> 
> The point of the refactoring is to eventually alter the default behavior of
> workqueues to become unbound by default so that their workload placement is
> optimized by the scheduler.
> 
> Before that to happen, workqueue users must be converted to the better named
> new workqueues with no intended behaviour changes:
> 
>    system_wq -> system_percpu_wq
>    system_unbound_wq -> system_dfl_wq
> 
> This way the old obsolete workqueues (system_wq, system_unbound_wq) can be
> removed in the future.
> 
> Cc: Julian Anastasov <ja@ssi.bg>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: Florian Westphal <fw@strlen.de>
> Cc: Phil Sutter <phil@nwl.cc>
> Cc: lvs-devel@vger.kernel.org
> Cc: netfilter-devel@vger.kernel.org
> Cc: coreteam@netfilter.org
> Link: https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>

	Sorry that such change was delayed but there were
many changes in IPVS for the last month. The last that may
delay this patch is:

v3 of "ipvs: avoid possible loop in ip_vs_dst_event on resizing"
https://lore.kernel.org/lvs-devel/20260510104605.24218-1-ja@ssi.bg/T/#u

	May be we have to wait this change to reach net and
net-next. Also, we can reconsider which queue to use, these works
resize hash tables and call synchronize_rcu(), should we switch to
system_dfl_long_wq if such job is considered "long" ?

> ---
>  net/netfilter/ipvs/ip_vs_conn.c |  4 ++--
>  net/netfilter/ipvs/ip_vs_ctl.c  | 10 +++++-----
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
> index 9ea6b4fa78bf..2625c0379556 100644
> --- a/net/netfilter/ipvs/ip_vs_conn.c
> +++ b/net/netfilter/ipvs/ip_vs_conn.c
> @@ -285,7 +285,7 @@ static inline int ip_vs_conn_hash(struct ip_vs_conn *cp)
>  	/* Schedule resizing if load increases */
>  	if (atomic_read(&ipvs->conn_count) > t->u_thresh &&
>  	    !test_and_set_bit(IP_VS_WORK_CONN_RESIZE, &ipvs->work_flags))
> -		mod_delayed_work(system_unbound_wq, &ipvs->conn_resize_work, 0);
> +		mod_delayed_work(system_dfl_wq, &ipvs->conn_resize_work, 0);
>  
>  	return ret;
>  }
> @@ -916,7 +916,7 @@ static void conn_resize_work_handler(struct work_struct *work)
>  
>  out:
>  	/* Monitor if we need to shrink table */
> -	queue_delayed_work(system_unbound_wq, &ipvs->conn_resize_work,
> +	queue_delayed_work(system_dfl_wq, &ipvs->conn_resize_work,
>  			   more_work ? 1 : 2 * HZ);
>  }
>  
> diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
> index c7c7f6a7a9f6..f8fe1c8981d8 100644
> --- a/net/netfilter/ipvs/ip_vs_ctl.c
> +++ b/net/netfilter/ipvs/ip_vs_ctl.c
> @@ -800,7 +800,7 @@ static void svc_resize_work_handler(struct work_struct *work)
>  	if (!READ_ONCE(ipvs->enable) || !more_work ||
>  	    test_bit(IP_VS_WORK_SVC_NORESIZE, &ipvs->work_flags))
>  		return;
> -	queue_delayed_work(system_unbound_wq, &ipvs->svc_resize_work, 1);
> +	queue_delayed_work(system_dfl_wq, &ipvs->svc_resize_work, 1);
>  }
>  
>  static inline void
> @@ -1833,7 +1833,7 @@ ip_vs_add_service(struct netns_ipvs *ipvs, struct ip_vs_service_user_kern *u,
>  	/* Schedule resize work */
>  	if (t && ip_vs_get_num_services(ipvs) > t->u_thresh &&
>  	    !test_and_set_bit(IP_VS_WORK_SVC_RESIZE, &ipvs->work_flags))
> -		queue_delayed_work(system_unbound_wq, &ipvs->svc_resize_work,
> +		queue_delayed_work(system_dfl_wq, &ipvs->svc_resize_work,
>  				   1);
>  
>  	*svc_p = svc;
> @@ -2078,7 +2078,7 @@ static int ip_vs_del_service(struct ip_vs_service *svc)
>  	} else if (ns <= t->l_thresh &&
>  		   !test_and_set_bit(IP_VS_WORK_SVC_RESIZE,
>  				     &ipvs->work_flags)) {
> -		queue_delayed_work(system_unbound_wq, &ipvs->svc_resize_work,
> +		queue_delayed_work(system_dfl_wq, &ipvs->svc_resize_work,
>  				   1);
>  	}
>  	return 0;
> @@ -2511,7 +2511,7 @@ static int ipvs_proc_conn_lfactor(const struct ctl_table *table, int write,
>  		} else {
>  			WRITE_ONCE(*valp, val);
>  			if (rcu_access_pointer(ipvs->conn_tab))
> -				mod_delayed_work(system_unbound_wq,
> +				mod_delayed_work(system_dfl_wq,
>  						 &ipvs->conn_resize_work, 0);
>  		}
>  	}
> @@ -2543,7 +2543,7 @@ static int ipvs_proc_svc_lfactor(const struct ctl_table *table, int write,
>  			    READ_ONCE(ipvs->enable) &&
>  			    !test_bit(IP_VS_WORK_SVC_NORESIZE,
>  				      &ipvs->work_flags))
> -				mod_delayed_work(system_unbound_wq,
> +				mod_delayed_work(system_dfl_wq,
>  						 &ipvs->svc_resize_work, 0);
>  			mutex_unlock(&ipvs->service_mutex);
>  		}
> -- 
> 2.54.0

Regards

--
Julian Anastasov <ja@ssi.bg>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 2/2] ipvs: Replace use of system_unbound_wq with system_dfl_wq
  2026-05-12  4:22   ` Julian Anastasov
@ 2026-05-12  7:36     ` Marco Crivellari
  0 siblings, 0 replies; 5+ messages in thread
From: Marco Crivellari @ 2026-05-12  7:36 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: linux-kernel, netdev, Tejun Heo, Lai Jiangshan,
	Frederic Weisbecker, Sebastian Andrzej Siewior, Michal Hocko,
	Simon Horman, Eric Dumazet, David S . Miller, Jakub Kicinski,
	Paolo Abeni, Pablo Neira Ayuso, Florian Westphal, Phil Sutter,
	lvs-devel, netfilter-devel, coreteam

On Tue, May 12, 2026 at 6:22 AM Julian Anastasov <ja@ssi.bg> wrote:
> [...]
>         Sorry that such change was delayed but there were
> many changes in IPVS for the last month. The last that may
> delay this patch is:
>
> v3 of "ipvs: avoid possible loop in ip_vs_dst_event on resizing"
> https://lore.kernel.org/lvs-devel/20260510104605.24218-1-ja@ssi.bg/T/#u
>
>         May be we have to wait this change to reach net and
> net-next. Also, we can reconsider which queue to use, these works
> resize hash tables and call synchronize_rcu(), should we switch to
> system_dfl_long_wq if such job is considered "long" ?

Hello Julian,

Thanks for letting me know.

Yes, if it is considered long we can switch to the long unbound version.
I will prepare the v2 with this change.

Thanks!

--

Marco Crivellari

SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-12  7:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 13:47 [PATCH net-next 0/2] net: Replace system_unbound_wq with system_dfl_wq Marco Crivellari
2026-05-11 13:47 ` [PATCH net-next 1/2] ipmr: Replace use of " Marco Crivellari
2026-05-11 13:47 ` [PATCH net-next 2/2] ipvs: " Marco Crivellari
2026-05-12  4:22   ` Julian Anastasov
2026-05-12  7:36     ` Marco Crivellari

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox