public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask
@ 2026-03-24 15:18 Waiman Long
  2026-03-24 15:18 ` [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
  2026-03-24 15:18 ` [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
  0 siblings, 2 replies; 5+ messages in thread
From: Waiman Long @ 2026-03-24 15:18 UTC (permalink / raw)
  To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong,
	Phil Auld
  Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks, Waiman Long

Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
affinity management"), the HK_TYPE_KTHREAD housekeeping cpumask may no
longer be correct in showing the actual CPU affinity of kthreads that
have no predefined CPU affinity. As the ipvs networking code is still
using HK_TYPE_KTHREAD, we need to make HK_TYPE_KTHREAD reflect the
reality.

This patch series makes HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
and uses RCU to protect access to the HK_TYPE_KTHREAD housekeeping
cpumask.

Waiman Long (2):
  sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
  ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU

 include/linux/sched/isolation.h |  6 +++++-
 include/net/ip_vs.h             | 20 ++++++++++++++++----
 net/netfilter/ipvs/ip_vs_ctl.c  | 13 ++++++++-----
 3 files changed, 29 insertions(+), 10 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
  2026-03-24 15:18 [PATCH 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long
@ 2026-03-24 15:18 ` Waiman Long
  2026-03-24 18:59   ` David Dull
  2026-03-24 15:18 ` [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
  1 sibling, 1 reply; 5+ messages in thread
From: Waiman Long @ 2026-03-24 15:18 UTC (permalink / raw)
  To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong,
	Phil Auld
  Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks, Waiman Long

Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
affinity management"), kthreads default to use the HK_TYPE_DOMAIN
cpumask. IOW, it is no longer affected by the setting of the nohz_full
boot kernel parameter.

That means HK_TYPE_KTHREAD should now be an alias of HK_TYPE_DOMAIN
instead of HK_TYPE_KERNEL_NOISE to correctly reflect the current kthread
behavior. Make the change as HK_TYPE_KTHREAD is still being used in
some networking code.

Fixes: 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/sched/isolation.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
index dc3975ff1b2e..cf0fd03dd7a2 100644
--- a/include/linux/sched/isolation.h
+++ b/include/linux/sched/isolation.h
@@ -20,6 +20,11 @@ enum hk_type {
 	HK_TYPE_KERNEL_NOISE,
 	HK_TYPE_MAX,
 
+	/*
+	 * HK_TYPE_KTHREAD is now an alias of HK_TYPE_DOMAIN
+	 */
+	HK_TYPE_KTHREAD = HK_TYPE_DOMAIN,
+
 	/*
 	 * The following housekeeping types are only set by the nohz_full
 	 * boot commandline option. So they can share the same value.
@@ -29,7 +34,6 @@ enum hk_type {
 	HK_TYPE_RCU     = HK_TYPE_KERNEL_NOISE,
 	HK_TYPE_MISC    = HK_TYPE_KERNEL_NOISE,
 	HK_TYPE_WQ      = HK_TYPE_KERNEL_NOISE,
-	HK_TYPE_KTHREAD = HK_TYPE_KERNEL_NOISE
 };
 
 #ifdef CONFIG_CPU_ISOLATION
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
  2026-03-24 15:18 [PATCH 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long
  2026-03-24 15:18 ` [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
@ 2026-03-24 15:18 ` Waiman Long
  2026-03-26  8:32   ` Julian Anastasov
  1 sibling, 1 reply; 5+ messages in thread
From: Waiman Long @ 2026-03-24 15:18 UTC (permalink / raw)
  To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong,
	Phil Auld
  Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks, Waiman Long

The ip_vs_ctl.c file and the associated ip_vs.h file are the only places
in the kernel where HK_TYPE_KTHREAD cpumask is being retrieved and used.
Now that HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask can be changed at run
time. We need to use RCU to guard access to this cpumask to avoid a
potential UAF problem as the returned cpumask may be freed before it
is being used.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/net/ip_vs.h            | 20 ++++++++++++++++----
 net/netfilter/ipvs/ip_vs_ctl.c | 13 ++++++++-----
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 29a36709e7f3..17c85a575ef4 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1155,7 +1155,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
 	return ipvs->sysctl_run_estimation;
 }
 
-static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
+static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
 {
 	if (ipvs->est_cpulist_valid)
 		return ipvs->sysctl_est_cpulist;
@@ -1273,7 +1273,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
 	return 1;
 }
 
-static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
+static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
 {
 	return housekeeping_cpumask(HK_TYPE_KTHREAD);
 }
@@ -1290,6 +1290,18 @@ static inline int sysctl_est_nice(struct netns_ipvs *ipvs)
 
 #endif
 
+static inline bool sysctl_est_cpulist_empty(struct netns_ipvs *ipvs)
+{
+	guard(rcu)();
+	return cpumask_empty(__sysctl_est_cpulist(ipvs));
+}
+
+static inline unsigned int sysctl_est_cpulist_weight(struct netns_ipvs *ipvs)
+{
+	guard(rcu)();
+	return cpumask_weight(__sysctl_est_cpulist(ipvs));
+}
+
 /* IPVS core functions
  * (from ip_vs_core.c)
  */
@@ -1604,7 +1616,7 @@ static inline void ip_vs_est_stopped_recalc(struct netns_ipvs *ipvs)
 	/* Stop tasks while cpulist is empty or if disabled with flag */
 	ipvs->est_stopped = !sysctl_run_estimation(ipvs) ||
 			    (ipvs->est_cpulist_valid &&
-			     cpumask_empty(sysctl_est_cpulist(ipvs)));
+			     sysctl_est_cpulist_empty(ipvs));
 #endif
 }
 
@@ -1620,7 +1632,7 @@ static inline bool ip_vs_est_stopped(struct netns_ipvs *ipvs)
 static inline int ip_vs_est_max_threads(struct netns_ipvs *ipvs)
 {
 	unsigned int limit = IPVS_EST_CPU_KTHREADS *
-			     cpumask_weight(sysctl_est_cpulist(ipvs));
+			     sysctl_est_cpulist_weight(ipvs);
 
 	return max(1U, limit);
 }
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 35642de2a0fe..f38a2e2a9dc5 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -1973,11 +1973,14 @@ static int ipvs_proc_est_cpumask_get(const struct ctl_table *table,
 
 	mutex_lock(&ipvs->est_mutex);
 
-	if (ipvs->est_cpulist_valid)
-		mask = *valp;
-	else
-		mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
-	ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
+	/* HK_TYPE_KTHREAD cpumask needs RCU protection */
+	scoped_guard(rcu) {
+		if (ipvs->est_cpulist_valid)
+			mask = *valp;
+		else
+			mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
+		ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
+	}
 
 	mutex_unlock(&ipvs->est_mutex);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
  2026-03-24 15:18 ` [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
@ 2026-03-24 18:59   ` David Dull
  0 siblings, 0 replies; 5+ messages in thread
From: David Dull @ 2026-03-24 18:59 UTC (permalink / raw)
  To: longman
  Cc: linux-kernel, netdev, lvs-devel, linux-sched, kuba, pabeni, horms,
	David Dull

I went through the series.

Patch 1 correctly updates HK_TYPE_KTHREAD to reflect the post-041ee6f3727a
behavior where kthreads follow HK_TYPE_DOMAIN. Keeping it mapped to
HK_TYPE_KERNEL_NOISE is no longer accurate, so the alias is the right fix.

Patch 2 properly introduces RCU protection for accessing the
HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask in IPVS. Since the mask can now
change at runtime, guarding against potential UAF is required. The
changes are minimal and the helper wrappers keep call sites clean.

Reviewed-by: David Dull <monderasdor@gmail.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
  2026-03-24 15:18 ` [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
@ 2026-03-26  8:32   ` Julian Anastasov
  0 siblings, 0 replies; 5+ messages in thread
From: Julian Anastasov @ 2026-03-26  8:32 UTC (permalink / raw)
  To: Waiman Long
  Cc: Simon Horman, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso, Florian Westphal,
	Phil Sutter, Frederic Weisbecker, Chen Ridong, Phil Auld,
	linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks


	Hello,

On Tue, 24 Mar 2026, Waiman Long wrote:

> The ip_vs_ctl.c file and the associated ip_vs.h file are the only places
> in the kernel where HK_TYPE_KTHREAD cpumask is being retrieved and used.
> Now that HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask can be changed at run
> time. We need to use RCU to guard access to this cpumask to avoid a
> potential UAF problem as the returned cpumask may be freed before it
> is being used.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  include/net/ip_vs.h            | 20 ++++++++++++++++----
>  net/netfilter/ipvs/ip_vs_ctl.c | 13 ++++++++-----
>  2 files changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index 29a36709e7f3..17c85a575ef4 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -1155,7 +1155,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
>  	return ipvs->sysctl_run_estimation;
>  }
>  
> -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
> +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
>  {
>  	if (ipvs->est_cpulist_valid)
>  		return ipvs->sysctl_est_cpulist;
> @@ -1273,7 +1273,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
>  	return 1;
>  }
>  
> -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
> +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
>  {
>  	return housekeeping_cpumask(HK_TYPE_KTHREAD);
>  }
> @@ -1290,6 +1290,18 @@ static inline int sysctl_est_nice(struct netns_ipvs *ipvs)
>  
>  #endif
>  

	May be there is little fuzz here, due to the recent
changes in the nf-next tree. If this is a bugfix due to the
missing RCU protection, may be you should add Fixes line too
and use the nf tree. Probably, there will be fuzz/collisions with
the changes in the nf-next tree...

> +static inline bool sysctl_est_cpulist_empty(struct netns_ipvs *ipvs)
> +{
> +	guard(rcu)();
> +	return cpumask_empty(__sysctl_est_cpulist(ipvs));
> +}
> +
> +static inline unsigned int sysctl_est_cpulist_weight(struct netns_ipvs *ipvs)
> +{
> +	guard(rcu)();
> +	return cpumask_weight(__sysctl_est_cpulist(ipvs));
> +}
> +
>  /* IPVS core functions
>   * (from ip_vs_core.c)
>   */
> @@ -1604,7 +1616,7 @@ static inline void ip_vs_est_stopped_recalc(struct netns_ipvs *ipvs)
>  	/* Stop tasks while cpulist is empty or if disabled with flag */
>  	ipvs->est_stopped = !sysctl_run_estimation(ipvs) ||
>  			    (ipvs->est_cpulist_valid &&
> -			     cpumask_empty(sysctl_est_cpulist(ipvs)));
> +			     sysctl_est_cpulist_empty(ipvs));
>  #endif
>  }
>  
> @@ -1620,7 +1632,7 @@ static inline bool ip_vs_est_stopped(struct netns_ipvs *ipvs)
>  static inline int ip_vs_est_max_threads(struct netns_ipvs *ipvs)
>  {
>  	unsigned int limit = IPVS_EST_CPU_KTHREADS *
> -			     cpumask_weight(sysctl_est_cpulist(ipvs));
> +			     sysctl_est_cpulist_weight(ipvs);
>  
>  	return max(1U, limit);
>  }
> diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
> index 35642de2a0fe..f38a2e2a9dc5 100644
> --- a/net/netfilter/ipvs/ip_vs_ctl.c
> +++ b/net/netfilter/ipvs/ip_vs_ctl.c
> @@ -1973,11 +1973,14 @@ static int ipvs_proc_est_cpumask_get(const struct ctl_table *table,
>  
>  	mutex_lock(&ipvs->est_mutex);
>  
> -	if (ipvs->est_cpulist_valid)
> -		mask = *valp;
> -	else
> -		mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
> -	ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
> +	/* HK_TYPE_KTHREAD cpumask needs RCU protection */

	Can we switch IPVS to use HK_TYPE_DOMAIN? The initial
intention was to follow the code in kthread.c. Then you can 
reconsider if HK_TYPE_KTHREAD should be alias to HK_TYPE_DOMAIN,
may be not if there are no other users...

> +	scoped_guard(rcu) {
> +		if (ipvs->est_cpulist_valid)
> +			mask = *valp;
> +		else
> +			mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
> +		ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
> +	}
>  
>  	mutex_unlock(&ipvs->est_mutex);
>  
> -- 
> 2.53.0

Regards

--
Julian Anastasov <ja@ssi.bg>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-26  8:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-24 15:18 [PATCH 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long
2026-03-24 15:18 ` [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
2026-03-24 18:59   ` David Dull
2026-03-24 15:18 ` [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
2026-03-26  8:32   ` Julian Anastasov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox