public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask
@ 2026-03-31 16:50 Waiman Long
  2026-03-31 16:50 ` [PATCH-next v2 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Waiman Long @ 2026-03-31 16:50 UTC (permalink / raw)
  To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong,
	Phil Auld
  Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks, Waiman Long

 v2:
  - Rebased on top of linux-next

Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
affinity management"), the HK_TYPE_KTHREAD housekeeping cpumask may no
longer be correct in showing the actual CPU affinity of kthreads that
have no predefined CPU affinity. As the ipvs networking code is still
using HK_TYPE_KTHREAD, we need to make HK_TYPE_KTHREAD reflect the
reality.

This patch series makes HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
and uses RCU to protect access to the HK_TYPE_KTHREAD housekeeping
cpumask.

Waiman Long (2):
  sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
  ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU

 include/linux/sched/isolation.h |  6 +++++-
 include/net/ip_vs.h             | 20 ++++++++++++++++----
 net/netfilter/ipvs/ip_vs_ctl.c  | 13 ++++++++-----
 3 files changed, 29 insertions(+), 10 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH-next v2 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
  2026-03-31 16:50 [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long
@ 2026-03-31 16:50 ` Waiman Long
  2026-04-01 12:43   ` Frederic Weisbecker
  2026-03-31 16:50 ` [PATCH-next v2 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
  2026-04-03 14:15 ` [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Julian Anastasov
  2 siblings, 1 reply; 9+ messages in thread
From: Waiman Long @ 2026-03-31 16:50 UTC (permalink / raw)
  To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong,
	Phil Auld
  Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks, Waiman Long

Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
affinity management"), kthreads default to use the HK_TYPE_DOMAIN
cpumask. IOW, it is no longer affected by the setting of the nohz_full
boot kernel parameter.

That means HK_TYPE_KTHREAD should now be an alias of HK_TYPE_DOMAIN
instead of HK_TYPE_KERNEL_NOISE to correctly reflect the current kthread
behavior. Make the change as HK_TYPE_KTHREAD is still being used in
some networking code.

Fixes: 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/sched/isolation.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
index dc3975ff1b2e..cf0fd03dd7a2 100644
--- a/include/linux/sched/isolation.h
+++ b/include/linux/sched/isolation.h
@@ -20,6 +20,11 @@ enum hk_type {
 	HK_TYPE_KERNEL_NOISE,
 	HK_TYPE_MAX,
 
+	/*
+	 * HK_TYPE_KTHREAD is now an alias of HK_TYPE_DOMAIN
+	 */
+	HK_TYPE_KTHREAD = HK_TYPE_DOMAIN,
+
 	/*
 	 * The following housekeeping types are only set by the nohz_full
 	 * boot commandline option. So they can share the same value.
@@ -29,7 +34,6 @@ enum hk_type {
 	HK_TYPE_RCU     = HK_TYPE_KERNEL_NOISE,
 	HK_TYPE_MISC    = HK_TYPE_KERNEL_NOISE,
 	HK_TYPE_WQ      = HK_TYPE_KERNEL_NOISE,
-	HK_TYPE_KTHREAD = HK_TYPE_KERNEL_NOISE
 };
 
 #ifdef CONFIG_CPU_ISOLATION
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH-next v2 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
  2026-03-31 16:50 [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long
  2026-03-31 16:50 ` [PATCH-next v2 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
@ 2026-03-31 16:50 ` Waiman Long
  2026-04-01 12:54   ` Frederic Weisbecker
  2026-04-03 14:15 ` [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Julian Anastasov
  2 siblings, 1 reply; 9+ messages in thread
From: Waiman Long @ 2026-03-31 16:50 UTC (permalink / raw)
  To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong,
	Phil Auld
  Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks, Waiman Long

The ip_vs_ctl.c file and the associated ip_vs.h file are the only places
in the kernel where HK_TYPE_KTHREAD cpumask is being retrieved and used.
Now that HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask can be changed at run
time. We need to use RCU to guard access to this cpumask to avoid a
potential UAF problem as the returned cpumask may be freed before it
is being used.

We can replace HK_TYPE_KTHREAD by HK_TYPE_DOMAIN as they are aliases
of each other, but keeping the HK_TYPE_KTHREAD name can highlight the
fact that it is the kthread initiated by ipvs that is being controlled.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/net/ip_vs.h            | 20 ++++++++++++++++----
 net/netfilter/ipvs/ip_vs_ctl.c | 13 ++++++++-----
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 72d325c81313..7bda92fd3fe6 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1411,7 +1411,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
 	return ipvs->sysctl_run_estimation;
 }
 
-static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
+static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
 {
 	if (ipvs->est_cpulist_valid)
 		return ipvs->sysctl_est_cpulist;
@@ -1529,7 +1529,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
 	return 1;
 }
 
-static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
+static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
 {
 	return housekeeping_cpumask(HK_TYPE_KTHREAD);
 }
@@ -1564,6 +1564,18 @@ static inline int sysctl_svc_lfactor(struct netns_ipvs *ipvs)
 	return READ_ONCE(ipvs->sysctl_svc_lfactor);
 }
 
+static inline bool sysctl_est_cpulist_empty(struct netns_ipvs *ipvs)
+{
+	guard(rcu)();
+	return cpumask_empty(__sysctl_est_cpulist(ipvs));
+}
+
+static inline unsigned int sysctl_est_cpulist_weight(struct netns_ipvs *ipvs)
+{
+	guard(rcu)();
+	return cpumask_weight(__sysctl_est_cpulist(ipvs));
+}
+
 /* IPVS core functions
  * (from ip_vs_core.c)
  */
@@ -1895,7 +1907,7 @@ static inline void ip_vs_est_stopped_recalc(struct netns_ipvs *ipvs)
 	/* Stop tasks while cpulist is empty or if disabled with flag */
 	ipvs->est_stopped = !sysctl_run_estimation(ipvs) ||
 			    (ipvs->est_cpulist_valid &&
-			     cpumask_empty(sysctl_est_cpulist(ipvs)));
+			     sysctl_est_cpulist_empty(ipvs));
 #endif
 }
 
@@ -1911,7 +1923,7 @@ static inline bool ip_vs_est_stopped(struct netns_ipvs *ipvs)
 static inline int ip_vs_est_max_threads(struct netns_ipvs *ipvs)
 {
 	unsigned int limit = IPVS_EST_CPU_KTHREADS *
-			     cpumask_weight(sysctl_est_cpulist(ipvs));
+			     sysctl_est_cpulist_weight(ipvs);
 
 	return max(1U, limit);
 }
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 032425025d88..e253a1ceef48 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2338,11 +2338,14 @@ static int ipvs_proc_est_cpumask_get(const struct ctl_table *table,
 
 	mutex_lock(&ipvs->est_mutex);
 
-	if (ipvs->est_cpulist_valid)
-		mask = *valp;
-	else
-		mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
-	ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
+	/* HK_TYPE_KTHREAD cpumask needs RCU protection */
+	scoped_guard(rcu) {
+		if (ipvs->est_cpulist_valid)
+			mask = *valp;
+		else
+			mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
+		ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
+	}
 
 	mutex_unlock(&ipvs->est_mutex);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH-next v2 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
  2026-03-31 16:50 ` [PATCH-next v2 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
@ 2026-04-01 12:43   ` Frederic Weisbecker
  0 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2026-04-01 12:43 UTC (permalink / raw)
  To: Waiman Long
  Cc: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Chen Ridong, Phil Auld,
	linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks

Le Tue, Mar 31, 2026 at 12:50:14PM -0400, Waiman Long a écrit :
> Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
> affinity management"), kthreads default to use the HK_TYPE_DOMAIN
> cpumask. IOW, it is no longer affected by the setting of the nohz_full
> boot kernel parameter.
> 
> That means HK_TYPE_KTHREAD should now be an alias of HK_TYPE_DOMAIN
> instead of HK_TYPE_KERNEL_NOISE to correctly reflect the current kthread
> behavior. Make the change as HK_TYPE_KTHREAD is still being used in
> some networking code.
> 
> Fixes: 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management")
> Signed-off-by: Waiman Long <longman@redhat.com>

This makes ipvs_proc_est_cpumask_get() racy because now without RCU locked the
mask pointer can be released at any point.

Other users:

sysctl_est_cpulist() -> ip_vs_est_stopped_recalc() -> ip_vs_est_reload_start()

Here sysctl_est_cpulist() is only invoked if ->est_cpulist_valid
(->est_mutex makes it stable). So housekeeping_cpumask() should not be called.

But ip_vs_est_max_threads() is more complicated. And it's a sign we should
probably call something like ipvs_proc_est_cpumask_set() when the HK_TYPE_DOMAIN
is modified (and ipvs->est_cpulist_valid is 0) in order to update the ipvs
kthreads accordingly.

Thanks.

-- 
Frederic Weisbecker
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH-next v2 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
  2026-03-31 16:50 ` [PATCH-next v2 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
@ 2026-04-01 12:54   ` Frederic Weisbecker
  2026-04-01 15:13     ` Waiman Long
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2026-04-01 12:54 UTC (permalink / raw)
  To: Waiman Long
  Cc: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Chen Ridong, Phil Auld,
	linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks

Le Tue, Mar 31, 2026 at 12:50:15PM -0400, Waiman Long a écrit :
> The ip_vs_ctl.c file and the associated ip_vs.h file are the only places
> in the kernel where HK_TYPE_KTHREAD cpumask is being retrieved and used.
> Now that HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask can be changed at run
> time. We need to use RCU to guard access to this cpumask to avoid a
> potential UAF problem as the returned cpumask may be freed before it
> is being used.
> 
> We can replace HK_TYPE_KTHREAD by HK_TYPE_DOMAIN as they are aliases
> of each other, but keeping the HK_TYPE_KTHREAD name can highlight the
> fact that it is the kthread initiated by ipvs that is being controlled.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>

Oh I see you're handling a few concerns here. But it's too late, the previous
patch broke bisection.

> ---
>  include/net/ip_vs.h            | 20 ++++++++++++++++----
>  net/netfilter/ipvs/ip_vs_ctl.c | 13 ++++++++-----
>  2 files changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index 72d325c81313..7bda92fd3fe6 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -1411,7 +1411,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
>  	return ipvs->sysctl_run_estimation;
>  }
>  
> -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
> +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
>  {
>  	if (ipvs->est_cpulist_valid)
>  		return ipvs->sysctl_est_cpulist;
> @@ -1529,7 +1529,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
>  	return 1;
>  }
>  
> -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
> +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
>  {
>  	return housekeeping_cpumask(HK_TYPE_KTHREAD);
>  }
> @@ -1564,6 +1564,18 @@ static inline int sysctl_svc_lfactor(struct netns_ipvs *ipvs)
>  	return READ_ONCE(ipvs->sysctl_svc_lfactor);
>  }
>  
> +static inline bool sysctl_est_cpulist_empty(struct netns_ipvs *ipvs)
> +{
> +	guard(rcu)();
> +	return cpumask_empty(__sysctl_est_cpulist(ipvs));
> +}
> +
> +static inline unsigned int sysctl_est_cpulist_weight(struct netns_ipvs *ipvs)
> +{
> +	guard(rcu)();
> +	return cpumask_weight(__sysctl_est_cpulist(ipvs));
> +}
> +
>  /* IPVS core functions
>   * (from ip_vs_core.c)
>   */
> @@ -1895,7 +1907,7 @@ static inline void ip_vs_est_stopped_recalc(struct netns_ipvs *ipvs)
>  	/* Stop tasks while cpulist is empty or if disabled with flag */
>  	ipvs->est_stopped = !sysctl_run_estimation(ipvs) ||
>  			    (ipvs->est_cpulist_valid &&
> -			     cpumask_empty(sysctl_est_cpulist(ipvs)));
> +			     sysctl_est_cpulist_empty(ipvs));

It's not needed, if ipvs->est_cpulist_valid, sysctl_est_cpulist() doesn't
refer to housekeeping.

>  #endif
>  }
>  
> @@ -1911,7 +1923,7 @@ static inline bool ip_vs_est_stopped(struct netns_ipvs *ipvs)
>  static inline int ip_vs_est_max_threads(struct netns_ipvs *ipvs)
>  {
>  	unsigned int limit = IPVS_EST_CPU_KTHREADS *
> -			     cpumask_weight(sysctl_est_cpulist(ipvs));
> +			     sysctl_est_cpulist_weight(ipvs);

That probably works for callers ip_vs_start_estimator().

But this is not handling the core issue that related kthreads should be updated,
as is done in ipvs_proc_est_cpumask_set(), when HK_TYPE_DOMAIN mask changes.

>  
>  	return max(1U, limit);
>  }
> diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
> index 032425025d88..e253a1ceef48 100644
> --- a/net/netfilter/ipvs/ip_vs_ctl.c
> +++ b/net/netfilter/ipvs/ip_vs_ctl.c
> @@ -2338,11 +2338,14 @@ static int ipvs_proc_est_cpumask_get(const struct ctl_table *table,
>  
>  	mutex_lock(&ipvs->est_mutex);
>  
> -	if (ipvs->est_cpulist_valid)
> -		mask = *valp;
> -	else
> -		mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
> -	ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
> +	/* HK_TYPE_KTHREAD cpumask needs RCU protection */
> +	scoped_guard(rcu) {
> +		if (ipvs->est_cpulist_valid)
> +			mask = *valp;
> +		else
> +			mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
> +		ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
> +	}

And that works.

Thanks.

>  
>  	mutex_unlock(&ipvs->est_mutex);
>  
> -- 
> 2.53.0
> 

-- 
Frederic Weisbecker
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH-next v2 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
  2026-04-01 12:54   ` Frederic Weisbecker
@ 2026-04-01 15:13     ` Waiman Long
  0 siblings, 0 replies; 9+ messages in thread
From: Waiman Long @ 2026-04-01 15:13 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	Florian Westphal, Phil Sutter, Chen Ridong, Phil Auld,
	linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks

On 4/1/26 8:54 AM, Frederic Weisbecker wrote:
> Le Tue, Mar 31, 2026 at 12:50:15PM -0400, Waiman Long a écrit :
>> The ip_vs_ctl.c file and the associated ip_vs.h file are the only places
>> in the kernel where HK_TYPE_KTHREAD cpumask is being retrieved and used.
>> Now that HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask can be changed at run
>> time. We need to use RCU to guard access to this cpumask to avoid a
>> potential UAF problem as the returned cpumask may be freed before it
>> is being used.
>>
>> We can replace HK_TYPE_KTHREAD by HK_TYPE_DOMAIN as they are aliases
>> of each other, but keeping the HK_TYPE_KTHREAD name can highlight the
>> fact that it is the kthread initiated by ipvs that is being controlled.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
> Oh I see you're handling a few concerns here. But it's too late, the previous
> patch broke bisection.
Good point. So I have to either reverse the patch order, or just change 
HK_TYPE_KTHREAD to HK_TYPE_DOMAIN & drop the first one.
>
>> ---
>>   include/net/ip_vs.h            | 20 ++++++++++++++++----
>>   net/netfilter/ipvs/ip_vs_ctl.c | 13 ++++++++-----
>>   2 files changed, 24 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
>> index 72d325c81313..7bda92fd3fe6 100644
>> --- a/include/net/ip_vs.h
>> +++ b/include/net/ip_vs.h
>> @@ -1411,7 +1411,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
>>   	return ipvs->sysctl_run_estimation;
>>   }
>>   
>> -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
>> +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
>>   {
>>   	if (ipvs->est_cpulist_valid)
>>   		return ipvs->sysctl_est_cpulist;
>> @@ -1529,7 +1529,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs)
>>   	return 1;
>>   }
>>   
>> -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs)
>> +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs)
>>   {
>>   	return housekeeping_cpumask(HK_TYPE_KTHREAD);
>>   }
>> @@ -1564,6 +1564,18 @@ static inline int sysctl_svc_lfactor(struct netns_ipvs *ipvs)
>>   	return READ_ONCE(ipvs->sysctl_svc_lfactor);
>>   }
>>   
>> +static inline bool sysctl_est_cpulist_empty(struct netns_ipvs *ipvs)
>> +{
>> +	guard(rcu)();
>> +	return cpumask_empty(__sysctl_est_cpulist(ipvs));
>> +}
>> +
>> +static inline unsigned int sysctl_est_cpulist_weight(struct netns_ipvs *ipvs)
>> +{
>> +	guard(rcu)();
>> +	return cpumask_weight(__sysctl_est_cpulist(ipvs));
>> +}
>> +
>>   /* IPVS core functions
>>    * (from ip_vs_core.c)
>>    */
>> @@ -1895,7 +1907,7 @@ static inline void ip_vs_est_stopped_recalc(struct netns_ipvs *ipvs)
>>   	/* Stop tasks while cpulist is empty or if disabled with flag */
>>   	ipvs->est_stopped = !sysctl_run_estimation(ipvs) ||
>>   			    (ipvs->est_cpulist_valid &&
>> -			     cpumask_empty(sysctl_est_cpulist(ipvs)));
>> +			     sysctl_est_cpulist_empty(ipvs));
> It's not needed, if ipvs->est_cpulist_valid, sysctl_est_cpulist() doesn't
> refer to housekeeping.
Right.
>>   #endif
>>   }
>>   
>> @@ -1911,7 +1923,7 @@ static inline bool ip_vs_est_stopped(struct netns_ipvs *ipvs)
>>   static inline int ip_vs_est_max_threads(struct netns_ipvs *ipvs)
>>   {
>>   	unsigned int limit = IPVS_EST_CPU_KTHREADS *
>> -			     cpumask_weight(sysctl_est_cpulist(ipvs));
>> +			     sysctl_est_cpulist_weight(ipvs);
> That probably works for callers ip_vs_start_estimator().
>
> But this is not handling the core issue that related kthreads should be updated,
> as is done in ipvs_proc_est_cpumask_set(), when HK_TYPE_DOMAIN mask changes.

If ipvs_proc_est_cpumask_set() has been called, the real affinity should 
be the intersection of the given cpumask and HK_TYPE_DOMAIN cpumask. Is 
that what you are referring to?

Cheers,
Longman

>
>>   
>>   	return max(1U, limit);
>>   }
>> diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
>> index 032425025d88..e253a1ceef48 100644
>> --- a/net/netfilter/ipvs/ip_vs_ctl.c
>> +++ b/net/netfilter/ipvs/ip_vs_ctl.c
>> @@ -2338,11 +2338,14 @@ static int ipvs_proc_est_cpumask_get(const struct ctl_table *table,
>>   
>>   	mutex_lock(&ipvs->est_mutex);
>>   
>> -	if (ipvs->est_cpulist_valid)
>> -		mask = *valp;
>> -	else
>> -		mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
>> -	ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
>> +	/* HK_TYPE_KTHREAD cpumask needs RCU protection */
>> +	scoped_guard(rcu) {
>> +		if (ipvs->est_cpulist_valid)
>> +			mask = *valp;
>> +		else
>> +			mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD);
>> +		ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask));
>> +	}
> And that works.
>
> Thanks.
>
>>   
>>   	mutex_unlock(&ipvs->est_mutex);
>>   
>> -- 
>> 2.53.0
>>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask
  2026-03-31 16:50 [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long
  2026-03-31 16:50 ` [PATCH-next v2 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
  2026-03-31 16:50 ` [PATCH-next v2 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
@ 2026-04-03 14:15 ` Julian Anastasov
  2026-04-03 14:29   ` Pablo Neira Ayuso
  2 siblings, 1 reply; 9+ messages in thread
From: Julian Anastasov @ 2026-04-03 14:15 UTC (permalink / raw)
  To: Waiman Long
  Cc: Simon Horman, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso, Florian Westphal,
	Phil Sutter, Frederic Weisbecker, Chen Ridong, Phil Auld,
	linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks


	Hello,

On Tue, 31 Mar 2026, Waiman Long wrote:

>  v2:
>   - Rebased on top of linux-next
> 
> Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
> affinity management"), the HK_TYPE_KTHREAD housekeeping cpumask may no
> longer be correct in showing the actual CPU affinity of kthreads that
> have no predefined CPU affinity. As the ipvs networking code is still
> using HK_TYPE_KTHREAD, we need to make HK_TYPE_KTHREAD reflect the
> reality.
> 
> This patch series makes HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
> and uses RCU to protect access to the HK_TYPE_KTHREAD housekeeping
> cpumask.
> 
> Waiman Long (2):
>   sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
>   ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU

	The patchset looks good to me for nf-next, thanks!

Acked-by: Julian Anastasov <ja@ssi.bg>

	Pablo, Florian, as a bugfix this patchset missed
the chance to be applied before the changes that are in
nf-next in ip_vs.h, there is little fuzz there. If there
is no chance to resolve it somehow, we can apply it
on top of nf-next where it now applies successfully.

> 
>  include/linux/sched/isolation.h |  6 +++++-
>  include/net/ip_vs.h             | 20 ++++++++++++++++----
>  net/netfilter/ipvs/ip_vs_ctl.c  | 13 ++++++++-----
>  3 files changed, 29 insertions(+), 10 deletions(-)

Regards

--
Julian Anastasov <ja@ssi.bg>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask
  2026-04-03 14:15 ` [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Julian Anastasov
@ 2026-04-03 14:29   ` Pablo Neira Ayuso
  2026-04-03 15:00     ` Julian Anastasov
  0 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2026-04-03 14:29 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Waiman Long, Simon Horman, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Florian Westphal,
	Phil Sutter, Frederic Weisbecker, Chen Ridong, Phil Auld,
	linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks

On Fri, Apr 03, 2026 at 05:15:50PM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Tue, 31 Mar 2026, Waiman Long wrote:
> 
> >  v2:
> >   - Rebased on top of linux-next
> > 
> > Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
> > affinity management"), the HK_TYPE_KTHREAD housekeeping cpumask may no
> > longer be correct in showing the actual CPU affinity of kthreads that
> > have no predefined CPU affinity. As the ipvs networking code is still
> > using HK_TYPE_KTHREAD, we need to make HK_TYPE_KTHREAD reflect the
> > reality.
> > 
> > This patch series makes HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
> > and uses RCU to protect access to the HK_TYPE_KTHREAD housekeeping
> > cpumask.
> > 
> > Waiman Long (2):
> >   sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
> >   ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
> 
> 	The patchset looks good to me for nf-next, thanks!
> 
> Acked-by: Julian Anastasov <ja@ssi.bg>
> 
> 	Pablo, Florian, as a bugfix this patchset missed
> the chance to be applied before the changes that are in
> nf-next in ip_vs.h, there is little fuzz there. If there
> is no chance to resolve it somehow, we can apply it
> on top of nf-next where it now applies successfully.

One way to handle this is to follow up with nf-next as you suggest,
then send a backport that applies cleanly for -stable once it is
released.

Else, let me know if I am misunderstanding.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask
  2026-04-03 14:29   ` Pablo Neira Ayuso
@ 2026-04-03 15:00     ` Julian Anastasov
  0 siblings, 0 replies; 9+ messages in thread
From: Julian Anastasov @ 2026-04-03 15:00 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Waiman Long, Simon Horman, David S. Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Florian Westphal,
	Phil Sutter, Frederic Weisbecker, Chen Ridong, Phil Auld,
	linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
	sheviks


	Hello,

On Fri, 3 Apr 2026, Pablo Neira Ayuso wrote:

> On Fri, Apr 03, 2026 at 05:15:50PM +0300, Julian Anastasov wrote:
> > 
> > 	Hello,
> > 
> > On Tue, 31 Mar 2026, Waiman Long wrote:
> > 
> > >  v2:
> > >   - Rebased on top of linux-next
> > > 
> > > Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
> > > affinity management"), the HK_TYPE_KTHREAD housekeeping cpumask may no
> > > longer be correct in showing the actual CPU affinity of kthreads that
> > > have no predefined CPU affinity. As the ipvs networking code is still
> > > using HK_TYPE_KTHREAD, we need to make HK_TYPE_KTHREAD reflect the
> > > reality.
> > > 
> > > This patch series makes HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
> > > and uses RCU to protect access to the HK_TYPE_KTHREAD housekeeping
> > > cpumask.
> > > 
> > > Waiman Long (2):
> > >   sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
> > >   ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
> > 
> > 	The patchset looks good to me for nf-next, thanks!
> > 
> > Acked-by: Julian Anastasov <ja@ssi.bg>
> > 
> > 	Pablo, Florian, as a bugfix this patchset missed
> > the chance to be applied before the changes that are in
> > nf-next in ip_vs.h, there is little fuzz there. If there
> > is no chance to resolve it somehow, we can apply it
> > on top of nf-next where it now applies successfully.
> 
> One way to handle this is to follow up with nf-next as you suggest,
> then send a backport that applies cleanly for -stable once it is
> released.

	Lets do it this way, thanks!

> Else, let me know if I am misunderstanding.

Regards

--
Julian Anastasov <ja@ssi.bg>


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-04-03 15:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31 16:50 [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long
2026-03-31 16:50 ` [PATCH-next v2 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
2026-04-01 12:43   ` Frederic Weisbecker
2026-03-31 16:50 ` [PATCH-next v2 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
2026-04-01 12:54   ` Frederic Weisbecker
2026-04-01 15:13     ` Waiman Long
2026-04-03 14:15 ` [PATCH-next v2 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Julian Anastasov
2026-04-03 14:29   ` Pablo Neira Ayuso
2026-04-03 15:00     ` Julian Anastasov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox