* [PATCH 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask
@ 2026-03-24 15:18 Waiman Long
2026-03-24 15:18 ` [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long
2026-03-24 15:18 ` [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long
0 siblings, 2 replies; 5+ messages in thread
From: Waiman Long @ 2026-03-24 15:18 UTC (permalink / raw)
To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong,
Phil Auld
Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam,
sheviks, Waiman Long
Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred
affinity management"), the HK_TYPE_KTHREAD housekeeping cpumask may no
longer be correct in showing the actual CPU affinity of kthreads that
have no predefined CPU affinity. As the ipvs networking code is still
using HK_TYPE_KTHREAD, we need to make HK_TYPE_KTHREAD reflect the
reality.
This patch series makes HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
and uses RCU to protect access to the HK_TYPE_KTHREAD housekeeping
cpumask.
Waiman Long (2):
sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN
ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU
include/linux/sched/isolation.h | 6 +++++-
include/net/ip_vs.h | 20 ++++++++++++++++----
net/netfilter/ipvs/ip_vs_ctl.c | 13 ++++++++-----
3 files changed, 29 insertions(+), 10 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN 2026-03-24 15:18 [PATCH 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long @ 2026-03-24 15:18 ` Waiman Long 2026-03-24 18:59 ` David Dull 2026-03-24 15:18 ` [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long 1 sibling, 1 reply; 5+ messages in thread From: Waiman Long @ 2026-03-24 15:18 UTC (permalink / raw) To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso, Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong, Phil Auld Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam, sheviks, Waiman Long Since commit 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management"), kthreads default to use the HK_TYPE_DOMAIN cpumask. IOW, it is no longer affected by the setting of the nohz_full boot kernel parameter. That means HK_TYPE_KTHREAD should now be an alias of HK_TYPE_DOMAIN instead of HK_TYPE_KERNEL_NOISE to correctly reflect the current kthread behavior. Make the change as HK_TYPE_KTHREAD is still being used in some networking code. Fixes: 041ee6f3727a ("kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management") Signed-off-by: Waiman Long <longman@redhat.com> --- include/linux/sched/isolation.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index dc3975ff1b2e..cf0fd03dd7a2 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -20,6 +20,11 @@ enum hk_type { HK_TYPE_KERNEL_NOISE, HK_TYPE_MAX, + /* + * HK_TYPE_KTHREAD is now an alias of HK_TYPE_DOMAIN + */ + HK_TYPE_KTHREAD = HK_TYPE_DOMAIN, + /* * The following housekeeping types are only set by the nohz_full * boot commandline option. So they can share the same value. @@ -29,7 +34,6 @@ enum hk_type { HK_TYPE_RCU = HK_TYPE_KERNEL_NOISE, HK_TYPE_MISC = HK_TYPE_KERNEL_NOISE, HK_TYPE_WQ = HK_TYPE_KERNEL_NOISE, - HK_TYPE_KTHREAD = HK_TYPE_KERNEL_NOISE }; #ifdef CONFIG_CPU_ISOLATION -- 2.53.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN 2026-03-24 15:18 ` [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long @ 2026-03-24 18:59 ` David Dull 0 siblings, 0 replies; 5+ messages in thread From: David Dull @ 2026-03-24 18:59 UTC (permalink / raw) To: longman Cc: linux-kernel, netdev, lvs-devel, linux-sched, kuba, pabeni, horms, David Dull I went through the series. Patch 1 correctly updates HK_TYPE_KTHREAD to reflect the post-041ee6f3727a behavior where kthreads follow HK_TYPE_DOMAIN. Keeping it mapped to HK_TYPE_KERNEL_NOISE is no longer accurate, so the alias is the right fix. Patch 2 properly introduces RCU protection for accessing the HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask in IPVS. Since the mask can now change at runtime, guarding against potential UAF is required. The changes are minimal and the helper wrappers keep call sites clean. Reviewed-by: David Dull <monderasdor@gmail.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU 2026-03-24 15:18 [PATCH 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long 2026-03-24 15:18 ` [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long @ 2026-03-24 15:18 ` Waiman Long 2026-03-26 8:32 ` Julian Anastasov 1 sibling, 1 reply; 5+ messages in thread From: Waiman Long @ 2026-03-24 15:18 UTC (permalink / raw) To: Simon Horman, Julian Anastasov, David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso, Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong, Phil Auld Cc: linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam, sheviks, Waiman Long The ip_vs_ctl.c file and the associated ip_vs.h file are the only places in the kernel where HK_TYPE_KTHREAD cpumask is being retrieved and used. Now that HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask can be changed at run time. We need to use RCU to guard access to this cpumask to avoid a potential UAF problem as the returned cpumask may be freed before it is being used. Signed-off-by: Waiman Long <longman@redhat.com> --- include/net/ip_vs.h | 20 ++++++++++++++++---- net/netfilter/ipvs/ip_vs_ctl.c | 13 ++++++++----- 2 files changed, 24 insertions(+), 9 deletions(-) diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 29a36709e7f3..17c85a575ef4 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -1155,7 +1155,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs) return ipvs->sysctl_run_estimation; } -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs) +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs) { if (ipvs->est_cpulist_valid) return ipvs->sysctl_est_cpulist; @@ -1273,7 +1273,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs) return 1; } -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs) +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs) { return housekeeping_cpumask(HK_TYPE_KTHREAD); } @@ -1290,6 +1290,18 @@ static inline int sysctl_est_nice(struct netns_ipvs *ipvs) #endif +static inline bool sysctl_est_cpulist_empty(struct netns_ipvs *ipvs) +{ + guard(rcu)(); + return cpumask_empty(__sysctl_est_cpulist(ipvs)); +} + +static inline unsigned int sysctl_est_cpulist_weight(struct netns_ipvs *ipvs) +{ + guard(rcu)(); + return cpumask_weight(__sysctl_est_cpulist(ipvs)); +} + /* IPVS core functions * (from ip_vs_core.c) */ @@ -1604,7 +1616,7 @@ static inline void ip_vs_est_stopped_recalc(struct netns_ipvs *ipvs) /* Stop tasks while cpulist is empty or if disabled with flag */ ipvs->est_stopped = !sysctl_run_estimation(ipvs) || (ipvs->est_cpulist_valid && - cpumask_empty(sysctl_est_cpulist(ipvs))); + sysctl_est_cpulist_empty(ipvs)); #endif } @@ -1620,7 +1632,7 @@ static inline bool ip_vs_est_stopped(struct netns_ipvs *ipvs) static inline int ip_vs_est_max_threads(struct netns_ipvs *ipvs) { unsigned int limit = IPVS_EST_CPU_KTHREADS * - cpumask_weight(sysctl_est_cpulist(ipvs)); + sysctl_est_cpulist_weight(ipvs); return max(1U, limit); } diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 35642de2a0fe..f38a2e2a9dc5 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -1973,11 +1973,14 @@ static int ipvs_proc_est_cpumask_get(const struct ctl_table *table, mutex_lock(&ipvs->est_mutex); - if (ipvs->est_cpulist_valid) - mask = *valp; - else - mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD); - ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask)); + /* HK_TYPE_KTHREAD cpumask needs RCU protection */ + scoped_guard(rcu) { + if (ipvs->est_cpulist_valid) + mask = *valp; + else + mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD); + ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask)); + } mutex_unlock(&ipvs->est_mutex); -- 2.53.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU 2026-03-24 15:18 ` [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long @ 2026-03-26 8:32 ` Julian Anastasov 0 siblings, 0 replies; 5+ messages in thread From: Julian Anastasov @ 2026-03-26 8:32 UTC (permalink / raw) To: Waiman Long Cc: Simon Horman, David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso, Florian Westphal, Phil Sutter, Frederic Weisbecker, Chen Ridong, Phil Auld, linux-kernel, netdev, lvs-devel, netfilter-devel, coreteam, sheviks Hello, On Tue, 24 Mar 2026, Waiman Long wrote: > The ip_vs_ctl.c file and the associated ip_vs.h file are the only places > in the kernel where HK_TYPE_KTHREAD cpumask is being retrieved and used. > Now that HK_TYPE_KTHREAD/HK_TYPE_DOMAIN cpumask can be changed at run > time. We need to use RCU to guard access to this cpumask to avoid a > potential UAF problem as the returned cpumask may be freed before it > is being used. > > Signed-off-by: Waiman Long <longman@redhat.com> > --- > include/net/ip_vs.h | 20 ++++++++++++++++---- > net/netfilter/ipvs/ip_vs_ctl.c | 13 ++++++++----- > 2 files changed, 24 insertions(+), 9 deletions(-) > > diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h > index 29a36709e7f3..17c85a575ef4 100644 > --- a/include/net/ip_vs.h > +++ b/include/net/ip_vs.h > @@ -1155,7 +1155,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs) > return ipvs->sysctl_run_estimation; > } > > -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs) > +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs) > { > if (ipvs->est_cpulist_valid) > return ipvs->sysctl_est_cpulist; > @@ -1273,7 +1273,7 @@ static inline int sysctl_run_estimation(struct netns_ipvs *ipvs) > return 1; > } > > -static inline const struct cpumask *sysctl_est_cpulist(struct netns_ipvs *ipvs) > +static inline const struct cpumask *__sysctl_est_cpulist(struct netns_ipvs *ipvs) > { > return housekeeping_cpumask(HK_TYPE_KTHREAD); > } > @@ -1290,6 +1290,18 @@ static inline int sysctl_est_nice(struct netns_ipvs *ipvs) > > #endif > May be there is little fuzz here, due to the recent changes in the nf-next tree. If this is a bugfix due to the missing RCU protection, may be you should add Fixes line too and use the nf tree. Probably, there will be fuzz/collisions with the changes in the nf-next tree... > +static inline bool sysctl_est_cpulist_empty(struct netns_ipvs *ipvs) > +{ > + guard(rcu)(); > + return cpumask_empty(__sysctl_est_cpulist(ipvs)); > +} > + > +static inline unsigned int sysctl_est_cpulist_weight(struct netns_ipvs *ipvs) > +{ > + guard(rcu)(); > + return cpumask_weight(__sysctl_est_cpulist(ipvs)); > +} > + > /* IPVS core functions > * (from ip_vs_core.c) > */ > @@ -1604,7 +1616,7 @@ static inline void ip_vs_est_stopped_recalc(struct netns_ipvs *ipvs) > /* Stop tasks while cpulist is empty or if disabled with flag */ > ipvs->est_stopped = !sysctl_run_estimation(ipvs) || > (ipvs->est_cpulist_valid && > - cpumask_empty(sysctl_est_cpulist(ipvs))); > + sysctl_est_cpulist_empty(ipvs)); > #endif > } > > @@ -1620,7 +1632,7 @@ static inline bool ip_vs_est_stopped(struct netns_ipvs *ipvs) > static inline int ip_vs_est_max_threads(struct netns_ipvs *ipvs) > { > unsigned int limit = IPVS_EST_CPU_KTHREADS * > - cpumask_weight(sysctl_est_cpulist(ipvs)); > + sysctl_est_cpulist_weight(ipvs); > > return max(1U, limit); > } > diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c > index 35642de2a0fe..f38a2e2a9dc5 100644 > --- a/net/netfilter/ipvs/ip_vs_ctl.c > +++ b/net/netfilter/ipvs/ip_vs_ctl.c > @@ -1973,11 +1973,14 @@ static int ipvs_proc_est_cpumask_get(const struct ctl_table *table, > > mutex_lock(&ipvs->est_mutex); > > - if (ipvs->est_cpulist_valid) > - mask = *valp; > - else > - mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD); > - ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask)); > + /* HK_TYPE_KTHREAD cpumask needs RCU protection */ Can we switch IPVS to use HK_TYPE_DOMAIN? The initial intention was to follow the code in kthread.c. Then you can reconsider if HK_TYPE_KTHREAD should be alias to HK_TYPE_DOMAIN, may be not if there are no other users... > + scoped_guard(rcu) { > + if (ipvs->est_cpulist_valid) > + mask = *valp; > + else > + mask = (struct cpumask *)housekeeping_cpumask(HK_TYPE_KTHREAD); > + ret = scnprintf(buffer, size, "%*pbl\n", cpumask_pr_args(mask)); > + } > > mutex_unlock(&ipvs->est_mutex); > > -- > 2.53.0 Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-03-26 8:33 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-24 15:18 [PATCH 0/2] ipvs: Fix incorrect use of HK_TYPE_KTHREAD housekeeping cpumask Waiman Long 2026-03-24 15:18 ` [PATCH 1/2] sched/isolation: Make HK_TYPE_KTHREAD an alias of HK_TYPE_DOMAIN Waiman Long 2026-03-24 18:59 ` David Dull 2026-03-24 15:18 ` [PATCH 2/2] ipvs: Guard access of HK_TYPE_KTHREAD cpumask with RCU Waiman Long 2026-03-26 8:32 ` Julian Anastasov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox