* [PATCH 0/2] genirq: Fix IRQ threads VS cpuset
@ 2025-11-18 14:30 Frederic Weisbecker
2025-11-18 14:30 ` [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions Frederic Weisbecker
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Frederic Weisbecker @ 2025-11-18 14:30 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Frederic Weisbecker, Marco Crivellari, Waiman Long, cgroups
Hi,
See here for previous patch:
https://lore.kernel.org/lkml/20251105131726.46364-1-frederic@kernel.org/
Changes since v1:
* Remove handling of cpumask availability
Frederic Weisbecker (2):
genirq: Fix IRQ threads affinity VS cpuset isolated partitions
genirq: Remove cpumask availability check on kthread affinity setting
kernel/irq/manage.c | 43 +++++++++++++++++--------------------------
1 file changed, 17 insertions(+), 26 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-18 14:30 [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Frederic Weisbecker @ 2025-11-18 14:30 ` Frederic Weisbecker 2025-11-18 16:27 ` Waiman Long 2025-11-20 11:51 ` Marek Szyprowski 2025-11-18 14:30 ` [PATCH 2/2] genirq: Remove cpumask availability check on kthread affinity setting Frederic Weisbecker 2025-11-18 15:14 ` [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Thomas Gleixner 2 siblings, 2 replies; 15+ messages in thread From: Frederic Weisbecker @ 2025-11-18 14:30 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Frederic Weisbecker, Marco Crivellari, Waiman Long, cgroups When a cpuset isolated partition is created / updated or destroyed, the IRQ threads are affine blindly to all the non-isolated CPUs. And this happens without taking into account the IRQ thread initial affinity that becomes ignored. For example in a system with 8 CPUs, if an IRQ and its kthread are initially affine to CPU 5, creating an isolated partition with only CPU 2 inside will eventually end up affining the IRQ kthread to all CPUs but CPU 2 (that is CPUs 0,1,3-7), losing the kthread preference for CPU 5. Besides the blind re-affinity, this doesn't take care of the actual low level interrupt which isn't migrated. As of today the only way to isolate non managed interrupts, along with their kthreads, is to overwrite their affinity separately, for example through /proc/irq/ To avoid doing that manually, future development should focus on updating the IRQs affinity whenever cpuset isolated partitions are updated. In the meantime, cpuset shouldn't fiddle with IRQ threads directly. To prevent from that, set the PF_NO_SETAFFINITY flag to them. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> --- kernel/irq/manage.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 400856abf672..76e2cbe21d1f 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -176,7 +176,7 @@ bool irq_can_set_affinity_usr(unsigned int irq) } /** - * irq_set_thread_affinity - Notify irq threads to adjust affinity + * irq_thread_update_affinity - Notify irq threads to adjust affinity * @desc: irq descriptor which has affinity changed * * Just set IRQTF_AFFINITY and delegate the affinity setting to the @@ -184,7 +184,7 @@ bool irq_can_set_affinity_usr(unsigned int irq) * we hold desc->lock and this code can be called from hard interrupt * context. */ -static void irq_set_thread_affinity(struct irq_desc *desc) +static void irq_thread_update_affinity(struct irq_desc *desc) { struct irqaction *action; @@ -283,7 +283,7 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask, fallthrough; case IRQ_SET_MASK_OK_NOCOPY: irq_validate_effective_affinity(data); - irq_set_thread_affinity(desc); + irq_thread_update_affinity(desc); ret = 0; } @@ -1035,8 +1035,16 @@ static void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *a set_cpus_allowed_ptr(current, mask); free_cpumask_var(mask); } + +static inline void irq_thread_set_affinity(struct task_struct *t, + struct irq_desc *desc) +{ + kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data)); +} #else static inline void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { } +static inline void irq_thread_set_affinity(struct task_struct *t, + struct irq_desc *desc) { } #endif static int irq_wait_for_interrupt(struct irq_desc *desc, @@ -1221,6 +1229,7 @@ static void wake_up_and_wait_for_irq_thread_ready(struct irq_desc *desc, if (!action || !action->thread) return; + irq_thread_set_affinity(action->thread, desc); wake_up_process(action->thread); wait_event(desc->wait_for_threads, test_bit(IRQTF_READY, &action->thread_flags)); @@ -1405,16 +1414,7 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) * references an already freed task_struct. */ new->thread = get_task_struct(t); - /* - * Tell the thread to set its affinity. This is - * important for shared interrupt handlers as we do - * not invoke setup_affinity() for the secondary - * handlers as everything is already set up. Even for - * interrupts marked with IRQF_NO_BALANCE this is - * correct as we want the thread to move to the cpu(s) - * on which the requesting code placed the interrupt. - */ - set_bit(IRQTF_AFFINITY, &new->thread_flags); + return 0; } -- 2.51.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-18 14:30 ` [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions Frederic Weisbecker @ 2025-11-18 16:27 ` Waiman Long 2025-11-18 17:23 ` Thomas Gleixner 2025-11-20 11:51 ` Marek Szyprowski 1 sibling, 1 reply; 15+ messages in thread From: Waiman Long @ 2025-11-18 16:27 UTC (permalink / raw) To: Frederic Weisbecker, Thomas Gleixner Cc: LKML, Marco Crivellari, Waiman Long, cgroups On 11/18/25 9:30 AM, Frederic Weisbecker wrote: > When a cpuset isolated partition is created / updated or destroyed, > the IRQ threads are affine blindly to all the non-isolated CPUs. And > this happens without taking into account the IRQ thread initial > affinity that becomes ignored. > > For example in a system with 8 CPUs, if an IRQ and its kthread are > initially affine to CPU 5, creating an isolated partition with only > CPU 2 inside will eventually end up affining the IRQ kthread to all > CPUs but CPU 2 (that is CPUs 0,1,3-7), losing the kthread preference for > CPU 5. > > Besides the blind re-affinity, this doesn't take care of the actual > low level interrupt which isn't migrated. As of today the only way to > isolate non managed interrupts, along with their kthreads, is to > overwrite their affinity separately, for example through /proc/irq/ > > To avoid doing that manually, future development should focus on > updating the IRQs affinity whenever cpuset isolated partitions are > updated. > > In the meantime, cpuset shouldn't fiddle with IRQ threads directly. > To prevent from that, set the PF_NO_SETAFFINITY flag to them. > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > --- > kernel/irq/manage.c | 26 +++++++++++++------------- > 1 file changed, 13 insertions(+), 13 deletions(-) > > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c > index 400856abf672..76e2cbe21d1f 100644 > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -176,7 +176,7 @@ bool irq_can_set_affinity_usr(unsigned int irq) > } > > /** > - * irq_set_thread_affinity - Notify irq threads to adjust affinity > + * irq_thread_update_affinity - Notify irq threads to adjust affinity > * @desc: irq descriptor which has affinity changed > * > * Just set IRQTF_AFFINITY and delegate the affinity setting to the > @@ -184,7 +184,7 @@ bool irq_can_set_affinity_usr(unsigned int irq) > * we hold desc->lock and this code can be called from hard interrupt > * context. > */ > -static void irq_set_thread_affinity(struct irq_desc *desc) > +static void irq_thread_update_affinity(struct irq_desc *desc) > { > struct irqaction *action; > > @@ -283,7 +283,7 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask, > fallthrough; > case IRQ_SET_MASK_OK_NOCOPY: > irq_validate_effective_affinity(data); > - irq_set_thread_affinity(desc); > + irq_thread_update_affinity(desc); > ret = 0; > } > > @@ -1035,8 +1035,16 @@ static void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *a > set_cpus_allowed_ptr(current, mask); > free_cpumask_var(mask); > } > + > +static inline void irq_thread_set_affinity(struct task_struct *t, > + struct irq_desc *desc) > +{ > + kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data)); > +} According to irq_thread_check_affinity(), accessing the cpumask returned from irq_data_get_effective_affinity_mask(&desc->irq_data) requires taking desc->lock to ensure its stability. Do we need something similar here? Other than that, it looks good to me. Cheers, Longman > #else > static inline void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { } > +static inline void irq_thread_set_affinity(struct task_struct *t, > + struct irq_desc *desc) { } > #endif > > static int irq_wait_for_interrupt(struct irq_desc *desc, > @@ -1221,6 +1229,7 @@ static void wake_up_and_wait_for_irq_thread_ready(struct irq_desc *desc, > if (!action || !action->thread) > return; > > + irq_thread_set_affinity(action->thread, desc); > wake_up_process(action->thread); > wait_event(desc->wait_for_threads, > test_bit(IRQTF_READY, &action->thread_flags)); > @@ -1405,16 +1414,7 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) > * references an already freed task_struct. > */ > new->thread = get_task_struct(t); > - /* > - * Tell the thread to set its affinity. This is > - * important for shared interrupt handlers as we do > - * not invoke setup_affinity() for the secondary > - * handlers as everything is already set up. Even for > - * interrupts marked with IRQF_NO_BALANCE this is > - * correct as we want the thread to move to the cpu(s) > - * on which the requesting code placed the interrupt. > - */ > - set_bit(IRQTF_AFFINITY, &new->thread_flags); > + > return 0; > } > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-18 16:27 ` Waiman Long @ 2025-11-18 17:23 ` Thomas Gleixner 2025-11-18 17:29 ` Waiman Long 0 siblings, 1 reply; 15+ messages in thread From: Thomas Gleixner @ 2025-11-18 17:23 UTC (permalink / raw) To: Waiman Long, Frederic Weisbecker Cc: LKML, Marco Crivellari, Waiman Long, cgroups On Tue, Nov 18 2025 at 11:27, Waiman Long wrote: >> +static inline void irq_thread_set_affinity(struct task_struct *t, >> + struct irq_desc *desc) >> +{ >> + kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data)); >> +} > > According to irq_thread_check_affinity(), accessing the cpumask returned > from irq_data_get_effective_affinity_mask(&desc->irq_data) requires > taking desc->lock to ensure its stability. Do we need something similar > here? Other than that, it looks good to me. That's during interrupt setup so it should be stable (famous last words) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-18 17:23 ` Thomas Gleixner @ 2025-11-18 17:29 ` Waiman Long 0 siblings, 0 replies; 15+ messages in thread From: Waiman Long @ 2025-11-18 17:29 UTC (permalink / raw) To: Thomas Gleixner, Waiman Long, Frederic Weisbecker Cc: LKML, Marco Crivellari, cgroups On 11/18/25 12:23 PM, Thomas Gleixner wrote: > On Tue, Nov 18 2025 at 11:27, Waiman Long wrote: >>> +static inline void irq_thread_set_affinity(struct task_struct *t, >>> + struct irq_desc *desc) >>> +{ >>> + kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data)); >>> +} >> According to irq_thread_check_affinity(), accessing the cpumask returned >> from irq_data_get_effective_affinity_mask(&desc->irq_data) requires >> taking desc->lock to ensure its stability. Do we need something similar >> here? Other than that, it looks good to me. > That's during interrupt setup so it should be stable (famous last words) Thanks for the clarification. We should probably add a comment to mention that. Cheers, Longman ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-18 14:30 ` [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions Frederic Weisbecker 2025-11-18 16:27 ` Waiman Long @ 2025-11-20 11:51 ` Marek Szyprowski 2025-11-20 13:20 ` Frederic Weisbecker 2025-11-20 13:45 ` Thomas Gleixner 1 sibling, 2 replies; 15+ messages in thread From: Marek Szyprowski @ 2025-11-20 11:51 UTC (permalink / raw) To: Frederic Weisbecker, Thomas Gleixner Cc: LKML, Marco Crivellari, Waiman Long, cgroups On 18.11.2025 15:30, Frederic Weisbecker wrote: > When a cpuset isolated partition is created / updated or destroyed, > the IRQ threads are affine blindly to all the non-isolated CPUs. And > this happens without taking into account the IRQ thread initial > affinity that becomes ignored. > > For example in a system with 8 CPUs, if an IRQ and its kthread are > initially affine to CPU 5, creating an isolated partition with only > CPU 2 inside will eventually end up affining the IRQ kthread to all > CPUs but CPU 2 (that is CPUs 0,1,3-7), losing the kthread preference for > CPU 5. > > Besides the blind re-affinity, this doesn't take care of the actual > low level interrupt which isn't migrated. As of today the only way to > isolate non managed interrupts, along with their kthreads, is to > overwrite their affinity separately, for example through /proc/irq/ > > To avoid doing that manually, future development should focus on > updating the IRQs affinity whenever cpuset isolated partitions are > updated. > > In the meantime, cpuset shouldn't fiddle with IRQ threads directly. > To prevent from that, set the PF_NO_SETAFFINITY flag to them. > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> This patch landed in today's linux-next as commit 844dcacab287 ("genirq: Fix interrupt threads affinity vs. cpuset isolated partitions"). In my tests I found that it triggers a warnings on some of my test systems. This is example of such warning: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/kthread.c:599 kthread_bind_mask+0x2c/0x84 Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.0-rc1-00031-g844dcacab287 #16177 PREEMPT Hardware name: Samsung Exynos (Flattened Device Tree) Call trace: unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x68/0x88 dump_stack_lvl from __warn+0x80/0x1d0 __warn from warn_slowpath_fmt+0x1b0/0x1bc warn_slowpath_fmt from kthread_bind_mask+0x2c/0x84 kthread_bind_mask from wake_up_and_wait_for_irq_thread_ready+0x3c/0xd4 wake_up_and_wait_for_irq_thread_ready from __setup_irq+0x3e8/0x894 __setup_irq from request_threaded_irq+0xe4/0x15c request_threaded_irq from devm_request_threaded_irq+0x78/0x104 devm_request_threaded_irq from max8997_irq_init+0x15c/0x24c max8997_irq_init from max8997_i2c_probe+0x118/0x208 max8997_i2c_probe from i2c_device_probe+0x1bc/0x358 i2c_device_probe from really_probe+0xe0/0x3d8 really_probe from __driver_probe_device+0x9c/0x1e0 __driver_probe_device from driver_probe_device+0x30/0xc0 driver_probe_device from __device_attach_driver+0xa8/0x120 __device_attach_driver from bus_for_each_drv+0x84/0xdc bus_for_each_drv from __device_attach+0xb0/0x20c __device_attach from bus_probe_device+0x8c/0x90 bus_probe_device from device_add+0x5b0/0x7f0 device_add from i2c_new_client_device+0x170/0x360 i2c_new_client_device from of_i2c_register_device+0x80/0xc8 of_i2c_register_device from of_i2c_register_devices+0x84/0xf8 of_i2c_register_devices from i2c_register_adapter+0x240/0x7b0 i2c_register_adapter from s3c24xx_i2c_probe+0x2a0/0x570 s3c24xx_i2c_probe from platform_probe+0x5c/0x98 platform_probe from really_probe+0xe0/0x3d8 really_probe from __driver_probe_device+0x9c/0x1e0 __driver_probe_device from driver_probe_device+0x30/0xc0 driver_probe_device from __driver_attach+0x124/0x1d4 __driver_attach from bus_for_each_dev+0x70/0xc4 bus_for_each_dev from bus_add_driver+0xe0/0x220 bus_add_driver from driver_register+0x7c/0x118 driver_register from do_one_initcall+0x70/0x328 do_one_initcall from kernel_init_freeable+0x1c0/0x234 kernel_init_freeable from kernel_init+0x1c/0x12c kernel_init from ret_from_fork+0x14/0x28 Exception stack(0xf082dfb0 to 0xf082dff8) ... irq event stamp: 78529 hardirqs last enabled at (78805): [<c01bc4f8>] __up_console_sem+0x50/0x60 hardirqs last disabled at (78816): [<c01bc4e4>] __up_console_sem+0x3c/0x60 softirqs last enabled at (78784): [<c013bb14>] handle_softirqs+0x328/0x520 softirqs last disabled at (78779): [<c013beb8>] __irq_exit_rcu+0x144/0x1f0 ---[ end trace 0000000000000000 ]--- Reverting $subject on top of linux-next fixes this issue. Let me know how I can help debugging it. > --- > kernel/irq/manage.c | 26 +++++++++++++------------- > 1 file changed, 13 insertions(+), 13 deletions(-) > > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c > index 400856abf672..76e2cbe21d1f 100644 > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -176,7 +176,7 @@ bool irq_can_set_affinity_usr(unsigned int irq) > } > > /** > - * irq_set_thread_affinity - Notify irq threads to adjust affinity > + * irq_thread_update_affinity - Notify irq threads to adjust affinity > * @desc: irq descriptor which has affinity changed > * > * Just set IRQTF_AFFINITY and delegate the affinity setting to the > @@ -184,7 +184,7 @@ bool irq_can_set_affinity_usr(unsigned int irq) > * we hold desc->lock and this code can be called from hard interrupt > * context. > */ > -static void irq_set_thread_affinity(struct irq_desc *desc) > +static void irq_thread_update_affinity(struct irq_desc *desc) > { > struct irqaction *action; > > @@ -283,7 +283,7 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask, > fallthrough; > case IRQ_SET_MASK_OK_NOCOPY: > irq_validate_effective_affinity(data); > - irq_set_thread_affinity(desc); > + irq_thread_update_affinity(desc); > ret = 0; > } > > @@ -1035,8 +1035,16 @@ static void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *a > set_cpus_allowed_ptr(current, mask); > free_cpumask_var(mask); > } > + > +static inline void irq_thread_set_affinity(struct task_struct *t, > + struct irq_desc *desc) > +{ > + kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data)); > +} > #else > static inline void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { } > +static inline void irq_thread_set_affinity(struct task_struct *t, > + struct irq_desc *desc) { } > #endif > > static int irq_wait_for_interrupt(struct irq_desc *desc, > @@ -1221,6 +1229,7 @@ static void wake_up_and_wait_for_irq_thread_ready(struct irq_desc *desc, > if (!action || !action->thread) > return; > > + irq_thread_set_affinity(action->thread, desc); > wake_up_process(action->thread); > wait_event(desc->wait_for_threads, > test_bit(IRQTF_READY, &action->thread_flags)); > @@ -1405,16 +1414,7 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) > * references an already freed task_struct. > */ > new->thread = get_task_struct(t); > - /* > - * Tell the thread to set its affinity. This is > - * important for shared interrupt handlers as we do > - * not invoke setup_affinity() for the secondary > - * handlers as everything is already set up. Even for > - * interrupts marked with IRQF_NO_BALANCE this is > - * correct as we want the thread to move to the cpu(s) > - * on which the requesting code placed the interrupt. > - */ > - set_bit(IRQTF_AFFINITY, &new->thread_flags); > + > return 0; > } > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-20 11:51 ` Marek Szyprowski @ 2025-11-20 13:20 ` Frederic Weisbecker 2025-11-20 15:00 ` Thomas Gleixner 2025-11-20 13:45 ` Thomas Gleixner 1 sibling, 1 reply; 15+ messages in thread From: Frederic Weisbecker @ 2025-11-20 13:20 UTC (permalink / raw) To: Marek Szyprowski Cc: Thomas Gleixner, LKML, Marco Crivellari, Waiman Long, cgroups Le Thu, Nov 20, 2025 at 12:51:31PM +0100, Marek Szyprowski a écrit : > On 18.11.2025 15:30, Frederic Weisbecker wrote: > > When a cpuset isolated partition is created / updated or destroyed, > > the IRQ threads are affine blindly to all the non-isolated CPUs. And > > this happens without taking into account the IRQ thread initial > > affinity that becomes ignored. > > > > For example in a system with 8 CPUs, if an IRQ and its kthread are > > initially affine to CPU 5, creating an isolated partition with only > > CPU 2 inside will eventually end up affining the IRQ kthread to all > > CPUs but CPU 2 (that is CPUs 0,1,3-7), losing the kthread preference for > > CPU 5. > > > > Besides the blind re-affinity, this doesn't take care of the actual > > low level interrupt which isn't migrated. As of today the only way to > > isolate non managed interrupts, along with their kthreads, is to > > overwrite their affinity separately, for example through /proc/irq/ > > > > To avoid doing that manually, future development should focus on > > updating the IRQs affinity whenever cpuset isolated partitions are > > updated. > > > > In the meantime, cpuset shouldn't fiddle with IRQ threads directly. > > To prevent from that, set the PF_NO_SETAFFINITY flag to them. > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > This patch landed in today's linux-next as commit 844dcacab287 ("genirq: > Fix interrupt threads affinity vs. cpuset isolated partitions"). In my > tests I found that it triggers a warnings on some of my test systems. > This is example of such warning: > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 1 at kernel/kthread.c:599 kthread_bind_mask+0x2c/0x84 Erm, does this means that the IRQ thread got awaken before the first official wakeup in wake_up_and_wait_for_irq_thread_ready()? This looks wrong... irq_startup() may be called on a few occcasions before. So perhaps the IRQ already fired and woke up the kthread once before the "official" first wake up? There seem to be some initialization ordering issue here... Thomas? Thanks. -- Frederic Weisbecker SUSE Labs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-20 13:20 ` Frederic Weisbecker @ 2025-11-20 15:00 ` Thomas Gleixner 2025-11-20 15:50 ` Frederic Weisbecker 2025-11-20 16:34 ` Marek Szyprowski 0 siblings, 2 replies; 15+ messages in thread From: Thomas Gleixner @ 2025-11-20 15:00 UTC (permalink / raw) To: Frederic Weisbecker, Marek Szyprowski Cc: LKML, Marco Crivellari, Waiman Long, cgroups On Thu, Nov 20 2025 at 14:20, Frederic Weisbecker wrote: > Le Thu, Nov 20, 2025 at 12:51:31PM +0100, Marek Szyprowski a écrit : >> On 18.11.2025 15:30, Frederic Weisbecker wrote: >> > When a cpuset isolated partition is created / updated or destroyed, >> > the IRQ threads are affine blindly to all the non-isolated CPUs. And >> > this happens without taking into account the IRQ thread initial >> > affinity that becomes ignored. >> > >> > For example in a system with 8 CPUs, if an IRQ and its kthread are >> > initially affine to CPU 5, creating an isolated partition with only >> > CPU 2 inside will eventually end up affining the IRQ kthread to all >> > CPUs but CPU 2 (that is CPUs 0,1,3-7), losing the kthread preference for >> > CPU 5. >> > >> > Besides the blind re-affinity, this doesn't take care of the actual >> > low level interrupt which isn't migrated. As of today the only way to >> > isolate non managed interrupts, along with their kthreads, is to >> > overwrite their affinity separately, for example through /proc/irq/ >> > >> > To avoid doing that manually, future development should focus on >> > updating the IRQs affinity whenever cpuset isolated partitions are >> > updated. >> > >> > In the meantime, cpuset shouldn't fiddle with IRQ threads directly. >> > To prevent from that, set the PF_NO_SETAFFINITY flag to them. >> > >> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org> >> >> This patch landed in today's linux-next as commit 844dcacab287 ("genirq: >> Fix interrupt threads affinity vs. cpuset isolated partitions"). In my >> tests I found that it triggers a warnings on some of my test systems. >> This is example of such warning: >> >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 1 at kernel/kthread.c:599 kthread_bind_mask+0x2c/0x84 > > Erm, does this means that the IRQ thread got awaken before the first official > wakeup in wake_up_and_wait_for_irq_thread_ready()? This looks wrong... > > irq_startup() may be called on a few occcasions before. So perhaps > the IRQ already fired and woke up the kthread once before the "official" > first wake up? > > There seem to be some initialization ordering issue here... That's unavoidable because of locking and hen and egg ordering problems. When the thread is created the interrupt is not yet started up and therefore effective affinity is not known. So doing a bind there is pointless. The action has to be made visible _before_ starting it up as once the interrupt is unmasked it might fire. That's even more true for shared interrupts. The wakeup/bind/... muck cannot be invoked with the descriptor lock held, so it has to be done _after_ the thing went live. So yes an interrupt which fires before kthread_bind() is invoked might exactly have that effect. It wakes it from the kthread() wait and it goes straight through to the thread function. That's why the original code just set the affinity bit and let the thread itself handle it. There are three options to solve that: 1) Bind the thread right after kthread_create() to a random housekeeping task and let move itself over to the real place once it came up (at this point the affinity is established) 2) Serialize the setup so that the thread (even, when woken up early) get's stuck in UNINTERRUPTIBLE state _before_ it can reach wait_for_interrupt() which waits with INTERRUPTIBLE 3) Teach kthread_create() that the thread is subject to being bound to something later on and implement that serialization there. #1 is pretty straight forward. See untested below. #2 is ugly (I tried) #3 could be useful in general, but that needs more thoughts Thanks tglx --- --- a/kernel/irq/handle.c +++ b/kernel/irq/handle.c @@ -133,7 +133,15 @@ void __irq_wake_thread(struct irq_desc * */ atomic_inc(&desc->threads_active); - wake_up_process(action->thread); + /* + * This might be a premature wakeup before the thread reached the + * thread function and set the IRQTF_READY bit. It's waiting in + * kthread code with state UNINTERRUPTIBLE. Once it reaches the + * thread function it waits with INTERRUPTIBLE. The wakeup is not + * lost in that case because the thread is guaranteed to observe + * the RUN flag before it goes to sleep in wait_for_interrupt(). + */ + wake_up_state(action->thread, TASK_INTERRUPTIBLE); } static DEFINE_STATIC_KEY_FALSE(irqhandler_duration_check_enabled); --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1026,16 +1026,8 @@ static void irq_thread_check_affinity(st set_cpus_allowed_ptr(current, mask); free_cpumask_var(mask); } - -static inline void irq_thread_set_affinity(struct task_struct *t, - struct irq_desc *desc) -{ - kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data)); -} #else static inline void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { } -static inline void irq_thread_set_affinity(struct task_struct *t, - struct irq_desc *desc) { } #endif static int irq_wait_for_interrupt(struct irq_desc *desc, @@ -1220,7 +1212,6 @@ static void wake_up_and_wait_for_irq_thr if (!action || !action->thread) return; - irq_thread_set_affinity(action->thread, desc); wake_up_process(action->thread); wait_event(desc->wait_for_threads, test_bit(IRQTF_READY, &action->thread_flags)); @@ -1389,26 +1380,35 @@ static void irq_nmi_teardown(struct irq_ static int setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) { + /* + * At this point interrupt affinity is not known. just assume that + * the current CPU is not isolated and valid to bring the thread + * up. The thread will move itself over to the right place once the + * whole setup is complete. + */ + unsigned int cpu = raw_smp_processor_id(); struct task_struct *t; - if (!secondary) { - t = kthread_create(irq_thread, new, "irq/%d-%s", irq, - new->name); - } else { - t = kthread_create(irq_thread, new, "irq/%d-s-%s", irq, - new->name); - } + if (!secondary) + t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-%s", irq, new->name); + else + t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-s-%s", irq, new->name); if (IS_ERR(t)) return PTR_ERR(t); /* - * We keep the reference to the task struct even if - * the thread dies to avoid that the interrupt code - * references an already freed task_struct. + * We keep the reference to the task struct even if the thread dies + * to avoid that the interrupt code references an already freed + * task_struct. */ new->thread = get_task_struct(t); + /* + * Ensure the thread adjusts the affinity once it reaches the + * thread function. + */ + new->thread_flags = BIT(IRQTF_AFFINITY); return 0; } ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-20 15:00 ` Thomas Gleixner @ 2025-11-20 15:50 ` Frederic Weisbecker 2025-11-20 19:09 ` Thomas Gleixner 2025-11-20 16:34 ` Marek Szyprowski 1 sibling, 1 reply; 15+ messages in thread From: Frederic Weisbecker @ 2025-11-20 15:50 UTC (permalink / raw) To: Thomas Gleixner Cc: Marek Szyprowski, LKML, Marco Crivellari, Waiman Long, cgroups Le Thu, Nov 20, 2025 at 04:00:39PM +0100, Thomas Gleixner a écrit : > On Thu, Nov 20 2025 at 14:20, Frederic Weisbecker wrote: > > Erm, does this means that the IRQ thread got awaken before the first official > > wakeup in wake_up_and_wait_for_irq_thread_ready()? This looks wrong... > > > > irq_startup() may be called on a few occcasions before. So perhaps > > the IRQ already fired and woke up the kthread once before the "official" > > first wake up? > > > > There seem to be some initialization ordering issue here... > > That's unavoidable because of locking and hen and egg ordering problems. > > When the thread is created the interrupt is not yet started up and > therefore effective affinity is not known. So doing a bind there is > pointless. > > The action has to be made visible _before_ starting it up as once the > interrupt is unmasked it might fire. That's even more true for shared > interrupts. > > The wakeup/bind/... muck cannot be invoked with the descriptor lock > held, so it has to be done _after_ the thing went live. > > So yes an interrupt which fires before kthread_bind() is invoked might > exactly have that effect. It wakes it from the kthread() wait and it > goes straight through to the thread function. > > That's why the original code just set the affinity bit and let the > thread itself handle it. > > There are three options to solve that: > > 1) Bind the thread right after kthread_create() to a random > housekeeping task and let move itself over to the real place > once it came up (at this point the affinity is established) > > 2) Serialize the setup so that the thread (even, when woken up > early) get's stuck in UNINTERRUPTIBLE state _before_ it can > reach wait_for_interrupt() which waits with INTERRUPTIBLE > > 3) Teach kthread_create() that the thread is subject to being > bound to something later on and implement that serialization > there. > > #1 is pretty straight forward. See untested below. Right. > > #2 is ugly (I tried) Ok... > > #3 could be useful in general, but that needs more thoughts A lot of thoughts yes given the many things to consider (kthread automatic affinity handling, etc...). And if we do that we must always affine the IRQ threads remotely to avoid conflict with them. But that can be a problem with locking, etc... > --- a/kernel/irq/handle.c > +++ b/kernel/irq/handle.c > @@ -133,7 +133,15 @@ void __irq_wake_thread(struct irq_desc * > */ > atomic_inc(&desc->threads_active); > > - wake_up_process(action->thread); > + /* > + * This might be a premature wakeup before the thread reached the > + * thread function and set the IRQTF_READY bit. It's waiting in > + * kthread code with state UNINTERRUPTIBLE. Once it reaches the > + * thread function it waits with INTERRUPTIBLE. The wakeup is not > + * lost in that case because the thread is guaranteed to observe > + * the RUN flag before it goes to sleep in wait_for_interrupt(). > + */ > + wake_up_state(action->thread, TASK_INTERRUPTIBLE); > } > > static DEFINE_STATIC_KEY_FALSE(irqhandler_duration_check_enabled); > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -1026,16 +1026,8 @@ static void irq_thread_check_affinity(st > set_cpus_allowed_ptr(current, mask); > free_cpumask_var(mask); > } > - > -static inline void irq_thread_set_affinity(struct task_struct *t, > - struct irq_desc *desc) > -{ > - kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data)); > -} > #else > static inline void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { } > -static inline void irq_thread_set_affinity(struct task_struct *t, > - struct irq_desc *desc) { } > #endif > > static int irq_wait_for_interrupt(struct irq_desc *desc, > @@ -1220,7 +1212,6 @@ static void wake_up_and_wait_for_irq_thr > if (!action || !action->thread) > return; > > - irq_thread_set_affinity(action->thread, desc); > wake_up_process(action->thread); > wait_event(desc->wait_for_threads, > test_bit(IRQTF_READY, &action->thread_flags)); > @@ -1389,26 +1380,35 @@ static void irq_nmi_teardown(struct irq_ > static int > setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) > { > + /* > + * At this point interrupt affinity is not known. just assume that > + * the current CPU is not isolated and valid to bring the thread > + * up. The thread will move itself over to the right place once the > + * whole setup is complete. > + */ > + unsigned int cpu = raw_smp_processor_id(); > struct task_struct *t; > > - if (!secondary) { > - t = kthread_create(irq_thread, new, "irq/%d-%s", irq, > - new->name); > - } else { > - t = kthread_create(irq_thread, new, "irq/%d-s-%s", irq, > - new->name); > - } > + if (!secondary) > + t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-%s", irq, new->name); > + else > + t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-s-%s", > - irq, new->name); Right I though about something like that, it involved: kthread_bind_mask(t, cpu_possible_mask); Which do you prefer? Also do you prefer such a fixup or should I refactor my patches you merged? Thanks. -- Frederic Weisbecker SUSE Labs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-20 15:50 ` Frederic Weisbecker @ 2025-11-20 19:09 ` Thomas Gleixner 0 siblings, 0 replies; 15+ messages in thread From: Thomas Gleixner @ 2025-11-20 19:09 UTC (permalink / raw) To: Frederic Weisbecker Cc: Marek Szyprowski, LKML, Marco Crivellari, Waiman Long, cgroups On Thu, Nov 20 2025 at 16:50, Frederic Weisbecker wrote: > Le Thu, Nov 20, 2025 at 04:00:39PM +0100, Thomas Gleixner a écrit : >> + if (!secondary) >> + t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-%s", irq, new->name); >> + else >> + t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-s-%s", >> - irq, new->name); > > Right I though about something like that, it involved: > > kthread_bind_mask(t, cpu_possible_mask); That's way simpler and also solves the problem with the kthread_create_on_cpu() name which Marek pointed out. > Which do you prefer? Also do you prefer such a fixup or should I refactor my > patches you merged? Can you split out the wakeup change into a separate patch (Suggested-by-me) with it's own change log and fold the kthread_bind_mask() + set(AFFINITY) bit into this one. I just go and zap the existing commits (they are on top of the branch). Thanks, tglx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-20 15:00 ` Thomas Gleixner 2025-11-20 15:50 ` Frederic Weisbecker @ 2025-11-20 16:34 ` Marek Szyprowski 1 sibling, 0 replies; 15+ messages in thread From: Marek Szyprowski @ 2025-11-20 16:34 UTC (permalink / raw) To: Thomas Gleixner, Frederic Weisbecker Cc: LKML, Marco Crivellari, Waiman Long, cgroups On 20.11.2025 16:00, Thomas Gleixner wrote: > On Thu, Nov 20 2025 at 14:20, Frederic Weisbecker wrote: >> Le Thu, Nov 20, 2025 at 12:51:31PM +0100, Marek Szyprowski a écrit : >>> On 18.11.2025 15:30, Frederic Weisbecker wrote: >>>> When a cpuset isolated partition is created / updated or destroyed, >>>> the IRQ threads are affine blindly to all the non-isolated CPUs. And >>>> this happens without taking into account the IRQ thread initial >>>> affinity that becomes ignored. >>>> >>>> For example in a system with 8 CPUs, if an IRQ and its kthread are >>>> initially affine to CPU 5, creating an isolated partition with only >>>> CPU 2 inside will eventually end up affining the IRQ kthread to all >>>> CPUs but CPU 2 (that is CPUs 0,1,3-7), losing the kthread preference for >>>> CPU 5. >>>> >>>> Besides the blind re-affinity, this doesn't take care of the actual >>>> low level interrupt which isn't migrated. As of today the only way to >>>> isolate non managed interrupts, along with their kthreads, is to >>>> overwrite their affinity separately, for example through /proc/irq/ >>>> >>>> To avoid doing that manually, future development should focus on >>>> updating the IRQs affinity whenever cpuset isolated partitions are >>>> updated. >>>> >>>> In the meantime, cpuset shouldn't fiddle with IRQ threads directly. >>>> To prevent from that, set the PF_NO_SETAFFINITY flag to them. >>>> >>>> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> >>> This patch landed in today's linux-next as commit 844dcacab287 ("genirq: >>> Fix interrupt threads affinity vs. cpuset isolated partitions"). In my >>> tests I found that it triggers a warnings on some of my test systems. >>> This is example of such warning: >>> >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 1 at kernel/kthread.c:599 kthread_bind_mask+0x2c/0x84 >> Erm, does this means that the IRQ thread got awaken before the first official >> wakeup in wake_up_and_wait_for_irq_thread_ready()? This looks wrong... >> >> irq_startup() may be called on a few occcasions before. So perhaps >> the IRQ already fired and woke up the kthread once before the "official" >> first wake up? >> >> There seem to be some initialization ordering issue here... > That's unavoidable because of locking and hen and egg ordering problems. > > When the thread is created the interrupt is not yet started up and > therefore effective affinity is not known. So doing a bind there is > pointless. > > The action has to be made visible _before_ starting it up as once the > interrupt is unmasked it might fire. That's even more true for shared > interrupts. > > The wakeup/bind/... muck cannot be invoked with the descriptor lock > held, so it has to be done _after_ the thing went live. > > So yes an interrupt which fires before kthread_bind() is invoked might > exactly have that effect. It wakes it from the kthread() wait and it > goes straight through to the thread function. > > That's why the original code just set the affinity bit and let the > thread itself handle it. > > There are three options to solve that: > > 1) Bind the thread right after kthread_create() to a random > housekeeping task and let move itself over to the real place > once it came up (at this point the affinity is established) > > 2) Serialize the setup so that the thread (even, when woken up > early) get's stuck in UNINTERRUPTIBLE state _before_ it can > reach wait_for_interrupt() which waits with INTERRUPTIBLE > > 3) Teach kthread_create() that the thread is subject to being > bound to something later on and implement that serialization > there. > > #1 is pretty straight forward. See untested below. This fixes the observed issue. The only problem with that patch is incorrect set of arguments to kthread_create_on_cpu() (it doesn't support printf-style args). If that matters: Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> > #2 is ugly (I tried) > > #3 could be useful in general, but that needs more thoughts > > Thanks > > tglx > --- > --- a/kernel/irq/handle.c > +++ b/kernel/irq/handle.c > @@ -133,7 +133,15 @@ void __irq_wake_thread(struct irq_desc * > */ > atomic_inc(&desc->threads_active); > > - wake_up_process(action->thread); > + /* > + * This might be a premature wakeup before the thread reached the > + * thread function and set the IRQTF_READY bit. It's waiting in > + * kthread code with state UNINTERRUPTIBLE. Once it reaches the > + * thread function it waits with INTERRUPTIBLE. The wakeup is not > + * lost in that case because the thread is guaranteed to observe > + * the RUN flag before it goes to sleep in wait_for_interrupt(). > + */ > + wake_up_state(action->thread, TASK_INTERRUPTIBLE); > } > > static DEFINE_STATIC_KEY_FALSE(irqhandler_duration_check_enabled); > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -1026,16 +1026,8 @@ static void irq_thread_check_affinity(st > set_cpus_allowed_ptr(current, mask); > free_cpumask_var(mask); > } > - > -static inline void irq_thread_set_affinity(struct task_struct *t, > - struct irq_desc *desc) > -{ > - kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data)); > -} > #else > static inline void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { } > -static inline void irq_thread_set_affinity(struct task_struct *t, > - struct irq_desc *desc) { } > #endif > > static int irq_wait_for_interrupt(struct irq_desc *desc, > @@ -1220,7 +1212,6 @@ static void wake_up_and_wait_for_irq_thr > if (!action || !action->thread) > return; > > - irq_thread_set_affinity(action->thread, desc); > wake_up_process(action->thread); > wait_event(desc->wait_for_threads, > test_bit(IRQTF_READY, &action->thread_flags)); > @@ -1389,26 +1380,35 @@ static void irq_nmi_teardown(struct irq_ > static int > setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) > { > + /* > + * At this point interrupt affinity is not known. just assume that > + * the current CPU is not isolated and valid to bring the thread > + * up. The thread will move itself over to the right place once the > + * whole setup is complete. > + */ > + unsigned int cpu = raw_smp_processor_id(); > struct task_struct *t; > > - if (!secondary) { > - t = kthread_create(irq_thread, new, "irq/%d-%s", irq, > - new->name); > - } else { > - t = kthread_create(irq_thread, new, "irq/%d-s-%s", irq, > - new->name); > - } > + if (!secondary) > + t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-%s", irq, new->name); > + else > + t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-s-%s", irq, new->name); > > if (IS_ERR(t)) > return PTR_ERR(t); > > /* > - * We keep the reference to the task struct even if > - * the thread dies to avoid that the interrupt code > - * references an already freed task_struct. > + * We keep the reference to the task struct even if the thread dies > + * to avoid that the interrupt code references an already freed > + * task_struct. > */ > new->thread = get_task_struct(t); > > + /* > + * Ensure the thread adjusts the affinity once it reaches the > + * thread function. > + */ > + new->thread_flags = BIT(IRQTF_AFFINITY); > return 0; > } > > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions 2025-11-20 11:51 ` Marek Szyprowski 2025-11-20 13:20 ` Frederic Weisbecker @ 2025-11-20 13:45 ` Thomas Gleixner 1 sibling, 0 replies; 15+ messages in thread From: Thomas Gleixner @ 2025-11-20 13:45 UTC (permalink / raw) To: Marek Szyprowski, Frederic Weisbecker Cc: LKML, Marco Crivellari, Waiman Long, cgroups On Thu, Nov 20 2025 at 12:51, Marek Szyprowski wrote: > On 18.11.2025 15:30, Frederic Weisbecker wrote: >> In the meantime, cpuset shouldn't fiddle with IRQ threads directly. >> To prevent from that, set the PF_NO_SETAFFINITY flag to them. >> >> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> > > This patch landed in today's linux-next as commit 844dcacab287 ("genirq: > Fix interrupt threads affinity vs. cpuset isolated partitions"). In my > tests I found that it triggers a warnings on some of my test systems. > This is example of such warning: > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 1 at kernel/kthread.c:599 kthread_bind_mask+0x2c/0x84 > Modules linked in: > CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted > 6.18.0-rc1-00031-g844dcacab287 #16177 PREEMPT > Hardware name: Samsung Exynos (Flattened Device Tree) > Call trace: > unwind_backtrace from show_stack+0x10/0x14 > show_stack from dump_stack_lvl+0x68/0x88 > dump_stack_lvl from __warn+0x80/0x1d0 > __warn from warn_slowpath_fmt+0x1b0/0x1bc > warn_slowpath_fmt from kthread_bind_mask+0x2c/0x84 > kthread_bind_mask from wake_up_and_wait_for_irq_thread_ready+0x3c/0xd4 > wake_up_and_wait_for_irq_thread_ready from __setup_irq+0x3e8/0x894 Hmm. The only explaination for that is that the thread got woken up already and left the initial UNINTERRUPTIBLE state and is now waiting for an interrupt wakeup with INTERRUPTIBLE state. To validate that theory, can you please apply the patch below? The extra warning I added should trigger first. Let me think about a proper cure... Thanks, tglx --- --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -615,6 +615,8 @@ static void __kthread_bind(struct task_s void kthread_bind_mask(struct task_struct *p, const struct cpumask *mask) { struct kthread *kthread = to_kthread(p); + + WARN_ON_ONCE(kthread->started); __kthread_bind_mask(p, mask, TASK_UNINTERRUPTIBLE); WARN_ON_ONCE(kthread->started); } @@ -631,6 +633,8 @@ void kthread_bind_mask(struct task_struc void kthread_bind(struct task_struct *p, unsigned int cpu) { struct kthread *kthread = to_kthread(p); + + WARN_ON_ONCE(kthread->started); __kthread_bind(p, cpu, TASK_UNINTERRUPTIBLE); WARN_ON_ONCE(kthread->started); } ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] genirq: Remove cpumask availability check on kthread affinity setting 2025-11-18 14:30 [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Frederic Weisbecker 2025-11-18 14:30 ` [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions Frederic Weisbecker @ 2025-11-18 14:30 ` Frederic Weisbecker 2025-11-18 15:14 ` [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Thomas Gleixner 2 siblings, 0 replies; 15+ messages in thread From: Frederic Weisbecker @ 2025-11-18 14:30 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Frederic Weisbecker, Marco Crivellari, Waiman Long, cgroups Failing to allocate the affinity mask of an irqdesc eventually fails the whole irqdesc initialization. It is then guaranteed that the cpumask is always available whenever the related IRQ objects are alive, such as the kthread handler. Therefore remove the superfluous check since it is merely just a historical leftover. Get rid also of the comments above it that are either obsolete or useless. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> --- kernel/irq/manage.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 76e2cbe21d1f..e9316cba311c 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1001,7 +1001,6 @@ static irqreturn_t irq_forced_secondary_handler(int irq, void *dev_id) static void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { cpumask_var_t mask; - bool valid = false; if (!test_and_clear_bit(IRQTF_AFFINITY, &action->thread_flags)) return; @@ -1018,21 +1017,13 @@ static void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *a } scoped_guard(raw_spinlock_irq, &desc->lock) { - /* - * This code is triggered unconditionally. Check the affinity - * mask pointer. For CPU_MASK_OFFSTACK=n this is optimized out. - */ - if (cpumask_available(desc->irq_common_data.affinity)) { - const struct cpumask *m; + const struct cpumask *m; - m = irq_data_get_effective_affinity_mask(&desc->irq_data); - cpumask_copy(mask, m); - valid = true; - } + m = irq_data_get_effective_affinity_mask(&desc->irq_data); + cpumask_copy(mask, m); } - if (valid) - set_cpus_allowed_ptr(current, mask); + set_cpus_allowed_ptr(current, mask); free_cpumask_var(mask); } -- 2.51.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] genirq: Fix IRQ threads VS cpuset 2025-11-18 14:30 [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Frederic Weisbecker 2025-11-18 14:30 ` [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions Frederic Weisbecker 2025-11-18 14:30 ` [PATCH 2/2] genirq: Remove cpumask availability check on kthread affinity setting Frederic Weisbecker @ 2025-11-18 15:14 ` Thomas Gleixner 2025-11-18 17:44 ` Frederic Weisbecker 2 siblings, 1 reply; 15+ messages in thread From: Thomas Gleixner @ 2025-11-18 15:14 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Frederic Weisbecker, Marco Crivellari, Waiman Long, cgroups On Tue, Nov 18 2025 at 15:30, Frederic Weisbecker wrote: > See here for previous patch: > > https://lore.kernel.org/lkml/20251105131726.46364-1-frederic@kernel.org/ It would be really useful if you would have 'PATCH V2' in the subject lines of the series so it's easy distinguishable for humans and tools. Thanks, tglx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/2] genirq: Fix IRQ threads VS cpuset 2025-11-18 15:14 ` [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Thomas Gleixner @ 2025-11-18 17:44 ` Frederic Weisbecker 0 siblings, 0 replies; 15+ messages in thread From: Frederic Weisbecker @ 2025-11-18 17:44 UTC (permalink / raw) To: Thomas Gleixner; +Cc: LKML, Marco Crivellari, Waiman Long, cgroups Le Tue, Nov 18, 2025 at 04:14:40PM +0100, Thomas Gleixner a écrit : > On Tue, Nov 18 2025 at 15:30, Frederic Weisbecker wrote: > > See here for previous patch: > > > > https://lore.kernel.org/lkml/20251105131726.46364-1-frederic@kernel.org/ > > It would be really useful if you would have 'PATCH V2' in the subject > lines of the series so it's easy distinguishable for humans and tools. Right, I usually do but failed this time... Thanks! -- Frederic Weisbecker SUSE Labs ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-11-20 19:09 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-18 14:30 [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Frederic Weisbecker 2025-11-18 14:30 ` [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions Frederic Weisbecker 2025-11-18 16:27 ` Waiman Long 2025-11-18 17:23 ` Thomas Gleixner 2025-11-18 17:29 ` Waiman Long 2025-11-20 11:51 ` Marek Szyprowski 2025-11-20 13:20 ` Frederic Weisbecker 2025-11-20 15:00 ` Thomas Gleixner 2025-11-20 15:50 ` Frederic Weisbecker 2025-11-20 19:09 ` Thomas Gleixner 2025-11-20 16:34 ` Marek Szyprowski 2025-11-20 13:45 ` Thomas Gleixner 2025-11-18 14:30 ` [PATCH 2/2] genirq: Remove cpumask availability check on kthread affinity setting Frederic Weisbecker 2025-11-18 15:14 ` [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Thomas Gleixner 2025-11-18 17:44 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox