public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Marco Crivellari <marco.crivellari@suse.com>,
	Waiman Long <llong@redhat.com>,
	cgroups@vger.kernel.org
Subject: Re: [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions
Date: Thu, 20 Nov 2025 16:50:03 +0100	[thread overview]
Message-ID: <aR84qyZp3PyH5xFg@localhost.localdomain> (raw)
In-Reply-To: <87tsyokaug.ffs@tglx>

Le Thu, Nov 20, 2025 at 04:00:39PM +0100, Thomas Gleixner a écrit :
> On Thu, Nov 20 2025 at 14:20, Frederic Weisbecker wrote:
> > Erm, does this means that the IRQ thread got awaken before the first official
> > wakeup in wake_up_and_wait_for_irq_thread_ready()? This looks wrong...
> >
> > irq_startup() may be called on a few occcasions before. So perhaps
> > the IRQ already fired and woke up the kthread once before the "official"
> > first wake up?
> >
> > There seem to be some initialization ordering issue here...
> 
> That's unavoidable because of locking and hen and egg ordering problems.
> 
> When the thread is created the interrupt is not yet started up and
> therefore effective affinity is not known. So doing a bind there is
> pointless.
> 
> The action has to be made visible _before_ starting it up as once the
> interrupt is unmasked it might fire. That's even more true for shared
> interrupts.
> 
> The wakeup/bind/... muck cannot be invoked with the descriptor lock
> held, so it has to be done _after_ the thing went live.
> 
> So yes an interrupt which fires before kthread_bind() is invoked might
> exactly have that effect. It wakes it from the kthread() wait and it
> goes straight through to the thread function.
> 
> That's why the original code just set the affinity bit and let the
> thread itself handle it.
> 
> There are three options to solve that:
> 
>       1) Bind the thread right after kthread_create() to a random
>          housekeeping task and let move itself over to the real place
>          once it came up (at this point the affinity is established)
> 
>       2) Serialize the setup so that the thread (even, when woken up
>          early) get's stuck in UNINTERRUPTIBLE state _before_ it can
>          reach wait_for_interrupt() which waits with INTERRUPTIBLE
> 
>       3) Teach kthread_create() that the thread is subject to being
>          bound to something later on and implement that serialization
>          there.
> 
> #1 is pretty straight forward. See untested below.

Right.

> 
> #2 is ugly (I tried)

Ok...

> 
> #3 could be useful in general, but that needs more thoughts

A lot of thoughts yes given the many things to consider (kthread
automatic affinity handling, etc...).

And if we do that we must always affine the IRQ threads remotely
to avoid conflict with them. But that can be a problem with locking,
etc...

> --- a/kernel/irq/handle.c
> +++ b/kernel/irq/handle.c
> @@ -133,7 +133,15 @@ void __irq_wake_thread(struct irq_desc *
>  	 */
>  	atomic_inc(&desc->threads_active);
>  
> -	wake_up_process(action->thread);
> +	/*
> +	 * This might be a premature wakeup before the thread reached the
> +	 * thread function and set the IRQTF_READY bit. It's waiting in
> +	 * kthread code with state UNINTERRUPTIBLE. Once it reaches the
> +	 * thread function it waits with INTERRUPTIBLE. The wakeup is not
> +	 * lost in that case because the thread is guaranteed to observe
> +	 * the RUN flag before it goes to sleep in wait_for_interrupt().
> +	 */
> +	wake_up_state(action->thread, TASK_INTERRUPTIBLE);
>  }
>  
>  static DEFINE_STATIC_KEY_FALSE(irqhandler_duration_check_enabled);
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1026,16 +1026,8 @@ static void irq_thread_check_affinity(st
>  	set_cpus_allowed_ptr(current, mask);
>  	free_cpumask_var(mask);
>  }
> -
> -static inline void irq_thread_set_affinity(struct task_struct *t,
> -					   struct irq_desc *desc)
> -{
> -	kthread_bind_mask(t, irq_data_get_effective_affinity_mask(&desc->irq_data));
> -}
>  #else
>  static inline void irq_thread_check_affinity(struct irq_desc *desc, struct irqaction *action) { }
> -static inline void irq_thread_set_affinity(struct task_struct *t,
> -					   struct irq_desc *desc) { }
>  #endif
>  
>  static int irq_wait_for_interrupt(struct irq_desc *desc,
> @@ -1220,7 +1212,6 @@ static void wake_up_and_wait_for_irq_thr
>  	if (!action || !action->thread)
>  		return;
>  
> -	irq_thread_set_affinity(action->thread, desc);
>  	wake_up_process(action->thread);
>  	wait_event(desc->wait_for_threads,
>  		   test_bit(IRQTF_READY, &action->thread_flags));
> @@ -1389,26 +1380,35 @@ static void irq_nmi_teardown(struct irq_
>  static int
>  setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary)
>  {
> +	/*
> +	 * At this point interrupt affinity is not known. just assume that
> +	 * the current CPU is not isolated and valid to bring the thread
> +	 * up. The thread will move itself over to the right place once the
> +	 * whole setup is complete.
> +	 */
> +	unsigned int cpu = raw_smp_processor_id();
>  	struct task_struct *t;
>  
> -	if (!secondary) {
> -		t = kthread_create(irq_thread, new, "irq/%d-%s", irq,
> -				   new->name);
> -	} else {
> -		t = kthread_create(irq_thread, new, "irq/%d-s-%s", irq,
> -				   new->name);
> -	}
> +	if (!secondary)
> +		t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-%s", irq, new->name);
> +	else
> +		t = kthread_create_on_cpu(irq_thread, new, cpu, "irq/%d-s-%s",
> -		irq, new->name);

Right I though about something like that, it involved:

kthread_bind_mask(t, cpu_possible_mask);

Which do you prefer? Also do you prefer such a fixup or should I refactor my
patches you merged?

Thanks.

-- 
Frederic Weisbecker
SUSE Labs

  reply	other threads:[~2025-11-20 15:50 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-18 14:30 [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Frederic Weisbecker
2025-11-18 14:30 ` [PATCH 1/2] genirq: Fix IRQ threads affinity VS cpuset isolated partitions Frederic Weisbecker
2025-11-18 16:27   ` Waiman Long
2025-11-18 17:23     ` Thomas Gleixner
2025-11-18 17:29       ` Waiman Long
2025-11-20 11:51   ` Marek Szyprowski
2025-11-20 13:20     ` Frederic Weisbecker
2025-11-20 15:00       ` Thomas Gleixner
2025-11-20 15:50         ` Frederic Weisbecker [this message]
2025-11-20 19:09           ` Thomas Gleixner
2025-11-20 16:34         ` Marek Szyprowski
2025-11-20 13:45     ` Thomas Gleixner
2025-11-18 14:30 ` [PATCH 2/2] genirq: Remove cpumask availability check on kthread affinity setting Frederic Weisbecker
2025-11-18 15:14 ` [PATCH 0/2] genirq: Fix IRQ threads VS cpuset Thomas Gleixner
2025-11-18 17:44   ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aR84qyZp3PyH5xFg@localhost.localdomain \
    --to=frederic@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llong@redhat.com \
    --cc=m.szyprowski@samsung.com \
    --cc=marco.crivellari@suse.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox