Re: IRQ thread timeouts and affinity

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marc Zyngier <maz@kernel.org>
To: Thierry Reding <thierry.reding@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	linux-tegra@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: IRQ thread timeouts and affinity
Date: Thu, 09 Oct 2025 18:04:58 +0100	[thread overview]
Message-ID: <86o6qgxayt.wl-maz@kernel.org> (raw)
In-Reply-To: <loeliplxuvek4nh4plt4hup3ibqorpiv4eljiiwltgmyqa4nki@xpzymugslcvf>

On Thu, 09 Oct 2025 17:05:15 +0100,
Thierry Reding <thierry.reding@gmail.com> wrote:
> 
> [1  <text/plain; us-ascii (quoted-printable)>]
> On Thu, Oct 09, 2025 at 03:30:56PM +0100, Marc Zyngier wrote:
> > Hi Thierry,
> > 
> > On Thu, 09 Oct 2025 12:38:55 +0100,
> > Thierry Reding <thierry.reding@gmail.com> wrote:
> > > 
> > > Which brings me to the actual question: what is the right way to solve
> > > this? I had, maybe naively, assumed that the default CPU affinity, which
> > > includes all available CPUs, would be sufficient to have interrupts
> > > balanced across all of those CPUs, but that doesn't appear to be the
> > > case. At least not with the GIC (v3) driver which selects one CPU (CPU 0
> > > in this particular case) from the affinity mask to set the "effective
> > > affinity", which then dictates where IRQs are handled and where the
> > > corresponding IRQ thread function is run.
> > 
> > There's a (GIC-specific) answer to that, and that's the "1 of N"
> > distribution model. The problem is that it is a massive headache (it
> > completely breaks with per-CPU context).
> 
> Heh, that started out as a very promising first paragraph but turned
> ugly very quickly... =)
> 
> > We could try and hack this in somehow, but defining a reasonable API
> > is complicated. The set of CPUs receiving 1:N interrupts is a *global*
> > set, which means you cannot have one interrupt targeting CPUs 0-1, and
> > another targeting CPUs 2-3. You can only have a single set for all 1:N
> > interrupts. How would you define such a set in a platform agnostic
> > manner so that a random driver could use this? I definitely don't want
> > to have a GIC-specific API.
> 
> I see. I've been thinking that maybe the only way to solve this is using
> some sort of policy. A very simple policy might be: use CPU 0 as the
> "default" interrupt (much like it is now) because like you said there
> might be assumptions built-in that break when the interrupt is scheduled
> elsewhere. But then let individual drivers opt into the 1:N set, which
> would perhaps span all available CPUs but the first one. From an API PoV
> this would just be a flag that's passed to request_irq() (or one of its
> derivatives).

The $10k question is how do you pick the victim CPUs? I can't see how
to do it in a reasonable way unless we decide that interrupts that
have an affinity matching cpu_possible_mask are 1:N. And then we're
left with wondering what to do about CPU hotplug.

> 
> > Overall, there is quite a lot of work to be done in this space: the
> > machine I'm typing this from doesn't have affinity control *at
> > all*. Any interrupt can target any CPU,
> 
> Well, that actually sounds pretty nice for the use-case that we have...
> 
> >                                         and if Linux doesn't expect
> > that, tough.
> 
> ... but yeah, it may also break things.

Yeah. With GICv3, only SPIs can be 1:N, but on this (fruity) box, even
MSIs can be arbitrarily moved from one CPU to another. This is a
ticking bomb.

I'll see if I can squeeze out some time to look into this -- no
promises though.

	M.

-- 
Without deviation from the norm, progress is not possible.

next prev parent reply	other threads:[~2025-10-09 17:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-09 11:38 IRQ thread timeouts and affinity Thierry Reding
2025-10-09 14:30 ` Marc Zyngier
2025-10-09 16:05   ` Thierry Reding
2025-10-09 17:04     ` Marc Zyngier [this message]
2025-10-09 18:11       ` Marc Zyngier
2025-10-10 13:50         ` Thierry Reding
2025-10-10 14:18           ` Marc Zyngier
2025-10-10 14:38             ` Jon Hunter
2025-10-10 14:54               ` Thierry Reding
2025-10-10 15:52                 ` Jon Hunter
2025-10-10 15:03             ` Thierry Reding
2025-10-11 10:00               ` Marc Zyngier
2025-10-14 10:50                 ` Thierry Reding
2025-10-14 11:08                   ` Thierry Reding
2025-10-14 17:46                     ` Marc Zyngier
2025-10-16 18:53 ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86o6qgxayt.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thierry.reding@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.