public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Florian Bezdeka <florian.bezdeka@siemens.com>
Cc: "Preclik, Tobias" <tobias.preclik@siemens.com>,
	"linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>,
	Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: Control of IRQ Affinities from Userspace
Date: Tue, 11 Nov 2025 14:58:35 +0100	[thread overview]
Message-ID: <20251111135835.EXCy4ajR@linutronix.de> (raw)
In-Reply-To: <3cbc0cf5301350d87c03b7ceb646a3d7c549167b.camel@siemens.com>

On 2025-11-03 18:12:48 [+0100], Florian Bezdeka wrote:
> I'm trying to jump in and adding some thoughts and results we got while
> analyzing this issue:
> 
> What stmmac (and some more drivers) are trying to achieve here is some
> kind of handcrafted IRQ balancing, like the good old irqbalanced did in
> the past from usermode. Turns out that the situation about IRQ balancing
> is a bit inconsistent. Some IRQ chips (like the APIC on x86 do that
> "automatically" on driver level, many others don't. So drivers end up
> fiddling with affinities.

Doing it once during startup is probably okay. The problem is probably
that it forgets everything while it removes the IRQ and requests it
again during down/ up. It guess this is simpler because the number of
interrupts can change if the networking queues have been changed. And
this is probably also invoked in that case.

> We can nicely tune IRQs and affected affinities that that have been
> requested during system boot. Tools like tuned can configure them using
> the APIs Tobias described. IRQs that are requested / setup after boot,
> during runtime, are kind of "problematic" for us, as there is no API
> that informs about new IRQ. We would have to rescan /proc. But even if
> there would be such an API: That would be too late. The IRQ might have
> fired already.
> 
> Once an affinity has been set (e.g. by tuned) this affinity is being
> restored when the IRQ comes back after a link up/down or bpf load. But:
> It might have happened that the situation on the system has changed.
> Even the default affinity could be different now. In case of the stmmac
> - and probably way more drivers - the default affinity is not taken into
> account anymore. The previous affinity is being restored
> unconditionally.
> 
> I tried to modify stmmac and let it evaluate the default affinity while
> doing the IRQ balancing dance. That turned out to be working at the end,
> but each line violated several coding/style/abstraction rules. There is
> no API at driver level to read the current default affinity - or I
> missed it. I could sent that hack out as RFC if requested. Just let me
> know.

Several driver tune the affinity based on what they think is best. The
usual is we start with current CPU and increment the CPU with each
queue. This is not unique to networking but also happen with storage.

But we do have the "managed API" already.

> Thinking more about this problem - and trying to abstract that in a
> generalized way - triggered some ideas about "IRQ namespaces", similar
> to what we have for CPUs/Memory/... in the cgroup world. Devices, or
> classes of devices could be moved into namespaces, instead of
> configuring them one by one. Thoughts welcome. The main challenge here
> is that we do not think about rt vs. non-rt. It's more about multiple RT
> applications running in parallel, well isolated from each other and the
> non-rt world.

The excluded "affinity" would be a good place to start. So if you have
16 CPUs but declare only two CPU as housekeeping it would sense to limit
it to two interrupts if possible. Otherwise shuffle them among the two
available CPUs.

> Florian
Sebastian

      parent reply	other threads:[~2025-11-11 13:58 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-30 14:20 Control of IRQ Affinities from Userspace Preclik, Tobias
2025-11-03 15:53 ` Sebastian Andrzej Siewior
2025-11-03 17:12   ` Florian Bezdeka
2025-11-05 13:11     ` Preclik, Tobias
2025-11-05 13:18       ` Preclik, Tobias
2025-11-11 14:35         ` bigeasy
2025-11-11 14:34       ` bigeasy
2025-11-21 13:25         ` Preclik, Tobias
2025-11-24  9:59           ` bigeasy
2025-11-25 11:32             ` Florian Bezdeka
2025-11-25 11:50               ` bigeasy
2025-11-25 14:36                 ` Florian Bezdeka
2025-11-25 16:31                   ` Thomas Gleixner
2025-11-26  9:20                     ` Florian Bezdeka
2025-11-26 14:26                       ` Thomas Gleixner
2025-11-26 15:07                         ` Florian Bezdeka
2025-11-26 19:15                           ` Thomas Gleixner
2025-11-27 14:06                             ` Preclik, Tobias
2025-11-27 14:52                             ` Florian Bezdeka
2025-11-27 18:09                               ` Thomas Gleixner
2025-11-28  7:33                                 ` Florian Bezdeka
2025-11-26 15:45                       ` Frederic Weisbecker
2025-11-26 15:31                 ` Frederic Weisbecker
2025-11-26 15:24               ` Frederic Weisbecker
2025-11-11 13:58     ` Sebastian Andrzej Siewior [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251111135835.EXCy4ajR@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=florian.bezdeka@siemens.com \
    --cc=jan.kiszka@siemens.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=tobias.preclik@siemens.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox