From: Thomas Gleixner <tglx@linutronix.de>
To: Luigi Rizzo <lrizzo@google.com>, Marc Zyngier <maz@kernel.org>,
Luigi Rizzo <rizzo.unipi@gmail.com>,
Paolo Abeni <pabeni@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Sean Christopherson <seanjc@google.com>,
Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
Bjorn Helgaas <bhelgaas@google.com>,
Willem de Bruijn <willemb@google.com>,
Luigi Rizzo <lrizzo@google.com>
Subject: Re: [PATCH v2 5/8] x86/irq: soft_moderation: add support for posted_msi (intel)
Date: Mon, 17 Nov 2025 22:14:56 +0100 [thread overview]
Message-ID: <87h5us744v.ffs@tglx> (raw)
In-Reply-To: <20251116182839.939139-6-lrizzo@google.com>
On Sun, Nov 16 2025 at 18:28, Luigi Rizzo wrote:
> On recent Intel CPUs, kernels compiled with CONFIG_X86_POSTED_MSI=y,
> and the boot option "intremap=posted_msi", all MSI interrupts
> that hit a CPU issue a single POSTED_MSI interrupt processed by
> sysvec_posted_msi_notification() instead of having separate interrupts.
>
> This change adds soft moderation hooks to the above handler.
'This change' is not any better than 'This patch'.
> Soft moderation on posted_msi does not require per-source enable,
> irq_moderation.delay_us > 0 suffices.
>
> To test it, run a kernel with the above options and enable moderation by
> setting delay_us > 0. The column "from_msi" in /proc/irq/soft_moderation
> will show a non-zero value.
Impressive. But it would be more impressive and useful to actualy have
an explanation why this is required and what the benefits are.
> CFLAGS_head32.o := -fno-stack-protector
> CFLAGS_head64.o := -fno-stack-protector
> -CFLAGS_irq.o := -I $(src)/../include/asm/trace
> +CFLAGS_irq.o := -I $(src)/../include/asm/trace -I $(srctree)/kernel/irq
Not going to happen. Architecture code has no business to tap into
generic core infrastructure.
> --- a/kernel/irq/irq_moderation.c
> +++ b/kernel/irq/irq_moderation.c
> @@ -11,6 +11,13 @@
> #include "internals.h"
> #include "irq_moderation.h"
>
> +#ifdef CONFIG_X86
Architecture specific muck has absolutely no place in generic code.
> +#include <asm/apic.h>
> +#include <asm/irq_remapping.h>
> +#else
> +static inline bool posted_msi_supported(void) { return false; }
> +#endif
> +
> /*
> * Platform-wide software interrupt moderation.
> *
> @@ -32,6 +39,10 @@
> * moderation on that CPU/irq. If so, calls disable_irq_nosync() and starts
> * an hrtimer with appropriate delay.
> *
> + * - Intel only: using "intremap=posted_msi", all the above is done in
> + * sysvec_posted_msi_notification(). In this case all host device interrupts
> + * are subject to moderation.
> + *
> * - the timer callback calls enable_irq() for all disabled interrupts on that
> * CPU. That in turn will generate interrupts if there are pending events.
> *
> @@ -230,6 +241,17 @@ static enum hrtimer_restart timer_cb(struct hrtimer *timer)
>
> ms->rounds_left--;
>
> +#ifdef CONFIG_X86_POSTED_MSI
> + if (ms->kick_posted_msi) {
> + if (ms->rounds_left == 0)
> + ms->kick_posted_msi = false;
> + /* Next call will be from timer, count it conditionally. */
> + ms->dont_count = !irq_mod_info.count_timer_calls;
TBH, this is one of the worst hacks I've seen in a long time. It's
admittedly creative, but that's unmaintainable and the worst precedent
to open the flood gates for random architecture hacks in generic core
code.
If you want this to work for that posted MSI muck, then this needs
proper integration of those posted vectors into the interrupt core and a
sane mechanism to do the accounting.
> @@ -476,6 +500,18 @@ static ssize_t mode_write(struct file *f, const char __user *buf, size_t count,
> ret = kstrtobool(cmd, &enable);
> if (!ret)
> ret = set_moderation_mode(desc, enable);
> + if (ret) {
> + /* extra helpers for prodkernel */
> + if (cmd[count - 1] == '\n')
> + cmd[count - 1] = '\0';
> + ret = 0;
> + if (!strcmp(cmd, "managed"))
> + irqd_set(&desc->irq_data, IRQD_AFFINITY_MANAGED);
> + else if (!strcmp(cmd, "unmanaged"))
> + irqd_clear(&desc->irq_data, IRQD_AFFINITY_MANAGED);
What the heck? Can you spare me the exposure to random google internal
garbage? My eyes are bleeding already enough.
Thanks,
tglx
next prev parent reply other threads:[~2025-11-17 21:14 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-16 18:28 [PATCH v2 0/8] platform wide software interrupt moderation Luigi Rizzo
2025-11-16 18:28 ` [PATCH v2 1/8] genirq: platform wide interrupt moderation: Documentation, Kconfig, irq_desc Luigi Rizzo
2025-11-16 18:28 ` [PATCH v2 2/8] genirq: soft_moderation: add base files, procfs Luigi Rizzo
2025-11-17 11:12 ` kernel test robot
2025-11-17 16:01 ` Thomas Gleixner
2025-11-17 16:16 ` Luigi Rizzo
2025-11-17 19:35 ` Thomas Gleixner
2025-11-17 21:56 ` kernel test robot
2025-11-16 18:28 ` [PATCH v2 3/8] genirq: soft_moderation: implement fixed moderation Luigi Rizzo
2025-11-17 19:30 ` Thomas Gleixner
2025-11-17 23:16 ` Thomas Gleixner
2025-11-17 23:59 ` Luigi Rizzo
2025-11-18 8:34 ` Thomas Gleixner
2025-11-18 10:09 ` Luigi Rizzo
2025-11-18 16:31 ` Thomas Gleixner
2025-11-18 18:25 ` Luigi Rizzo
2025-11-18 23:06 ` Luigi Rizzo
2025-11-19 14:43 ` Thomas Gleixner
2025-11-21 10:58 ` Luigi Rizzo
2025-11-21 14:33 ` Luigi Rizzo
2025-11-22 14:08 ` Thomas Gleixner
2025-11-16 18:28 ` [PATCH v2 4/8] genirq: soft_moderation: implement adaptive moderation Luigi Rizzo
2025-11-17 20:51 ` Thomas Gleixner
2025-11-17 21:34 ` Luigi Rizzo
2025-11-17 23:05 ` Thomas Gleixner
2025-11-18 9:00 ` Thomas Gleixner
2025-11-16 18:28 ` [PATCH v2 5/8] x86/irq: soft_moderation: add support for posted_msi (intel) Luigi Rizzo
2025-11-17 21:14 ` Thomas Gleixner [this message]
2025-11-17 21:36 ` kernel test robot
2025-11-16 18:28 ` [PATCH v2 6/8] genirq: soft_moderation: helpers for per-driver defaults Luigi Rizzo
2025-11-16 18:28 ` [PATCH v2 7/8] nvme-pci: add module parameter for default moderation mode Luigi Rizzo
2025-11-16 18:28 ` [PATCH v2 8/8] vfio-pci: " Luigi Rizzo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h5us744v.ffs@tglx \
--to=tglx@linutronix.de \
--cc=akpm@linux-foundation.org \
--cc=bhelgaas@google.com \
--cc=jacob.jun.pan@linux.intel.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lrizzo@google.com \
--cc=maz@kernel.org \
--cc=pabeni@redhat.com \
--cc=rizzo.unipi@gmail.com \
--cc=seanjc@google.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).