* [GIT pull] irq/urgent for v6.15-rc1
@ 2025-03-26 19:08 Thomas Gleixner
2025-03-27 0:57 ` pr-tracker-bot
2025-03-27 1:07 ` Sasha Levin
0 siblings, 2 replies; 4+ messages in thread
From: Thomas Gleixner @ 2025-03-26 19:08 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, x86
Linus,
please pull the latest irq/urgent branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-urgent-2025-03-26
up to: 3ece3e8e5976: PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends
A urgent fix for the XEN related PCI/MSI changes:
XEN used a global variable to disable the masking of MSI interrupts as
XEN handles that on the hypervisor side. This turned out to be a problem
with VMD as the PCI devices behind a VMD bridge are not always handled
by the hypervisor and then require masking by guest.
To solve this the global variable was replaced by a interrupt domain
specific flag, which is set by the generic XEN PCI/MSI domain, but not by
VMD or any other domain in the system.
So far, so good. But the implementation (and the reviewer) missed the
fact, that accessing the domain flag cannot be done directly because
there are at least two situations, where this fails. Legacy architectures
are not providing interrupt domains at all. The new MSI parent domains do
not require to have a domain info pointer. Both cases result in a
unconditional NULL pointer derefence.
The PCI/MSI code already has a function to query the MSI domain specific
flag in a safe way, which handles all possible cases of PCI/MSI backends.
So the fix it simply to replace the open coded checks by invoking the
safe helper to query the flag.
Note: This is hot of the press, but has been tested and validated. As it
affects a lot of people, I fast tracked it.
Thanks,
tglx
------------------>
Thomas Gleixner (1):
PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends
drivers/pci/msi/msi.c | 18 ++++++------------
1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index d74162880d83..7058d59e7c5f 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -285,8 +285,6 @@ static void pci_msi_set_enable(struct pci_dev *dev, int enable)
static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
struct irq_affinity_desc *masks)
{
- const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
- const struct msi_domain_info *info = d->host_data;
struct msi_desc desc;
u16 control;
@@ -297,7 +295,7 @@ static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
/* Lies, damned lies, and MSIs */
if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
control |= PCI_MSI_FLAGS_MASKBIT;
- if (info->flags & MSI_FLAG_NO_MASK)
+ if (pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY))
control &= ~PCI_MSI_FLAGS_MASKBIT;
desc.nvec_used = nvec;
@@ -604,20 +602,18 @@ static void __iomem *msix_map_region(struct pci_dev *dev,
*/
void msix_prepare_msi_desc(struct pci_dev *dev, struct msi_desc *desc)
{
- const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
- const struct msi_domain_info *info = d->host_data;
-
desc->nvec_used = 1;
desc->pci.msi_attrib.is_msix = 1;
desc->pci.msi_attrib.is_64 = 1;
desc->pci.msi_attrib.default_irq = dev->irq;
desc->pci.mask_base = dev->msix_base;
- desc->pci.msi_attrib.can_mask = !(info->flags & MSI_FLAG_NO_MASK) &&
- !desc->pci.msi_attrib.is_virtual;
- if (desc->pci.msi_attrib.can_mask) {
+
+ if (!pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY) &&
+ !desc->pci.msi_attrib.is_virtual) {
void __iomem *addr = pci_msix_desc_addr(desc);
+ desc->pci.msi_attrib.can_mask = 1;
desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
}
}
@@ -715,8 +711,6 @@ static int msix_setup_interrupts(struct pci_dev *dev, struct msix_entry *entries
static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
int nvec, struct irq_affinity *affd)
{
- const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
- const struct msi_domain_info *info = d->host_data;
int ret, tsize;
u16 control;
@@ -747,7 +741,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
/* Disable INTX */
pci_intx_for_msi(dev, 0);
- if (!(info->flags & MSI_FLAG_NO_MASK)) {
+ if (!pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY)) {
/*
* Ensure that all table entries are masked to prevent
* stale entries from firing in a crash kernel.
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [GIT pull] irq/urgent for v6.15-rc1
2025-03-26 19:08 [GIT pull] irq/urgent for v6.15-rc1 Thomas Gleixner
@ 2025-03-27 0:57 ` pr-tracker-bot
2025-03-27 1:07 ` Sasha Levin
1 sibling, 0 replies; 4+ messages in thread
From: pr-tracker-bot @ 2025-03-27 0:57 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
The pull request you sent on Wed, 26 Mar 2025 20:08:16 +0100 (CET):
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-urgent-2025-03-26
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/87dc996d56b7871da02ea6047cd46bb879443974
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [GIT pull] irq/urgent for v6.15-rc1
2025-03-26 19:08 [GIT pull] irq/urgent for v6.15-rc1 Thomas Gleixner
2025-03-27 0:57 ` pr-tracker-bot
@ 2025-03-27 1:07 ` Sasha Levin
2025-03-27 7:43 ` Thomas Gleixner
1 sibling, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2025-03-27 1:07 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86
On Wed, Mar 26, 2025 at 08:08:16PM +0100, Thomas Gleixner wrote:
>Thomas Gleixner (1):
> PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends
Hi Thomas,
I haven't bisected this, but I suspect that this commit is causing
boot-time panics that are observed on LKFT. Note the line numbers are
off by a bit.
Full logs of the run are available at:
https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-16083-gc13edfd29c29/testrun/27775255/suite/log-parser-test/test/bug-bug-kernel-null-pointer-dereference-address/details/
<1>[ 1.540630] BUG: kernel NULL pointer dereference, address: 0000000000000002
<1>[ 1.540630] #PF: supervisor read access in kernel mode
<1>[ 1.540630] #PF: error_code(0x0000) - not-present page
<6>[ 1.540630] PGD 0 P4D 0
<4>[ 1.540630] Oops: Oops: 0000 [#1] SMP PTI
<4>[ 1.540630] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0 #1 PREEMPT(voluntary)
<4>[ 1.540630] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
<4>[ 1.540630] RIP: 0010:__pci_enable_msi_range (drivers/pci/msi/msi.c:300 drivers/pci/msi/msi.c:342 drivers/pci/msi/msi.c:412 drivers/pci/msi/msi.c:463)
<4>[ 1.540630] Code: ff ff ff e8 4e 18 fe ff f6 83 9f 06 00 00 10 0f b7 85 66 ff ff ff 74 0c 0d 00 01 00 00 66 89 85 66 ff ff ff 8b 8d 60 ff ff ff <41> f6 47 02 40 74 0c 25 ff fe 00 00 66 89 85 66 ff ff ff 89 8d 6c
All code
========
0: ff (bad)
1: ff (bad)
2: ff (bad)
3: e8 4e 18 fe ff call 0xfffffffffffe1856
8: f6 83 9f 06 00 00 10 testb $0x10,0x69f(%rbx)
f: 0f b7 85 66 ff ff ff movzwl -0x9a(%rbp),%eax
16: 74 0c je 0x24
18: 0d 00 01 00 00 or $0x100,%eax
1d: 66 89 85 66 ff ff ff mov %ax,-0x9a(%rbp)
24: 8b 8d 60 ff ff ff mov -0xa0(%rbp),%ecx
2a:* 41 f6 47 02 40 testb $0x40,0x2(%r15) <-- trapping instruction
2f: 74 0c je 0x3d
31: 25 ff fe 00 00 and $0xfeff,%eax
36: 66 89 85 66 ff ff ff mov %ax,-0x9a(%rbp)
3d: 89 .byte 0x89
3e: 8d .byte 0x8d
3f: 6c insb (%dx),%es:(%rdi)
Code starting with the faulting instruction
===========================================
0: 41 f6 47 02 40 testb $0x40,0x2(%r15)
5: 74 0c je 0x13
7: 25 ff fe 00 00 and $0xfeff,%eax
c: 66 89 85 66 ff ff ff mov %ax,-0x9a(%rbp)
13: 89 .byte 0x89
14: 8d .byte 0x8d
15: 6c insb (%dx),%es:(%rdi)
<4>[ 1.540630] RSP: 0000:ffffa0df00013748 EFLAGS: 00010246
<4>[ 1.540630] RAX: 0000000000000080 RBX: ffff932e00981000 RCX: 0000000000000001
<4>[ 1.540630] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffffffff85e6e74c
<4>[ 1.540630] RBP: ffffa0df00013820 R08: 0000000000000002 R09: ffffa0df00013714
<4>[ 1.540630] R10: 0000000000000001 R11: ffffffff84ef46c0 R12: ffff932e009810c0
<4>[ 1.540630] R13: 0000000000000001 R14: ffff932e00981000 R15: 0000000000000000
<4>[ 1.540630] FS: 0000000000000000(0000) GS:ffff932ef5f71000(0000) knlGS:0000000000000000
<4>[ 1.540630] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 1.540630] CR2: 0000000000000002 CR3: 0000000020a2e000 CR4: 00000000000006f0
<4>[ 1.540630] Call Trace:
<4>[ 1.540630] <TASK>
<4>[ 1.540630] pci_alloc_irq_vectors_affinity (drivers/pci/msi/api.c:?)
<4>[ 1.540630] pci_alloc_irq_vectors (drivers/pci/msi/api.c:235)
<4>[ 1.540630] ahci_init_irq (drivers/ata/ahci.c:1720)
<4>[ 1.540630] ahci_init_one (drivers/ata/ahci.c:2004)
<4>[ 1.540630] pci_device_probe (drivers/pci/pci-driver.c:325 drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:451)
<4>[ 1.540630] really_probe (drivers/base/dd.c:?)
<4>[ 1.540630] __driver_probe_device (drivers/base/dd.c:?)
<4>[ 1.540630] driver_probe_device (drivers/base/dd.c:830)
<4>[ 1.540630] __driver_attach (drivers/base/dd.c:1217)
<4>[ 1.540630] bus_for_each_dev (drivers/base/bus.c:369)
<4>[ 1.540630] driver_attach (drivers/base/dd.c:1234)
<4>[ 1.540630] bus_add_driver (drivers/base/bus.c:678)
<4>[ 1.540630] driver_register (drivers/base/driver.c:250)
<4>[ 1.540630] __pci_register_driver (drivers/pci/pci-driver.c:1448)
<4>[ 1.540630] ahci_pci_driver_init (drivers/ata/ahci.c:2090)
<4>[ 1.540630] do_one_initcall (init/main.c:1257)
<4>[ 1.540630] do_initcall_level (init/main.c:1318)
<4>[ 1.540630] do_initcalls (init/main.c:1332)
<4>[ 1.540630] do_basic_setup (init/main.c:1355)
<4>[ 1.540630] kernel_init_freeable (init/main.c:1571)
<4>[ 1.540630] kernel_init (init/main.c:1459)
<4>[ 1.540630] ret_from_fork (arch/x86/kernel/process.c:159)
<4>[ 1.540630] ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
<4>[ 1.540630] </TASK>
<4>[ 1.540630] Modules linked in:
<4>[ 1.540630] CR2: 0000000000000002
<4>[ 1.540630] ---[ end trace 0000000000000000 ]---
<4>[ 1.540630] RIP: 0010:__pci_enable_msi_range (drivers/pci/msi/msi.c:300 drivers/pci/msi/msi.c:342 drivers/pci/msi/msi.c:412 drivers/pci/msi/msi.c:463)
<4>[ 1.540630] Code: ff ff ff e8 4e 18 fe ff f6 83 9f 06 00 00 10 0f b7 85 66 ff ff ff 74 0c 0d 00 01 00 00 66 89 85 66 ff ff ff 8b 8d 60 ff ff ff <41> f6 47 02 40 74 0c 25 ff fe 00 00 66 89 85 66 ff ff ff 89 8d 6c
All code
========
0: ff (bad)
1: ff (bad)
2: ff (bad)
3: e8 4e 18 fe ff call 0xfffffffffffe1856
8: f6 83 9f 06 00 00 10 testb $0x10,0x69f(%rbx)
f: 0f b7 85 66 ff ff ff movzwl -0x9a(%rbp),%eax
16: 74 0c je 0x24
18: 0d 00 01 00 00 or $0x100,%eax
1d: 66 89 85 66 ff ff ff mov %ax,-0x9a(%rbp)
24: 8b 8d 60 ff ff ff mov -0xa0(%rbp),%ecx
2a:* 41 f6 47 02 40 testb $0x40,0x2(%r15) <-- trapping instruction
2f: 74 0c je 0x3d
31: 25 ff fe 00 00 and $0xfeff,%eax
36: 66 89 85 66 ff ff ff mov %ax,-0x9a(%rbp)
3d: 89 .byte 0x89
3e: 8d .byte 0x8d
3f: 6c insb (%dx),%es:(%rdi)
Code starting with the faulting instruction
===========================================
0: 41 f6 47 02 40 testb $0x40,0x2(%r15)
5: 74 0c je 0x13
7: 25 ff fe 00 00 and $0xfeff,%eax
c: 66 89 85 66 ff ff ff mov %ax,-0x9a(%rbp)
13: 89 .byte 0x89
14: 8d .byte 0x8d
15: 6c insb (%dx),%es:(%rdi)
<4>[ 1.540630] RSP: 0000:ffffa0df00013748 EFLAGS: 00010246
<4>[ 1.540630] RAX: 0000000000000080 RBX: ffff932e00981000 RCX: 0000000000000001
<4>[ 1.540630] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffffffff85e6e74c
<4>[ 1.540630] RBP: ffffa0df00013820 R08: 0000000000000002 R09: ffffa0df00013714
<4>[ 1.540630] R10: 0000000000000001 R11: ffffffff84ef46c0 R12: ffff932e009810c0
<4>[ 1.540630] R13: 0000000000000001 R14: ffff932e00981000 R15: 0000000000000000
<4>[ 1.540630] FS: 0000000000000000(0000) GS:ffff932ef5f71000(0000) knlGS:0000000000000000
<4>[ 1.540630] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 1.540630] CR2: 0000000000000002 CR3: 0000000020a2e000 CR4: 00000000000006f0
<6>[ 1.540630] note: swapper/0[1] exited with irqs disabled
<0>[ 1.574039] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
<0>[ 1.574664] Kernel Offset: 0x2c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
<0>[ 1.574664] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [GIT pull] irq/urgent for v6.15-rc1
2025-03-27 1:07 ` Sasha Levin
@ 2025-03-27 7:43 ` Thomas Gleixner
0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2025-03-27 7:43 UTC (permalink / raw)
To: Sasha Levin; +Cc: Linus Torvalds, linux-kernel, x86
Sasha!
On Wed, Mar 26 2025 at 21:07, Sasha Levin wrote:
> On Wed, Mar 26, 2025 at 08:08:16PM +0100, Thomas Gleixner wrote:
>>Thomas Gleixner (1):
>> PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends
>
> I haven't bisected this, but I suspect that this commit is causing
> boot-time panics that are observed on LKFT. Note the line numbers are
> off by a bit.
I'm not sure which commit you are referring to, but the one which causes
this type of failure is:
c3164d2e0d18 ("PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag")
which is fixed by
3ece3e8e5976 ("PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends")
i.e. this pull request.
> Full logs of the run are available at:
> https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-16083-gc13edfd29c29/testrun/27775255/suite/log-parser-test/test/bug-bug-kernel-null-pointer-dereference-address/details/
TBH, and I know this is not your fault, this report page is a
masterpiece of bad engineering. It contains tons of useless information,
but fails to provide the most important basic data:
1) There is no date of the failure
Am I supposed to reverse engineer this out of this horrible user
interface?
I haven't even found a way to figure it out within a reasonable
time. I just gave up.
2) There is no useful reference to the actually used source tree and
commit
v6.13-rc7-16083-gc13edfd29c29 is _NOT_ a helpful reference as it
suggests that this is a 6.13-rc7 based tree, but the log file says:
Linux version 6.14.0
sasha-linus-next at least gives a hint where to rumage, and 100
clicks later I'm able to see what this is actually testing.
Seriously?
Thanks,
tglx
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-03-27 7:43 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-26 19:08 [GIT pull] irq/urgent for v6.15-rc1 Thomas Gleixner
2025-03-27 0:57 ` pr-tracker-bot
2025-03-27 1:07 ` Sasha Levin
2025-03-27 7:43 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox