public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [GIT pull] irq/urgent for v6.15-rc1
@ 2025-03-26 19:08 Thomas Gleixner
  2025-03-27  0:57 ` pr-tracker-bot
  2025-03-27  1:07 ` Sasha Levin
  0 siblings, 2 replies; 4+ messages in thread
From: Thomas Gleixner @ 2025-03-26 19:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, x86

Linus,

please pull the latest irq/urgent branch from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-urgent-2025-03-26

up to:  3ece3e8e5976: PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends

A urgent fix for the XEN related PCI/MSI changes:

  XEN used a global variable to disable the masking of MSI interrupts as
  XEN handles that on the hypervisor side. This turned out to be a problem
  with VMD as the PCI devices behind a VMD bridge are not always handled
  by the hypervisor and then require masking by guest.

  To solve this the global variable was replaced by a interrupt domain
  specific flag, which is set by the generic XEN PCI/MSI domain, but not by
  VMD or any other domain in the system.

  So far, so good. But the implementation (and the reviewer) missed the
  fact, that accessing the domain flag cannot be done directly because
  there are at least two situations, where this fails. Legacy architectures
  are not providing interrupt domains at all. The new MSI parent domains do
  not require to have a domain info pointer. Both cases result in a
  unconditional NULL pointer derefence.

  The PCI/MSI code already has a function to query the MSI domain specific
  flag in a safe way, which handles all possible cases of PCI/MSI backends.

  So the fix it simply to replace the open coded checks by invoking the
  safe helper to query the flag.

Note: This is hot of the press, but has been tested and validated. As it
      affects a lot of people, I fast tracked it.

Thanks,

	tglx

------------------>
Thomas Gleixner (1):
      PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends


 drivers/pci/msi/msi.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index d74162880d83..7058d59e7c5f 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -285,8 +285,6 @@ static void pci_msi_set_enable(struct pci_dev *dev, int enable)
 static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
 			      struct irq_affinity_desc *masks)
 {
-	const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
-	const struct msi_domain_info *info = d->host_data;
 	struct msi_desc desc;
 	u16 control;
 
@@ -297,7 +295,7 @@ static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
 	/* Lies, damned lies, and MSIs */
 	if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
 		control |= PCI_MSI_FLAGS_MASKBIT;
-	if (info->flags & MSI_FLAG_NO_MASK)
+	if (pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY))
 		control &= ~PCI_MSI_FLAGS_MASKBIT;
 
 	desc.nvec_used			= nvec;
@@ -604,20 +602,18 @@ static void __iomem *msix_map_region(struct pci_dev *dev,
  */
 void msix_prepare_msi_desc(struct pci_dev *dev, struct msi_desc *desc)
 {
-	const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
-	const struct msi_domain_info *info = d->host_data;
-
 	desc->nvec_used				= 1;
 	desc->pci.msi_attrib.is_msix		= 1;
 	desc->pci.msi_attrib.is_64		= 1;
 	desc->pci.msi_attrib.default_irq	= dev->irq;
 	desc->pci.mask_base			= dev->msix_base;
-	desc->pci.msi_attrib.can_mask		= !(info->flags & MSI_FLAG_NO_MASK) &&
-						  !desc->pci.msi_attrib.is_virtual;
 
-	if (desc->pci.msi_attrib.can_mask) {
+
+	if (!pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY) &&
+	    !desc->pci.msi_attrib.is_virtual) {
 		void __iomem *addr = pci_msix_desc_addr(desc);
 
+		desc->pci.msi_attrib.can_mask = 1;
 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
 	}
 }
@@ -715,8 +711,6 @@ static int msix_setup_interrupts(struct pci_dev *dev, struct msix_entry *entries
 static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 				int nvec, struct irq_affinity *affd)
 {
-	const struct irq_domain *d = dev_get_msi_domain(&dev->dev);
-	const struct msi_domain_info *info = d->host_data;
 	int ret, tsize;
 	u16 control;
 
@@ -747,7 +741,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 	/* Disable INTX */
 	pci_intx_for_msi(dev, 0);
 
-	if (!(info->flags & MSI_FLAG_NO_MASK)) {
+	if (!pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY)) {
 		/*
 		 * Ensure that all table entries are masked to prevent
 		 * stale entries from firing in a crash kernel.


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [GIT pull] irq/urgent for v6.15-rc1
  2025-03-26 19:08 [GIT pull] irq/urgent for v6.15-rc1 Thomas Gleixner
@ 2025-03-27  0:57 ` pr-tracker-bot
  2025-03-27  1:07 ` Sasha Levin
  1 sibling, 0 replies; 4+ messages in thread
From: pr-tracker-bot @ 2025-03-27  0:57 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86

The pull request you sent on Wed, 26 Mar 2025 20:08:16 +0100 (CET):

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq-urgent-2025-03-26

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/87dc996d56b7871da02ea6047cd46bb879443974

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GIT pull] irq/urgent for v6.15-rc1
  2025-03-26 19:08 [GIT pull] irq/urgent for v6.15-rc1 Thomas Gleixner
  2025-03-27  0:57 ` pr-tracker-bot
@ 2025-03-27  1:07 ` Sasha Levin
  2025-03-27  7:43   ` Thomas Gleixner
  1 sibling, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2025-03-27  1:07 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Linus Torvalds, linux-kernel, x86

On Wed, Mar 26, 2025 at 08:08:16PM +0100, Thomas Gleixner wrote:
>Thomas Gleixner (1):
>      PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends

Hi Thomas,

I haven't bisected this, but I suspect that this commit is causing
boot-time panics that are observed on LKFT. Note the line numbers are
off by a bit.

Full logs of the run are available at:
https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-16083-gc13edfd29c29/testrun/27775255/suite/log-parser-test/test/bug-bug-kernel-null-pointer-dereference-address/details/

<1>[    1.540630] BUG: kernel NULL pointer dereference, address: 0000000000000002
<1>[    1.540630] #PF: supervisor read access in kernel mode
<1>[    1.540630] #PF: error_code(0x0000) - not-present page
<6>[    1.540630] PGD 0 P4D 0
<4>[    1.540630] Oops: Oops: 0000 [#1] SMP PTI
<4>[    1.540630] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0 #1 PREEMPT(voluntary)
<4>[    1.540630] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
<4>[ 1.540630] RIP: 0010:__pci_enable_msi_range (drivers/pci/msi/msi.c:300 drivers/pci/msi/msi.c:342 drivers/pci/msi/msi.c:412 drivers/pci/msi/msi.c:463)
<4>[ 1.540630] Code: ff ff ff e8 4e 18 fe ff f6 83 9f 06 00 00 10 0f b7 85 66 ff ff ff 74 0c 0d 00 01 00 00 66 89 85 66 ff ff ff 8b 8d 60 ff ff ff <41> f6 47 02 40 74 0c 25 ff fe 00 00 66 89 85 66 ff ff ff 89 8d 6c
All code
========
    0:	ff                   	(bad)
    1:	ff                   	(bad)
    2:	ff                   	(bad)
    3:	e8 4e 18 fe ff       	call   0xfffffffffffe1856
    8:	f6 83 9f 06 00 00 10 	testb  $0x10,0x69f(%rbx)
    f:	0f b7 85 66 ff ff ff 	movzwl -0x9a(%rbp),%eax
   16:	74 0c                	je     0x24
   18:	0d 00 01 00 00       	or     $0x100,%eax
   1d:	66 89 85 66 ff ff ff 	mov    %ax,-0x9a(%rbp)
   24:	8b 8d 60 ff ff ff    	mov    -0xa0(%rbp),%ecx
   2a:*	41 f6 47 02 40       	testb  $0x40,0x2(%r15)		<-- trapping instruction
   2f:	74 0c                	je     0x3d
   31:	25 ff fe 00 00       	and    $0xfeff,%eax
   36:	66 89 85 66 ff ff ff 	mov    %ax,-0x9a(%rbp)
   3d:	89                   	.byte 0x89
   3e:	8d                   	.byte 0x8d
   3f:	6c                   	insb   (%dx),%es:(%rdi)

Code starting with the faulting instruction
===========================================
    0:	41 f6 47 02 40       	testb  $0x40,0x2(%r15)
    5:	74 0c                	je     0x13
    7:	25 ff fe 00 00       	and    $0xfeff,%eax
    c:	66 89 85 66 ff ff ff 	mov    %ax,-0x9a(%rbp)
   13:	89                   	.byte 0x89
   14:	8d                   	.byte 0x8d
   15:	6c                   	insb   (%dx),%es:(%rdi)
<4>[    1.540630] RSP: 0000:ffffa0df00013748 EFLAGS: 00010246
<4>[    1.540630] RAX: 0000000000000080 RBX: ffff932e00981000 RCX: 0000000000000001
<4>[    1.540630] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffffffff85e6e74c
<4>[    1.540630] RBP: ffffa0df00013820 R08: 0000000000000002 R09: ffffa0df00013714
<4>[    1.540630] R10: 0000000000000001 R11: ffffffff84ef46c0 R12: ffff932e009810c0
<4>[    1.540630] R13: 0000000000000001 R14: ffff932e00981000 R15: 0000000000000000
<4>[    1.540630] FS:  0000000000000000(0000) GS:ffff932ef5f71000(0000) knlGS:0000000000000000
<4>[    1.540630] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[    1.540630] CR2: 0000000000000002 CR3: 0000000020a2e000 CR4: 00000000000006f0
<4>[    1.540630] Call Trace:
<4>[    1.540630]  <TASK>
<4>[ 1.540630] pci_alloc_irq_vectors_affinity (drivers/pci/msi/api.c:?)
<4>[ 1.540630] pci_alloc_irq_vectors (drivers/pci/msi/api.c:235)
<4>[ 1.540630] ahci_init_irq (drivers/ata/ahci.c:1720)
<4>[ 1.540630] ahci_init_one (drivers/ata/ahci.c:2004)
<4>[ 1.540630] pci_device_probe (drivers/pci/pci-driver.c:325 drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:451)
<4>[ 1.540630] really_probe (drivers/base/dd.c:?)
<4>[ 1.540630] __driver_probe_device (drivers/base/dd.c:?)
<4>[ 1.540630] driver_probe_device (drivers/base/dd.c:830)
<4>[ 1.540630] __driver_attach (drivers/base/dd.c:1217)
<4>[ 1.540630] bus_for_each_dev (drivers/base/bus.c:369)
<4>[ 1.540630] driver_attach (drivers/base/dd.c:1234)
<4>[ 1.540630] bus_add_driver (drivers/base/bus.c:678)
<4>[ 1.540630] driver_register (drivers/base/driver.c:250)
<4>[ 1.540630] __pci_register_driver (drivers/pci/pci-driver.c:1448)
<4>[ 1.540630] ahci_pci_driver_init (drivers/ata/ahci.c:2090)
<4>[ 1.540630] do_one_initcall (init/main.c:1257)
<4>[ 1.540630] do_initcall_level (init/main.c:1318)
<4>[ 1.540630] do_initcalls (init/main.c:1332)
<4>[ 1.540630] do_basic_setup (init/main.c:1355)
<4>[ 1.540630] kernel_init_freeable (init/main.c:1571)
<4>[ 1.540630] kernel_init (init/main.c:1459)
<4>[ 1.540630] ret_from_fork (arch/x86/kernel/process.c:159)
<4>[ 1.540630] ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
<4>[    1.540630]  </TASK>
<4>[    1.540630] Modules linked in:
<4>[    1.540630] CR2: 0000000000000002
<4>[    1.540630] ---[ end trace 0000000000000000 ]---
<4>[ 1.540630] RIP: 0010:__pci_enable_msi_range (drivers/pci/msi/msi.c:300 drivers/pci/msi/msi.c:342 drivers/pci/msi/msi.c:412 drivers/pci/msi/msi.c:463)
<4>[ 1.540630] Code: ff ff ff e8 4e 18 fe ff f6 83 9f 06 00 00 10 0f b7 85 66 ff ff ff 74 0c 0d 00 01 00 00 66 89 85 66 ff ff ff 8b 8d 60 ff ff ff <41> f6 47 02 40 74 0c 25 ff fe 00 00 66 89 85 66 ff ff ff 89 8d 6c
All code
========
    0:	ff                   	(bad)
    1:	ff                   	(bad)
    2:	ff                   	(bad)
    3:	e8 4e 18 fe ff       	call   0xfffffffffffe1856
    8:	f6 83 9f 06 00 00 10 	testb  $0x10,0x69f(%rbx)
    f:	0f b7 85 66 ff ff ff 	movzwl -0x9a(%rbp),%eax
   16:	74 0c                	je     0x24
   18:	0d 00 01 00 00       	or     $0x100,%eax
   1d:	66 89 85 66 ff ff ff 	mov    %ax,-0x9a(%rbp)
   24:	8b 8d 60 ff ff ff    	mov    -0xa0(%rbp),%ecx
   2a:*	41 f6 47 02 40       	testb  $0x40,0x2(%r15)		<-- trapping instruction
   2f:	74 0c                	je     0x3d
   31:	25 ff fe 00 00       	and    $0xfeff,%eax
   36:	66 89 85 66 ff ff ff 	mov    %ax,-0x9a(%rbp)
   3d:	89                   	.byte 0x89
   3e:	8d                   	.byte 0x8d
   3f:	6c                   	insb   (%dx),%es:(%rdi)

Code starting with the faulting instruction
===========================================
    0:	41 f6 47 02 40       	testb  $0x40,0x2(%r15)
    5:	74 0c                	je     0x13
    7:	25 ff fe 00 00       	and    $0xfeff,%eax
    c:	66 89 85 66 ff ff ff 	mov    %ax,-0x9a(%rbp)
   13:	89                   	.byte 0x89
   14:	8d                   	.byte 0x8d
   15:	6c                   	insb   (%dx),%es:(%rdi)
<4>[    1.540630] RSP: 0000:ffffa0df00013748 EFLAGS: 00010246
<4>[    1.540630] RAX: 0000000000000080 RBX: ffff932e00981000 RCX: 0000000000000001
<4>[    1.540630] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffffffff85e6e74c
<4>[    1.540630] RBP: ffffa0df00013820 R08: 0000000000000002 R09: ffffa0df00013714
<4>[    1.540630] R10: 0000000000000001 R11: ffffffff84ef46c0 R12: ffff932e009810c0
<4>[    1.540630] R13: 0000000000000001 R14: ffff932e00981000 R15: 0000000000000000
<4>[    1.540630] FS:  0000000000000000(0000) GS:ffff932ef5f71000(0000) knlGS:0000000000000000
<4>[    1.540630] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[    1.540630] CR2: 0000000000000002 CR3: 0000000020a2e000 CR4: 00000000000006f0
<6>[    1.540630] note: swapper/0[1] exited with irqs disabled
<0>[    1.574039] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
<0>[    1.574664] Kernel Offset: 0x2c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
<0>[    1.574664] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---


-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GIT pull] irq/urgent for v6.15-rc1
  2025-03-27  1:07 ` Sasha Levin
@ 2025-03-27  7:43   ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2025-03-27  7:43 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Linus Torvalds, linux-kernel, x86

Sasha!

On Wed, Mar 26 2025 at 21:07, Sasha Levin wrote:
> On Wed, Mar 26, 2025 at 08:08:16PM +0100, Thomas Gleixner wrote:
>>Thomas Gleixner (1):
>>      PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends
>
> I haven't bisected this, but I suspect that this commit is causing
> boot-time panics that are observed on LKFT. Note the line numbers are
> off by a bit.

I'm not sure which commit you are referring to, but the one which causes
this type of failure is:

  c3164d2e0d18 ("PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag")

which is fixed by

  3ece3e8e5976 ("PCI/MSI: Handle the NOMASK flag correctly for all PCI/MSI backends")

i.e. this pull request.

> Full logs of the run are available at:
> https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-16083-gc13edfd29c29/testrun/27775255/suite/log-parser-test/test/bug-bug-kernel-null-pointer-dereference-address/details/

TBH, and I know this is not your fault, this report page is a
masterpiece of bad engineering. It contains tons of useless information,
but fails to provide the most important basic data:

  1) There is no date of the failure

     Am I supposed to reverse engineer this out of this horrible user
     interface?

     I haven't even found a way to figure it out within a reasonable
     time. I just gave up.

  2) There is no useful reference to the actually used source tree and
     commit

     v6.13-rc7-16083-gc13edfd29c29 is _NOT_ a helpful reference as it
     suggests that this is a 6.13-rc7 based tree, but the log file says:
     Linux version 6.14.0

     sasha-linus-next at least gives a hint where to rumage, and 100
     clicks later I'm able to see what this is actually testing.

Seriously?

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-03-27  7:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-26 19:08 [GIT pull] irq/urgent for v6.15-rc1 Thomas Gleixner
2025-03-27  0:57 ` pr-tracker-bot
2025-03-27  1:07 ` Sasha Levin
2025-03-27  7:43   ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox