Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Akshay Jindal <akshayaj.lkd@gmail.com>
Cc: bhelgaas@google.com, mani@kernel.org,
	manivannan.sadhasivam@linaro.org, kwilczynski@kernel.org,
	mahesh@linux.ibm.com, oohall@gmail.com,
	ilpo.jarvinen@linux.intel.com, Jonathan.Cameron@huawei.com,
	sathyanarayanan.kuppuswamy@linux.intel.com, lukas@wunner.de,
	shuah@kernel.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] PCI/AER: Add error message when AER_MAX_MULTI_ERR_DEVICES limit is hit during AER handling
Date: Thu, 26 Jun 2025 15:35:55 -0500	[thread overview]
Message-ID: <20250626203555.GA1637877@bhelgaas> (raw)
In-Reply-To: <20250619185041.73240-1-akshayaj.lkd@gmail.com>

On Fri, Jun 20, 2025 at 12:20:30AM +0530, Akshay Jindal wrote:
> When a PCIe error is detected, the root port receives the error message
> and the threaded IRQ handler, aer_isr, traverses the hierarchy downward
> from the root port. It populates the e_info->dev[] array with the PCIe
> devices that have recorded error status, so that appropriate error
> handling and recovery can be performed.
> 
> The e_info->dev[] array is limited in size by AER_MAX_MULTI_ERR_DEVICES,
> which is currently defined as 5. If more than five devices report errors
> in the same event, the array silently truncates the list, and those
> extra devices are not included in the recovery flow.
> 
> Emit an error message when this limit is reached, fulfilling a TODO
> comment in drivers/pci/pcie/aer.c.
> /* TODO: Should print error message here? */
> 
> Signed-off-by: Akshay Jindal <akshayaj.lkd@gmail.com>

Applied to pci/aer for v6.17, thanks!

> ---
> 
> Changes since v1:
> - Reworded commit message in imperative mood (per Shuah’s feedback)
> - Mentioned and quoted related TODO in the message
> - Updated recipient list
> 
> Testing:
> ========
> Verified log in dmesg on QEMU.
> 
> 1. Following command created the required environment. As mentioned below a
> pcie-root-port and a virtio-net-pci device are used on a Q35 machine model.
> ./qemu-system-x86_64 \
> 	-M q35,accel=kvm \
> 	-m 2G -cpu host -nographic \
> 	-serial mon:stdio \
> 	-kernel /home/akshayaj/pci/arch/x86/boot/bzImage \
> 	-initrd /home/akshayaj/Embedded_System_Using_QEMU/rootfs/rootfs.cpio.gz \
> 	-append "console=ttyS0 root=/ pci=pcie_scan_all" \
> 	-device pcie-root-port,id=rp0,chassis=1,slot=1 \
> 	-device virtio-net-pci,bus=rp0
> 
> ~ # mylspci -t
> -[0000:00]-+-00.0
>            +-01.0
>            +-02.0
>            +-03.0-[01]----00.0
>            +-1f.0
>            +-1f.2
>            \-1f.3
> 00:03.0--> pcie-root-port
> 
> 2. Kernel bzImage compiled with following changes:
> 	2.1 CONFIG_PCIEAER=y in config
> 	2.2 AER_MAX_MULTI_ERR_DEVICES set to 0
> 	Since there is no pcie-testdev in QEMU, it is impossible to create
> 	a 5-level hierarchy of PCIe devices in QEMU. So we simulate the
> 	error scenario by changing the limit to 0.
> 	2.3 Log added at the required place in aer.c.
> 
> 3. Both correctable and uncorrectable errors were injected on
> pcie-root-port via HMP command (pcie_aer_inject_error) in QEMU.
> HMP Command used are as follows:
> 	3.1 pcie_aer_inject_error -c rp0 0x1
> 	3.2 pcie_aer_inject_error -c rp0 0x40
> 	3.3 pcie_aer_inject_error rp0 0x10
> 
> Resulting dmesg:
> ================
> [    0.380534] pcieport 0000:00:03.0: AER: enabled with IRQ 24
> [   55.729530] pcieport 0000:00:03.0: AER: Exceeded max allowed (0) addition of PCIe devices for AER handling
> [  225.484456] pcieport 0000:00:03.0: AER: Exceeded max allowed (0) addition of PCIe devices for AER handling
> [  356.976253] pcieport 0000:00:03.0: AER: Exceeded max allowed (0) addition of PCIe devices for AER handling
> 
>  drivers/pci/pcie/aer.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 70ac66188367..3995a1db5699 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1039,7 +1039,8 @@ static int find_device_iter(struct pci_dev *dev, void *data)
>  		/* List this device */
>  		if (add_error_device(e_info, dev)) {
>  			/* We cannot handle more... Stop iteration */
> -			/* TODO: Should print error message here? */
> +			pci_err(dev, "Exceeded max allowed (%d) addition of PCIe "
> +				"devices for AER handling\n", AER_MAX_MULTI_ERR_DEVICES);
>  			return 1;
>  		}
>  
> -- 
> 2.43.0
> 

      parent reply	other threads:[~2025-06-26 20:35 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-19 18:50 [PATCH v2] PCI/AER: Add error message when AER_MAX_MULTI_ERR_DEVICES limit is hit during AER handling Akshay Jindal
2025-06-25 10:29 ` Akshay Jindal
2025-06-26 20:35 ` Bjorn Helgaas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250626203555.GA1637877@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akshayaj.lkd@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=kwilczynski@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mahesh@linux.ibm.com \
    --cc=mani@kernel.org \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=oohall@gmail.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox