From: Bjorn Helgaas <helgaas@kernel.org>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@linux.intel.com>,
Bjorn Helgaas <bhelgaas@google.com>,
Oliver O'Halloran <oohall@gmail.com>,
linux-pci@vger.kernel.org, Ben Widawsky <bwidawsk@kernel.org>,
Dan Williams <dan.j.williams@intel.com>,
Stefan Roese <sr@denx.de>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH RFC] PCI/AER: Enable internal AER errors by default
Date: Mon, 13 Feb 2023 15:38:20 -0600 [thread overview]
Message-ID: <20230213213820.GA2935044@bhelgaas> (raw)
In-Reply-To: <20230209-cxl-pci-aer-v1-1-f9a817fa4016@intel.com>
On Fri, Feb 10, 2023 at 02:33:23PM -0800, Ira Weiny wrote:
> The CXL driver expects internal error reporting to be enabled via
> pci_enable_pcie_error_reporting(). It is likely other drivers expect the same.
> Dave submitted a patch to enable the CXL side[1] but the PCI AER registers
> still mask errors.
>
> PCIe v6.0 Uncorrectable Mask Register (7.8.4.3) and Correctable Mask
> Register (7.8.4.6) default to masking internal errors. The
> Uncorrectable Error Severity Register (7.8.4.4) defaults internal errors
> as fatal.
>
> Enable internal errors to be reported via the standard
> pci_enable_pcie_error_reporting() call. Ensure uncorrectable errors are set
> non-fatal to limit any impact to other drivers.
Do you have any background on why the spec makes these errors masked
by default? I'm sympathetic to wanting to learn about all the errors
we can, but I'm a little wary if the spec authors thought it was
important to mask these by default.
> [1] https://lore.kernel.org/all/167604864163.2392965.5102660329807283871.stgit@djiang5-mobl3.local/
>
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Stefan Roese <sr@denx.de>
> Cc: "Kuppuswamy Sathyanarayanan" <sathyanarayanan.kuppuswamy@linux.intel.com>
> Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com>
> Cc: Oliver O'Halloran <oohall@gmail.com>
> Cc: linux-cxl@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-pci@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
> This is RFC to see if it is acceptable to be part of the standard
> pci_enable_pcie_error_reporting() call or perhaps a separate pci core
> call should be introduced. It is anticipated that enabling this error
> reporting is what existing drivers are expecting. The errors are marked
> non-fatal therefore it should not adversely affect existing devices.
> ---
> drivers/pci/pcie/aer.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 625f7b2cafe4..9d3ed3a5fc23 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -229,11 +229,28 @@ int pcie_aer_is_native(struct pci_dev *dev)
>
> int pci_enable_pcie_error_reporting(struct pci_dev *dev)
> {
> + int pos_cap_err;
> + u32 reg;
> int rc;
>
> if (!pcie_aer_is_native(dev))
> return -EIO;
>
> + pos_cap_err = dev->aer_cap;
> +
> + /* Unmask correctable and uncorrectable (non-fatal) internal errors */
> + pci_read_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK, ®);
> + reg &= ~PCI_ERR_COR_INTERNAL;
> + pci_write_config_dword(dev, pos_cap_err + PCI_ERR_COR_MASK, reg);
> +
> + pci_read_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_SEVER, ®);
> + reg &= ~PCI_ERR_UNC_INTN;
> + pci_write_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_SEVER, reg);
> +
> + pci_read_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_MASK, ®);
> + reg &= ~PCI_ERR_UNC_INTN;
> + pci_write_config_dword(dev, pos_cap_err + PCI_ERR_UNCOR_MASK, reg);
> +
> rc = pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_AER_FLAGS);
> return pcibios_err_to_errno(rc);
> }
>
> ---
> base-commit: e5ab7f206ffc873160bd0f1a52cae17ab692a9d1
> change-id: 20230209-cxl-pci-aer-18dda61c8239
>
> Best regards,
> --
> Ira Weiny <ira.weiny@intel.com>
>
next prev parent reply other threads:[~2023-02-13 21:39 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-10 22:33 [PATCH RFC] PCI/AER: Enable internal AER errors by default Ira Weiny
2023-02-13 21:38 ` Bjorn Helgaas [this message]
2023-02-13 22:44 ` David Laight
2023-02-15 0:08 ` Ira Weiny
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230213213820.GA2935044@bhelgaas \
--to=helgaas@kernel.org \
--cc=Jonathan.Cameron@Huawei.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mahesh@linux.ibm.com \
--cc=oohall@gmail.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=sr@denx.de \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).