From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12D68221FBE; Tue, 3 Jun 2025 21:33:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748986431; cv=none; b=a5TvvHatVaf1jkpmMvEI1WK+F3mJdxhrvmMfYvUvFCAQXITrfsD4Ho8cpen0n6BnaDE+RW9Fz+z4OYxs8T4eSSNG7cxc8Shv8CQ8daEoAdSGMvOvLsjyLbxa4jHpEbvPBqzjSEjL369Kzj1CfiRzrElhGxwKRJG9eEcBWHMSB4o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748986431; c=relaxed/simple; bh=/72GejJlyaFyHD9yEsDhR0fa1xsa7AXd9PwGG3Wdsz4=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=I2k3eD5tMjWpSD5hm/uMYnoM++fFNkYs7pHC/jrzLskI9pNE6XiN8MlbS1iv9og75cnNI6nYEIrD7zUuex+JbTxzbOhiKKKpDVI7rRBUgkGIae573HP+/iRzBFMjHH9dHZocKVhEK5cBQ9XeXVnkabqVoIPW+4UDbKWmu+F5fHs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QWDwzK2/; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QWDwzK2/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748986430; x=1780522430; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=/72GejJlyaFyHD9yEsDhR0fa1xsa7AXd9PwGG3Wdsz4=; b=QWDwzK2/G5j1RjaJ7yP596/k0fzjc9VgEZIgRE/XifyH5GyPCp6U4k5s Oia6o+AkFlFkhtvHbxlr3Asacve8Q6x1Op7ECWq/nEwfgiV+nA1IZCeGi lSBypfaF1nwicEOHtd6Fz6a+LptzsOp1yZiwO8RlrFl5sBWDMiUNNQE8p yQYSx4TiDoZkps6PITc7B7C9YW1AJCS73PhP00Y7dH82omf4ltyLiHrgK pho8PRuriBjHrD4NV5fjxVQIfQz88di85ZQR8FeTQq29LEwCvR4FCK8on eRrFJXFUrU7kZMw57zV8Rv/eQY5gv+zOT8kS3E2Cx4ZdN0uG6gC9QPmh/ Q==; X-CSE-ConnectionGUID: QbT6qrhWQ9+gxPmfhA6/+A== X-CSE-MsgGUID: NbYKk5SnT26FS+m/nzGX3w== X-IronPort-AV: E=McAfee;i="6700,10204,11453"; a="53673598" X-IronPort-AV: E=Sophos;i="6.16,207,1744095600"; d="scan'208";a="53673598" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Jun 2025 14:33:48 -0700 X-CSE-ConnectionGUID: 9kvR+v10RH2jObpB+rgN1w== X-CSE-MsgGUID: 2OaDTlvjT92NxmD7SquWZg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,207,1744095600"; d="scan'208";a="175921907" Received: from iherna2-mobl4.amr.corp.intel.com (HELO [10.125.110.198]) ([10.125.110.198]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Jun 2025 14:33:46 -0700 Message-ID: <788af5cd-4e2d-4949-9d5a-9943f13da481@intel.com> Date: Tue, 3 Jun 2025 14:33:43 -0700 Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 3/4 v3] ACPI: extlog: Trace CPER PCI Express Error Section To: "Fabio M. De Francesco" , "Rafael J . Wysocki" , Len Brown , Davidlohr Bueso , Jonathan Cameron , Alison Schofield , Vishal Verma , Ira Weiny , Dan Williams , Mahesh J Salgaonkar , Oliver O'Halloran , Bjorn Helgaas , Tony Luck , Borislav Petkov , linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-cxl@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org, linux-edac@vger.kernel.org Cc: Yazen Ghannam References: <20250603155536.577493-1-fabio.m.de.francesco@linux.intel.com> <20250603155536.577493-4-fabio.m.de.francesco@linux.intel.com> Content-Language: en-US From: Dave Jiang In-Reply-To: <20250603155536.577493-4-fabio.m.de.francesco@linux.intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 6/3/25 8:54 AM, Fabio M. De Francesco wrote: > I/O Machine Check Architecture events may signal failing PCIe components > or links. The AER event contains details on what was happening on the wire > when the error was signaled. > > Trace the CPER PCIe Error section (UEFI v2.10, Appendix N.2.7) reported > by the I/O MCA. > > Cc: Dan Williams > Signed-off-by: Fabio M. De Francesco Reviewed-by: Dave Jiang > --- > drivers/acpi/Kconfig | 1 + > drivers/acpi/acpi_extlog.c | 32 ++++++++++++++++++++++++++++++++ > drivers/pci/pcie/aer.c | 2 +- > include/linux/aer.h | 8 ++++++-- > 4 files changed, 40 insertions(+), 3 deletions(-) > > diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig > index 7bc40c2735ac0..2bbd9e4868ad7 100644 > --- a/drivers/acpi/Kconfig > +++ b/drivers/acpi/Kconfig > @@ -493,6 +493,7 @@ config ACPI_EXTLOG > tristate "Extended Error Log support" > depends on X86_MCE && X86_LOCAL_APIC && EDAC > select UEFI_CPER > + select ACPI_APEI_PCIEAER > help > Certain usages such as Predictive Failure Analysis (PFA) require > more information about the error than what can be described in > diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c > index 47d11cb5c9120..b2928ff297eda 100644 > --- a/drivers/acpi/acpi_extlog.c > +++ b/drivers/acpi/acpi_extlog.c > @@ -132,6 +132,34 @@ static int print_extlog_rcd(const char *pfx, > return 1; > } > > +static void extlog_print_pcie(struct cper_sec_pcie *pcie_err, > + int severity) > +{ > + struct aer_capability_regs *aer; > + struct pci_dev *pdev; > + unsigned int devfn; > + unsigned int bus; > + int aer_severity; > + int domain; > + > + if (!(pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID || > + pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO)) > + return; > + > + aer_severity = cper_severity_to_aer(severity); > + aer = (struct aer_capability_regs *)pcie_err->aer_info; > + domain = pcie_err->device_id.segment; > + bus = pcie_err->device_id.bus; > + devfn = PCI_DEVFN(pcie_err->device_id.device, > + pcie_err->device_id.function); > + pdev = pci_get_domain_bus_and_slot(domain, bus, devfn); > + if (!pdev) > + return; > + > + pci_print_aer(KERN_DEBUG, pdev, aer_severity, aer); > + pci_dev_put(pdev); > +} > + > static int extlog_print(struct notifier_block *nb, unsigned long val, > void *data) > { > @@ -183,6 +211,10 @@ static int extlog_print(struct notifier_block *nb, unsigned long val, > if (gdata->error_data_length >= sizeof(*mem)) > trace_extlog_mem_event(mem, err_seq, fru_id, fru_text, > (u8)gdata->error_severity); > + } else if (guid_equal(sec_type, &CPER_SEC_PCIE)) { > + struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata); > + > + extlog_print_pcie(pcie_err, gdata->error_severity); > } else { > void *err = acpi_hest_get_payload(gdata); > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index d0ebf7c15afa9..627fcf4346983 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -801,7 +801,7 @@ void pci_print_aer(char *level, struct pci_dev *dev, int aer_severity, > trace_aer_event(dev_name(&dev->dev), (status & ~mask), > aer_severity, tlp_header_valid, &aer->header_log); > } > -EXPORT_SYMBOL_NS_GPL(pci_print_aer, "CXL"); > +EXPORT_SYMBOL_GPL(pci_print_aer); > > /** > * add_error_device - list device to be handled > diff --git a/include/linux/aer.h b/include/linux/aer.h > index 45d0fb2e2e759..6ce433cee4625 100644 > --- a/include/linux/aer.h > +++ b/include/linux/aer.h > @@ -56,16 +56,20 @@ struct aer_capability_regs { > #if defined(CONFIG_PCIEAER) > int pci_aer_clear_nonfatal_status(struct pci_dev *dev); > int pcie_aer_is_native(struct pci_dev *dev); > +void pci_print_aer(char *level, struct pci_dev *dev, int aer_severity, > + struct aer_capability_regs *aer); > #else > static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev) > { > return -EINVAL; > } > static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; } > +static inline void pci_print_aer(char *level, struct pci_dev *dev, > + int aer_severity, > + struct aer_capability_regs *aer) > +{ } > #endif > > -void pci_print_aer(char *level, struct pci_dev *dev, int aer_severity, > - struct aer_capability_regs *aer); > int cper_severity_to_aer(int cper_severity); > void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, > int severity, struct aer_capability_regs *aer_regs);