From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 310AAC5AE59 for ; Tue, 3 Jun 2025 21:33:59 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [127.0.0.1]) by lists.ozlabs.org (Postfix) with ESMTP id 4bBkVY63ljz2ySg; Wed, 4 Jun 2025 07:33:57 +1000 (AEST) Authentication-Results: lists.ozlabs.org; arc=none smtp.remote-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1748986437; cv=none; b=CW251qTF7thn8rvCV2HlQqtE8Wkho9jxOHb9Ps3z41c1DeTfodhSM/P36Z65hePRXejwRG0ieQ7Kvr7UVJaNdp7MZJ8+C1oDvN4oWJQ4/ejMCHTak/ulV+r/QgZQMs61MFNja5mgSTBECJrDOGVRwgbBroP4lfqGoBNWv2CDYDOYagxHHmVW/Yi+SaAJTgxjJm3eN0apoK14zYX/8y9e5T42IK+40zE21ohYKiijQ3h8S/RiqZ3DNafvxgvs36WA9x07gyx0bwEBMhoA1wg/PnkBZddnPk8fN8wsy/05BiPzipnWzpDNSxA7B9xcCkGCkPDZrzXu+XMIJPcW40MHHQ== ARC-Message-Signature: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1748986437; c=relaxed/relaxed; bh=XqIAQmi+2MagpM80lD32fIK/S0sshNCVleZWh68dUmo=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=Kxflyk4wYI5+9tZm5NDqDZlmZHSdBEtm8W4/jBqzgJZDIKow2a9QaGgQpN9kfmEpixk/8L2Bsba6/YGU/SomtMeurSnmJrxf1KTrUYbERXhHJpVof1LmAEg2EXSM9mcsdXVxMonPx4oC/Q3HVpCfAq2/FqBZIj04VPH7gDBYP3ZierQyCkd3UgdCT+sXwPMEbMI9GjMQqzAp0F9hTRdbaeY0bNYtWaZAAJP48SqJQUuMK6UgD2gZJQ2joVBKav//puDIGSc6Aw4toNZo+fW8AIg9DVZZ3vFQ21Miyeg811yWo9jaTWd5nNvEfZNS6V95tbO0UYrb1wmkKGhgo2yu9w== ARC-Authentication-Results: i=1; lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=intel.com; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=mVg//miQ; dkim-atps=neutral; spf=pass (client-ip=192.198.163.13; helo=mgamail.intel.com; envelope-from=dave.jiang@intel.com; receiver=lists.ozlabs.org) smtp.mailfrom=intel.com Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=mVg//miQ; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=intel.com (client-ip=192.198.163.13; helo=mgamail.intel.com; envelope-from=dave.jiang@intel.com; receiver=lists.ozlabs.org) Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4bBkVW0xgZz2ySQ for ; Wed, 4 Jun 2025 07:33:50 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748986435; x=1780522435; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=/72GejJlyaFyHD9yEsDhR0fa1xsa7AXd9PwGG3Wdsz4=; b=mVg//miQMDqh2jH7P/hZSeGGiGb59zJxp0gCvl53Bi/lRDppQfZEa6w7 ZwJaRpz3ZIT9PVBL0NE+lmhaPVXLfuVSBEW86o4HhAhd52uIIq2ZMm/Zj dfz3MWqqq2Shf8efjr74Vq9OP1dGVadEDih1HdcT2eFlrzJAcbpeh6B8i 6pyLpXM4hdgKyOrPWr28NidPg5wtf7F0pnPhn7P64o2nbU7jRFmQoO32g hyRnLoh0X/TKrsVrfdjJPLzaRWfSnHFYz9YUYIa3AJLZ64IPdeIH7wHCq grD3CX9UyJ735vEzTuuo/I93FZ+qu75AW35LBWQaJmUo2xVxKX7kt8/Uq g==; X-CSE-ConnectionGUID: iNcRMAXiSXqouViJOcd1nQ== X-CSE-MsgGUID: aws9zRSxTZaTDVA2/jWsdA== X-IronPort-AV: E=McAfee;i="6700,10204,11453"; a="53673603" X-IronPort-AV: E=Sophos;i="6.16,207,1744095600"; d="scan'208";a="53673603" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Jun 2025 14:33:48 -0700 X-CSE-ConnectionGUID: 9kvR+v10RH2jObpB+rgN1w== X-CSE-MsgGUID: 2OaDTlvjT92NxmD7SquWZg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,207,1744095600"; d="scan'208";a="175921907" Received: from iherna2-mobl4.amr.corp.intel.com (HELO [10.125.110.198]) ([10.125.110.198]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Jun 2025 14:33:46 -0700 Message-ID: <788af5cd-4e2d-4949-9d5a-9943f13da481@intel.com> Date: Tue, 3 Jun 2025 14:33:43 -0700 X-Mailing-List: linuxppc-dev@lists.ozlabs.org List-Id: List-Help: List-Owner: List-Post: List-Archive: , List-Subscribe: , , List-Unsubscribe: Precedence: list MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 3/4 v3] ACPI: extlog: Trace CPER PCI Express Error Section To: "Fabio M. De Francesco" , "Rafael J . Wysocki" , Len Brown , Davidlohr Bueso , Jonathan Cameron , Alison Schofield , Vishal Verma , Ira Weiny , Dan Williams , Mahesh J Salgaonkar , Oliver O'Halloran , Bjorn Helgaas , Tony Luck , Borislav Petkov , linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-cxl@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org, linux-edac@vger.kernel.org Cc: Yazen Ghannam References: <20250603155536.577493-1-fabio.m.de.francesco@linux.intel.com> <20250603155536.577493-4-fabio.m.de.francesco@linux.intel.com> Content-Language: en-US From: Dave Jiang In-Reply-To: <20250603155536.577493-4-fabio.m.de.francesco@linux.intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 6/3/25 8:54 AM, Fabio M. De Francesco wrote: > I/O Machine Check Architecture events may signal failing PCIe components > or links. The AER event contains details on what was happening on the wire > when the error was signaled. > > Trace the CPER PCIe Error section (UEFI v2.10, Appendix N.2.7) reported > by the I/O MCA. > > Cc: Dan Williams > Signed-off-by: Fabio M. De Francesco Reviewed-by: Dave Jiang > --- > drivers/acpi/Kconfig | 1 + > drivers/acpi/acpi_extlog.c | 32 ++++++++++++++++++++++++++++++++ > drivers/pci/pcie/aer.c | 2 +- > include/linux/aer.h | 8 ++++++-- > 4 files changed, 40 insertions(+), 3 deletions(-) > > diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig > index 7bc40c2735ac0..2bbd9e4868ad7 100644 > --- a/drivers/acpi/Kconfig > +++ b/drivers/acpi/Kconfig > @@ -493,6 +493,7 @@ config ACPI_EXTLOG > tristate "Extended Error Log support" > depends on X86_MCE && X86_LOCAL_APIC && EDAC > select UEFI_CPER > + select ACPI_APEI_PCIEAER > help > Certain usages such as Predictive Failure Analysis (PFA) require > more information about the error than what can be described in > diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c > index 47d11cb5c9120..b2928ff297eda 100644 > --- a/drivers/acpi/acpi_extlog.c > +++ b/drivers/acpi/acpi_extlog.c > @@ -132,6 +132,34 @@ static int print_extlog_rcd(const char *pfx, > return 1; > } > > +static void extlog_print_pcie(struct cper_sec_pcie *pcie_err, > + int severity) > +{ > + struct aer_capability_regs *aer; > + struct pci_dev *pdev; > + unsigned int devfn; > + unsigned int bus; > + int aer_severity; > + int domain; > + > + if (!(pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID || > + pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO)) > + return; > + > + aer_severity = cper_severity_to_aer(severity); > + aer = (struct aer_capability_regs *)pcie_err->aer_info; > + domain = pcie_err->device_id.segment; > + bus = pcie_err->device_id.bus; > + devfn = PCI_DEVFN(pcie_err->device_id.device, > + pcie_err->device_id.function); > + pdev = pci_get_domain_bus_and_slot(domain, bus, devfn); > + if (!pdev) > + return; > + > + pci_print_aer(KERN_DEBUG, pdev, aer_severity, aer); > + pci_dev_put(pdev); > +} > + > static int extlog_print(struct notifier_block *nb, unsigned long val, > void *data) > { > @@ -183,6 +211,10 @@ static int extlog_print(struct notifier_block *nb, unsigned long val, > if (gdata->error_data_length >= sizeof(*mem)) > trace_extlog_mem_event(mem, err_seq, fru_id, fru_text, > (u8)gdata->error_severity); > + } else if (guid_equal(sec_type, &CPER_SEC_PCIE)) { > + struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata); > + > + extlog_print_pcie(pcie_err, gdata->error_severity); > } else { > void *err = acpi_hest_get_payload(gdata); > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index d0ebf7c15afa9..627fcf4346983 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -801,7 +801,7 @@ void pci_print_aer(char *level, struct pci_dev *dev, int aer_severity, > trace_aer_event(dev_name(&dev->dev), (status & ~mask), > aer_severity, tlp_header_valid, &aer->header_log); > } > -EXPORT_SYMBOL_NS_GPL(pci_print_aer, "CXL"); > +EXPORT_SYMBOL_GPL(pci_print_aer); > > /** > * add_error_device - list device to be handled > diff --git a/include/linux/aer.h b/include/linux/aer.h > index 45d0fb2e2e759..6ce433cee4625 100644 > --- a/include/linux/aer.h > +++ b/include/linux/aer.h > @@ -56,16 +56,20 @@ struct aer_capability_regs { > #if defined(CONFIG_PCIEAER) > int pci_aer_clear_nonfatal_status(struct pci_dev *dev); > int pcie_aer_is_native(struct pci_dev *dev); > +void pci_print_aer(char *level, struct pci_dev *dev, int aer_severity, > + struct aer_capability_regs *aer); > #else > static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev) > { > return -EINVAL; > } > static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; } > +static inline void pci_print_aer(char *level, struct pci_dev *dev, > + int aer_severity, > + struct aer_capability_regs *aer) > +{ } > #endif > > -void pci_print_aer(char *level, struct pci_dev *dev, int aer_severity, > - struct aer_capability_regs *aer); > int cper_severity_to_aer(int cper_severity); > void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, > int severity, struct aer_capability_regs *aer_regs);