All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Cc: Alison Schofield <alison.schofield@intel.com>,
	<linux-efi@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-cxl@vger.kernel.org>, Ard Biesheuvel <ardb@kernel.org>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Yazen Ghannam <yazen.ghannam@amd.com>,
	"Bowman Terry" <terry.bowman@amd.com>
Subject: Re: [PATCH 2/4] acpi/ghes, efi/cper: Recognize and process CXL Protocol Errors.
Date: Fri, 7 Jun 2024 16:26:22 +0100	[thread overview]
Message-ID: <20240607162622.00000819@Huawei.com> (raw)
In-Reply-To: <09e0d961-e19f-30d6-5306-1b35609b7d79@amd.com>

On Thu, 23 May 2024 14:21:40 -0700
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> wrote:

> Hi Alison,
> 
> On 5/22/2024 5:03 PM, Alison Schofield wrote:
> > On Wed, May 22, 2024 at 03:08:37PM +0000, Smita Koralahalli wrote:  
> >> UEFI v2.10 section N.2.13 defines a CPER record for CXL Protocol errors.
> >>
> >> Add GHES support to detect CXL CPER Protocol Error Record and Cache Error
> >> Severity, Device ID, Device Serial number and CXL RAS capability struct in
> >> struct cxl_cper_prot_err. Include this struct as a member of struct
> >> cxl_cper_work_data.
> >>
> >> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> >> ---
> >>   drivers/acpi/apei/ghes.c        | 10 +++++
> >>   drivers/firmware/efi/cper_cxl.c | 66 +++++++++++++++++++++++++++++++++
> >>   include/linux/cxl-event.h       | 26 +++++++++++++
> >>   3 files changed, 102 insertions(+)
> >>
> >> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> >> index 623cc0cb4a65..1a58032770ee 100644
> >> --- a/drivers/acpi/apei/ghes.c
> >> +++ b/drivers/acpi/apei/ghes.c
> >> @@ -717,6 +717,14 @@ static void cxl_cper_post_event(enum cxl_event_type event_type,
> >>   	schedule_work(cxl_cper_work);
> >>   }
> >>   
> >> +static void cxl_cper_handle_prot_err(struct acpi_hest_generic_data *gdata)
> >> +{
> >> +	struct cxl_cper_work_data wd;
> >> +
> >> +	if (cxl_cper_handle_prot_err_info(gdata, &wd.p_err))
> >> +		return;
> >> +}
> >> +
> >>   int cxl_cper_register_work(struct work_struct *work)
> >>   {
> >>   	if (cxl_cper_work)
> >> @@ -791,6 +799,8 @@ static bool ghes_do_proc(struct ghes *ghes,
> >>   			struct cxl_cper_event_rec *rec = acpi_hest_get_payload(gdata);
> >>   
> >>   			cxl_cper_post_event(CXL_CPER_EVENT_MEM_MODULE, rec);
> >> +		} else if (guid_equal(sec_type, &CPER_SEC_CXL_PROT_ERR)) {
> >> +			cxl_cper_handle_prot_err(gdata);
> >>   		} else {
> >>   			void *err = acpi_hest_get_payload(gdata);
> >>   
> >> diff --git a/drivers/firmware/efi/cper_cxl.c b/drivers/firmware/efi/cper_cxl.c
> >> index 4fd8d783993e..03b9839f3b73 100644
> >> --- a/drivers/firmware/efi/cper_cxl.c
> >> +++ b/drivers/firmware/efi/cper_cxl.c
> >> @@ -8,6 +8,7 @@
> >>    */
> >>   
> >>   #include <linux/cper.h>
> >> +#include <acpi/ghes.h>
> >>   #include "cper_cxl.h"
> >>   
> >>   #define PROT_ERR_VALID_AGENT_TYPE		BIT_ULL(0)
> >> @@ -44,6 +45,17 @@ enum {
> >>   	USP,	/* CXL Upstream Switch Port */
> >>   };
> >>   
> >> +static enum cxl_aer_err_type cper_severity_cxl_aer(int cper_severity)
> >> +{
> >> +	switch (cper_severity) {
> >> +	case CPER_SEV_RECOVERABLE:
> >> +	case CPER_SEV_FATAL:
> >> +		return CXL_AER_UNCORRECTABLE;
> >> +	default:
> >> +		return CXL_AER_CORRECTABLE;
> >> +	}
> >> +}
> >> +
> >>   void cper_print_prot_err(const char *pfx, const struct cper_sec_prot_err *prot_err)
> >>   {
> >>   	if (prot_err->valid_bits & PROT_ERR_VALID_AGENT_TYPE)
> >> @@ -176,3 +188,57 @@ void cper_print_prot_err(const char *pfx, const struct cper_sec_prot_err *prot_e
> >>   			       sizeof(cxl_ras->header_log), 0);
> >>   	}
> >>   }
> >> +
> >> +int cxl_cper_handle_prot_err_info(struct acpi_hest_generic_data *gdata,
> >> +				  struct cxl_cper_prot_err *p_err)
> >> +{
> >> +	struct cper_sec_prot_err *prot_err = acpi_hest_get_payload(gdata);
> >> +	u8 *dvsec_start, *cap_start;
> >> +
> >> +	if (!(prot_err->valid_bits & PROT_ERR_VALID_DEVICE_ID)) {
> >> +		pr_err(FW_WARN "No Device ID\n");
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	/*
> >> +	 * The device ID or agent address is required for CXL RCD, CXL
> >> +	 * SLD, CXL LD, CXL Fabric Manager Managed LD, CXL Root Port,
> >> +	 * CXL Downstream Switch Port and CXL Upstream Switch Port.
> >> +	 */
> >> +	if (prot_err->agent_type <= 0x7 && prot_err->agent_type != RCH_DP) {  
> > 
> > For this check against agent_type, and the similar one below, would a boolean
> > array indexed by the agent type work? That would avoid the <= 0x7 and > 0x4
> > below. It seems one array would suffice for this case, but naming it isn't obvious
> > to me. Maybe it'll be to you.
> > 
> > Something similar to what is done for prot_err_agent_type_strs[]
> > 
> > static const bool agent_requires_id_address_serial[] = {
> > 	true,	/* RDC */ 	
> > 	false,	/* RCH_DP */
	[RCD] = false,

etc rather than comments would be neater.
Given two similar things already. Maybe time for a little structure.

//with better name than this
struct agent_reqs {
	bool sn;
	bool sbdf;
};

static const agent_reqs agent_reqs[] = {
	[RCD] = { .sn = false, .sbdf = true, },
};

etc.

Maybe just bring the the string in as well

struct agent_info {
	const char *string;
	bool req_sn;
	bool req_sbdf;
};

static const agent_info agent_info[] = {
	[RD] = {
		.string = "Restricted CXL Device",
		.req_sn = false,
		.req_sbdf = true,
	},
};

Values made up, but hopefully conveys that moving to having
all the data in one place makes it harder to forget stuff
for new entries etc.

> > 	.
> > 	.
> > 	.
> > };
> > 
> >   
> 
> Noted. Will implement it this way!
> 
> Thanks
> Smita
> 
> >> +		p_err->segment = prot_err->agent_addr.segment;
> >> +		p_err->bus = prot_err->agent_addr.bus;
> >> +		p_err->device = prot_err->agent_addr.device;
> >> +		p_err->function = prot_err->agent_addr.function;
> >> +	} else {
> >> +		pr_err(FW_WARN "Invalid agent type\n");
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	if (!(prot_err->valid_bits & PROT_ERR_VALID_ERROR_LOG)) {
> >> +		pr_err(FW_WARN "Invalid Protocol Error log\n");
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	dvsec_start = (u8 *)(prot_err + 1);
> >> +	cap_start = dvsec_start + prot_err->dvsec_len;
> >> +	p_err->cxl_ras = *(struct cxl_ras_capability_regs *)cap_start;
> >> +
> >> +	/*
> >> +	 * Set device serial number unconditionally.
> >> +	 *
> >> +	 * Print a warning message if it is not valid. The device serial
> >> +	 * number is required for CXL RCD, CXL SLD, CXL LD and CXL Fabric
> >> +	 * Manager Managed LD.
> >> +	 */
> >> +	if (!(prot_err->valid_bits & PROT_ERR_VALID_SERIAL_NUMBER) ||
> >> +	      prot_err->agent_type > 0x4 || prot_err->agent_type == RCH_DP)  
> > 
> > then this also can be replaced with
> > 		agent_requires_id_address_serial[prot_err->agent_type]
> > 
> > 
> > -- Alison
> > 
> >   
> >> +		pr_warn(FW_WARN "No Device Serial number\n");
> >> +
> >> +	p_err->lower_dw = prot_err->dev_serial_num.lower_dw;
> >> +	p_err->upper_dw = prot_err->dev_serial_num.upper_dw;
> >> +
> >> +	p_err->severity = cper_severity_cxl_aer(gdata->error_severity);
> >> +
> >> +	return 0;
> >> +}  
> > 
> > snip
> >   


  reply	other threads:[~2024-06-07 15:26 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-22 15:08 [PATCH 0/4] acpi/ghes, cper, cxl: Trace FW-First CXL Protocol Errors Smita Koralahalli
2024-05-22 15:08 ` [PATCH 1/4] efi/cper, cxl: Make definitions and structures global Smita Koralahalli
2024-05-22 17:28   ` Dave Jiang
2024-05-22 23:40   ` Alison Schofield
2024-06-07 15:14   ` Jonathan Cameron
2024-06-10 18:31     ` Smita Koralahalli
2024-05-22 15:08 ` [PATCH 2/4] acpi/ghes, efi/cper: Recognize and process CXL Protocol Errors Smita Koralahalli
2024-05-22 17:59   ` Dave Jiang
2024-05-23 21:19     ` Smita Koralahalli
2024-05-23 22:51       ` Dave Jiang
2024-05-23  0:03   ` Alison Schofield
2024-05-23 21:21     ` Smita Koralahalli
2024-06-07 15:26       ` Jonathan Cameron [this message]
2024-06-10 19:07         ` Smita Koralahalli
2024-05-22 15:08 ` [PATCH 3/4] acpi/ghes, cxl/pci: Trace FW-First " Smita Koralahalli
2024-05-22 18:05   ` Dave Jiang
2024-05-23  4:38     ` Alison Schofield
2024-05-23 21:23     ` Smita Koralahalli
2024-05-23  0:22   ` Alison Schofield
2024-05-23 21:35     ` Smita Koralahalli
2024-06-12  0:07   ` Dan Williams
2024-06-13 17:47     ` Smita Koralahalli
2024-06-24 22:04       ` Smita Koralahalli
2024-05-22 15:08 ` [PATCH 4/4] cxl/pci: Define a common function get_cxl_dev() Smita Koralahalli
2024-05-22 19:42   ` Dave Jiang
2024-05-23 21:37     ` Smita Koralahalli
2024-05-23  0:45   ` Alison Schofield
2024-06-07 15:40   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240607162622.00000819@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=ardb@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=terry.bowman@amd.com \
    --cc=vishal.l.verma@intel.com \
    --cc=yazen.ghannam@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.