Re: [PATCH V6 03/10] efi: parse ARM processor error

public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed

From: "Baicar, Tyler" <tbaicar@codeaurora.org>
To: James Morse <james.morse@arm.com>
Cc: linux-efi@vger.kernel.org, kvm@vger.kernel.org,
	matt@codeblueprint.co.uk, catalin.marinas@arm.com,
	will.deacon@arm.com, robert.moore@intel.com,
	paul.gortmaker@windriver.com, lv.zheng@intel.com,
	kvmarm@lists.cs.columbia.edu, fu.wei@linaro.org,
	zjzhang@codeaurora.org, linux@armlinux.org.uk,
	linux-acpi@vger.kernel.org, eun.taik.lee@samsung.com,
	shijie.huang@arm.com, labbott@redhat.com, lenb@kernel.org,
	harba@codeaurora.org, john.garry@huawei.com,
	marc.zyngier@arm.com, punit.agrawal@arm.com, rostedt@goodmis.org,
	nkaje@codeaurora.org, sandeepa.s.prabhu@gmail.com,
	linux-arm-kernel@lists.infradead.org, devel@acpica.org,
	rjw@rjwysocki.net, rruigrok@codeaurora.org,
	linux-kernel@vger.kernel.org, astone@redhat.com,
	hanjun.guo@linaro.org, pbonzini@redhat.com,
	akpm@linux-foundation.org, bristot@redhat.com,
	shiju.jose@huawei.com
Subject: Re: [PATCH V6 03/10] efi: parse ARM processor error
Date: Thu, 5 Jan 2017 14:17:54 -0700	[thread overview]
Message-ID: <63d439c0-0e21-dcf5-72da-e84ae0cc2df8@codeaurora.org> (raw)
In-Reply-To: <5852A3CA.807@arm.com>

On 12/15/2016 7:08 AM, James Morse wrote:
> Hi Tyler,
>
> On 07/12/16 21:48, Tyler Baicar wrote:
>> Add support for ARM Common Platform Error Record (CPER).
>> UEFI 2.6 specification adds support for ARM specific
>> processor error information to be reported as part of the
>> CPER records. This provides more detail on for processor error logs.
> Looks good to me, a few minor comments below.
>
> Reviewed-by: James Morse <james.morse@arm.com>
Thanks!
>> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
>> index 8fa4e23..1ac2572 100644
>> --- a/drivers/firmware/efi/cper.c
>> +++ b/drivers/firmware/efi/cper.c
>> @@ -184,6 +199,110 @@ static void cper_print_proc_generic(const char *pfx,
>>   		printk("%s""IP: 0x%016llx\n", pfx, proc->ip);
>>   }
>>   
>> +static void cper_print_proc_arm(const char *pfx,
>> +				const struct cper_sec_proc_arm *proc)
>> +{
>> +	int i, len, max_ctx_type;
>> +	struct cper_arm_err_info *err_info;
>> +	struct cper_arm_ctx_info *ctx_info;
>> +	char newpfx[64];
>> +
>> +	printk("%ssection length: %d\n", pfx, proc->section_length);
> Compared to the rest of the file, this:
>> 	printk("%s""section length: %d\n", pfx, proc->section_length);
> would be more in keeping. I guess its done this way to avoid some spurious
> warning about %ssection not being recognised by printk().
Makes sense, I'll make this change next patchset.
>> +	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
>> +
>> +	len = proc->section_length - (sizeof(*proc) +
>> +		proc->err_info_num * (sizeof(*err_info)));
>> +	if (len < 0) {
>> +		printk("%ssection length is too small\n", pfx);
> This calculation is all based on values in the 'struct cper_sec_proc_arm', is it
> worth making more noise about how the firmware-generated record is incorrectly
> formatted? If we see this message its not the kernel's fault!
I can make the print more clear saying that the firmware-generated 
record is incorrect to make
it clear it is not a kernel issue.
>> +		printk("%sERR_INFO_NUM is %d\n", pfx, proc->err_info_num);
>> +		return;
>> +	}
>> +
>> +	if (proc->validation_bits & CPER_ARM_VALID_MPIDR)
>> +		printk("%sMPIDR: 0x%016llx\n", pfx, proc->mpidr);
>> +	if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL)
>> +		printk("%serror affinity level: %d\n", pfx,
>> +			proc->affinity_level);
>> +	if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) {
>> +		printk("%srunning state: %d\n", pfx, proc->running_state);
> This field is described as a bit field in table 260, can we print it as 0x%lx in
> case additional bits are set?
Yes, will do.
>> +		printk("%sPSCI state: %d\n", pfx, proc->psci_state);
>> +	}
>> +
>> +	snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
>> +
>> +	err_info = (struct cper_arm_err_info *)(proc + 1);
>> +	for (i = 0; i < proc->err_info_num; i++) {
>> +		printk("%sError info structure %d:\n", pfx, i);
>> +		printk("%sversion:%d\n", newpfx, err_info->version);
>> +		printk("%slength:%d\n", newpfx, err_info->length);
>> +		if (err_info->validation_bits &
>> +		    CPER_ARM_INFO_VALID_MULTI_ERR) {
>> +			if (err_info->multiple_error == 0)
>> +				printk("%ssingle error\n", newpfx);
>> +			else if (err_info->multiple_error == 1)
>> +				printk("%smultiple errors\n", newpfx);
>> +			else
>> +				printk("%smultiple errors count:%d\n",
>> +				newpfx, err_info->multiple_error);
> This is described as unsigned in table 261.
Will change.
>> +		}
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) {
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST)
>> +				printk("%sfirst error captured\n", newpfx);
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST)
>> +				printk("%slast error captured\n", newpfx);
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED)
>> +				printk("%spropagated error captured\n",
>> +				       newpfx);
> Table 261 also has an 'overflow' bit in flags. It may be worth printing a
> warning if this is set:
>> Note: Overflow bit indicates that firmware/hardware error
>> buffers had experience an overflow, and it is possible that
>> some error information has been lost.
I will add that in.
>> +		}
>> +		printk("%serror_type: %d, %s\n", newpfx, err_info->type,
>> +			err_info->type < ARRAY_SIZE(proc_error_type_strs) ?
>> +			proc_error_type_strs[err_info->type] : "unknown");
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO)
>> +			printk("%serror_info: 0x%016llx\n", newpfx,
>> +			       err_info->error_info);
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_VIRT_ADDR)
>> +			printk("%svirtual fault address: 0x%016llx\n",
>> +				newpfx, err_info->virt_fault_addr);
>> +		if (err_info->validation_bits &
>> +		    CPER_ARM_INFO_VALID_PHYSICAL_ADDR)
>> +			printk("%sphysical fault address: 0x%016llx\n",
>> +				newpfx, err_info->physical_fault_addr);
>> +		err_info += 1;
>> +	}
>> +	ctx_info = (struct cper_arm_ctx_info *)err_info;
>> +	max_ctx_type = (sizeof(arm_reg_ctx_strs) /
>> +			sizeof(arm_reg_ctx_strs[0]) - 1);
> ARRAY_SIZE() - 1?
I'll use ARRAY_SIZE in the next patchset.
>> +	for (i = 0; i < proc->context_info_num; i++) {
>> +		int size = sizeof(*ctx_info) + ctx_info->size;
>> +
>> +		printk("%sContext info structure %d:\n", pfx, i);
>> +		if (len < size) {
>> +			printk("%ssection length is too small\n", newpfx);
>> +			return;
>> +		}
>> +		if (ctx_info->type > max_ctx_type) {
>> +			printk("%sInvalid context type: %d\n",	newpfx,
>> +						ctx_info->type);
>> +			printk("%sMax context type: %d\n", newpfx,
>> +						max_ctx_type);
>> +			return;
>> +		}
>> +		printk("%sregister context type %d: %s\n", newpfx,
>> +			ctx_info->type, arm_reg_ctx_strs[ctx_info->type]);
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
>> +				(ctx_info + 1), ctx_info->size, 0);
>> +		len -= size;
>> +		ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + size);
>> +	}
>> +
>> +	if (len > 0) {
>> +		printk("%sVendor specific error info has %d bytes:\n", pfx,
>> +		       len);
> %u - just in case it is surprisingly large!
>
Will do.
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4, ctx_info,
>> +				len, 0);
>> +	}
>> +}
>> +
>>   static const char * const mem_err_type_strs[] = {
>>   	"unknown",
>>   	"no error",
>> @@ -458,6 +577,15 @@ static void cper_estatus_print_section(
>>   			cper_print_pcie(newpfx, pcie, gdata);
>>   		else
>>   			goto err_section_too_small;
>> +	} else if (!uuid_le_cmp(*sec_type, CPER_SEC_PROC_ARM)) {
>> +		struct cper_sec_proc_arm *arm_err;
>> +
>> +		arm_err = acpi_hest_generic_data_payload(gdata);
>> +		printk("%ssection_type: ARM processor error\n", newpfx);
>> +		if (gdata->error_data_length >= sizeof(*arm_err))
>> +			cper_print_proc_arm(newpfx, arm_err);
>> +		else
>> +			goto err_section_too_small;
>>   	} else
>>   		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
>>   
> This is the only processor-specific entry in this function,
> CPER_SEC_PROC_{IA,IPF} don't appear anywhere else in the tree.
>
> Is it worth adding an (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM)) in
> the if()? This would let the compiler remove cper_print_proc_arm(() on x86/ia64
> systems which won't ever see a record of this type.
Yes, I can add that.

Thank you for the feedback!

-Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

next prev parent reply	other threads:[~2017-01-05 21:17 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-07 21:48 [PATCH V6 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 01/10] acpi: apei: read ack upon ghes record consumption Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 02/10] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1 Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 03/10] efi: parse ARM processor error Tyler Baicar
2016-12-15 14:08   ` James Morse
2017-01-05 21:17     ` Baicar, Tyler [this message]
2016-12-07 21:48 ` [PATCH V6 04/10] arm64: exception: handle Synchronous External Abort Tyler Baicar
2017-01-04 13:54   ` Will Deacon
2017-01-06 16:58     ` Baicar, Tyler
2016-12-07 21:48 ` [PATCH V6 05/10] acpi: apei: handle SEA notification type for ARMv8 Tyler Baicar
     [not found]   ` <1481147303-7979-6-git-send-email-tbaicar-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-12-20 15:29     ` James Morse
2017-01-05 22:31       ` Baicar, Tyler
2017-01-06 10:43         ` James Morse
2017-01-10 17:50           ` Baicar, Tyler
2016-12-07 21:48 ` [PATCH V6 06/10] acpi: apei: panic OS with fatal error status block Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 07/10] efi: print unrecognized CPER section Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 08/10] ras: acpi / apei: generate trace event for " Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 09/10] trace, ras: add ARM processor error trace event Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 10/10] arm/arm64: KVM: add guest SEA support Tyler Baicar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63d439c0-0e21-dcf5-72da-e84ae0cc2df8@codeaurora.org \
    --to=tbaicar@codeaurora.org \
    --cc=akpm@linux-foundation.org \
    --cc=astone@redhat.com \
    --cc=bristot@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=devel@acpica.org \
    --cc=eun.taik.lee@samsung.com \
    --cc=fu.wei@linaro.org \
    --cc=hanjun.guo@linaro.org \
    --cc=harba@codeaurora.org \
    --cc=james.morse@arm.com \
    --cc=john.garry@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=labbott@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lv.zheng@intel.com \
    --cc=marc.zyngier@arm.com \
    --cc=matt@codeblueprint.co.uk \
    --cc=nkaje@codeaurora.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=pbonzini@redhat.com \
    --cc=punit.agrawal@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=robert.moore@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=rruigrok@codeaurora.org \
    --cc=sandeepa.s.prabhu@gmail.com \
    --cc=shijie.huang@arm.com \
    --cc=shiju.jose@huawei.com \
    --cc=will.deacon@arm.com \
    --cc=zjzhang@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox