Re: [PATCH V6 03/10] efi: parse ARM processor error

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Baicar, Tyler" <tbaicar@codeaurora.org>
To: James Morse <james.morse@arm.com>
Cc: linux-efi@vger.kernel.org, kvm@vger.kernel.org,
	matt@codeblueprint.co.uk, catalin.marinas@arm.com,
	will.deacon@arm.com, robert.moore@intel.com,
	paul.gortmaker@windriver.com, lv.zheng@intel.com,
	kvmarm@lists.cs.columbia.edu, fu.wei@linaro.org,
	zjzhang@codeaurora.org, linux@armlinux.org.uk,
	linux-acpi@vger.kernel.org, eun.taik.lee@samsung.com,
	shijie.huang@arm.com, labbott@redhat.com, lenb@kernel.org,
	harba@codeaurora.org, john.garry@huawei.com,
	marc.zyngier@arm.com, punit.agrawal@arm.com, rostedt@goodmis.org,
	nkaje@codeaurora.org, sandeepa.s.prabhu@gmail.com,
	linux-arm-kernel@lists.infradead.org, devel@acpica.org,
	rjw@rjwysocki.net, rruigrok@codeaurora.org,
	linux-kernel@vger.kernel.org, astone@redhat.com,
	hanjun.guo@linaro.org, pbonzini@redhat.com,
	akpm@linux-foundation.org, bristot@redhat.com,
	shiju.jose@huawei.com
Subject: Re: [PATCH V6 03/10] efi: parse ARM processor error
Date: Thu, 5 Jan 2017 14:17:54 -0700	[thread overview]
Message-ID: <63d439c0-0e21-dcf5-72da-e84ae0cc2df8@codeaurora.org> (raw)
In-Reply-To: <5852A3CA.807@arm.com>

On 12/15/2016 7:08 AM, James Morse wrote:
> Hi Tyler,
>
> On 07/12/16 21:48, Tyler Baicar wrote:
>> Add support for ARM Common Platform Error Record (CPER).
>> UEFI 2.6 specification adds support for ARM specific
>> processor error information to be reported as part of the
>> CPER records. This provides more detail on for processor error logs.
> Looks good to me, a few minor comments below.
>
> Reviewed-by: James Morse <james.morse@arm.com>
Thanks!
>> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
>> index 8fa4e23..1ac2572 100644
>> --- a/drivers/firmware/efi/cper.c
>> +++ b/drivers/firmware/efi/cper.c
>> @@ -184,6 +199,110 @@ static void cper_print_proc_generic(const char *pfx,
>>   		printk("%s""IP: 0x%016llx\n", pfx, proc->ip);
>>   }
>>   
>> +static void cper_print_proc_arm(const char *pfx,
>> +				const struct cper_sec_proc_arm *proc)
>> +{
>> +	int i, len, max_ctx_type;
>> +	struct cper_arm_err_info *err_info;
>> +	struct cper_arm_ctx_info *ctx_info;
>> +	char newpfx[64];
>> +
>> +	printk("%ssection length: %d\n", pfx, proc->section_length);
> Compared to the rest of the file, this:
>> 	printk("%s""section length: %d\n", pfx, proc->section_length);
> would be more in keeping. I guess its done this way to avoid some spurious
> warning about %ssection not being recognised by printk().
Makes sense, I'll make this change next patchset.
>> +	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
>> +
>> +	len = proc->section_length - (sizeof(*proc) +
>> +		proc->err_info_num * (sizeof(*err_info)));
>> +	if (len < 0) {
>> +		printk("%ssection length is too small\n", pfx);
> This calculation is all based on values in the 'struct cper_sec_proc_arm', is it
> worth making more noise about how the firmware-generated record is incorrectly
> formatted? If we see this message its not the kernel's fault!
I can make the print more clear saying that the firmware-generated 
record is incorrect to make
it clear it is not a kernel issue.
>> +		printk("%sERR_INFO_NUM is %d\n", pfx, proc->err_info_num);
>> +		return;
>> +	}
>> +
>> +	if (proc->validation_bits & CPER_ARM_VALID_MPIDR)
>> +		printk("%sMPIDR: 0x%016llx\n", pfx, proc->mpidr);
>> +	if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL)
>> +		printk("%serror affinity level: %d\n", pfx,
>> +			proc->affinity_level);
>> +	if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) {
>> +		printk("%srunning state: %d\n", pfx, proc->running_state);
> This field is described as a bit field in table 260, can we print it as 0x%lx in
> case additional bits are set?
Yes, will do.
>> +		printk("%sPSCI state: %d\n", pfx, proc->psci_state);
>> +	}
>> +
>> +	snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
>> +
>> +	err_info = (struct cper_arm_err_info *)(proc + 1);
>> +	for (i = 0; i < proc->err_info_num; i++) {
>> +		printk("%sError info structure %d:\n", pfx, i);
>> +		printk("%sversion:%d\n", newpfx, err_info->version);
>> +		printk("%slength:%d\n", newpfx, err_info->length);
>> +		if (err_info->validation_bits &
>> +		    CPER_ARM_INFO_VALID_MULTI_ERR) {
>> +			if (err_info->multiple_error == 0)
>> +				printk("%ssingle error\n", newpfx);
>> +			else if (err_info->multiple_error == 1)
>> +				printk("%smultiple errors\n", newpfx);
>> +			else
>> +				printk("%smultiple errors count:%d\n",
>> +				newpfx, err_info->multiple_error);
> This is described as unsigned in table 261.
Will change.
>> +		}
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) {
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST)
>> +				printk("%sfirst error captured\n", newpfx);
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST)
>> +				printk("%slast error captured\n", newpfx);
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED)
>> +				printk("%spropagated error captured\n",
>> +				       newpfx);
> Table 261 also has an 'overflow' bit in flags. It may be worth printing a
> warning if this is set:
>> Note: Overflow bit indicates that firmware/hardware error
>> buffers had experience an overflow, and it is possible that
>> some error information has been lost.
I will add that in.
>> +		}
>> +		printk("%serror_type: %d, %s\n", newpfx, err_info->type,
>> +			err_info->type < ARRAY_SIZE(proc_error_type_strs) ?
>> +			proc_error_type_strs[err_info->type] : "unknown");
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO)
>> +			printk("%serror_info: 0x%016llx\n", newpfx,
>> +			       err_info->error_info);
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_VIRT_ADDR)
>> +			printk("%svirtual fault address: 0x%016llx\n",
>> +				newpfx, err_info->virt_fault_addr);
>> +		if (err_info->validation_bits &
>> +		    CPER_ARM_INFO_VALID_PHYSICAL_ADDR)
>> +			printk("%sphysical fault address: 0x%016llx\n",
>> +				newpfx, err_info->physical_fault_addr);
>> +		err_info += 1;
>> +	}
>> +	ctx_info = (struct cper_arm_ctx_info *)err_info;
>> +	max_ctx_type = (sizeof(arm_reg_ctx_strs) /
>> +			sizeof(arm_reg_ctx_strs[0]) - 1);
> ARRAY_SIZE() - 1?
I'll use ARRAY_SIZE in the next patchset.
>> +	for (i = 0; i < proc->context_info_num; i++) {
>> +		int size = sizeof(*ctx_info) + ctx_info->size;
>> +
>> +		printk("%sContext info structure %d:\n", pfx, i);
>> +		if (len < size) {
>> +			printk("%ssection length is too small\n", newpfx);
>> +			return;
>> +		}
>> +		if (ctx_info->type > max_ctx_type) {
>> +			printk("%sInvalid context type: %d\n",	newpfx,
>> +						ctx_info->type);
>> +			printk("%sMax context type: %d\n", newpfx,
>> +						max_ctx_type);
>> +			return;
>> +		}
>> +		printk("%sregister context type %d: %s\n", newpfx,
>> +			ctx_info->type, arm_reg_ctx_strs[ctx_info->type]);
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
>> +				(ctx_info + 1), ctx_info->size, 0);
>> +		len -= size;
>> +		ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + size);
>> +	}
>> +
>> +	if (len > 0) {
>> +		printk("%sVendor specific error info has %d bytes:\n", pfx,
>> +		       len);
> %u - just in case it is surprisingly large!
>
Will do.
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4, ctx_info,
>> +				len, 0);
>> +	}
>> +}
>> +
>>   static const char * const mem_err_type_strs[] = {
>>   	"unknown",
>>   	"no error",
>> @@ -458,6 +577,15 @@ static void cper_estatus_print_section(
>>   			cper_print_pcie(newpfx, pcie, gdata);
>>   		else
>>   			goto err_section_too_small;
>> +	} else if (!uuid_le_cmp(*sec_type, CPER_SEC_PROC_ARM)) {
>> +		struct cper_sec_proc_arm *arm_err;
>> +
>> +		arm_err = acpi_hest_generic_data_payload(gdata);
>> +		printk("%ssection_type: ARM processor error\n", newpfx);
>> +		if (gdata->error_data_length >= sizeof(*arm_err))
>> +			cper_print_proc_arm(newpfx, arm_err);
>> +		else
>> +			goto err_section_too_small;
>>   	} else
>>   		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
>>   
> This is the only processor-specific entry in this function,
> CPER_SEC_PROC_{IA,IPF} don't appear anywhere else in the tree.
>
> Is it worth adding an (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM)) in
> the if()? This would let the compiler remove cper_print_proc_arm(() on x86/ia64
> systems which won't ever see a record of this type.
Yes, I can add that.

Thank you for the feedback!

-Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

WARNING: multiple messages have this Message-ID (diff)

From: tbaicar@codeaurora.org (Baicar, Tyler)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V6 03/10] efi: parse ARM processor error
Date: Thu, 5 Jan 2017 14:17:54 -0700	[thread overview]
Message-ID: <63d439c0-0e21-dcf5-72da-e84ae0cc2df8@codeaurora.org> (raw)
In-Reply-To: <5852A3CA.807@arm.com>

On 12/15/2016 7:08 AM, James Morse wrote:
> Hi Tyler,
>
> On 07/12/16 21:48, Tyler Baicar wrote:
>> Add support for ARM Common Platform Error Record (CPER).
>> UEFI 2.6 specification adds support for ARM specific
>> processor error information to be reported as part of the
>> CPER records. This provides more detail on for processor error logs.
> Looks good to me, a few minor comments below.
>
> Reviewed-by: James Morse <james.morse@arm.com>
Thanks!
>> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
>> index 8fa4e23..1ac2572 100644
>> --- a/drivers/firmware/efi/cper.c
>> +++ b/drivers/firmware/efi/cper.c
>> @@ -184,6 +199,110 @@ static void cper_print_proc_generic(const char *pfx,
>>   		printk("%s""IP: 0x%016llx\n", pfx, proc->ip);
>>   }
>>   
>> +static void cper_print_proc_arm(const char *pfx,
>> +				const struct cper_sec_proc_arm *proc)
>> +{
>> +	int i, len, max_ctx_type;
>> +	struct cper_arm_err_info *err_info;
>> +	struct cper_arm_ctx_info *ctx_info;
>> +	char newpfx[64];
>> +
>> +	printk("%ssection length: %d\n", pfx, proc->section_length);
> Compared to the rest of the file, this:
>> 	printk("%s""section length: %d\n", pfx, proc->section_length);
> would be more in keeping. I guess its done this way to avoid some spurious
> warning about %ssection not being recognised by printk().
Makes sense, I'll make this change next patchset.
>> +	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
>> +
>> +	len = proc->section_length - (sizeof(*proc) +
>> +		proc->err_info_num * (sizeof(*err_info)));
>> +	if (len < 0) {
>> +		printk("%ssection length is too small\n", pfx);
> This calculation is all based on values in the 'struct cper_sec_proc_arm', is it
> worth making more noise about how the firmware-generated record is incorrectly
> formatted? If we see this message its not the kernel's fault!
I can make the print more clear saying that the firmware-generated 
record is incorrect to make
it clear it is not a kernel issue.
>> +		printk("%sERR_INFO_NUM is %d\n", pfx, proc->err_info_num);
>> +		return;
>> +	}
>> +
>> +	if (proc->validation_bits & CPER_ARM_VALID_MPIDR)
>> +		printk("%sMPIDR: 0x%016llx\n", pfx, proc->mpidr);
>> +	if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL)
>> +		printk("%serror affinity level: %d\n", pfx,
>> +			proc->affinity_level);
>> +	if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) {
>> +		printk("%srunning state: %d\n", pfx, proc->running_state);
> This field is described as a bit field in table 260, can we print it as 0x%lx in
> case additional bits are set?
Yes, will do.
>> +		printk("%sPSCI state: %d\n", pfx, proc->psci_state);
>> +	}
>> +
>> +	snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
>> +
>> +	err_info = (struct cper_arm_err_info *)(proc + 1);
>> +	for (i = 0; i < proc->err_info_num; i++) {
>> +		printk("%sError info structure %d:\n", pfx, i);
>> +		printk("%sversion:%d\n", newpfx, err_info->version);
>> +		printk("%slength:%d\n", newpfx, err_info->length);
>> +		if (err_info->validation_bits &
>> +		    CPER_ARM_INFO_VALID_MULTI_ERR) {
>> +			if (err_info->multiple_error == 0)
>> +				printk("%ssingle error\n", newpfx);
>> +			else if (err_info->multiple_error == 1)
>> +				printk("%smultiple errors\n", newpfx);
>> +			else
>> +				printk("%smultiple errors count:%d\n",
>> +				newpfx, err_info->multiple_error);
> This is described as unsigned in table 261.
Will change.
>> +		}
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) {
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST)
>> +				printk("%sfirst error captured\n", newpfx);
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST)
>> +				printk("%slast error captured\n", newpfx);
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED)
>> +				printk("%spropagated error captured\n",
>> +				       newpfx);
> Table 261 also has an 'overflow' bit in flags. It may be worth printing a
> warning if this is set:
>> Note: Overflow bit indicates that firmware/hardware error
>> buffers had experience an overflow, and it is possible that
>> some error information has been lost.
I will add that in.
>> +		}
>> +		printk("%serror_type: %d, %s\n", newpfx, err_info->type,
>> +			err_info->type < ARRAY_SIZE(proc_error_type_strs) ?
>> +			proc_error_type_strs[err_info->type] : "unknown");
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO)
>> +			printk("%serror_info: 0x%016llx\n", newpfx,
>> +			       err_info->error_info);
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_VIRT_ADDR)
>> +			printk("%svirtual fault address: 0x%016llx\n",
>> +				newpfx, err_info->virt_fault_addr);
>> +		if (err_info->validation_bits &
>> +		    CPER_ARM_INFO_VALID_PHYSICAL_ADDR)
>> +			printk("%sphysical fault address: 0x%016llx\n",
>> +				newpfx, err_info->physical_fault_addr);
>> +		err_info += 1;
>> +	}
>> +	ctx_info = (struct cper_arm_ctx_info *)err_info;
>> +	max_ctx_type = (sizeof(arm_reg_ctx_strs) /
>> +			sizeof(arm_reg_ctx_strs[0]) - 1);
> ARRAY_SIZE() - 1?
I'll use ARRAY_SIZE in the next patchset.
>> +	for (i = 0; i < proc->context_info_num; i++) {
>> +		int size = sizeof(*ctx_info) + ctx_info->size;
>> +
>> +		printk("%sContext info structure %d:\n", pfx, i);
>> +		if (len < size) {
>> +			printk("%ssection length is too small\n", newpfx);
>> +			return;
>> +		}
>> +		if (ctx_info->type > max_ctx_type) {
>> +			printk("%sInvalid context type: %d\n",	newpfx,
>> +						ctx_info->type);
>> +			printk("%sMax context type: %d\n", newpfx,
>> +						max_ctx_type);
>> +			return;
>> +		}
>> +		printk("%sregister context type %d: %s\n", newpfx,
>> +			ctx_info->type, arm_reg_ctx_strs[ctx_info->type]);
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
>> +				(ctx_info + 1), ctx_info->size, 0);
>> +		len -= size;
>> +		ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + size);
>> +	}
>> +
>> +	if (len > 0) {
>> +		printk("%sVendor specific error info has %d bytes:\n", pfx,
>> +		       len);
> %u - just in case it is surprisingly large!
>
Will do.
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4, ctx_info,
>> +				len, 0);
>> +	}
>> +}
>> +
>>   static const char * const mem_err_type_strs[] = {
>>   	"unknown",
>>   	"no error",
>> @@ -458,6 +577,15 @@ static void cper_estatus_print_section(
>>   			cper_print_pcie(newpfx, pcie, gdata);
>>   		else
>>   			goto err_section_too_small;
>> +	} else if (!uuid_le_cmp(*sec_type, CPER_SEC_PROC_ARM)) {
>> +		struct cper_sec_proc_arm *arm_err;
>> +
>> +		arm_err = acpi_hest_generic_data_payload(gdata);
>> +		printk("%ssection_type: ARM processor error\n", newpfx);
>> +		if (gdata->error_data_length >= sizeof(*arm_err))
>> +			cper_print_proc_arm(newpfx, arm_err);
>> +		else
>> +			goto err_section_too_small;
>>   	} else
>>   		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
>>   
> This is the only processor-specific entry in this function,
> CPER_SEC_PROC_{IA,IPF} don't appear anywhere else in the tree.
>
> Is it worth adding an (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM)) in
> the if()? This would let the compiler remove cper_print_proc_arm(() on x86/ia64
> systems which won't ever see a record of this type.
Yes, I can add that.

Thank you for the feedback!

-Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

WARNING: multiple messages have this Message-ID (diff)

From: "Baicar, Tyler" <tbaicar@codeaurora.org>
To: James Morse <james.morse@arm.com>
Cc: christoffer.dall@linaro.org, marc.zyngier@arm.com,
	pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk,
	catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net,
	lenb@kernel.org, matt@codeblueprint.co.uk,
	robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org,
	zjzhang@codeaurora.org, mark.rutland@arm.com,
	akpm@linux-foundation.org, eun.taik.lee@samsung.com,
	sandeepa.s.prabhu@gmail.com, labbott@redhat.com,
	shijie.huang@arm.com, rruigrok@codeaurora.org,
	paul.gortmaker@windriver.com, tn@semihalf.com, fu.wei@linaro.org,
	rostedt@goodmis.org, bristot@redhat.com,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-efi@vger.kernel.org, devel@acpica.org,
	Suzuki.Poulose@arm.com, punit.agrawal@arm.com, astone@redhat.com,
	harba@codeaurora.org, hanjun.guo@linaro.org,
	john.garry@huawei.com, shiju.jose@huawei.com
Subject: Re: [PATCH V6 03/10] efi: parse ARM processor error
Date: Thu, 5 Jan 2017 14:17:54 -0700	[thread overview]
Message-ID: <63d439c0-0e21-dcf5-72da-e84ae0cc2df8@codeaurora.org> (raw)
In-Reply-To: <5852A3CA.807@arm.com>

On 12/15/2016 7:08 AM, James Morse wrote:
> Hi Tyler,
>
> On 07/12/16 21:48, Tyler Baicar wrote:
>> Add support for ARM Common Platform Error Record (CPER).
>> UEFI 2.6 specification adds support for ARM specific
>> processor error information to be reported as part of the
>> CPER records. This provides more detail on for processor error logs.
> Looks good to me, a few minor comments below.
>
> Reviewed-by: James Morse <james.morse@arm.com>
Thanks!
>> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
>> index 8fa4e23..1ac2572 100644
>> --- a/drivers/firmware/efi/cper.c
>> +++ b/drivers/firmware/efi/cper.c
>> @@ -184,6 +199,110 @@ static void cper_print_proc_generic(const char *pfx,
>>   		printk("%s""IP: 0x%016llx\n", pfx, proc->ip);
>>   }
>>   
>> +static void cper_print_proc_arm(const char *pfx,
>> +				const struct cper_sec_proc_arm *proc)
>> +{
>> +	int i, len, max_ctx_type;
>> +	struct cper_arm_err_info *err_info;
>> +	struct cper_arm_ctx_info *ctx_info;
>> +	char newpfx[64];
>> +
>> +	printk("%ssection length: %d\n", pfx, proc->section_length);
> Compared to the rest of the file, this:
>> 	printk("%s""section length: %d\n", pfx, proc->section_length);
> would be more in keeping. I guess its done this way to avoid some spurious
> warning about %ssection not being recognised by printk().
Makes sense, I'll make this change next patchset.
>> +	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
>> +
>> +	len = proc->section_length - (sizeof(*proc) +
>> +		proc->err_info_num * (sizeof(*err_info)));
>> +	if (len < 0) {
>> +		printk("%ssection length is too small\n", pfx);
> This calculation is all based on values in the 'struct cper_sec_proc_arm', is it
> worth making more noise about how the firmware-generated record is incorrectly
> formatted? If we see this message its not the kernel's fault!
I can make the print more clear saying that the firmware-generated 
record is incorrect to make
it clear it is not a kernel issue.
>> +		printk("%sERR_INFO_NUM is %d\n", pfx, proc->err_info_num);
>> +		return;
>> +	}
>> +
>> +	if (proc->validation_bits & CPER_ARM_VALID_MPIDR)
>> +		printk("%sMPIDR: 0x%016llx\n", pfx, proc->mpidr);
>> +	if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL)
>> +		printk("%serror affinity level: %d\n", pfx,
>> +			proc->affinity_level);
>> +	if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) {
>> +		printk("%srunning state: %d\n", pfx, proc->running_state);
> This field is described as a bit field in table 260, can we print it as 0x%lx in
> case additional bits are set?
Yes, will do.
>> +		printk("%sPSCI state: %d\n", pfx, proc->psci_state);
>> +	}
>> +
>> +	snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
>> +
>> +	err_info = (struct cper_arm_err_info *)(proc + 1);
>> +	for (i = 0; i < proc->err_info_num; i++) {
>> +		printk("%sError info structure %d:\n", pfx, i);
>> +		printk("%sversion:%d\n", newpfx, err_info->version);
>> +		printk("%slength:%d\n", newpfx, err_info->length);
>> +		if (err_info->validation_bits &
>> +		    CPER_ARM_INFO_VALID_MULTI_ERR) {
>> +			if (err_info->multiple_error == 0)
>> +				printk("%ssingle error\n", newpfx);
>> +			else if (err_info->multiple_error == 1)
>> +				printk("%smultiple errors\n", newpfx);
>> +			else
>> +				printk("%smultiple errors count:%d\n",
>> +				newpfx, err_info->multiple_error);
> This is described as unsigned in table 261.
Will change.
>> +		}
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) {
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST)
>> +				printk("%sfirst error captured\n", newpfx);
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST)
>> +				printk("%slast error captured\n", newpfx);
>> +			if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED)
>> +				printk("%spropagated error captured\n",
>> +				       newpfx);
> Table 261 also has an 'overflow' bit in flags. It may be worth printing a
> warning if this is set:
>> Note: Overflow bit indicates that firmware/hardware error
>> buffers had experience an overflow, and it is possible that
>> some error information has been lost.
I will add that in.
>> +		}
>> +		printk("%serror_type: %d, %s\n", newpfx, err_info->type,
>> +			err_info->type < ARRAY_SIZE(proc_error_type_strs) ?
>> +			proc_error_type_strs[err_info->type] : "unknown");
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO)
>> +			printk("%serror_info: 0x%016llx\n", newpfx,
>> +			       err_info->error_info);
>> +		if (err_info->validation_bits & CPER_ARM_INFO_VALID_VIRT_ADDR)
>> +			printk("%svirtual fault address: 0x%016llx\n",
>> +				newpfx, err_info->virt_fault_addr);
>> +		if (err_info->validation_bits &
>> +		    CPER_ARM_INFO_VALID_PHYSICAL_ADDR)
>> +			printk("%sphysical fault address: 0x%016llx\n",
>> +				newpfx, err_info->physical_fault_addr);
>> +		err_info += 1;
>> +	}
>> +	ctx_info = (struct cper_arm_ctx_info *)err_info;
>> +	max_ctx_type = (sizeof(arm_reg_ctx_strs) /
>> +			sizeof(arm_reg_ctx_strs[0]) - 1);
> ARRAY_SIZE() - 1?
I'll use ARRAY_SIZE in the next patchset.
>> +	for (i = 0; i < proc->context_info_num; i++) {
>> +		int size = sizeof(*ctx_info) + ctx_info->size;
>> +
>> +		printk("%sContext info structure %d:\n", pfx, i);
>> +		if (len < size) {
>> +			printk("%ssection length is too small\n", newpfx);
>> +			return;
>> +		}
>> +		if (ctx_info->type > max_ctx_type) {
>> +			printk("%sInvalid context type: %d\n",	newpfx,
>> +						ctx_info->type);
>> +			printk("%sMax context type: %d\n", newpfx,
>> +						max_ctx_type);
>> +			return;
>> +		}
>> +		printk("%sregister context type %d: %s\n", newpfx,
>> +			ctx_info->type, arm_reg_ctx_strs[ctx_info->type]);
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
>> +				(ctx_info + 1), ctx_info->size, 0);
>> +		len -= size;
>> +		ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + size);
>> +	}
>> +
>> +	if (len > 0) {
>> +		printk("%sVendor specific error info has %d bytes:\n", pfx,
>> +		       len);
> %u - just in case it is surprisingly large!
>
Will do.
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4, ctx_info,
>> +				len, 0);
>> +	}
>> +}
>> +
>>   static const char * const mem_err_type_strs[] = {
>>   	"unknown",
>>   	"no error",
>> @@ -458,6 +577,15 @@ static void cper_estatus_print_section(
>>   			cper_print_pcie(newpfx, pcie, gdata);
>>   		else
>>   			goto err_section_too_small;
>> +	} else if (!uuid_le_cmp(*sec_type, CPER_SEC_PROC_ARM)) {
>> +		struct cper_sec_proc_arm *arm_err;
>> +
>> +		arm_err = acpi_hest_generic_data_payload(gdata);
>> +		printk("%ssection_type: ARM processor error\n", newpfx);
>> +		if (gdata->error_data_length >= sizeof(*arm_err))
>> +			cper_print_proc_arm(newpfx, arm_err);
>> +		else
>> +			goto err_section_too_small;
>>   	} else
>>   		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
>>   
> This is the only processor-specific entry in this function,
> CPER_SEC_PROC_{IA,IPF} don't appear anywhere else in the tree.
>
> Is it worth adding an (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM)) in
> the if()? This would let the compiler remove cper_print_proc_arm(() on x86/ia64
> systems which won't ever see a record of this type.
Yes, I can add that.

Thank you for the feedback!

-Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

next prev parent reply	other threads:[~2017-01-05 21:16 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-07 21:48 [PATCH V6 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Tyler Baicar
2016-12-07 21:48 ` Tyler Baicar
2016-12-07 21:48 ` Tyler Baicar
2016-12-07 21:48 ` Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 01/10] acpi: apei: read ack upon ghes record consumption Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 02/10] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1 Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 03/10] efi: parse ARM processor error Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-15 14:08   ` James Morse
2016-12-15 14:08     ` James Morse
2016-12-15 14:08     ` James Morse
2017-01-05 21:17     ` Baicar, Tyler [this message]
2017-01-05 21:17       ` Baicar, Tyler
2017-01-05 21:17       ` Baicar, Tyler
2016-12-07 21:48 ` [PATCH V6 04/10] arm64: exception: handle Synchronous External Abort Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2017-01-04 13:54   ` Will Deacon
2017-01-04 13:54     ` Will Deacon
2017-01-04 13:54     ` Will Deacon
2017-01-06 16:58     ` Baicar, Tyler
2017-01-06 16:58       ` Baicar, Tyler
2017-01-06 16:58       ` Baicar, Tyler
2016-12-07 21:48 ` [PATCH V6 05/10] acpi: apei: handle SEA notification type for ARMv8 Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
     [not found]   ` <1481147303-7979-6-git-send-email-tbaicar-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-12-20 15:29     ` James Morse
2016-12-20 15:29       ` James Morse
2016-12-20 15:29       ` James Morse
2017-01-05 22:31       ` Baicar, Tyler
2017-01-05 22:31         ` Baicar, Tyler
2017-01-05 22:31         ` Baicar, Tyler
2017-01-06 10:43         ` James Morse
2017-01-06 10:43           ` James Morse
2017-01-06 10:43           ` James Morse
2017-01-10 17:50           ` Baicar, Tyler
2017-01-10 17:50             ` Baicar, Tyler
2017-01-10 17:50             ` Baicar, Tyler
2016-12-07 21:48 ` [PATCH V6 06/10] acpi: apei: panic OS with fatal error status block Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 07/10] efi: print unrecognized CPER section Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 08/10] ras: acpi / apei: generate trace event for " Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 09/10] trace, ras: add ARM processor error trace event Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48 ` [PATCH V6 10/10] arm/arm64: KVM: add guest SEA support Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-07 21:48   ` Tyler Baicar
2016-12-13 11:10 ` [PATCH V6 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Shiju Jose
2016-12-13 11:10   ` Shiju Jose
2016-12-13 18:38   ` Baicar, Tyler
2016-12-13 18:38     ` Baicar, Tyler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63d439c0-0e21-dcf5-72da-e84ae0cc2df8@codeaurora.org \
    --to=tbaicar@codeaurora.org \
    --cc=akpm@linux-foundation.org \
    --cc=astone@redhat.com \
    --cc=bristot@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=devel@acpica.org \
    --cc=eun.taik.lee@samsung.com \
    --cc=fu.wei@linaro.org \
    --cc=hanjun.guo@linaro.org \
    --cc=harba@codeaurora.org \
    --cc=james.morse@arm.com \
    --cc=john.garry@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=labbott@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lv.zheng@intel.com \
    --cc=marc.zyngier@arm.com \
    --cc=matt@codeblueprint.co.uk \
    --cc=nkaje@codeaurora.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=pbonzini@redhat.com \
    --cc=punit.agrawal@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=robert.moore@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=rruigrok@codeaurora.org \
    --cc=sandeepa.s.prabhu@gmail.com \
    --cc=shijie.huang@arm.com \
    --cc=shiju.jose@huawei.com \
    --cc=will.deacon@arm.com \
    --cc=zjzhang@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.