linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: harba@codeaurora.org (Abdulhamid, Harb)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V1 4/6] arm64: exception: handle Synchronous External Abort
Date: Wed, 10 Feb 2016 21:40:44 -0500	[thread overview]
Message-ID: <56BBF4AC.7000208@codeaurora.org> (raw)
In-Reply-To: <20160210180344.GV1052@arm.com>

On 2/10/2016 1:03 PM, Will Deacon wrote:
> On Fri, Feb 05, 2016 at 12:13:26PM -0700, Tyler Baicar wrote:

<snip>

>> +static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
>> +{
>> +	struct siginfo info;
>> +
>> +	atomic_notifier_call_chain(&sea_handler_chain, 0, NULL);
>> +
>> +	pr_err("Synchronous External Abort: %s (0x%08x) at 0x%016lx\n",
>> +		 fault_name(esr), esr, addr);
>> +
>> +	info.si_signo = SIGBUS;
>> +	info.si_errno = 0;
>> +	info.si_code  = 0;
>> +	info.si_addr  = (void __user *)addr;
>> +	arm64_notify_die("", regs, &info, esr);
> 
> Surely we don't want to call this if the notifier chain handled the
> exception?
You are correct, Ideally you should not die if the notifier chain
handled the exception (e.g. via memory fault handling).  However, this
patch was intended as a first step to provide the user with more useful
information about the hardware error (e.g. details of a cache error, bus
error, or memory error that led to the SEA).

The thought was to do what your suggesting as a next step (i.e. adding
actually recovery mechanisms in the SEA handler). However, there are a
couple of questions enumerated below that I think need more discussion.

First, you need a way to get information returned from the notifier
chain to understand whether or not it recovered from the error. (If this
easier than I'm making it out to be, please set me straight here, as it
was not clear to me at first glance on how to do that)

Second, you need a way to kill/abort the thread that encountered this
error, which (I assume) would only be valid/possible thing to do if it
was a user thread that encountered the hardware error.

For example, let's say we encounter an SEA due to a memory error that
was successfully handled by the memory fault handling code (e.g. offline
a page owned by some user application).  Since this is a synchronous
error that may have occurred either on a load, store, or instruction
fetch, the SEA handler must also know to kill the user thread that
encountered that hardware error.  It is not clear to me how we do that
cleanly, and what the repercussions would be. Would it get handled
naturally after the page has become invalid (e.g. it would just result
in a translation fault when attempting to continue the thread, existing
kernel software error handling takes it from there)?

Also, keep in mind that our current assumption is that *all* kernel data
and threads should be considered critical, and any
corruption/termination of kernel data/threads should always be treated
as fatal.  Please let us know if you disagree.

Harb
-- 
Qualcomm Technologies, Inc.
on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

  reply	other threads:[~2016-02-11  2:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-05 19:13 [PATCH V1 0/6] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Tyler Baicar
2016-02-05 19:13 ` [PATCH V1 1/6] acpi: apei: read ack upon ghes record consumption Tyler Baicar
2016-02-05 19:13 ` [PATCH V1 2/6] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1 Tyler Baicar
2016-02-05 19:13 ` [PATCH V1 3/6] efi: parse ARMv8 processor error Tyler Baicar
2016-02-05 19:13 ` [PATCH V1 4/6] arm64: exception: handle Synchronous External Abort Tyler Baicar
2016-02-10 18:03   ` Will Deacon
2016-02-11  2:40     ` Abdulhamid, Harb [this message]
2016-02-05 19:13 ` [PATCH V1 5/6] arm64: exception: handle instruction abort at current EL Tyler Baicar
2016-02-10 18:02   ` Will Deacon
2016-02-11  3:03     ` Abdulhamid, Harb
2016-02-05 19:13 ` [PATCH V1 6/6] acpi: apei: handle SEA notification type for ARMv8 Tyler Baicar
2016-02-10 18:03   ` Will Deacon
2016-02-11  3:22     ` Abdulhamid, Harb
2016-02-11 22:37     ` Baicar, Tyler
2016-02-12  9:51       ` Will Deacon
2016-02-10 17:44 ` [PATCH V1 0/6] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56BBF4AC.7000208@codeaurora.org \
    --to=harba@codeaurora.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).