linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 04/13] arm64: kernel: Survive corrected RAS errors notified by SError
Date: Fri, 05 Jan 2018 18:28:46 +0000	[thread overview]
Message-ID: <5A4FC3DE.3010907@arm.com> (raw)
In-Reply-To: <720fa5cc-93a9-8844-c2f3-83116a724d1b@huawei.com>

Hi gengdongjiu,

On 16/12/17 04:51, gengdongjiu wrote:
> On 2017/12/16 12:08, gengdongjiu wrote:
>> On 2017/12/15 23:50, James Morse wrote:
>>> +	case ESR_ELx_AET_UER:	/* Uncorrected Recoverable */
>>> +		/*
>>> +		 * The CPU can't make progress. The exception may have
>>> +		 * been imprecise.
>>> +		 */
>>> +		return true;

>>         For Recoverable error (UER), the error has not been  silently propagated,
>>         and has not been architecturally consumed by the PE, and
>>         The exception is precise and PE can recover execution from the preferred return address of the exception.

>>         so I do not think it should be panic here if the SError come from user space instead of coming from kernel space.

'coming from' doesn't mean an awful lot unless we know what the error is.
To repeat the earlier examples, it could be a fault in the page tables, or pages
shared between processes, e.g. the vdso data page.

I don't want this crude panic/continue to consider anything other than the ESR.
Lets keep it crude, its a stop-gap: both kernel-first and firmware-first can do
a better job - this is just some glue to hold things together until we have
one/both implemented.


[...]

> Recoverable error (UER)
> The state of the PE is Recoverable if all of the following are true:
> ? The error has not been silently propagated.
> ? The error has not been architecturally consumed by the PE. (The PE architectural state is not infected.)
> ? The exception is precise and PE can recover execution from the preferred return address of the exception, if software locates and repairs the error.


It's this bit that made me err on the side of caution/panic():

> The PE cannot make correct progress without either consuming the error or
> otherwise making the error unrecoverable. The error remains latent in the system.

Without firmware-first or kernel-first we can't know where the error is. What
should we do?:

> If software cannot locate and repair the error, either the application or the
> VM, or both, must be isolated by software.


Thanks,

James

  reply	other threads:[~2018-01-05 18:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-15 15:50 [PATCH v5 00/13] arm64/KVM: RAS & IESB for firmware first support James Morse
2017-12-15 15:50 ` [PATCH v5 01/13] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early James Morse
2017-12-15 16:24   ` Suzuki K Poulose
2017-12-15 15:50 ` [PATCH v5 02/13] arm64: sysreg: Move to use definitions for all the SCTLR bits James Morse
2017-12-15 15:50 ` [PATCH v5 03/13] arm64: cpufeature: Detect CPU RAS Extentions James Morse
2017-12-15 15:50 ` [PATCH v5 04/13] arm64: kernel: Survive corrected RAS errors notified by SError James Morse
2017-12-16  2:53   ` gengdongjiu
2018-01-05 18:28     ` James Morse
2017-12-16  4:08   ` gengdongjiu
2017-12-16  4:51     ` gengdongjiu
2018-01-05 18:28       ` James Morse [this message]
2017-12-15 15:50 ` [PATCH v5 05/13] arm64: Unconditionally enable IESB on exception entry/return for firmware-first James Morse
2017-12-15 15:50 ` [PATCH v5 06/13] arm64: kernel: Prepare for a DISR user James Morse
2017-12-15 15:50 ` [PATCH v5 07/13] KVM: arm/arm64: mask/unmask daif around VHE guests James Morse
2018-01-08 16:26   ` James Morse
2017-12-15 15:50 ` [PATCH v5 08/13] KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2 James Morse
2017-12-15 15:50 ` [PATCH v5 09/13] KVM: arm64: Save/Restore guest DISR_EL1 James Morse
2017-12-15 15:50 ` [PATCH v5 10/13] KVM: arm64: Save ESR_EL2 on guest SError James Morse
2017-12-15 15:50 ` [PATCH v5 11/13] KVM: arm64: Handle RAS SErrors from EL1 on guest exit James Morse
2017-12-15 15:51 ` [PATCH v5 12/13] KVM: arm64: Handle RAS SErrors from EL2 " James Morse
2017-12-15 15:51 ` [PATCH v5 13/13] KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA James Morse
2018-01-08 16:27   ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5A4FC3DE.3010907@arm.com \
    --to=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).