All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiaofei Tan <tanxiaofei@huawei.com>
To: James Morse <james.morse@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Linuxarm <linuxarm@huawei.com>, Will Deacon <will@kernel.org>,
	Dave Martin <Dave.Martin@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Shiju Jose <shiju.jose@huawei.com>
Subject: Re: Question about SEA handling process happened in user space
Date: Sat, 18 Apr 2020 18:49:19 +0800	[thread overview]
Message-ID: <5E9ADB2F.1020004@huawei.com> (raw)
In-Reply-To: <5d00a4d4-9633-74a1-25f2-cf195e939290@arm.com>

Hi James,

On 2020/4/16 21:27, James Morse wrote:
> On 10/04/2020 03:55, Xiaofei Tan wrote:
>> On 2020/4/9 22:28, James Morse wrote:
>>> On 09/04/2020 09:42, Xiaofei Tan wrote:
>>>> James Morse wrote:
>>>>> Do you have patches to get linux to do something useful with the processor error nodes?
>>>>>
>>>>> We'd need it to handle uncorrected cache errors with a physical address, as if they were
>>>>> memory errors...
>>>
>>>> Yes, we have some patches to do this thing inside. Then memory_failure() will be called for
>>>> arm processor error section when physical address is available.
>>>
>>> I look forward to reading them!
> 
>> https://lkml.org/lkml/2018/1/26/197
>>
>> Our guy tried to upstream it, but not accepted. :(
> 
> Wrong series?

No

> 
> https://lkml.org/lkml/2018/1/26/194 is not creating any handing for processor error nodes.
> 

The main patch is this. It just re-write the function ghes_arm_process_error().
https://lkml.org/lkml/2018/1/26/198.


> That series tried to to suck all the pending errors out of the core code, into an arch
> specific queue:
> | arch/arm64/kernel/ras.c              | 173 +++++++++++++++++++++++++++++++++++
> 
> As far as I understand it, that was to ensure the memory_failure() work was done before we
> return to user-space.
> 
> My attempt to fix that got rolled up in the SDEI series. It was posted again here:
> https://lore.kernel.org/linux-acpi/20200228174817.74278-1-james.morse@arm.com/
> 
> 
> If you need processor errors handling, there should be code added to the
> CPER_SEC_PROC_ARM else-if in ghes_do_proc() to do the handling.
> 
> You may end up duplicating bits of ghes_handle_memory_failure(), to report the memory
> errors that happened in the cache.
> If you want to count corrected errors, a device in ghes_edac is probably the way to do that.
> 

OK. I will do some research for this. thanks.

> 
>>> [...]
>>>
>>>> I think this part is worth improving.
>>>
>>>> BTW, should ARM processor record physical address when consumed an memory poison error for SEA?
>>>> It is helpful to do error recovery. Is this mandatory for arm spec?
>>>
>>> ERR<n>ADDR? Its not mandatory to be filled for any error. It can be some imp-def bus
>>> address or a virtual address. 
>>
>> virtual address ? but arm spec called it physical address.
> 
> That was my recollection too! But I checked again before writing this:
> 
> "4.4.5 ERR<n>ADDR, Error Record Address Register" in
> https://static.docs.arm.com/ddi0587/cb/2019_07_05_DD_0587_C_b.pdf
> 
> has a VA bit for a virtual-address, and 'AI' for this imp-def bus address, more properly
> described as on that "might not match the programmers' view of the physical address for
> the recorded location."
> 

OK.The spec also support hw to record virtual address.

> 
> [...]
> 
>>> Does your implementation always give a physical-address for a synchronous external abort?
> 
>> We hope so. But hardware guys say it is hard to record physical address for every situation.
> 
> Yeah ...
> 
> Hopefully the situations where its too-hard are also the rarest, we can class these as
> fatal (because we can't handle them).
> 

Agree.

> 
> Thanks,
> 
> James
> 
> .
> 

-- 
 thanks
tanxiaofei


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-04-18 10:49 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-30 13:10 Question about SEA handling process happened in user space Xiaofei Tan
2020-03-30 16:49 ` James Morse
2020-03-31  9:41   ` Xiaofei Tan
2020-03-31 17:00     ` James Morse
2020-04-01  3:49       ` Xiaofei Tan
2020-04-07 16:37         ` James Morse
2020-04-09  8:42           ` Xiaofei Tan
2020-04-09 14:28             ` James Morse
2020-04-10  2:55               ` Xiaofei Tan
2020-04-16 13:27                 ` James Morse
2020-04-18 10:49                   ` Xiaofei Tan [this message]
2020-04-02  6:35   ` Xiaofei Tan
2020-04-07 16:37     ` James Morse
2020-04-09  9:17       ` Xiaofei Tan
2020-04-09 14:28         ` James Morse
2020-04-10  9:43           ` Xiaofei Tan
2020-04-16 13:50             ` James Morse
2020-04-18 11:25               ` Xiaofei Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5E9ADB2F.1020004@huawei.com \
    --to=tanxiaofei@huawei.com \
    --cc=Dave.Martin@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linuxarm@huawei.com \
    --cc=shiju.jose@huawei.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.