From: Xiaofei Tan <tanxiaofei@huawei.com>
To: James Morse <james.morse@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Linuxarm <linuxarm@huawei.com>, Will Deacon <will@kernel.org>,
Dave Martin <Dave.Martin@arm.com>,
linux-arm-kernel@lists.infradead.org,
Shiju Jose <shiju.jose@huawei.com>
Subject: Re: Question about SEA handling process happened in user space
Date: Sat, 18 Apr 2020 18:49:19 +0800 [thread overview]
Message-ID: <5E9ADB2F.1020004@huawei.com> (raw)
In-Reply-To: <5d00a4d4-9633-74a1-25f2-cf195e939290@arm.com>
Hi James,
On 2020/4/16 21:27, James Morse wrote:
> On 10/04/2020 03:55, Xiaofei Tan wrote:
>> On 2020/4/9 22:28, James Morse wrote:
>>> On 09/04/2020 09:42, Xiaofei Tan wrote:
>>>> James Morse wrote:
>>>>> Do you have patches to get linux to do something useful with the processor error nodes?
>>>>>
>>>>> We'd need it to handle uncorrected cache errors with a physical address, as if they were
>>>>> memory errors...
>>>
>>>> Yes, we have some patches to do this thing inside. Then memory_failure() will be called for
>>>> arm processor error section when physical address is available.
>>>
>>> I look forward to reading them!
>
>> https://lkml.org/lkml/2018/1/26/197
>>
>> Our guy tried to upstream it, but not accepted. :(
>
> Wrong series?
No
>
> https://lkml.org/lkml/2018/1/26/194 is not creating any handing for processor error nodes.
>
The main patch is this. It just re-write the function ghes_arm_process_error().
https://lkml.org/lkml/2018/1/26/198.
> That series tried to to suck all the pending errors out of the core code, into an arch
> specific queue:
> | arch/arm64/kernel/ras.c | 173 +++++++++++++++++++++++++++++++++++
>
> As far as I understand it, that was to ensure the memory_failure() work was done before we
> return to user-space.
>
> My attempt to fix that got rolled up in the SDEI series. It was posted again here:
> https://lore.kernel.org/linux-acpi/20200228174817.74278-1-james.morse@arm.com/
>
>
> If you need processor errors handling, there should be code added to the
> CPER_SEC_PROC_ARM else-if in ghes_do_proc() to do the handling.
>
> You may end up duplicating bits of ghes_handle_memory_failure(), to report the memory
> errors that happened in the cache.
> If you want to count corrected errors, a device in ghes_edac is probably the way to do that.
>
OK. I will do some research for this. thanks.
>
>>> [...]
>>>
>>>> I think this part is worth improving.
>>>
>>>> BTW, should ARM processor record physical address when consumed an memory poison error for SEA?
>>>> It is helpful to do error recovery. Is this mandatory for arm spec?
>>>
>>> ERR<n>ADDR? Its not mandatory to be filled for any error. It can be some imp-def bus
>>> address or a virtual address.
>>
>> virtual address ? but arm spec called it physical address.
>
> That was my recollection too! But I checked again before writing this:
>
> "4.4.5 ERR<n>ADDR, Error Record Address Register" in
> https://static.docs.arm.com/ddi0587/cb/2019_07_05_DD_0587_C_b.pdf
>
> has a VA bit for a virtual-address, and 'AI' for this imp-def bus address, more properly
> described as on that "might not match the programmers' view of the physical address for
> the recorded location."
>
OK.The spec also support hw to record virtual address.
>
> [...]
>
>>> Does your implementation always give a physical-address for a synchronous external abort?
>
>> We hope so. But hardware guys say it is hard to record physical address for every situation.
>
> Yeah ...
>
> Hopefully the situations where its too-hard are also the rarest, we can class these as
> fatal (because we can't handle them).
>
Agree.
>
> Thanks,
>
> James
>
> .
>
--
thanks
tanxiaofei
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-04-18 10:49 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-30 13:10 Question about SEA handling process happened in user space Xiaofei Tan
2020-03-30 16:49 ` James Morse
2020-03-31 9:41 ` Xiaofei Tan
2020-03-31 17:00 ` James Morse
2020-04-01 3:49 ` Xiaofei Tan
2020-04-07 16:37 ` James Morse
2020-04-09 8:42 ` Xiaofei Tan
2020-04-09 14:28 ` James Morse
2020-04-10 2:55 ` Xiaofei Tan
2020-04-16 13:27 ` James Morse
2020-04-18 10:49 ` Xiaofei Tan [this message]
2020-04-02 6:35 ` Xiaofei Tan
2020-04-07 16:37 ` James Morse
2020-04-09 9:17 ` Xiaofei Tan
2020-04-09 14:28 ` James Morse
2020-04-10 9:43 ` Xiaofei Tan
2020-04-16 13:50 ` James Morse
2020-04-18 11:25 ` Xiaofei Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5E9ADB2F.1020004@huawei.com \
--to=tanxiaofei@huawei.com \
--cc=Dave.Martin@arm.com \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linuxarm@huawei.com \
--cc=shiju.jose@huawei.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).