From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Daniel Ferguson <danielf@os.amperecomputing.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
Ard Biesheuvel <ardb@kernel.org>,
Jonathan Corbet <corbet@lwn.net>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Len Brown <lenb@kernel.org>, James Morse <james.morse@arm.com>,
Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-acpi@vger.kernel.org>, <linux-efi@vger.kernel.org>,
<linux-edac@vger.kernel.org>,
Jason Tian <jason@os.amperecomputing.com>,
Shengwei Luo <luoshengwei@huawei.com>,
Shiju Jose <shiju.jose@huawei.com>
Subject: Re: [PATCH v4 1/5] RAS: Report all ARM processor CPER information to userspace
Date: Tue, 12 Aug 2025 15:05:55 +0100 [thread overview]
Message-ID: <20250812150555.000067f7@huawei.com> (raw)
In-Reply-To: <a302543c-7a99-4ea4-9559-e4cf4ea79b5e@os.amperecomputing.com>
On Mon, 11 Aug 2025 15:37:18 -0700
Daniel Ferguson <danielf@os.amperecomputing.com> wrote:
> On 8/11/2025 3:52 AM, Jonathan Cameron wrote:
> > On Sat, 9 Aug 2025 17:55:19 +0200
> > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> >
> >> Em Fri, 8 Aug 2025 16:22:09 +0100
> >> Jonathan Cameron <Jonathan.Cameron@huawei.com> escreveu:
> >>
> >>> On Tue, 05 Aug 2025 11:35:38 -0700
> >>> Daniel Ferguson <danielf@os.amperecomputing.com> wrote:
> >>>
> >>>> From: Jason Tian <jason@os.amperecomputing.com>
> >>>>
> >>>> The ARM processor CPER record was added in UEFI v2.6 and remained
> >>>> unchanged up to v2.10.
> >>>>
> >>>> Yet, the original arm_event trace code added by
> >>>>
> >>>> e9279e83ad1f ("trace, ras: add ARM processor error trace event")
> >>>>
> >>>> is incomplete, as it only traces some fields of UAPI 2.6 table N.16, not
> >>>> exporting any information from tables N.17 to N.29 of the record.
> >>>>
> >>>> This is not enough for the user to be able to figure out what has
> >>>> exactly happened or to take appropriate action.
> >>>>
> >>>> According to the UEFI v2.9 specification chapter N2.4.4, the ARM
> >>>> processor error section includes:
> >>>>
> >>>> - several (ERR_INFO_NUM) ARM processor error information structures
> >>>> (Tables N.17 to N.20);
> >>>> - several (CONTEXT_INFO_NUM) ARM processor context information
> >>>> structures (Tables N.21 to N.29);
> >>>> - several vendor specific error information structures. The
> >>>> size is given by Section Length minus the size of the other
> >>>> fields.
> >>>>
> >>>> In addition, it also exports two fields that are parsed by the GHES
> >>>> driver when firmware reports it, e.g.:
> >>>>
> >>>> - error severity
> >>>> - CPU logical index
> >>>>
> >>>> Report all of these information to userspace via a the ARM tracepoint so
> >>>> that userspace can properly record the error and take decisions related
> >>>> to CPU core isolation according to error severity and other info.
> >>>>
> >>>> The updated ARM trace event now contains the following fields:
> >>>>
> >>>> ====================================== =============================
> >>>> UEFI field on table N.16 ARM Processor trace fields
> >>>> ====================================== =============================
> >>>> Validation handled when filling data for
> >>>> affinity MPIDR and running
> >>>> state.
> >>>> ERR_INFO_NUM pei_len
> >>>> CONTEXT_INFO_NUM ctx_len
> >>>> Section Length indirectly reported by
> >>>> pei_len, ctx_len and oem_len
> >>>> Error affinity level affinity
> >>>> MPIDR_EL1 mpidr
> >>>> MIDR_EL1 midr
> >>>> Running State running_state
> >>>> PSCI State psci_state
> >>>> Processor Error Information Structure pei_err - count at pei_len
> >>>> Processor Context ctx_err- count at ctx_len
> >>>> Vendor Specific Error Info oem - count at oem_len
> >>>> ====================================== =============================
> >>>>
> >>>> It should be noted that decoding of tables N.17 to N.29, if needed, will
> >>>> be handled in userspace. That gives more flexibility, as there won't be
> >>>> any need to flood the kernel with micro-architecture specific error
> >>>> decoding.
> >>>>
> >>>> Also, decoding the other fields require a complex logic, and should be
> >>>> done for each of the several values inside the record field. So, let
> >>>> userspace daemons like rasdaemon decode them, parsing such tables and
> >>>> having vendor-specific micro-architecture-specific decoders.
> >>>>
> >>>> [mchehab: modified description, solved merge conflicts and fixed coding style]
> >>>>
> >>>> Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event")
> >>>>
> >>>
> >>> Fixes tag is part of the main tag block so no blank line here.
> >>> There are at least some scripts running on the kernel tree that trip
> >>> up on this (and one that moans at the submitter ;)
> >>>
> >>> I'd also add something to explain the SoB sequence for the curious.
> >>>
> >>>> Co-developed-by: Jason Tian <jason@os.amperecomputing.com>
> >>>
> >>> Jason's the Author, so shouldn't have a Co-dev tag.
> >>> There is some info on this in
> >>> https://docs.kernel.org/process/submitting-patches.html
> >>
> >> My understanding is that all co-authors shall have co-developed-by
> >> and SoB. Anyway, doesn't matter much in practice, I guess.
> >
> > Nope. In the description the docs say "in addition to the author
> > attribute in the From: tag" There are also examples where there
> > isn't a Co-dev for the From author including the subtle question of
> > ordering if someone else posts that patch.
> >
> > I have a vague recollection one of the scripts checking linux-next
> > might moan about this.
> >
> >>
> >>>
> >>>> Signed-off-by: Jason Tian <jason@os.amperecomputing.com>
> >>>> Co-developed-by: Shengwei Luo <luoshengwei@huawei.com>
> >>>> Signed-off-by: Shengwei Luo <luoshengwei@huawei.com>
> >>>> Co-developed-by: Daniel Ferguson <danielf@os.amperecomputing.com>
> >>>> Signed-off-by: Daniel Ferguson <danielf@os.amperecomputing.com>
> >>>
> >>> As person submitting I'd normally expect your SoB last.
> >>>
> >>>> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >>>
> >>> I guess this is because Mauro posted an earlier version in which
> >>> case this is arguably correct, but likely to confuse.
> >>> For cases like this I add comments.
> >>
> >> If the patch is identical, and it is just a resubmission,
> >> I would keep the original order.
> >>
> >> Otherwise, if Daniel did some changes at the code (except for a
> >> trivial rebase stuff), better to move his co-developed-by/SoB to
> >> the end, eventually adding:
> >>
> >> [Daniel: <did something>] before the custody chain block.
> >
> > Docs are clear that sender of the patch must be last SoB.
> > That's also checked by the some of the tag block check scripts
> > I think. There's an example of exactly this case in the in the
> > submitting-patches.rst file (last one in the section that talks
> > about Co-developed-by.
> >
> > For meaning I don't care that much, but keeping to the rigid
> > rules in that doc makes it easier for scripts to check for the
> > more important stuff like whether all necessary SoB are there.
> >
> > Jonathan
> >
>
> Just to make sure I get this right...
>
> I will move my SoB and my Co-developed-by after any other SoB's in the tag
> block. As a result, I do not need to provide an explanation in this case,
> because there is nothing curious remaining ? Or should I still add comments
> indicating why I rearranged the tag block ?
>
> Based on your feedback, I'm intending to organize the tag block to look like this:
>
> Signed-off-by: Jason Tian <jason@os.amperecomputing.com>
> Co-developed-by: Shengwei Luo <luoshengwei@huawei.com>
> Signed-off-by: Shengwei Luo <luoshengwei@huawei.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Co-developed-by: Daniel Ferguson <danielf@os.amperecomputing.com>
> Signed-off-by: Daniel Ferguson <danielf@os.amperecomputing.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Tested-by: Shiju Jose <shiju.jose@huawei.com>
> Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
> Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event")
> Link:
> https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section
>
>
> Is that satisfactory?
The SoB from Mauro is still out of the normal pattern which
tends to happen only when things are bouncing between different submitters.
So I'd still add a comment and keep that after you Codev/SoB.
#1 [mchehab: modified description, solved merge conflicts and fixed coding style
in version 3.]
Signed-off-by: Jason Tian <jason@os.amperecomputing.com>
Co-developed-by: Shengwei Luo <luoshengwei@huawei.com>
Signed-off-by: Shengwei Luo <luoshengwei@huawei.com>
Co-developed-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> # 1
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section
Or something along those lines
Jonathan
>
> Thank you,
> Daniel
>
> >>
> >> Thanks,
> >> Mauro
> >>
> >
>
next prev parent reply other threads:[~2025-08-12 14:06 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-05 18:35 [PATCH v4 0/5] Fix issues with ARM Processor CPER records Daniel Ferguson
2025-08-05 18:35 ` [PATCH v4 1/5] RAS: Report all ARM processor CPER information to userspace Daniel Ferguson
2025-08-08 15:22 ` Jonathan Cameron
2025-08-09 15:55 ` Mauro Carvalho Chehab
2025-08-11 10:52 ` Jonathan Cameron
2025-08-11 22:37 ` Daniel Ferguson
2025-08-12 14:05 ` Jonathan Cameron [this message]
2025-08-05 18:35 ` [PATCH v4 2/5] efi/cper: Adjust infopfx size to accept an extra space Daniel Ferguson
2025-08-05 18:35 ` [PATCH v4 3/5] efi/cper: Add a new helper function to print bitmasks Daniel Ferguson
2025-08-08 15:23 ` Jonathan Cameron
2025-08-05 18:35 ` [PATCH v4 4/5] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Daniel Ferguson
2025-08-05 18:35 ` [PATCH v4 5/5] docs: efi: add CPER functions to driver-api Daniel Ferguson
2025-08-05 18:39 ` [PATCH v4 0/5] Fix issues with ARM Processor CPER records Daniel Ferguson
2025-08-08 15:01 ` Jonathan Cameron
2025-08-08 19:03 ` Daniel Ferguson
2025-08-09 15:49 ` Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250812150555.000067f7@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=ardb@kernel.org \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=danielf@os.amperecomputing.com \
--cc=james.morse@arm.com \
--cc=jason@os.amperecomputing.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-edac@vger.kernel.org \
--cc=linux-efi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luoshengwei@huawei.com \
--cc=mchehab+huawei@kernel.org \
--cc=rafael@kernel.org \
--cc=shiju.jose@huawei.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).