* [PATCH v2] powerpc/mce: log the error for all unrecoverable errors
@ 2023-01-27 18:29 Ganesh Goudar
2023-01-31 11:29 ` Michael Ellerman
0 siblings, 1 reply; 3+ messages in thread
From: Ganesh Goudar @ 2023-01-27 18:29 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Ganesh Goudar, Mahesh Salgaonkar
For all unrecoverable errors we are missing to log the
error, Since machine_check_log_err() is not getting called
for unrecoverable errors.
Raise irq work in save_mce_event() for unrecoverable errors,
So that we log the error from MCE event handling block in
timer handler.
Log without this change
MCE: CPU27: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered]
MCE: CPU27: PID: 10580 Comm: inject-ra-err NIP: [0000000010000df4]
MCE: CPU27: Initiator CPU
MCE: CPU27: Unknown
Log with this change
MCE: CPU24: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered]
MCE: CPU24: PID: 1589811 Comm: inject-ra-err NIP: [0000000010000e48]
MCE: CPU24: Initiator CPU
MCE: CPU24: Unknown
RTAS: event: 5, Type: Platform Error (224), Severity: 3
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
---
V2: Rephrasing the commit message.
---
arch/powerpc/kernel/mce.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 6c5d30fba766..a1cb2172eb7b 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -131,6 +131,13 @@ void save_mce_event(struct pt_regs *regs, long handled,
if (mce->error_type == MCE_ERROR_TYPE_UE)
mce->u.ue_error.ignore_event = mce_err->ignore_event;
+ /*
+ * Raise irq work, So that we don't miss to log the error for
+ * unrecoverable errors.
+ */
+ if (mce->disposition == MCE_DISPOSITION_NOT_RECOVERED)
+ mce_irq_work_queue();
+
if (!addr)
return;
@@ -235,7 +242,6 @@ static void machine_check_ue_event(struct machine_check_event *evt)
evt, sizeof(*evt));
/* Queue work to process this event later. */
- mce_irq_work_queue();
}
/*
--
2.38.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] powerpc/mce: log the error for all unrecoverable errors
2023-01-27 18:29 [PATCH v2] powerpc/mce: log the error for all unrecoverable errors Ganesh Goudar
@ 2023-01-31 11:29 ` Michael Ellerman
2023-02-01 9:28 ` Ganesh G R
0 siblings, 1 reply; 3+ messages in thread
From: Michael Ellerman @ 2023-01-31 11:29 UTC (permalink / raw)
To: Ganesh Goudar, linuxppc-dev; +Cc: Ganesh Goudar, Mahesh Salgaonkar
Ganesh Goudar <ganeshgr@linux.ibm.com> writes:
> For all unrecoverable errors we are missing to log the
> error, Since machine_check_log_err() is not getting called
> for unrecoverable errors.
>
> Raise irq work in save_mce_event() for unrecoverable errors,
> So that we log the error from MCE event handling block in
> timer handler.
But the patch also removes the irq work raise from machine_check_ue_event().
That's currently done unconditionally, regardless of the disposition. So
doesn't this change also drop logging of recoverable UEs?
Maybe that's OK, but the change log should explain it.
> Log without this change
>
> MCE: CPU27: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered]
> MCE: CPU27: PID: 10580 Comm: inject-ra-err NIP: [0000000010000df4]
> MCE: CPU27: Initiator CPU
> MCE: CPU27: Unknown
>
> Log with this change
>
> MCE: CPU24: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered]
> MCE: CPU24: PID: 1589811 Comm: inject-ra-err NIP: [0000000010000e48]
> MCE: CPU24: Initiator CPU
> MCE: CPU24: Unknown
> RTAS: event: 5, Type: Platform Error (224), Severity: 3
>
> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> ---
> V2: Rephrasing the commit message.
> ---
> arch/powerpc/kernel/mce.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index 6c5d30fba766..a1cb2172eb7b 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -131,6 +131,13 @@ void save_mce_event(struct pt_regs *regs, long handled,
> if (mce->error_type == MCE_ERROR_TYPE_UE)
> mce->u.ue_error.ignore_event = mce_err->ignore_event;
>
> + /*
> + * Raise irq work, So that we don't miss to log the error for
> + * unrecoverable errors.
> + */
> + if (mce->disposition == MCE_DISPOSITION_NOT_RECOVERED)
> + mce_irq_work_queue();
> +
> if (!addr)
> return;
>
> @@ -235,7 +242,6 @@ static void machine_check_ue_event(struct machine_check_event *evt)
> evt, sizeof(*evt));
>
> /* Queue work to process this event later. */
This comment is meaningless without the function call it's commenting
about, ie. the comment should be removed too.
> - mce_irq_work_queue();
> }
>
cheers
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] powerpc/mce: log the error for all unrecoverable errors
2023-01-31 11:29 ` Michael Ellerman
@ 2023-02-01 9:28 ` Ganesh G R
0 siblings, 0 replies; 3+ messages in thread
From: Ganesh G R @ 2023-02-01 9:28 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev; +Cc: Mahesh Salgaonkar
[-- Attachment #1: Type: text/plain, Size: 2573 bytes --]
On 1/31/23 4:59 PM, Michael Ellerman wrote:
> Ganesh Goudar<ganeshgr@linux.ibm.com> writes:
>> For all unrecoverable errors we are missing to log the
>> error, Since machine_check_log_err() is not getting called
>> for unrecoverable errors.
>>
>> Raise irq work in save_mce_event() for unrecoverable errors,
>> So that we log the error from MCE event handling block in
>> timer handler.
> But the patch also removes the irq work raise from machine_check_ue_event().
>
> That's currently done unconditionally, regardless of the disposition. So
> doesn't this change also drop logging of recoverable UEs?
>
> Maybe that's OK, but the change log should explain it.
Yes, its ok, exception vector code will do that for recoverable errors, ill explain
this in commit message.
>
>> Log without this change
>>
>> MCE: CPU27: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered]
>> MCE: CPU27: PID: 10580 Comm: inject-ra-err NIP: [0000000010000df4]
>> MCE: CPU27: Initiator CPU
>> MCE: CPU27: Unknown
>>
>> Log with this change
>>
>> MCE: CPU24: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered]
>> MCE: CPU24: PID: 1589811 Comm: inject-ra-err NIP: [0000000010000e48]
>> MCE: CPU24: Initiator CPU
>> MCE: CPU24: Unknown
>> RTAS: event: 5, Type: Platform Error (224), Severity: 3
>>
>> Signed-off-by: Ganesh Goudar<ganeshgr@linux.ibm.com>
>> Reviewed-by: Mahesh Salgaonkar<mahesh@linux.ibm.com>
>> ---
>> V2: Rephrasing the commit message.
>> ---
>> arch/powerpc/kernel/mce.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
>> index 6c5d30fba766..a1cb2172eb7b 100644
>> --- a/arch/powerpc/kernel/mce.c
>> +++ b/arch/powerpc/kernel/mce.c
>> @@ -131,6 +131,13 @@ void save_mce_event(struct pt_regs *regs, long handled,
>> if (mce->error_type == MCE_ERROR_TYPE_UE)
>> mce->u.ue_error.ignore_event = mce_err->ignore_event;
>>
>> + /*
>> + * Raise irq work, So that we don't miss to log the error for
>> + * unrecoverable errors.
>> + */
>> + if (mce->disposition == MCE_DISPOSITION_NOT_RECOVERED)
>> + mce_irq_work_queue();
>> +
>> if (!addr)
>> return;
>>
>> @@ -235,7 +242,6 @@ static void machine_check_ue_event(struct machine_check_event *evt)
>> evt, sizeof(*evt));
>>
>> /* Queue work to process this event later. */
> This comment is meaningless without the function call it's commenting
> about, ie. the comment should be removed too.
ok.
Thanks.
[-- Attachment #2: Type: text/html, Size: 3482 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-02-01 10:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-27 18:29 [PATCH v2] powerpc/mce: log the error for all unrecoverable errors Ganesh Goudar
2023-01-31 11:29 ` Michael Ellerman
2023-02-01 9:28 ` Ganesh G R
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).