From: Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>
To: Laurent Dufour <ldufour@linux.vnet.ibm.com>,
linuxppc-dev <linuxppc-dev@ozlabs.org>
Cc: stable@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Subject: Re: [PATCH v4 1/6] powerpc/pseries: Defer the logging of rtas error to irq work queue.
Date: Fri, 29 Jun 2018 10:11:04 +0530 [thread overview]
Message-ID: <e471e695-d712-0ee5-0103-80220374152f@linux.vnet.ibm.com> (raw)
In-Reply-To: <b26f14a3-9f08-798d-1bf9-eb7530f040c0@linux.vnet.ibm.com>
On 06/28/2018 06:49 PM, Laurent Dufour wrote:
> On 28/06/2018 13:10, Mahesh J Salgaonkar wrote:
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> rtas_log_buf is a buffer to hold RTAS event data that are communicated
>> to kernel by hypervisor. This buffer is then used to pass RTAS event
>> data to user through proc fs. This buffer is allocated from vmalloc
>> (non-linear mapping) area.
>>
>> On Machine check interrupt, register r3 points to RTAS extended event
>> log passed by hypervisor that contains the MCE event. The pseries
>> machine check handler then logs this error into rtas_log_buf. The
>> rtas_log_buf is a vmalloc-ed (non-linear) buffer we end up taking up a
>> page fault (vector 0x300) while accessing it. Since machine check
>> interrupt handler runs in NMI context we can not afford to take any
>> page fault. Page faults are not honored in NMI context and causes
>> kernel panic. Apart from that, as Nick pointed out, pSeries_log_error()
>> also takes a spin_lock while logging error which is not safe in NMI
>> context. It may endup in deadlock if we get another MCE before releasing
>> the lock. Fix this by deferring the logging of rtas error to irq work queue.
>>
>> Current implementation uses two different buffers to hold rtas error log
>> depending on whether extended log is provided or not. This makes bit
>> difficult to identify which buffer has valid data that needs to logged
>> later in irq work. Simplify this using single buffer, one per paca, and
>> copy rtas log to it irrespective of whether extended log is provided or
>> not. Allocate this buffer below RMA region so that it can be accessed
>> in real mode mce handler.
>>
>> Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/asm/paca.h | 3 ++
>> arch/powerpc/platforms/pseries/ras.c | 39 +++++++++++++++++++++-----------
>> arch/powerpc/platforms/pseries/setup.c | 16 +++++++++++++
>> 3 files changed, 45 insertions(+), 13 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
>> index 3f109a3e3edb..b441fef53077 100644
>> --- a/arch/powerpc/include/asm/paca.h
>> +++ b/arch/powerpc/include/asm/paca.h
>> @@ -251,6 +251,9 @@ struct paca_struct {
>> void *rfi_flush_fallback_area;
>> u64 l1d_flush_size;
>> #endif
>> +#ifdef CONFIG_PPC_PSERIES
>> + u8 *mce_data_buf; /* buffer to hold per cpu rtas errlog */
>> +#endif /* CONFIG_PPC_PSERIES */
>> } ____cacheline_aligned;
>>
>> extern void copy_mm_to_paca(struct mm_struct *mm);
>> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
>> index 5e1ef9150182..f6ba9a2a4f84 100644
>> --- a/arch/powerpc/platforms/pseries/ras.c
>> +++ b/arch/powerpc/platforms/pseries/ras.c
>> @@ -22,6 +22,7 @@
>> #include <linux/of.h>
>> #include <linux/fs.h>
>> #include <linux/reboot.h>
>> +#include <linux/irq_work.h>
>>
>> #include <asm/machdep.h>
>> #include <asm/rtas.h>
>> @@ -32,11 +33,13 @@
>> static unsigned char ras_log_buf[RTAS_ERROR_LOG_MAX];
>> static DEFINE_SPINLOCK(ras_log_buf_lock);
>>
>> -static char global_mce_data_buf[RTAS_ERROR_LOG_MAX];
>> -static DEFINE_PER_CPU(__u64, mce_data_buf);
>> -
>> static int ras_check_exception_token;
>>
>> +static void mce_process_errlog_event(struct irq_work *work);
>> +static struct irq_work mce_errlog_process_work = {
>> + .func = mce_process_errlog_event,
>> +};
>> +
>> #define EPOW_SENSOR_TOKEN 9
>> #define EPOW_SENSOR_INDEX 0
>>
>> @@ -336,10 +339,9 @@ static irqreturn_t ras_error_interrupt(int irq, void *dev_id)
>> * the actual r3 if possible, and a ptr to the error log entry
>> * will be returned if found.
>> *
>> - * If the RTAS error is not of the extended type, then we put it in a per
>> - * cpu 64bit buffer. If it is the extended type we use global_mce_data_buf.
>> + * Use one buffer mce_data_buf per cpu to store RTAS error.
>> *
>> - * The global_mce_data_buf does not have any locks or protection around it,
>> + * The mce_data_buf does not have any locks or protection around it,
>> * if a second machine check comes in, or a system reset is done
>> * before we have logged the error, then we will get corruption in the
>> * error log. This is preferable over holding off on calling
>> @@ -362,20 +364,19 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs)
>> savep = __va(regs->gpr[3]);
>> regs->gpr[3] = savep[0]; /* restore original r3 */
>>
>> - /* If it isn't an extended log we can use the per cpu 64bit buffer */
>> h = (struct rtas_error_log *)&savep[1];
>> + /* Use the per cpu buffer from paca to store rtas error log */
>> + memset(local_paca->mce_data_buf, 0, RTAS_ERROR_LOG_MAX);
>> if (!rtas_error_extended(h)) {
>> - memcpy(this_cpu_ptr(&mce_data_buf), h, sizeof(__u64));
>> - errhdr = (struct rtas_error_log *)this_cpu_ptr(&mce_data_buf);
>> + memcpy(local_paca->mce_data_buf, h, sizeof(__u64));
>> } else {
>> int len, error_log_length;
>>
>> error_log_length = 8 + rtas_error_extended_log_length(h);
>> len = max_t(int, error_log_length, RTAS_ERROR_LOG_MAX);
>> - memset(global_mce_data_buf, 0, RTAS_ERROR_LOG_MAX);
>> - memcpy(global_mce_data_buf, h, len);
>> - errhdr = (struct rtas_error_log *)global_mce_data_buf;
>> + memcpy(local_paca->mce_data_buf, h, len);
>> }
>> + errhdr = (struct rtas_error_log *)local_paca->mce_data_buf;
>
> You should drop errhdr in this function and simply do
> return (struct rtas_error_log *)local_paca->mce_data_buf;
sure, will do that in next revision.
Thanks,
-Mahesh.
next prev parent reply other threads:[~2018-06-29 4:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-28 11:10 [PATCH v4 0/6] powerpc/pseries: Machien check handler improvements Mahesh J Salgaonkar
2018-06-28 11:10 ` [PATCH v4 1/6] powerpc/pseries: Defer the logging of rtas error to irq work queue Mahesh J Salgaonkar
2018-06-28 13:19 ` Laurent Dufour
2018-06-29 4:41 ` Mahesh Jagannath Salgaonkar [this message]
2018-06-28 21:05 ` kbuild test robot
2018-06-29 4:49 ` Mahesh Jagannath Salgaonkar
2018-06-28 11:11 ` [PATCH v4 2/6] powerpc/pseries: Fix endainness while restoring of r3 in MCE handler Mahesh J Salgaonkar
2018-06-28 11:11 ` [PATCH v4 3/6] powerpc/pseries: Define MCE error event section Mahesh J Salgaonkar
2018-06-28 11:11 ` [PATCH v4 4/6] powerpc/pseries: flush SLB contents on SLB MCE errors Mahesh J Salgaonkar
2018-06-28 11:13 ` [PATCH v4 5/6] powerpc/pseries: Display machine check error details Mahesh J Salgaonkar
2018-06-28 11:13 ` [PATCH v4 6/6] powerpc/pseries: Dump the SLB contents on SLB MCE errors Mahesh J Salgaonkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e471e695-d712-0ee5-0103-80220374152f@linux.vnet.ibm.com \
--to=mahesh@linux.vnet.ibm.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=ldufour@linux.vnet.ibm.com \
--cc=linuxppc-dev@ozlabs.org \
--cc=npiggin@gmail.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).