From: James Morse <james.morse@arm.com>
To: Xie XiuQi <xiexiuqi@huawei.com>
Cc: catalin.marinas@arm.com, will.deacon@arm.com, mingo@redhat.com,
mark.rutland@arm.com, ard.biesheuvel@linaro.org,
Dave.Martin@arm.com, takahiro.akashi@linaro.org,
tbaicar@codeaurora.org, stephen.boyd@linaro.org, bp@suse.de,
julien.thierry@arm.com, shiju.jose@huawei.com,
zjzhang@codeaurora.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
wangxiongfeng2@huawei.com, zhengqiang10@huawei.com,
gengdongjiu@huawei.com, huawei.libin@huawei.com,
wangkefeng.wang@huawei.com, lijinyue@huawei.com,
guohanjun@huawei.com, hanjun.guo@linaro.org,
cj.chengjian@huawei.com
Subject: Re: [PATCH v5 1/3] arm64/ras: support sea error recovery
Date: Thu, 15 Feb 2018 17:56:23 +0000 [thread overview]
Message-ID: <5A85C9C7.9060701@arm.com> (raw)
In-Reply-To: <7dacf375-4645-ba34-62d1-96d9f67dbcc2@huawei.com>
Hi Xie XiuQi,
On 08/02/18 08:35, Xie XiuQi wrote:
> I am very glad that you are trying to solve the problem, which is very helpful.
> I agree with your proposal, and I'll test it on by box latter.
>
> Indeed, we're in precess context when we are in sea handler. I was thought we
> can't call schedule() in the exception handler before.
While testing this I've come to the conclusion that the
memory_failure_queue_kick() approach I suggested makes arm64 behave slightly
differently with APEI, and would need re-inventing if we support kernel-first
too. The same race exists with memory-failure notifications signalled by SDEI,
and to a lesser extent IRQ. So by fixing this in arch-code, we actually making
our lives harder.
Instead, I have the patch below. This is smaller, and not arch specific. It also
saves the arch code secretly knowing that APEI calls memory_failure_queue().
I will post this as part of that series shortly...
Thanks,
James
---------------%<---------------
[PATCH] mm/memory-failure: increase queued recovery work's priority
arm64 can take an NMI-like error notification when user-space steps in
some corrupt memory. APEI's GHES code will call memory_failure_queue()
to schedule the recovery work. We then return to user-space, possibly
taking the fault again.
Currently the arch code unconditionally signals user-space from this
path, so we don't get stuck in this loop, but the affected process
never benefits from memory_failure()s recovery work. To fix this we
need to know the recovery work will run before we get back to user-space.
Increase the priority of the recovery work by scheduling it on the
system_highpri_wq, then try to bump the current task off this CPU
so that the recover work starts immediately.
Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
mm/memory-failure.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 4b80ccee4535..14f44d841e8b 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -55,6 +55,7 @@
#include <linux/hugetlb.h>
#include <linux/memory_hotplug.h>
#include <linux/mm_inline.h>
+#include <linux/preempt.h>
#include <linux/kfifo.h>
#include <linux/ratelimit.h>
#include "internal.h"
@@ -1319,6 +1320,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu,
memory_failure_cpu);
*/
void memory_failure_queue(unsigned long pfn, int flags)
{
+ int cpu = smp_processor_id();
struct memory_failure_cpu *mf_cpu;
unsigned long proc_flags;
struct memory_failure_entry entry = {
@@ -1328,11 +1330,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
mf_cpu = &get_cpu_var(memory_failure_cpu);
spin_lock_irqsave(&mf_cpu->lock, proc_flags);
- if (kfifo_put(&mf_cpu->fifo, entry))
- schedule_work_on(smp_processor_id(), &mf_cpu->work);
- else
+ if (kfifo_put(&mf_cpu->fifo, entry)) {
+ queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
+ set_tsk_need_resched(current);
+ preempt_set_need_resched();
+ } else {
pr_err("Memory failure: buffer overflow when queuing memory
failure at %#lx\n",
pfn);
+ }
spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
put_cpu_var(memory_failure_cpu);
}
---------------%<---------------
WARNING: multiple messages have this Message-ID (diff)
From: james.morse@arm.com (James Morse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 1/3] arm64/ras: support sea error recovery
Date: Thu, 15 Feb 2018 17:56:23 +0000 [thread overview]
Message-ID: <5A85C9C7.9060701@arm.com> (raw)
In-Reply-To: <7dacf375-4645-ba34-62d1-96d9f67dbcc2@huawei.com>
Hi Xie XiuQi,
On 08/02/18 08:35, Xie XiuQi wrote:
> I am very glad that you are trying to solve the problem, which is very helpful.
> I agree with your proposal, and I'll test it on by box latter.
>
> Indeed, we're in precess context when we are in sea handler. I was thought we
> can't call schedule() in the exception handler before.
While testing this I've come to the conclusion that the
memory_failure_queue_kick() approach I suggested makes arm64 behave slightly
differently with APEI, and would need re-inventing if we support kernel-first
too. The same race exists with memory-failure notifications signalled by SDEI,
and to a lesser extent IRQ. So by fixing this in arch-code, we actually making
our lives harder.
Instead, I have the patch below. This is smaller, and not arch specific. It also
saves the arch code secretly knowing that APEI calls memory_failure_queue().
I will post this as part of that series shortly...
Thanks,
James
---------------%<---------------
[PATCH] mm/memory-failure: increase queued recovery work's priority
arm64 can take an NMI-like error notification when user-space steps in
some corrupt memory. APEI's GHES code will call memory_failure_queue()
to schedule the recovery work. We then return to user-space, possibly
taking the fault again.
Currently the arch code unconditionally signals user-space from this
path, so we don't get stuck in this loop, but the affected process
never benefits from memory_failure()s recovery work. To fix this we
need to know the recovery work will run before we get back to user-space.
Increase the priority of the recovery work by scheduling it on the
system_highpri_wq, then try to bump the current task off this CPU
so that the recover work starts immediately.
Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
mm/memory-failure.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 4b80ccee4535..14f44d841e8b 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -55,6 +55,7 @@
#include <linux/hugetlb.h>
#include <linux/memory_hotplug.h>
#include <linux/mm_inline.h>
+#include <linux/preempt.h>
#include <linux/kfifo.h>
#include <linux/ratelimit.h>
#include "internal.h"
@@ -1319,6 +1320,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu,
memory_failure_cpu);
*/
void memory_failure_queue(unsigned long pfn, int flags)
{
+ int cpu = smp_processor_id();
struct memory_failure_cpu *mf_cpu;
unsigned long proc_flags;
struct memory_failure_entry entry = {
@@ -1328,11 +1330,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
mf_cpu = &get_cpu_var(memory_failure_cpu);
spin_lock_irqsave(&mf_cpu->lock, proc_flags);
- if (kfifo_put(&mf_cpu->fifo, entry))
- schedule_work_on(smp_processor_id(), &mf_cpu->work);
- else
+ if (kfifo_put(&mf_cpu->fifo, entry)) {
+ queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
+ set_tsk_need_resched(current);
+ preempt_set_need_resched();
+ } else {
pr_err("Memory failure: buffer overflow when queuing memory
failure at %#lx\n",
pfn);
+ }
spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
put_cpu_var(memory_failure_cpu);
}
---------------%<---------------
next prev parent reply other threads:[~2018-02-15 17:58 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-26 12:31 [PATCH v5 0/3] arm64/ras: support sea error recovery Xie XiuQi
2018-01-26 12:31 ` Xie XiuQi
2018-01-26 12:31 ` Xie XiuQi
2018-01-26 12:31 ` [PATCH v5 1/3] " Xie XiuQi
2018-01-26 12:31 ` Xie XiuQi
2018-01-26 12:31 ` Xie XiuQi
2018-01-30 19:19 ` James Morse
2018-01-30 19:19 ` James Morse
2018-02-07 19:03 ` James Morse
2018-02-07 19:03 ` James Morse
2018-02-08 8:35 ` Xie XiuQi
2018-02-08 8:35 ` Xie XiuQi
2018-02-08 8:35 ` Xie XiuQi
2018-02-15 17:56 ` James Morse [this message]
2018-02-15 17:56 ` James Morse
2018-02-09 5:04 ` gengdongjiu
2018-02-09 5:04 ` gengdongjiu
2018-02-09 5:04 ` gengdongjiu
2018-01-26 12:31 ` [PATCH v5 2/3] GHES: add a notify chain for process memory section Xie XiuQi
2018-01-26 12:31 ` Xie XiuQi
2018-01-26 12:31 ` Xie XiuQi
2018-02-07 10:31 ` Borislav Petkov
2018-02-07 10:31 ` Borislav Petkov
2018-02-08 8:41 ` Xie XiuQi
2018-02-08 8:41 ` Xie XiuQi
2018-02-08 8:41 ` Xie XiuQi
2018-01-26 12:31 ` [PATCH v5 3/3] arm64/ras: save error address from memory section for recovery Xie XiuQi
2018-01-26 12:31 ` Xie XiuQi
2018-01-26 12:31 ` Xie XiuQi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A85C9C7.9060701@arm.com \
--to=james.morse@arm.com \
--cc=Dave.Martin@arm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=bp@suse.de \
--cc=catalin.marinas@arm.com \
--cc=cj.chengjian@huawei.com \
--cc=gengdongjiu@huawei.com \
--cc=guohanjun@huawei.com \
--cc=hanjun.guo@linaro.org \
--cc=huawei.libin@huawei.com \
--cc=julien.thierry@arm.com \
--cc=lijinyue@huawei.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=shiju.jose@huawei.com \
--cc=stephen.boyd@linaro.org \
--cc=takahiro.akashi@linaro.org \
--cc=tbaicar@codeaurora.org \
--cc=wangkefeng.wang@huawei.com \
--cc=wangxiongfeng2@huawei.com \
--cc=will.deacon@arm.com \
--cc=xiexiuqi@huawei.com \
--cc=zhengqiang10@huawei.com \
--cc=zjzhang@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.