From mboxrd@z Thu Jan 1 00:00:00 1970 From: bp@alien8.de (Borislav Petkov) Date: Mon, 15 Oct 2018 18:49:13 +0200 Subject: [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority In-Reply-To: <20180921221705.6478-18-james.morse@arm.com> References: <20180921221705.6478-1-james.morse@arm.com> <20180921221705.6478-18-james.morse@arm.com> Message-ID: <20181015164913.GE11434@zn.tnic> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org + Peter. On Fri, Sep 21, 2018 at 11:17:04PM +0100, James Morse wrote: > arm64 can take an NMI-like error notification when user-space steps in > some corrupt memory. APEI's GHES code will call memory_failure_queue() > to schedule the recovery work. We then return to user-space, possibly > taking the fault again. > > Currently the arch code unconditionally signals user-space from this > path, so we don't get stuck in this loop, but the affected process > never benefits from memory_failure()s recovery work. To fix this we > need to know the recovery work will run before we get back to user-space. > > Increase the priority of the recovery work by scheduling it on the > system_highpri_wq, then try to bump the current task off this CPU > so that the recovery work starts immediately. > > Reported-by: Xie XiuQi > Signed-off-by: James Morse > Reviewed-by: Punit Agrawal > Tested-by: Tyler Baicar > Tested-by: gengdongjiu > CC: Xie XiuQi > CC: gengdongjiu > --- > mm/memory-failure.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 0cd3de3550f0..4e7b115cea5a 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -56,6 +56,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1454,6 +1455,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu, memory_failure_cpu); > */ > void memory_failure_queue(unsigned long pfn, int flags) > { > + int cpu = smp_processor_id(); > struct memory_failure_cpu *mf_cpu; > unsigned long proc_flags; > struct memory_failure_entry entry = { > @@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags) > > mf_cpu = &get_cpu_var(memory_failure_cpu); > spin_lock_irqsave(&mf_cpu->lock, proc_flags); > - if (kfifo_put(&mf_cpu->fifo, entry)) > - schedule_work_on(smp_processor_id(), &mf_cpu->work); > - else > + if (kfifo_put(&mf_cpu->fifo, entry)) { > + queue_work_on(cpu, system_highpri_wq, &mf_cpu->work); > + set_tsk_need_resched(current); > + preempt_set_need_resched(); What guarantees the workqueue would run before the process? I see this: ``WQ_HIGHPRI`` Work items of a highpri wq are queued to the highpri worker-pool of the target cpu. Highpri worker-pools are served by worker threads with elevated nice level. but is that enough? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.