From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries
Date: Thu, 14 Nov 2013 09:10:07 +0100 [thread overview]
Message-ID: <20131114091007.0b15dde2@mschwide> (raw)
In-Reply-To: <CAHkRjk6LAmQpkioBdFxH5htYHi75bR2xhfXzfvmQL5Br0kMgvg@mail.gmail.com>
On Wed, 13 Nov 2013 16:16:35 +0000
Catalin Marinas <catalin.marinas@arm.com> wrote:
> On 13 November 2013 08:16, Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h
> > index 5d1f950..e91afeb 100644
> > --- a/arch/s390/include/asm/mmu_context.h
> > +++ b/arch/s390/include/asm/mmu_context.h
> > @@ -48,13 +48,38 @@ static inline void update_mm(struct mm_struct *mm, struct task_struct *tsk)
> > static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> > struct task_struct *tsk)
> > {
> > - cpumask_set_cpu(smp_processor_id(), mm_cpumask(next));
> > - update_mm(next, tsk);
> > + int cpu = smp_processor_id();
> > +
> > + if (prev == next)
> > + return;
> > + if (atomic_inc_return(&next->context.attach_count) >> 16) {
> > + /* Delay update_mm until all TLB flushes are done. */
> > + set_tsk_thread_flag(tsk, TIF_TLB_WAIT);
> > + } else {
> > + cpumask_set_cpu(cpu, mm_cpumask(next));
> > + update_mm(next, tsk);
> > + if (next->context.flush_mm)
> > + /* Flush pending TLBs */
> > + __tlb_flush_mm(next);
> > + }
> > atomic_dec(&prev->context.attach_count);
> > WARN_ON(atomic_read(&prev->context.attach_count) < 0);
> > - atomic_inc(&next->context.attach_count);
> > - /* Check for TLBs not flushed yet */
> > - __tlb_flush_mm_lazy(next);
> > +}
> > +
> > +#define finish_switch_mm finish_switch_mm
> > +static inline void finish_switch_mm(struct mm_struct *mm,
> > + struct task_struct *tsk)
> > +{
> > + if (!test_and_clear_tsk_thread_flag(tsk, TIF_TLB_WAIT))
> > + return;
> > +
> > + while (atomic_read(&mm->context.attach_count) >> 16)
> > + cpu_relax();
> > +
> > + cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
> > + update_mm(mm, tsk);
> > + if (mm->context.flush_mm)
> > + __tlb_flush_mm(mm);
> > }
>
> Some care is needed here with preemption (we had this on arm and I
> think we need a fix on arm64 as well). Basically you set TIF_TLB_WAIT
> on a thread but you get preempted just before finish_switch_mm(). The
> new thread has the same mm as the preempted on and switch_mm() exits
> early without setting another flag. So finish_switch_mm() wouldn't do
> anything but you still switched to the new mm. The fix is to make the
> flag per mm rather than thread (see commit bdae73cd374e).
Interesting. For s390 I need to make sure that each task attaching an
mm waits for the completion of concurrent TLB flush operations. If the
scheduler does not switch the mm I don't care, the mm is still attached.
For the s390 issue a TIF bit seems appropriate. But I have to add an
preempt_enable/preempt_disable pair to finish_switch_mm, otherwise the
task can get hit by preemption after the while loop.
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
next prev parent reply other threads:[~2013-11-14 8:10 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-13 8:16 [PATCH 0/2] sched: finish_switch_mm hook Martin Schwidefsky
2013-11-13 8:16 ` [PATCH 1/2] sched/mm: add finish_switch_mm function Martin Schwidefsky
2013-11-13 11:41 ` Peter Zijlstra
2013-11-13 11:49 ` Martin Schwidefsky
2013-11-13 12:19 ` Catalin Marinas
2013-11-13 16:05 ` Martin Schwidefsky
2013-11-13 17:03 ` Catalin Marinas
2013-11-14 8:00 ` Martin Schwidefsky
2013-11-13 8:16 ` [PATCH 2/2] s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries Martin Schwidefsky
2013-11-13 16:16 ` Catalin Marinas
2013-11-14 8:10 ` Martin Schwidefsky [this message]
2013-11-14 13:22 ` Catalin Marinas
2013-11-14 16:33 ` Martin Schwidefsky
2013-11-15 10:44 ` Catalin Marinas
2013-11-15 11:10 ` Martin Schwidefsky
2013-11-15 11:17 ` Martin Schwidefsky
2013-11-15 11:57 ` Catalin Marinas
2013-11-15 13:29 ` Martin Schwidefsky
2013-11-15 13:46 ` Catalin Marinas
2013-11-18 8:11 ` Martin Schwidefsky
2013-11-15 9:13 ` Martin Schwidefsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131114091007.0b15dde2@mschwide \
--to=schwidefsky@de.ibm.com \
--cc=catalin.marinas@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.