From: Ingo Molnar <mingo@kernel.org>
To: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Borislav Petkov <bp@alien8.de>, Andy Lutomirski <luto@kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
"x86@kernel.org" <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Dave Hansen <dave.hansen@intel.com>
Subject: Re: [PATCH v2] x86/mm/tlb: Remove flush_tlb_info from the stack
Date: Thu, 25 Apr 2019 21:48:37 +0200 [thread overview]
Message-ID: <20190425194837.GB58719@gmail.com> (raw)
In-Reply-To: <82E44F4A-E52D-4666-95B5-C6248A14A442@vmware.com>
* Nadav Amit <namit@vmware.com> wrote:
> > On Apr 25, 2019, at 12:29 PM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Nadav Amit <namit@vmware.com> wrote:
> >
> >> Move flush_tlb_info variables off the stack. This allows to align
> >> flush_tlb_info to cache-line and avoid potentially unnecessary cache
> >> line movements. It also allows to have a fixed virtual-to-physical
> >> translation of the variables, which reduces TLB misses.
> >>
> >> Use per-CPU struct for flush_tlb_mm_range() and
> >> flush_tlb_kernel_range(). Add debug assertions to ensure there are
> >> no nested TLB flushes that might overwrite the per-CPU data. For
> >> arch_tlbbatch_flush() use a const struct.
> >>
> >> Results when running a microbenchmarks that performs 10^6 MADV_DONTEED
> >> operations and touching a page, in which 3 additional threads run a
> >> busy-wait loop (5 runs, PTI and retpolines are turned off):
> >>
> >> base off-stack
> >> ---- ---------
> >> avg (usec/op) 1.629 1.570 (-3%)
> >> stddev 0.014 0.009
> >>
> >> Cc: Peter Zijlstra <peterz@infradead.org>
> >> Cc: Andy Lutomirski <luto@kernel.org>
> >> Cc: Dave Hansen <dave.hansen@intel.com>
> >> Cc: Borislav Petkov <bp@alien8.de>
> >> Cc: Thomas Gleixner <tglx@linutronix.de>
> >> Signed-off-by: Nadav Amit <namit@vmware.com>
> >>
> >> ---
> >>
> >> v1->v2:
> >> - Initialize all flush_tlb_info fields [Andy]
> >> ---
> >> arch/x86/mm/tlb.c | 100 ++++++++++++++++++++++++++++++++++------------
> >> 1 file changed, 74 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> >> index 487b8474c01c..aac191eb2b90 100644
> >> --- a/arch/x86/mm/tlb.c
> >> +++ b/arch/x86/mm/tlb.c
> >> @@ -634,7 +634,7 @@ static void flush_tlb_func_common(const struct flush_tlb_info *f,
> >> this_cpu_write(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen, mm_tlb_gen);
> >> }
> >>
> >> -static void flush_tlb_func_local(void *info, enum tlb_flush_reason reason)
> >> +static void flush_tlb_func_local(const void *info, enum tlb_flush_reason reason)
> >> {
> >> const struct flush_tlb_info *f = info;
> >>
> >> @@ -722,43 +722,81 @@ void native_flush_tlb_others(const struct cpumask *cpumask,
> >> */
> >> unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
> >>
> >> +static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tlb_info, flush_tlb_info);
> >> +
> >> +#ifdef CONFIG_DEBUG_VM
> >> +static DEFINE_PER_CPU(unsigned int, flush_tlb_info_idx);
> >> +#endif
> >> +
> >> +static inline struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm,
> >> + unsigned long start, unsigned long end,
> >> + unsigned int stride_shift, bool freed_tables,
> >> + u64 new_tlb_gen)
> >> +{
> >> + struct flush_tlb_info *info = this_cpu_ptr(&flush_tlb_info);
> >> +
> >> +#ifdef CONFIG_DEBUG_VM
> >> + /*
> >> + * Ensure that the following code is non-reentrant and flush_tlb_info
> >> + * is not overwritten. This means no TLB flushing is initiated by
> >> + * interrupt handlers and machine-check exception handlers.
> >> + */
> >> + BUG_ON(this_cpu_inc_return(flush_tlb_info_idx) != 1);
> >> +#endif
> >
> > isn't this effectively VM_BUG_ON()?
>
> Not exactly. When CONFIG_DEBUG_VM is off we get
>
> #define VM_BUG_ON(cond) BUILD_BUG_ON_INVALID(cond)
>
> This will cause the build to fail since flush_tlb_info_idx is not defined in
> when CONFIG_DEBUG_VM is off.
Ugh, so VM_BUG_ON() should really be named VM_BUILD_BUG_ON()?
Anyway, agreed.
> >> +static inline void put_flush_tlb_info(void)
> >> +{
> >> +#ifdef CONFIG_DEBUG_VM
> >> + /* Complete reentrency prevention checks */
> >> + barrier();
> >> + this_cpu_dec(flush_tlb_info_idx);
> >> +#endif
> >
> > In principle this_cpu_dec() should imply a compiler barrier?
>
> this_cpu_dec() is eventually expanded to the macro of percpu_add_op(). And
> the inline assembly does not have a “memory” clobber, so I don’t think so.
I think that's a bug and PeterZ is fixing those.
Thanks,
Ingo
next prev parent reply other threads:[~2019-04-25 19:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-25 18:08 [PATCH v2] x86/mm/tlb: Remove flush_tlb_info from the stack Nadav Amit
2019-04-25 18:50 ` Andy Lutomirski
2019-04-25 19:13 ` Nadav Amit
2019-04-25 19:29 ` Ingo Molnar
2019-04-25 19:42 ` Nadav Amit
2019-04-25 19:48 ` Ingo Molnar [this message]
2019-04-25 21:20 ` Nadav Amit
2019-04-26 7:53 ` Peter Zijlstra
2019-04-26 8:37 ` Nadav Amit
2019-04-26 11:46 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190425194837.GB58719@gmail.com \
--to=mingo@kernel.org \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.