From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH V3 3/6] arm: mm: implement get_user_pages_fast
Date: Wed, 12 Mar 2014 16:55:11 +0000 [thread overview]
Message-ID: <20140312165510.GA9950@mudshark.cambridge.arm.com> (raw)
In-Reply-To: <20140312163200.GJ27965@twins.programming.kicks-ass.net>
On Wed, Mar 12, 2014 at 04:32:00PM +0000, Peter Zijlstra wrote:
> On Wed, Mar 12, 2014 at 01:40:20PM +0000, Steve Capper wrote:
> > +void pmdp_splitting_flush(struct vm_area_struct *vma, unsigned long address,
> > + pmd_t *pmdp)
> > +{
> > + pmd_t pmd = pmd_mksplitting(*pmdp);
> > + VM_BUG_ON(address & ~PMD_MASK);
> > + set_pmd_at(vma->vm_mm, address, pmdp, pmd);
> > +
> > + /* dummy IPI to serialise against fast_gup */
> > + smp_call_function(thp_splitting_flush_sync, NULL, 1);
> > +}
>
> do you really need to IPI the entire machine? Wouldn't the mm's TLB
> invalidate mask be sufficient?
Are you thinking of using mm_cpumask(vma->vm_mm)? That's rarely cleared on
ARM, so it tends to identify everywhere the task has ever run, regardless of
TLB state. The reason is that the mask is also used for cache flushing
(which is further overloaded for VIVT and VIPT w/ software maintenance
broadcast).
I had a patch improving this a bit (below) but I didn't manage to see any
significant improvements so I didn't pursue it further. What we probably want
to try is nuking the mask on a h/w broadcast TLBI operation with ARMv7, but
it will mean adding horrible checks to tlbflush.h
Will
--->8
commit fd24d6170839b200cc2916c83847ca46e889f1ca
Author: Will Deacon <will.deacon@arm.com>
Date: Thu Jul 25 16:38:34 2013 +0100
ARM: mm: use mm_cpumask to keep track of dirty TLBs on v7
Signed-off-by: Will Deacon <will.deacon@arm.com>
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index def9e570199f..f2a1cb7edfca 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -202,6 +202,7 @@
#ifndef __ASSEMBLY__
#include <linux/sched.h>
+#include <asm/smp_plat.h>
struct cpu_tlb_fns {
void (*flush_user_range)(unsigned long, unsigned long, struct vm_area_struct *);
@@ -401,6 +402,17 @@ static inline void __flush_tlb_mm(struct mm_struct *mm)
{
const unsigned int __tlb_flag = __cpu_tlb_flags;
+ if (!cache_ops_need_broadcast()) {
+ int cpu = get_cpu();
+ if (cpumask_equal(mm_cpumask(mm), cpumask_of(cpu))) {
+ cpumask_clear_cpu(cpu, mm_cpumask(mm));
+ local_flush_tlb_mm(mm);
+ put_cpu();
+ return;
+ }
+ put_cpu();
+ }
+
if (tlb_flag(TLB_WB))
dsb(ishst);
@@ -459,6 +471,17 @@ __flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr)
{
const unsigned int __tlb_flag = __cpu_tlb_flags;
+ if (!cache_ops_need_broadcast()) {
+ int cpu = get_cpu();
+ if (cpumask_equal(mm_cpumask(vma->vm_mm), cpumask_of(cpu))) {
+ cpumask_clear_cpu(cpu, mm_cpumask(vma->vm_mm));
+ local_flush_tlb_page(vma, uaddr);
+ put_cpu();
+ return;
+ }
+ put_cpu();
+ }
+
uaddr = (uaddr & PAGE_MASK) | ASID(vma->vm_mm);
if (tlb_flag(TLB_WB))
next prev parent reply other threads:[~2014-03-12 16:55 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-12 13:40 [RFC PATCH V3 0/6] get_user_pages_fast for ARM and ARM64 Steve Capper
2014-03-12 13:40 ` [RFC PATCH V3 1/6] arm: mm: Introduce special ptes for LPAE Steve Capper
2014-03-12 13:40 ` [RFC PATCH V3 2/6] arm: mm: Enable HAVE_RCU_TABLE_FREE logic Steve Capper
2014-03-12 13:40 ` [RFC PATCH V3 3/6] arm: mm: implement get_user_pages_fast Steve Capper
2014-03-12 14:18 ` Peter Zijlstra
2014-03-12 16:20 ` Steve Capper
2014-03-12 16:30 ` Peter Zijlstra
2014-03-12 16:42 ` Steve Capper
2014-03-12 16:32 ` Peter Zijlstra
2014-03-12 16:41 ` Steve Capper
2014-03-12 16:55 ` Will Deacon [this message]
2014-03-12 17:11 ` Peter Zijlstra
2014-03-14 11:47 ` Peter Zijlstra
2014-03-13 8:24 ` Steve Capper
2014-03-12 17:15 ` Catalin Marinas
2014-03-13 8:03 ` Steve Capper
2014-03-12 13:40 ` [RFC PATCH V3 4/6] arm64: Convert asm/tlb.h to generic mmu_gather Steve Capper
2014-03-12 13:40 ` [RFC PATCH V3 5/6] arm64: mm: Enable HAVE_RCU_TABLE_FREE logic Steve Capper
2014-03-12 13:40 ` [RFC PATCH V3 6/6] arm64: mm: Activate get_user_pages_fast for arm64 Steve Capper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140312165510.GA9950@mudshark.cambridge.arm.com \
--to=will.deacon@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).