From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ie0-f171.google.com (mail-ie0-f171.google.com [209.85.223.171]) by kanga.kvack.org (Postfix) with ESMTP id 334886B0035 for ; Wed, 27 Aug 2014 04:55:34 -0400 (EDT) Received: by mail-ie0-f171.google.com with SMTP id at1so13062057iec.30 for ; Wed, 27 Aug 2014 01:55:34 -0700 (PDT) Received: from cam-admin0.cambridge.arm.com (cam-admin0.cambridge.arm.com. [217.140.96.50]) by mx.google.com with ESMTP id yt2si124981igb.27.2014.08.27.01.55.28 for ; Wed, 27 Aug 2014 01:55:33 -0700 (PDT) Date: Wed, 27 Aug 2014 09:54:42 +0100 From: Will Deacon Subject: Re: [PATH V2 1/6] mm: Introduce a general RCU get_user_pages_fast. Message-ID: <20140827085442.GD16376@arm.com> References: <1408635812-31584-1-git-send-email-steve.capper@linaro.org> <1408635812-31584-2-git-send-email-steve.capper@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1408635812-31584-2-git-send-email-steve.capper@linaro.org> Sender: owner-linux-mm@kvack.org List-ID: To: Steve Capper Cc: "linux-arm-kernel@lists.infradead.org" , Catalin Marinas , "linux@arm.linux.org.uk" , "linux-arch@vger.kernel.org" , "linux-mm@kvack.org" , "gary.robertson@linaro.org" , "christoffer.dall@linaro.org" , "peterz@infradead.org" , "anders.roxell@linaro.org" , "akpm@linux-foundation.org" , "dann.frazier@canonical.com" , Mark Rutland , "mgorman@suse.de" Hi Steve, A few minor comments (took me a while to understand how this works, so I thought I'd make some noise :) On Thu, Aug 21, 2014 at 04:43:27PM +0100, Steve Capper wrote: > get_user_pages_fast attempts to pin user pages by walking the page > tables directly and avoids taking locks. Thus the walker needs to be > protected from page table pages being freed from under it, and needs > to block any THP splits. > > One way to achieve this is to have the walker disable interrupts, and > rely on IPIs from the TLB flushing code blocking before the page table > pages are freed. > > On some platforms we have hardware broadcast of TLB invalidations, thus > the TLB flushing code doesn't necessarily need to broadcast IPIs; and > spuriously broadcasting IPIs can hurt system performance if done too > often. > > This problem has been solved on PowerPC and Sparc by batching up page > table pages belonging to more than one mm_user, then scheduling an > rcu_sched callback to free the pages. This RCU page table free logic > has been promoted to core code and is activated when one enables > HAVE_RCU_TABLE_FREE. Unfortunately, these architectures implement > their own get_user_pages_fast routines. > > The RCU page table free logic coupled with a an IPI broadcast on THP > split (which is a rare event), allows one to protect a page table > walker by merely disabling the interrupts during the walk. Disabling interrupts isn't completely free (it's a self-synchronising operation on ARM). It would be interesting to see if your futex workload performance is improved by my simple irq_save optimisation for ARM: https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches&id=312a70adfa6f22e9d62803dd21400f481253e58b (I've been struggling to show anything other than tiny improvements from that patch). > This patch provides a general RCU implementation of get_user_pages_fast > that can be used by architectures that perform hardware broadcast of > TLB invalidations. > > It is based heavily on the PowerPC implementation by Nick Piggin. [...] > diff --git a/mm/gup.c b/mm/gup.c > index 91d044b..2f684fa 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -10,6 +10,10 @@ > #include > #include > > +#include > +#include > +#include > + > #include "internal.h" > > static struct page *no_page_table(struct vm_area_struct *vma, > @@ -672,3 +676,277 @@ struct page *get_dump_page(unsigned long addr) > return page; > } > #endif /* CONFIG_ELF_CORE */ > + > +#ifdef CONFIG_HAVE_RCU_GUP > + > +#ifdef __HAVE_ARCH_PTE_SPECIAL Do we actually require this (pte special) if hugepages are disabled or not supported? Will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org