From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Subject: Re: [PATCH 06/13] mm: Preemptible mmu_gather Date: Fri, 9 Apr 2010 13:25:09 +1000 Message-ID: <20100409032509.GH5683@laptop> References: <20100408191737.296180458@chello.nl> <20100408192722.858079986@chello.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from cantor.suse.de ([195.135.220.2]:52935 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752418Ab0DIDZO (ORCPT ); Thu, 8 Apr 2010 23:25:14 -0400 Content-Disposition: inline In-Reply-To: <20100408192722.858079986@chello.nl> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: Andrea Arcangeli , Avi Kivity , Thomas Gleixner , Rik van Riel , Ingo Molnar , akpm@linux-foundation.org, Linus Torvalds , linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, Benjamin Herrenschmidt , David Miller , Hugh Dickins , Mel Gorman On Thu, Apr 08, 2010 at 09:17:43PM +0200, Peter Zijlstra wrote: > @@ -39,30 +33,48 @@ > struct mmu_gather { > struct mm_struct *mm; > unsigned int nr; /* set to ~0U means fast mode */ > + unsigned int max; /* nr < max */ > unsigned int need_flush;/* Really unmapped some ptes? */ > unsigned int fullmm; /* non-zero means full mm flush */ > - struct page * pages[FREE_PTE_NR]; > +#ifdef HAVE_ARCH_MMU_GATHER > + struct arch_mmu_gather arch; > +#endif > + struct page **pages; > + struct page *local[8]; Have you done some profiling on this? What I would like to see, if it's not too much complexity, is to have a small set of pages to handle common size frees, and then use them up first by default before attempting to allocate more. Also, it would be cool to be able to chain allocations to avoid TLB flushes even on big frees (overridable by arch of course, in case they're doing some non-preeemptible work or you wish to break up lock hold times). But that might be just getting over engineered. > }; > > -/* Users of the generic TLB shootdown code must declare this storage space. */ > -DECLARE_PER_CPU(struct mmu_gather, mmu_gathers); > +static inline void __tlb_alloc_pages(struct mmu_gather *tlb) > +{ > + unsigned long addr = __get_free_pages(GFP_ATOMIC, 0); Slab allocations should be faster, so it's nice to use them in performance critical code if you don't need the struct page. Otherwise, looks ok to me.