From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f178.google.com (mail-pd0-f178.google.com [209.85.192.178]) by kanga.kvack.org (Postfix) with ESMTP id A1F016B0031 for ; Wed, 9 Oct 2013 23:14:54 -0400 (EDT) Received: by mail-pd0-f178.google.com with SMTP id w10so1904577pde.23 for ; Wed, 09 Oct 2013 20:14:54 -0700 (PDT) Received: by mail-vb0-f42.google.com with SMTP id e12so1185835vbg.15 for ; Wed, 09 Oct 2013 20:14:51 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20131009072838.GY3081@twins.programming.kicks-ass.net> References: <1380753493.11046.82.camel@schen9-DESK> <20131003073212.GC5775@gmail.com> <1381186674.11046.105.camel@schen9-DESK> <20131009061551.GD7664@gmail.com> <20131009072838.GY3081@twins.programming.kicks-ass.net> Date: Wed, 9 Oct 2013 20:14:51 -0700 Message-ID: Subject: Re: [PATCH v8 0/9] rwsem performance optimizations From: Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Peter Zijlstra Cc: Ingo Molnar , Tim Chen , Ingo Molnar , Andrew Morton , Andrea Arcangeli , Alex Shi , Andi Kleen , Michel Lespinasse , Davidlohr Bueso , Matthew R Wilcox , Dave Hansen , Rik van Riel , Peter Hurley , "Paul E.McKenney" , Jason Low , Waiman Long , Linux Kernel Mailing List , linux-mm On Wed, Oct 9, 2013 at 12:28 AM, Peter Zijlstra wrote: > > The workload that I got the report from was a virus scanner, it would > spawn nr_cpus threads and {mmap file, scan content, munmap} through your > filesystem. So I suspect we could make the mmap_sem write area *much* smaller for the normal cases. Look at do_mmap_pgoff(), for example: it is run entirely under mmap_sem, but 99% of what it does doesn't actually need the lock. The part that really needs the lock is addr = get_unmapped_area(file, addr, len, pgoff, flags); addr = mmap_region(file, addr, len, vm_flags, pgoff); but we hold it over all the other stuff too. In fact, even if we moved the mmap_sem down into do_mmap(), and moved code around a bit to only hold it over those functions, it would still cover unnecessarily much. For example, while merging is common, not merging is pretty common too, and we do that vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL); allocation under the lock. We could easily do things like preallocate it outside the lock. Right now mmap_sem covers pretty much the whole system call (we do do some security checks outside of it). I think the main issue is that nobody has ever cared deeply enough to see how far this could be pushed. I suspect there is some low-hanging fruit for anybody who is willing to handle the pain.. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org