From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:32944 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751263AbdJ0BYS (ORCPT ); Thu, 26 Oct 2017 21:24:18 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v9R1O7Mh068601 for ; Thu, 26 Oct 2017 21:24:18 -0400 Received: from e11.ny.us.ibm.com (e11.ny.us.ibm.com [129.33.205.201]) by mx0b-001b2d01.pphosted.com with ESMTP id 2dustab9h2-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Thu, 26 Oct 2017 21:24:17 -0400 Received: from localhost by e11.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 Oct 2017 21:24:17 -0400 Date: Thu, 26 Oct 2017 18:24:13 -0700 From: "Paul E. McKenney" To: NeilBrown Cc: Alexander Viro , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josh Triplett Subject: Re: [PATCH] VFS: use synchronize_rcu_expedited() in namespace_unlock() Reply-To: paulmck@linux.vnet.ibm.com References: <87y3nyd4pu.fsf@notabene.neil.brown.name> <20171026122743.GX3659@linux.vnet.ibm.com> <87po99ctbf.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87po99ctbf.fsf@notabene.neil.brown.name> Message-Id: <20171027012413.GC3659@linux.vnet.ibm.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri, Oct 27, 2017 at 11:45:08AM +1100, NeilBrown wrote: > On Thu, Oct 26 2017, Paul E. McKenney wrote: > > > On Thu, Oct 26, 2017 at 01:26:37PM +1100, NeilBrown wrote: > >> > >> The synchronize_rcu() in namespace_unlock() is called every time > >> a filesystem is unmounted. If a great many filesystems are mounted, > >> this can cause a noticable slow-down in, for example, system shutdown. > >> > >> The sequence: > >> mkdir -p /tmp/Mtest/{0..5000} > >> time for i in /tmp/Mtest/*; do mount -t tmpfs tmpfs $i ; done > >> time umount /tmp/Mtest/* > >> > >> on a 4-cpu VM can report 8 seconds to mount the tmpfs filesystems, and > >> 100 seconds to unmount them. > >> > >> Boot the same VM with 1 CPU and it takes 18 seconds to mount the > >> tmpfs filesystems, but only 36 to unmount. > >> > >> If we change the synchronize_rcu() to synchronize_rcu_expedited() > >> the umount time on a 4-cpu VM is 8 seconds to mount and 0.6 to > >> unmount. > >> > >> I think this 200-fold speed up is worth the slightly higher system > >> impact of use synchronize_rcu_expedited(). > >> > >> Signed-off-by: NeilBrown > >> --- > >> > >> Cc: to Paul and Josh in case they'll correct me if using _expedited() > >> is really bad here. > > > > I suspect that filesystem unmount is pretty rare in production real-time > > workloads, which are the ones that might care. So I would guess that > > this is OK. > > > > If the real-time guys ever do want to do filesystem unmounts while their > > real-time applications are running, they might modify this so that it can > > use synchronize_rcu() instead for real-time builds of the kernel. > > Thanks for the confirmation Paul. > > > > > But just for completeness, one way to make this work across the board > > might be to instead use call_rcu(), with the callback function kicking > > off a workqueue handler to do the rest of the unmount. Of course, > > in saying that, I am ignoring any mutexes that you might be holding > > across this whole thing, and also ignoring any problems that might arise > > when returning to userspace with some portion of the unmount operation > > still pending. (For example, someone unmounting a filesystem and then > > immediately remounting that same filesystem.) > > I had briefly considered that option, but it doesn't work. > The purpose of this synchronize_rcu() is to wait for any filename lookup > which might be locklessly touching the mountpoint to complete. > It is only after that that the real meat of unmount happen - the > filesystem is told that the last reference is gone, and it gets to > flush any saved changes out to disk etc. > That stuff really has to happen before the umount syscall returns. Hey, I was hoping! ;-) Thanx, Paul