From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751886AbdJ0BYW (ORCPT ); Thu, 26 Oct 2017 21:24:22 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:53318 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751548AbdJ0BYT (ORCPT ); Thu, 26 Oct 2017 21:24:19 -0400 Date: Thu, 26 Oct 2017 18:24:13 -0700 From: "Paul E. McKenney" To: NeilBrown Cc: Alexander Viro , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Josh Triplett Subject: Re: [PATCH] VFS: use synchronize_rcu_expedited() in namespace_unlock() Reply-To: paulmck@linux.vnet.ibm.com References: <87y3nyd4pu.fsf@notabene.neil.brown.name> <20171026122743.GX3659@linux.vnet.ibm.com> <87po99ctbf.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87po99ctbf.fsf@notabene.neil.brown.name> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17102701-2213-0000-0000-000002320D93 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007957; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000239; SDB=6.00936979; UDB=6.00472212; IPR=6.00717242; BA=6.00005660; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017732; XFM=3.00000015; UTC=2017-10-27 01:24:16 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17102701-2214-0000-0000-000057FA7761 Message-Id: <20171027012413.GC3659@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-10-26_11:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710270017 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 27, 2017 at 11:45:08AM +1100, NeilBrown wrote: > On Thu, Oct 26 2017, Paul E. McKenney wrote: > > > On Thu, Oct 26, 2017 at 01:26:37PM +1100, NeilBrown wrote: > >> > >> The synchronize_rcu() in namespace_unlock() is called every time > >> a filesystem is unmounted. If a great many filesystems are mounted, > >> this can cause a noticable slow-down in, for example, system shutdown. > >> > >> The sequence: > >> mkdir -p /tmp/Mtest/{0..5000} > >> time for i in /tmp/Mtest/*; do mount -t tmpfs tmpfs $i ; done > >> time umount /tmp/Mtest/* > >> > >> on a 4-cpu VM can report 8 seconds to mount the tmpfs filesystems, and > >> 100 seconds to unmount them. > >> > >> Boot the same VM with 1 CPU and it takes 18 seconds to mount the > >> tmpfs filesystems, but only 36 to unmount. > >> > >> If we change the synchronize_rcu() to synchronize_rcu_expedited() > >> the umount time on a 4-cpu VM is 8 seconds to mount and 0.6 to > >> unmount. > >> > >> I think this 200-fold speed up is worth the slightly higher system > >> impact of use synchronize_rcu_expedited(). > >> > >> Signed-off-by: NeilBrown > >> --- > >> > >> Cc: to Paul and Josh in case they'll correct me if using _expedited() > >> is really bad here. > > > > I suspect that filesystem unmount is pretty rare in production real-time > > workloads, which are the ones that might care. So I would guess that > > this is OK. > > > > If the real-time guys ever do want to do filesystem unmounts while their > > real-time applications are running, they might modify this so that it can > > use synchronize_rcu() instead for real-time builds of the kernel. > > Thanks for the confirmation Paul. > > > > > But just for completeness, one way to make this work across the board > > might be to instead use call_rcu(), with the callback function kicking > > off a workqueue handler to do the rest of the unmount. Of course, > > in saying that, I am ignoring any mutexes that you might be holding > > across this whole thing, and also ignoring any problems that might arise > > when returning to userspace with some portion of the unmount operation > > still pending. (For example, someone unmounting a filesystem and then > > immediately remounting that same filesystem.) > > I had briefly considered that option, but it doesn't work. > The purpose of this synchronize_rcu() is to wait for any filename lookup > which might be locklessly touching the mountpoint to complete. > It is only after that that the real meat of unmount happen - the > filesystem is told that the last reference is gone, and it gets to > flush any saved changes out to disk etc. > That stuff really has to happen before the umount syscall returns. Hey, I was hoping! ;-) Thanx, Paul