From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [GIT PULL] Detaching mounts on unlink for 3.15-rc1 Date: Wed, 09 Apr 2014 10:32:14 -0700 Message-ID: <87sipmbe8x.fsf@x220.int.ebiederm.org> References: <87a9kkax0j.fsf@xmission.com> <8761v7h2pt.fsf@tw-ebiederman.twitter.com> <87li281wx6.fsf_-_@xmission.com> <87ob28kqks.fsf_-_@xmission.com> <874n3n7czm.fsf_-_@xmission.com> <87wqezl5df.fsf_-_@x220.int.ebiederm.org> <20140409023027.GX18016@ZenIV.linux.org.uk> <20140409023947.GY18016@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain Cc: Linus Torvalds , "Serge E. Hallyn" , Linux-Fsdevel , Kernel Mailing List , Andy Lutomirski , Rob Landley , Miklos Szeredi , Christoph Hellwig , Karel Zak , "J. Bruce Fields" , Fengguang Wu To: Al Viro Return-path: In-Reply-To: <20140409023947.GY18016@ZenIV.linux.org.uk> (Al Viro's message of "Wed, 9 Apr 2014 03:39:47 +0100") Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Al Viro writes: > On Wed, Apr 09, 2014 at 03:30:27AM +0100, Al Viro wrote: > >> > When renaming or unlinking directory entries that are not mountpoints >> > no additional locks are taken so no performance differences can result, >> > and my benchmark reflected that. >> >> It also means that d_invalidate() now might trigger fs shutdown. Which >> has bloody huge stack footprint, for obvious reasons. And d_invalidate() >> can be called with pretty deep stack - walk into wrong dentry while >> resolving a deeply nested symlink and there you go... > > PS: I thought I actually replied with that point back a month or so ago, > but having checked sent-mail... Looks like I had not. My deep apologies. > > FWIW, I think that overall this thing is a good idea, provided that we can > live with semantics changes. The implementation is too optimistic, though - > at the very least, we want this work done upon namespace_unlock() held > back until we are not too deep in stack. task_work_add() fodder, > perhaps? Hmm. Just to confirm what I am dealing with I have proceeded to measure the amount of stack used by these operations. For resolving a deeply nested symlink that hits the limit of 8 nested symlinks, I find 4688 bytes left on the stack. Which means we use roughly 3504 bytes of stack when stating a deeply nested symlink. For umount I had a little trouble measuring as typically the work done by umount was not the largest stack consumer, but I found for a small ext4 filesystem after the umount operation was complete there were 5152 bytes left on the stack, or umount used roughly 3040 bytes. 3504 + 3040 = 6544 bytes of stack used or 1684 bytes of stack left unused. Which certainly isn't a lot of margin but it is not overflowing the kernel stack either. Is there a case that see where umount uses a lot more kernel stack? Is your concern an architecture other than x86_64 with different limitations? I am quite happy to change my code to avoid stack overflow but I want to make certain I understand where the stack usage is coming from so that I actually fix them issue. Eric