From mboxrd@z Thu Jan  1 00:00:00 1970
From: Mike Waychison <mike@waychison.com>
Subject: Re: [RFC][PATCH 1/8] namespace: use global namespace semaphore
Date: Fri, 03 Jun 2005 10:07:28 -0700
Message-ID: <42A08E50.5000700@waychison.com>
References: <E1DckB4-0000Rc-00@dorka.pomaz.szeredi.hu> <Pine.LNX.4.58.0505311052140.9575@wombat.indigo.net.au> <429BD8B7.7070906@waychison.com> <Pine.LNX.4.58.0505311453370.17076@wombat.indigo.net.au> <429C88F9.6010200@waychison.com> <Pine.LNX.4.58.0506011146390.10849@wombat.indigo.net.au>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Miklos Szeredi <miklos@szeredi.hu>, linux-fsdevel@vger.kernel.org,
	jamie@shareable.org, viro@parcelfarce.linux.theplanet.co.uk
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from 216-239-45-4.google.com ([216.239.45.4]:12983 "EHLO
	216-239-45-4.google.com") by vger.kernel.org with ESMTP
	id S261397AbVFCRIc (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Fri, 3 Jun 2005 13:08:32 -0400
To: Ian Kent <raven@themaw.net>
In-Reply-To: <Pine.LNX.4.58.0506011146390.10849@wombat.indigo.net.au>
Sender: linux-fsdevel-owner@vger.kernel.org
List-Id: linux-fsdevel.vger.kernel.org

Ian Kent wrote:
> On Tue, 31 May 2005, Mike Waychison wrote:
> 
> 
>>-----BEGIN PGP SIGNED MESSAGE-----
>>Hash: SHA1
>>
>>Ian Kent wrote:
>>
>>>On Mon, 30 May 2005, Mike Waychison wrote:
>>>
>>>>No, the removal of struct namespace shouldn't keep you from walking
>>>>vfsmounts from modules.  In fact, in Miklos' 1/8 patch, he moved
>>>>namespace->sem to a global semaphore, which isn't needed by the end of
>>>>the patch set.  namespace->sem only ever protected the namespace->list
>>>>field.
>>>
>>> 
>>>But my impression was that this semaphore surrounded the mount and umount 
>>>processes. The list updates looked like they were all done when this 
>>>semaphore was held.
>>>
>>
>>namespace->sem did indeed surround the mounting and unmounting
>>processes, however it never was meant to protect the integrety of the
>>vfsmount tree, only namespace->list.  Due to the lock nesting rules of
>>vfs, it did infact exclude any tree operations, however the proper lock
>>to have used was always vfsmount_lock
>>
>>
>>>>If you which to walk a mounttree, you still only ever require grabbing
>>>>vfsmount_lock and use mntget/mntput.
>>>
>>>
>>>Which is not exported.
>>>
>>>So do you think it should be?
>>
>>Yes.  Even better would be for a proper interface to be exported from
>>fs/namespace.c though, so that as a module you don't care about the
>>nitty gritty details of how the tree is held together.  This is likely
>>to get even more complicated with time as Ram's patches get more testing.
> 
> 
> For a start it's probably a good idea to define which of the current 
> functions should be exposed and leave them exposed whether they are in use 
> by other clients or not.

Well, the only valid module use of this stuff is autofs.  Change the 
interface and change the in-kernel uses of it for autofs4 (and autofs3 
if need be).

> 
> 
>>>Maybe lookup_mnt shouldn't be static and should be exported instead?
>>>
>>
>>lookup_mnt is the wrong operation, as it looks for a mountpoint given a
>>vfsmount and dentry.  What you need is a way to walk the tree, and I'm
>>inclined to say that something like next_mnt is what you are looking for.
> 
> 
> As it turns out lookup_mnt would work fine in this special case. The 
> nameidata should always contain the parent of the required mount. I can 
> see your point though, I'll think about it for a while.
> 

If lookup_mnt were used, you'd have to walk the dcache for each mount 
and do a lookup_mnt for each in-memory dentry if d_mounted!=0.  Not only 
is this a huge waste of cycles, but it's also very hairy as safely 
walking the dcache is a) harder than walking vfsmounts and b) over the 
edge in terms of intruding on VFS internals.

> 
>>>>Overall, this means autofs4 stuff will need to be updated to reflect
>>>>these changes, but they are minimal at best.
>>>
>>>
>>>OK. But I'm not sure how I can get at this from within a module.
>>>
>>
>>Right now, the quickest way is to EXPORT_SYMBOL(vfsmount_lock) and grab
>>it instead of namespace->sem.
>>
>>
>>>Perhaps something like updating the nameidata struct passed to 
>>>follow_link, similar to what you did elsewhere in your autofsng 
>>>patches, needs to be added to this lot.
>>
>>I'm lost to what you mean.  The only place autofsng touches nameidata in
>>the VFS is setting ->mnt to a non-stale mount before a ->follow_link.
>>That symlinks even work without this change is almost coincidence.
> 
> 
> That was exactly what I was saying.
> 

Ah.  Well, I don't know how having the right vfsmount at ->follow_link 
time will help you expire anything (that's the only use of vfsmount tree 
walking, isn't it?).  If anything, the ->follow_link'able dentry would 
be covered and the callback wouldn't happen.

Mike Waychison