public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* VFS + path walktrough
@ 2008-05-05 12:40 Enrico Weigelt
  2008-05-05 13:06 ` Enrico Weigelt
  0 siblings, 1 reply; 19+ messages in thread
From: Enrico Weigelt @ 2008-05-05 12:40 UTC (permalink / raw)
  To: linux kernel list


Hi folks,

could anyone please enlighten me, what exactly happens when 
some app opens an file with some longer pathname ?

Lets say we open /a/b/c/d and /a is mounted w/ some network 
filesystem (eg. 9P). Who exactly does the walktrough from b to d ?
The individual filesystem or VFS ?

The point is: the 9P protocol can work with whole pathnames, so
the client doesn't have to do the walkthrough manually - this
can heavily reduce traffic and latency. I'd like the 9P fs driver
to directly use this, if VFS can send the whole pathname at once.


cu
-- 
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 12:40 VFS + path walktrough Enrico Weigelt
@ 2008-05-05 13:06 ` Enrico Weigelt
  2008-05-05 13:13   ` Al Viro
  0 siblings, 1 reply; 19+ messages in thread
From: Enrico Weigelt @ 2008-05-05 13:06 UTC (permalink / raw)
  To: linux kernel list

* Enrico Weigelt <weigelt@metux.de> wrote:

> Lets say we open /a/b/c/d and /a is mounted w/ some network 
> filesystem (eg. 9P). Who exactly does the walktrough from b to d ?
> The individual filesystem or VFS ?
> 
> The point is: the 9P protocol can work with whole pathnames, so
> the client doesn't have to do the walkthrough manually - this
> can heavily reduce traffic and latency. I'd like the 9P fs driver
> to directly use this, if VFS can send the whole pathname at once.

I've digget somebit in the source and found out that it goes 
down to link_path_walk(). It seems to split the pathname into 
components and walk through them one by one.

We could just add another call vector to struct file_operations,
as replacement for link_path_walk() - if it's zero, the original
function is used. This way an filesystem can do the walktrough
by it's own, but doesn't need to.


What do you think about this ?


cu
-- 
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 13:06 ` Enrico Weigelt
@ 2008-05-05 13:13   ` Al Viro
  2008-05-05 13:43     ` Enrico Weigelt
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2008-05-05 13:13 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: linux kernel list

On Mon, May 05, 2008 at 03:06:23PM +0200, Enrico Weigelt wrote:

> We could just add another call vector to struct file_operations,
> as replacement for link_path_walk() - if it's zero, the original
> function is used. This way an filesystem can do the walktrough
> by it's own, but doesn't need to.
> 
> 
> What do you think about this ?

That you have quite forgotten about mounts.  And we *REALLY* don't
want to shift the entire logics of link_path_walk() into filesystems -
this is insane.  Even "let's follow that symlink" part alone, not to
mention mountpoint handling, populating dcache, etc.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 13:13   ` Al Viro
@ 2008-05-05 13:43     ` Enrico Weigelt
  2008-05-05 15:35       ` Al Viro
  0 siblings, 1 reply; 19+ messages in thread
From: Enrico Weigelt @ 2008-05-05 13:43 UTC (permalink / raw)
  To: linux kernel list

* Al Viro <viro@ZenIV.linux.org.uk> wrote:
> On Mon, May 05, 2008 at 03:06:23PM +0200, Enrico Weigelt wrote:
> 
> > We could just add another call vector to struct file_operations,
> > as replacement for link_path_walk() - if it's zero, the original
> > function is used. This way an filesystem can do the walktrough
> > by it's own, but doesn't need to.
> > 
> > 
> > What do you think about this ?
> 
> That you have quite forgotten about mounts.  

hmm, I though this would be done before the link_path_walk() 
call happens ;-o

> And we *REALLY* don't want to shift the entire logics of 
> link_path_walk() into filesystems - this is insane. 

Only for those filesystems who *really* want to do it by 
themselves and set the appropriate call vector. All other 
fs'es will just leave it blank (even don't have to be touched)
and so the old way remains for them.

To get around mointpoint issues, we could at least do it only
when an special mount option is given and add an big-fat warning
that mountpoints within these mounts won't work. So these fast
lookups will only happen when:

#1: the fs explicitly supports it
#2: mounted with an special option

And if you use that option, you'll simply loose the ability
of using mointpoints within this specific mount. This won't 
affect any situation other than #1 && #2, IMHO this is better
than no chance of fast lookups at all. Of course, an cleaner
approach would be better, but it's IMHO not critical.

BTW: there are (or have been) certain speed improvements for 
specific situations w/ loosing other standard features, eg.
fast bridging.


cu
-- 
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 13:43     ` Enrico Weigelt
@ 2008-05-05 15:35       ` Al Viro
  2008-05-05 16:43         ` Miklos Szeredi
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2008-05-05 15:35 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: linux kernel list

On Mon, May 05, 2008 at 03:43:15PM +0200, Enrico Weigelt wrote:
> * Al Viro <viro@ZenIV.linux.org.uk> wrote:
> > That you have quite forgotten about mounts.  
> 
> hmm, I though this would be done before the link_path_walk() 
> call happens ;-o

How on the earth...?  You don't know where will pathname resolution
get you, so how could you possibly handle mountpoint transitions prior
to it?

> And if you use that option, you'll simply loose the ability
> of using mointpoints within this specific mount. This won't 
> affect any situation other than #1 && #2, IMHO this is better
> than no chance of fast lookups at all. Of course, an cleaner
> approach would be better, but it's IMHO not critical.

This is crap.  First of all, the logics is already overcomplicated.
_Then_ we have a problem of populating dcache for intermediates.

Besides, that's not what that thing is for - it's to allow local
caching (which we do) with revalidation of several components
at once.  _After_ VFS has decided that nothing interesting is in
the part of path it has cached.  Then the protocol allows to do
bulk Walk, verifying that all cached intermediates still match
the reality, all in one roundtrip.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 15:35       ` Al Viro
@ 2008-05-05 16:43         ` Miklos Szeredi
  2008-05-05 17:03           ` Miklos Szeredi
  2008-05-05 17:14           ` Al Viro
  0 siblings, 2 replies; 19+ messages in thread
From: Miklos Szeredi @ 2008-05-05 16:43 UTC (permalink / raw)
  To: viro; +Cc: weigelt, linux-kernel

> > * Al Viro <viro@ZenIV.linux.org.uk> wrote:
> > > That you have quite forgotten about mounts.  
> > 
> > hmm, I though this would be done before the link_path_walk() 
> > call happens ;-o
> 
> How on the earth...?  You don't know where will pathname resolution
> get you, so how could you possibly handle mountpoint transitions prior
> to it?

One way this could be done cleanly, is to pass the rest of the path
(as hint) to the filesystem in its lookup function.  Most filesystems
would just ignore it, but those which have the capabilities can use it
to do the lookup in one go, and internally cache the result.  The VFS
doesn't need to know _anything_ about all this.  If there are
mountpoints, they are already cached, so ->lookup() wouldn't be called
at all, only ->d_revalidate(), which is a different issue.

Miklos

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 16:43         ` Miklos Szeredi
@ 2008-05-05 17:03           ` Miklos Szeredi
  2008-05-05 17:14           ` Al Viro
  1 sibling, 0 replies; 19+ messages in thread
From: Miklos Szeredi @ 2008-05-05 17:03 UTC (permalink / raw)
  To: miklos; +Cc: viro, weigelt, linux-kernel

> > > * Al Viro <viro@ZenIV.linux.org.uk> wrote:
> > > > That you have quite forgotten about mounts.  
> > > 
> > > hmm, I though this would be done before the link_path_walk() 
> > > call happens ;-o
> > 
> > How on the earth...?  You don't know where will pathname resolution
> > get you, so how could you possibly handle mountpoint transitions prior
> > to it?
> 
> One way this could be done cleanly, is to pass the rest of the path
> (as hint) to the filesystem in its lookup function.  Most filesystems
> would just ignore it, but those which have the capabilities can use it
> to do the lookup in one go, and internally

Better, the filesystem can just populate the dcache with the result.
The entry being looked up is locked, so noone can get at it, and so it
should be quite safe to build a tree below.

Miklos

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 16:43         ` Miklos Szeredi
  2008-05-05 17:03           ` Miklos Szeredi
@ 2008-05-05 17:14           ` Al Viro
  2008-05-05 17:33             ` Miklos Szeredi
  1 sibling, 1 reply; 19+ messages in thread
From: Al Viro @ 2008-05-05 17:14 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: weigelt, linux-kernel

On Mon, May 05, 2008 at 06:43:57PM +0200, Miklos Szeredi wrote:

> One way this could be done cleanly, is to pass the rest of the path
> (as hint) to the filesystem in its lookup function.  Most filesystems
> would just ignore it, but those which have the capabilities can use it
> to do the lookup in one go, and internally cache the result.  The VFS
> doesn't need to know _anything_ about all this.  If there are
> mountpoints, they are already cached, so ->lookup() wouldn't be called
> at all, only ->d_revalidate(), which is a different issue.

This is still wrong.  We not just pass the pathname to filesystem (note
that you still need to deal with symlinks), but we make that filesystem
to populate dentry tree.  Take a look at 9P walk - it does *not* give
you anything resembling stat, you just get qids of intermediates.  Which
is bloody useful when you want to do intelligent revalidation (do local
cached walk, then issue a single protocol request that will both do
bulk revalidate *and* tell you where in the path you've got the first
invalid one - just compare qids with what you've got stored locally).
However, it's just about useless for cutting corners in cold-cache
lookup.

It _is_ a useful thing, no arguments about that.  However, to use it
a sane way we need to massage the pathname resolution loop, taking
the "simple pass without symlinks or mountpoints" part into a new
helper, turning the current __link_path_walk() into a loop calling that
one and then folding it into callers.  Would also allow to kill the
last remnants of recursion in symlink handling for normal fs case...

_Then_ we can do saner logics for revalidate, allowing it on such segments.
Which, BTW, would deal with -ESTALE in a saner way, rather than "repeat
full pathname resolution from the very beginning".  And that's where
9P multi-step walk(5) would do very nicely, indeed.

And fuck the "hints" of all kinds, pardon the rudeness.  We already have
more than enough of that crap and it already makes cleaning the logics
up bloody painful.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 17:14           ` Al Viro
@ 2008-05-05 17:33             ` Miklos Szeredi
  2008-05-05 17:40               ` Al Viro
  0 siblings, 1 reply; 19+ messages in thread
From: Miklos Szeredi @ 2008-05-05 17:33 UTC (permalink / raw)
  To: viro; +Cc: miklos, weigelt, linux-kernel

> > One way this could be done cleanly, is to pass the rest of the path
> > (as hint) to the filesystem in its lookup function.  Most filesystems
> > would just ignore it, but those which have the capabilities can use it
> > to do the lookup in one go, and internally cache the result.  The VFS
> > doesn't need to know _anything_ about all this.  If there are
> > mountpoints, they are already cached, so ->lookup() wouldn't be called
> > at all, only ->d_revalidate(), which is a different issue.
> 
> This is still wrong.  We not just pass the pathname to filesystem (note
> that you still need to deal with symlinks),

Symlinks are easy: filesystem just needs to *stop* the resolution the
moment it finds one.

> but we make that filesystem to populate dentry tree.

Doesn't sound hard: d_alloc() + d_instantiate().

>  Take a look at 9P walk - it does *not* give
> you anything resembling stat, you just get qids of intermediates.

Which is exactly what's needed to populate the dentry tree, no?

>  Which
> is bloody useful when you want to do intelligent revalidation (do local
> cached walk, then issue a single protocol request that will both do
> bulk revalidate *and* tell you where in the path you've got the first
> invalid one - just compare qids with what you've got stored locally).
> However, it's just about useless for cutting corners in cold-cache
> lookup.

Sure, it's useful for that as well.

> It _is_ a useful thing, no arguments about that.  However, to use it
> a sane way we need to massage the pathname resolution loop, taking
> the "simple pass without symlinks or mountpoints" part into a new
> helper, turning the current __link_path_walk() into a loop calling that
> one and then folding it into callers.  Would also allow to kill the
> last remnants of recursion in symlink handling for normal fs case...
> 
> _Then_ we can do saner logics for revalidate, allowing it on such segments.
> Which, BTW, would deal with -ESTALE in a saner way, rather than "repeat
> full pathname resolution from the very beginning".  And that's where
> 9P multi-step walk(5) would do very nicely, indeed.

Sounds wonderful.  Really.

> And fuck the "hints" of all kinds, pardon the rudeness.

Don't mind at all.  In fact I usually enjoy it.

>  We already have more than enough of that crap and it already makes
> cleaning the logics up bloody painful.

Separate i_op for it is fine by me as well.

Not that I care very much.  I have plans for such a bulk lookup
interface in fuse, but that's far in the future.

Miklos


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 17:33             ` Miklos Szeredi
@ 2008-05-05 17:40               ` Al Viro
  2008-05-05 18:03                 ` Miklos Szeredi
  2008-05-05 18:23                 ` Enrico Weigelt
  0 siblings, 2 replies; 19+ messages in thread
From: Al Viro @ 2008-05-05 17:40 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: weigelt, linux-kernel

On Mon, May 05, 2008 at 07:33:00PM +0200, Miklos Szeredi wrote:

> > This is still wrong.  We not just pass the pathname to filesystem (note
> > that you still need to deal with symlinks),
> 
> Symlinks are easy: filesystem just needs to *stop* the resolution the
> moment it finds one.

That assumes you see types of objects as you do multi-step walk...

> >  Take a look at 9P walk - it does *not* give
> > you anything resembling stat, you just get qids of intermediates.
> 
> Which is exactly what's needed to populate the dentry tree, no?

No - you need inodes as well (i.e. as the absolute least you want
mode and ownership).  Which is to say, you need to issue stat on
each component in such situation anyway.  Not a win...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 17:40               ` Al Viro
@ 2008-05-05 18:03                 ` Miklos Szeredi
  2008-05-05 18:31                   ` Miklos Szeredi
  2008-05-05 18:50                   ` Enrico Weigelt
  2008-05-05 18:23                 ` Enrico Weigelt
  1 sibling, 2 replies; 19+ messages in thread
From: Miklos Szeredi @ 2008-05-05 18:03 UTC (permalink / raw)
  To: viro; +Cc: miklos, weigelt, linux-kernel

> > > This is still wrong.  We not just pass the pathname to filesystem (note
> > > that you still need to deal with symlinks),
> > 
> > Symlinks are easy: filesystem just needs to *stop* the resolution the
> > moment it finds one.
> 
> That assumes you see types of objects as you do multi-step walk...

Umm, OK.  The 9P server does see the type of objects, so it should be
able to do that.

> > >  Take a look at 9P walk - it does *not* give
> > > you anything resembling stat, you just get qids of intermediates.
> > 
> > Which is exactly what's needed to populate the dentry tree, no?
> 
> No - you need inodes as well (i.e. as the absolute least you want
> mode and ownership).  Which is to say, you need to issue stat on
> each component in such situation anyway.  Not a win...

You're right.  It doesn't sound too good, although it all depends on
the how permission checking is done.  If it's done in the server, then
neither the mode nor the ownership is needed for lookup.  The file
type *is* known for all but the last component, and doing a stat for
that one is not a big issue.  All this is modulo the symlink issue of
course.

Miklos

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 17:40               ` Al Viro
  2008-05-05 18:03                 ` Miklos Szeredi
@ 2008-05-05 18:23                 ` Enrico Weigelt
  2008-05-05 18:34                   ` Al Viro
  1 sibling, 1 reply; 19+ messages in thread
From: Enrico Weigelt @ 2008-05-05 18:23 UTC (permalink / raw)
  To: linux kernel list

* Al Viro <viro@ZenIV.linux.org.uk> wrote:

> > Symlinks are easy: filesystem just needs to *stop* the resolution the
> > moment it finds one.
> 
> That assumes you see types of objects as you do multi-step walk...

I've just read the spec for walk again:

Assuming the server doesn't resolve symlinks itself, the walk
will fail right at the symlink. So we can have a deeper look
here and try stat()'ing (adds one more request). If the fail 
point *is* an symlink, we need to properly handle it.

Would it be very complicated to give the link target back to
VFS and let the lookup start again (w/ new name) ?

> No - you need inodes as well (i.e. as the absolute least you want
> mode and ownership).  Which is to say, you need to issue stat on
> each component in such situation anyway.  Not a win...

Naive question: is it really *necessary* to have all the 
intermediate dirs in dcache ?


cu
-- 
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 18:03                 ` Miklos Szeredi
@ 2008-05-05 18:31                   ` Miklos Szeredi
  2008-05-05 20:16                     ` Trond Myklebust
  2008-05-05 18:50                   ` Enrico Weigelt
  1 sibling, 1 reply; 19+ messages in thread
From: Miklos Szeredi @ 2008-05-05 18:31 UTC (permalink / raw)
  To: miklos; +Cc: viro, miklos, weigelt, linux-kernel

> > > >  Take a look at 9P walk - it does *not* give
> > > > you anything resembling stat, you just get qids of intermediates.
> > > 
> > > Which is exactly what's needed to populate the dentry tree, no?
> > 
> > No - you need inodes as well (i.e. as the absolute least you want
> > mode and ownership).  Which is to say, you need to issue stat on
> > each component in such situation anyway.  Not a win...

And actually even that *could* be a win, if the network latency is
large.  Because by doing the lookup first, the stats can be performed
in parallel.  So a path with an arbitrary number of components could
be resolved in just 2 RTTs.

Miklos

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 18:23                 ` Enrico Weigelt
@ 2008-05-05 18:34                   ` Al Viro
  2008-05-05 19:02                     ` Enrico Weigelt
  0 siblings, 1 reply; 19+ messages in thread
From: Al Viro @ 2008-05-05 18:34 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: linux kernel list

On Mon, May 05, 2008 at 08:23:09PM +0200, Enrico Weigelt wrote:

> Assuming the server doesn't resolve symlinks itself,

Um...  What the hell are you talking about?  How _can_ server resolve
symlinks, when result of symlink resolution depends on where the damn
thing is mounted on client and even how deeply the process trying to
do lookup happens to be chrooted?

It wouldn't work even for relative symlinks - remember that we might
bloody well have something bound on the middle of the path in question.
Without any knowledge by fs.

> Naive question: is it really *necessary* to have all the 
> intermediate dirs in dcache ?

The answer's "yes".

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 18:03                 ` Miklos Szeredi
  2008-05-05 18:31                   ` Miklos Szeredi
@ 2008-05-05 18:50                   ` Enrico Weigelt
  1 sibling, 0 replies; 19+ messages in thread
From: Enrico Weigelt @ 2008-05-05 18:50 UTC (permalink / raw)
  To: linux kernel list

* Miklos Szeredi <miklos@szeredi.hu> wrote:

> You're right. It doesn't sound too good, although it all depends 
> on the how permission checking is done.  If it's done in the server, 
> then neither the mode nor the ownership is needed for lookup. 

It really should be done in the server. But this adds another
issue (no idea if the current 9p driver already handles this):
We need one link for each user accessing the filesystem.


cu
-- 
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 18:34                   ` Al Viro
@ 2008-05-05 19:02                     ` Enrico Weigelt
  2008-05-05 19:09                       ` Al Viro
  0 siblings, 1 reply; 19+ messages in thread
From: Enrico Weigelt @ 2008-05-05 19:02 UTC (permalink / raw)
  To: linux kernel list

* Al Viro <viro@ZenIV.linux.org.uk> wrote:

> How _can_ server resolve symlinks, when result of symlink 
> resolution depends on where the damn thing is mounted on client 
> and even how deeply the process trying to do lookup happens 
> to be chrooted?

In the same way as, eg. http servers, do. Of course this fails 
if the symlink isn't resolvable within server's fs.

Several years ago, I've seen exactly this behaviour on Samba.
Whether this is what you might expect, is another story ;-P

> > Naive question: is it really *necessary* to have all the 
> > intermediate dirs in dcache ?
> 
> The answer's "yes".

What exactly are they needed for ? 
Which information is needed ?
Can we perhaps fake them (at least we know - on success - the
intermediate components are dirs) ?


cu
-- 
---------------------------------------------------------------------
 Enrico Weigelt    ==   metux IT service - http://www.metux.de/
---------------------------------------------------------------------
 Please visit the OpenSource QM Taskforce:
 	http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
	http://patches.metux.de/
---------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 19:02                     ` Enrico Weigelt
@ 2008-05-05 19:09                       ` Al Viro
  0 siblings, 0 replies; 19+ messages in thread
From: Al Viro @ 2008-05-05 19:09 UTC (permalink / raw)
  To: Enrico Weigelt; +Cc: linux kernel list

On Mon, May 05, 2008 at 09:02:43PM +0200, Enrico Weigelt wrote:
> * Al Viro <viro@ZenIV.linux.org.uk> wrote:
> 
> > How _can_ server resolve symlinks, when result of symlink 
> > resolution depends on where the damn thing is mounted on client 
> > and even how deeply the process trying to do lookup happens 
> > to be chrooted?
> 
> In the same way as, eg. http servers, do. Of course this fails 
> if the symlink isn't resolvable within server's fs.

Umm...  You know, it might make more sense if you
	* explained what are you really trying to do
	* short of that, perhaps figured out what the hell symlinks and
bindings _are_.

Again, _no_ symlink is resolvable by server alone, simply because
server can not know if target of that symlink is overmounted from
the point of view of whoever is doing lookup.  Note that it *does*
depend on who's doing that and where in the namespace we are seeing
that sucker (the latter kills the "we want per-user connection"
variants).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 18:31                   ` Miklos Szeredi
@ 2008-05-05 20:16                     ` Trond Myklebust
  2008-05-05 20:35                       ` Miklos Szeredi
  0 siblings, 1 reply; 19+ messages in thread
From: Trond Myklebust @ 2008-05-05 20:16 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: viro, weigelt, linux-kernel

On Mon, 2008-05-05 at 20:31 +0200, Miklos Szeredi wrote:
> > > > >  Take a look at 9P walk - it does *not* give
> > > > > you anything resembling stat, you just get qids of intermediates.
> > > > 
> > > > Which is exactly what's needed to populate the dentry tree, no?
> > > 
> > > No - you need inodes as well (i.e. as the absolute least you want
> > > mode and ownership).  Which is to say, you need to issue stat on
> > > each component in such situation anyway.  Not a win...
> 
> And actually even that *could* be a win, if the network latency is
> large.  Because by doing the lookup first, the stats can be performed
> in parallel.  So a path with an arbitrary number of components could
> be resolved in just 2 RTTs.

...and NFSv4 could do it in a single RPC call (assuming no symlinks or
submounts).

Cheers
  Trond


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: VFS + path walktrough
  2008-05-05 20:16                     ` Trond Myklebust
@ 2008-05-05 20:35                       ` Miklos Szeredi
  0 siblings, 0 replies; 19+ messages in thread
From: Miklos Szeredi @ 2008-05-05 20:35 UTC (permalink / raw)
  To: trond.myklebust; +Cc: miklos, viro, weigelt, linux-kernel

> On Mon, 2008-05-05 at 20:31 +0200, Miklos Szeredi wrote:
> > > > > >  Take a look at 9P walk - it does *not* give
> > > > > > you anything resembling stat, you just get qids of intermediates.
> > > > > 
> > > > > Which is exactly what's needed to populate the dentry tree, no?
> > > > 
> > > > No - you need inodes as well (i.e. as the absolute least you want
> > > > mode and ownership).  Which is to say, you need to issue stat on
> > > > each component in such situation anyway.  Not a win...
> > 
> > And actually even that *could* be a win, if the network latency is
> > large.  Because by doing the lookup first, the stats can be performed
> > in parallel.  So a path with an arbitrary number of components could
> > be resolved in just 2 RTTs.
> 
> ...and NFSv4 could do it in a single RPC call (assuming no symlinks or
> submounts).

And just to show how utterly trivially this could be done, here's a
patch (totally untested).

Hack?  Hell, yes.

Miklos

---
 fs/namei.c         |   20 ++++++++++++--------
 include/linux/fs.h |    1 +
 2 files changed, 13 insertions(+), 8 deletions(-)

Index: linux-2.6/fs/namei.c
===================================================================
--- linux-2.6.orig/fs/namei.c	2008-05-05 12:03:24.000000000 +0200
+++ linux-2.6/fs/namei.c	2008-05-05 22:25:52.000000000 +0200
@@ -519,14 +519,18 @@ static struct dentry * real_lookup(struc
 	 */
 	result = d_lookup(parent, name);
 	if (!result) {
-		struct dentry * dentry = d_alloc(parent, name);
-		result = ERR_PTR(-ENOMEM);
-		if (dentry) {
-			result = dir->i_op->lookup(dir, dentry, nd);
-			if (result)
-				dput(dentry);
-			else
-				result = dentry;
+		if (dir->i_op->lookup_path) {
+			result = dir->i_op->lookup_path(dir, name);
+		} else  {
+			struct dentry * dentry = d_alloc(parent, name);
+			result = ERR_PTR(-ENOMEM);
+			if (dentry) {
+				result = dir->i_op->lookup(dir, dentry, nd);
+				if (result)
+					dput(dentry);
+				else
+					result = dentry;
+			}
 		}
 		mutex_unlock(&dir->i_mutex);
 		return result;
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h	2008-05-05 12:03:24.000000000 +0200
+++ linux-2.6/include/linux/fs.h	2008-05-05 22:26:59.000000000 +0200
@@ -1251,6 +1251,7 @@ struct file_operations {
 struct inode_operations {
 	int (*create) (struct inode *,struct dentry *,int, struct nameidata *);
 	struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *);
+	struct dentry * (*lookup_path) (struct inode *, struct qstr *);
 	int (*link) (struct dentry *,struct inode *,struct dentry *);
 	int (*unlink) (struct inode *,struct dentry *);
 	int (*symlink) (struct inode *,struct dentry *,const char *);






^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2008-05-05 20:36 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-05 12:40 VFS + path walktrough Enrico Weigelt
2008-05-05 13:06 ` Enrico Weigelt
2008-05-05 13:13   ` Al Viro
2008-05-05 13:43     ` Enrico Weigelt
2008-05-05 15:35       ` Al Viro
2008-05-05 16:43         ` Miklos Szeredi
2008-05-05 17:03           ` Miklos Szeredi
2008-05-05 17:14           ` Al Viro
2008-05-05 17:33             ` Miklos Szeredi
2008-05-05 17:40               ` Al Viro
2008-05-05 18:03                 ` Miklos Szeredi
2008-05-05 18:31                   ` Miklos Szeredi
2008-05-05 20:16                     ` Trond Myklebust
2008-05-05 20:35                       ` Miklos Szeredi
2008-05-05 18:50                   ` Enrico Weigelt
2008-05-05 18:23                 ` Enrico Weigelt
2008-05-05 18:34                   ` Al Viro
2008-05-05 19:02                     ` Enrico Weigelt
2008-05-05 19:09                       ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox