linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH][RFC] %pd - for printing dentry name
Date: Thu, 4 Feb 2010 18:40:57 +1100	[thread overview]
Message-ID: <20100204074057.GA13318@laptop> (raw)
In-Reply-To: <20100204060237.GD30031@ZenIV.linux.org.uk>

On Thu, Feb 04, 2010 at 06:02:37AM +0000, Al Viro wrote:
> On Tue, Feb 02, 2010 at 05:36:31PM +1100, Nick Piggin wrote:
> 
> > I ended up having to use a seqlock to do name comparison without locks
> > (and without holding references for that matter, just RCU).  However
> > name comparison is obviously a lot more critical because you can't
> > ignore races, so you might be able to do something simpler.
> 
> Umm...  Your ->d_seq, you mean?  IIRC, that stuff landed too close to
> do_filp_open() queue back then *and* had potential headache from hell
> with vfsmount side of that business.
> 
> I've reread it now.  Comments:

OK. It's at the end of a big patch series and the path walk stuff
was proof of concept quality at best. In particular I was hoping
for a better way to share code.

That said, we can move pieces around if you are interested to
cherry pick things. I should rediff and resend the series.

> 	* mntput() is blocking.  When refcount goes to 0, we get
> dput() on root, possibly followed by actual fs shutdown.  Very
> much not an RCU fodder, even though most of the calls will be OK.
> We can do a variant that would do atomic_dec_and_bail() instead ;-)
> I.e. decrement atomically if greater than 1, bail out otherwise.

Good point. atomic_dec_and_bail seems like it would work nicely.


> 	* why do you bail out on LOOKUP_PARENT?  For that matter, why
> do you do that so late?

Ah no good reason. Probably I was just trying to get some basic
testable cases working. I indeed hope to handle more cases and
bail out more gracefully (ie. attempt to resume walking from
last successful lookup rather than start the entire path again).


> 	* how does that interact with d_materialise_unique()?  Sure,
> you bail out on NFS anyway, but it's still not nice to leave relying just
> on that.

Yeah it does need some more thinking about these cases. In fact,
it's even missing a lot of basic rcu_assign_pointer / rcu_deref
cases that are needed if we are to make lookups totally lock free.

For harder cases, I'm hoping we can get away with using ->d_seq
to avoid thinking too hard about ordering.

Like I said, it's very much "known buggy" / proof of concept stage.
But if you're happy with the basic idea, then I'm keen to get to
work on it.

Do you have opinions on the rest of the vfs scalability series?


> 	* why can we access dentry->d_inode->i_op contents?  Or dentry->d_op
> one, for that matter...

Good point. Require an rcu_barrier in unregister_filesystem I guess.

 
> BTW, I'd love to take the entire "last component" part of link_path_walk()
> out into a separate function and away from link_path_walk(), leaving
> basically just the LOOKUP_PARENT case in there.  Price: trailing symlinks
> need to be handled by an iterative loop in do_follow_link().  And
> that actually ends up an improvement both in stack depth and in overall
> code cleanup.  Nothing like the horrors in do_filp_open(), TYVM (said
> horrors had mostly gone away in #untested, BTW, and I'm going to move
> that series to for-next shortly).  However, we are still several prereqs
> away from link_path_walk() split, so that'll have to wait a bit.  In
> any case, LOOKUP_PARENT is very much worth the first-class treatment...

I'll have to take a look at what you're doing.

Thanks,
Nick


  reply	other threads:[~2010-02-04  7:41 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-01 22:25 [PATCH][RFC] %pd - for printing dentry name Al Viro
2010-02-01 22:34 ` Al Viro
2010-02-01 22:37 ` Linus Torvalds
2010-02-01 23:18   ` Al Viro
2010-02-02  1:06     ` Al Viro
2010-02-02  5:55       ` Eric W. Biederman
2010-02-02 17:01         ` Al Viro
2010-02-02 18:10           ` Olivier Galibert
2010-02-02 19:19           ` Eric W. Biederman
2010-02-03  3:04             ` Al Viro
2010-02-04  4:53               ` Al Viro
2010-02-02  4:22     ` Linus Torvalds
2010-02-02  5:00       ` Al Viro
2010-02-02  6:36         ` Nick Piggin
2010-02-04  6:02           ` Al Viro
2010-02-04  7:40             ` Nick Piggin [this message]
2010-02-02  6:53     ` Paul E. McKenney
2010-02-02  7:09       ` Al Viro
2010-02-02 13:32         ` Matthew Wilcox
2010-02-02 15:56           ` Linus Torvalds
2010-02-02 16:13             ` Matthew Wilcox
2010-02-02 16:43           ` Al Viro
2010-02-03 10:52         ` Paul E. McKenney
2010-02-03  2:49       ` Paul E. McKenney
2010-02-04 15:29         ` Linus Torvalds
2010-02-04 16:02           ` Paul E. McKenney
2010-02-04 17:13             ` Linus Torvalds
2010-02-04 17:36               ` Al Viro
2010-02-07 16:34                 ` Paul E. McKenney
2010-02-01 22:45 ` Joe Perches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100204074057.GA13318@laptop \
    --to=npiggin@suse.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).