From: Jens Axboe <axboe@kernel.dk>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK
Date: Fri, 11 Dec 2020 09:05:26 -0700 [thread overview]
Message-ID: <89f96b42-9d58-cd46-e157-758e91269d89@kernel.dk> (raw)
In-Reply-To: <20201211024553.GW3579531@ZenIV.linux.org.uk>
On 12/10/20 7:45 PM, Al Viro wrote:
> On Thu, Dec 10, 2020 at 02:06:39PM -0700, Jens Axboe wrote:
>> On 12/10/20 1:53 PM, Linus Torvalds wrote:
>>> On Thu, Dec 10, 2020 at 12:01 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> io_uring always punts opens to async context, since there's no control
>>>> over whether the lookup blocks or not. Add LOOKUP_NONBLOCK to support
>>>> just doing the fast RCU based lookups, which we know will not block. If
>>>> we can do a cached path resolution of the filename, then we don't have
>>>> to always punt lookups for a worker.
>>>
>>> Ok, this looks much better to me just from the name change.
>>>
>>> Half of the patch is admittedly just to make sure it now returns the
>>> right error from unlazy_walk (rather than knowing it's always
>>> -ECHILD), and that could be its own thing, but I'm not sure it's even
>>> worth splitting up. The only reason to do it would be to perhaps make
>>> it really clear which part is the actual change, and which is just
>>> that error handling cleanup.
>>>
>>> So it looks fine to me, but I will leave this all to Al.
>>
>> I did consider doing a prep patch just making the error handling clearer
>> and get rid of the -ECHILD assumption, since it's pretty odd and not
>> even something I'd expect to see in there. Al, do you want a prep patch
>> to do that to make the change simpler/cleaner?
>
> No, I do not. Why bother returning anything other than -ECHILD, when
> you can just have path_init() treat you flag sans LOOKUP_RCU as "fail
> with -EAGAIN now" and be done with that?
>
> What's the point propagating that thing when we are going to call the
> non-RCU variant next if we get -ECHILD?
Let's at least make it consistent - there is already one spot in there
that passes the return value back (see below).
> And that still doesn't answer the questions about the difference between
> ->d_revalidate() and ->get_link() (for the latter you keep the call in
> RCU mode, for the former you generate that -EAGAIN crap). Or between
> ->d_revalidate() and ->permission(), for that matter.
I believe these are moot with the updated patch from the other email.
> Finally, I really wonder what is that for; if you are in conditions when
> you really don't want to risk going to sleep, you do *NOT* want to
> do mnt_want_write(). Or ->open(). Or truncate(). Or, for Cthulhu
> sake, IMA hash calculation.
I just want to do the RCU side lookup, that is all. That's my fast path.
If that doesn't work, then we'll go through the motions of pushing this
to a context that allows blocking open.
> So how hard are your "we don't want to block here" requirements? Because
> the stuff you do after complete_walk() can easily be much longer than
> everything else.
Ideally it'd extend a bit beyond the RCU lookup, as things like proc
resolution will still fail with the proposed patch. But that's not a
huge deal to me, I consider the dentry lookup to be Good Enough.
commit bbfc4b98da8c5d9a64ae202952aa52ae6bb54dbd
Author: Jens Axboe <axboe@kernel.dk>
Date: Thu Dec 10 14:10:37 2020 -0700
fs: make unlazy_walk() error handling consistent
Most callers check for non-zero return, and assume it's -ECHILD (which
it always will be). One caller uses the actual error return. Clean this
up and make it fully consistent, by having unlazy_walk() return a bool
instead.
No functional changes in this patch.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
diff --git a/fs/namei.c b/fs/namei.c
index 03d0e11e4f36..d7952f863e79 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -679,7 +679,7 @@ static bool legitimize_root(struct nameidata *nd)
* Nothing should touch nameidata between unlazy_walk() failure and
* terminate_walk().
*/
-static int unlazy_walk(struct nameidata *nd)
+static bool unlazy_walk(struct nameidata *nd)
{
struct dentry *parent = nd->path.dentry;
@@ -694,14 +694,14 @@ static int unlazy_walk(struct nameidata *nd)
goto out;
rcu_read_unlock();
BUG_ON(nd->inode != parent->d_inode);
- return 0;
+ return false;
out1:
nd->path.mnt = NULL;
nd->path.dentry = NULL;
out:
rcu_read_unlock();
- return -ECHILD;
+ return true;
}
/**
@@ -3151,9 +3151,8 @@ static const char *open_last_lookups(struct nameidata *nd,
} else {
/* create side of things */
if (nd->flags & LOOKUP_RCU) {
- error = unlazy_walk(nd);
- if (unlikely(error))
- return ERR_PTR(error);
+ if (unlazy_walk(nd))
+ return ERR_PTR(-ECHILD);
}
audit_inode(nd->name, dir, AUDIT_INODE_PARENT);
/* trailing slashes? */
--
Jens Axboe
next prev parent reply other threads:[~2020-12-11 17:16 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-10 20:01 [PATCHSET 0/2] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK Jens Axboe
2020-12-10 20:01 ` [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK Jens Axboe
2020-12-10 20:53 ` Linus Torvalds
2020-12-10 21:06 ` Jens Axboe
2020-12-11 2:45 ` Al Viro
2020-12-11 16:05 ` Jens Axboe [this message]
2020-12-11 17:20 ` Al Viro
2020-12-11 17:35 ` Linus Torvalds
2020-12-11 18:50 ` Jens Axboe
2020-12-11 21:51 ` Al Viro
2020-12-11 23:47 ` Jens Axboe
2020-12-11 17:33 ` Matthew Wilcox
2020-12-11 18:55 ` Jens Axboe
2020-12-11 2:35 ` Al Viro
2020-12-11 15:57 ` Jens Axboe
2020-12-11 17:21 ` Linus Torvalds
2020-12-11 17:29 ` Al Viro
2020-12-11 17:38 ` Al Viro
2020-12-11 17:44 ` Linus Torvalds
2020-12-11 21:46 ` Jens Axboe
2020-12-10 20:01 ` [PATCH 2/2] fs: expose LOOKUP_NONBLOCK through openat2() RESOLVE_NONBLOCK Jens Axboe
2020-12-10 22:29 ` Dave Chinner
2020-12-10 23:12 ` Jens Axboe
2020-12-10 23:29 ` Linus Torvalds
2020-12-11 0:58 ` Dave Chinner
2020-12-11 1:01 ` Linus Torvalds
2020-12-11 3:45 ` Dave Chinner
2020-12-11 18:07 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=89f96b42-9d58-cd46-e157-758e91269d89@kernel.dk \
--to=axboe@kernel.dk \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).