From: Jens Axboe <axboe@kernel.dk>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org, torvalds@linux-foundation.org
Subject: Re: [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK
Date: Fri, 11 Dec 2020 08:57:26 -0700 [thread overview]
Message-ID: <bef3f905-f6b7-1134-7ca9-ff9385d6bf86@kernel.dk> (raw)
In-Reply-To: <20201211023555.GV3579531@ZenIV.linux.org.uk>
On 12/10/20 7:35 PM, Al Viro wrote:
> On Thu, Dec 10, 2020 at 01:01:13PM -0700, Jens Axboe wrote:
>> io_uring always punts opens to async context, since there's no control
>> over whether the lookup blocks or not. Add LOOKUP_NONBLOCK to support
>> just doing the fast RCU based lookups, which we know will not block. If
>> we can do a cached path resolution of the filename, then we don't have
>> to always punt lookups for a worker.
>>
>> During path resolution, we always do LOOKUP_RCU first. If that fails and
>> we terminate LOOKUP_RCU, then fail a LOOKUP_NONBLOCK attempt as well.
>
> In effect you are adding a mode where
> * unlazy would fail, except when done from complete_walk()
> * ->d_revalidate() wouldn't be attempted at all (not even with LOOKUP_RCU)
> * ... but ->get_link() in RCU mode would
> * ... and so would everything done after complete_walk() in
> do_open(), very much including the joys like mnt_want_write() (i.e. waiting for
> frozen fs to thaw), handling O_TRUNC, calling ->open() itself...
>
> So this "not punting lookups for a worker" looks fishy as hell - if you care
> about blocking operations, you haven't really won anything.
>
> And why exactly is the RCU case of ->d_revalidate() worth buggering off (it
> really can't block - it's called under rcu_read_lock() and it does *not*
> drop it)?
>
> _IF_ for some theoretical exercise you want to do "lookup without dropping
> out of RCU", just add a flag that has unlazy_walk() fail. With -ECHILD.
> Strip it away in complete_walk() and have path_init() with that flag
> and without LOOKUP_RCU fail with -EAGAIN. All there is to it.
Thanks Al, that makes for an easier implementation. I like that suggestion,
boils it down to just three hunks (see below).
For io_uring, the concept is just to perform the fast path inline. The
RCU lookup serves that purpose nicely - if we fail that, then it's expected
to take the latency hit of going async.
> It still leaves you with fuckloads of blocking operations (and that's
> "blocking" with "until admin thaws the damn filesystem several hours
> down the road") after complete_walk(), though.
But that's true (and expected) for any open that isn't non-blocking.
diff --git a/fs/namei.c b/fs/namei.c
index d7952f863e79..d49c72e34c6e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -686,6 +686,8 @@ static bool unlazy_walk(struct nameidata *nd)
BUG_ON(!(nd->flags & LOOKUP_RCU));
nd->flags &= ~LOOKUP_RCU;
+ if (nd->flags & LOOKUP_NONBLOCK)
+ goto out1;
if (unlikely(!legitimize_links(nd)))
goto out1;
if (unlikely(!legitimize_path(nd, &nd->path, nd->seq)))
@@ -792,6 +794,7 @@ static int complete_walk(struct nameidata *nd)
*/
if (!(nd->flags & (LOOKUP_ROOT | LOOKUP_IS_SCOPED)))
nd->root.mnt = NULL;
+ nd->flags &= ~LOOKUP_NONBLOCK;
if (unlikely(unlazy_walk(nd)))
return -ECHILD;
}
@@ -2209,6 +2212,9 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
if (!*s)
flags &= ~LOOKUP_RCU;
+ /* LOOKUP_NONBLOCK requires RCU, ask caller to retry */
+ if ((flags & (LOOKUP_RCU | LOOKUP_NONBLOCK)) == LOOKUP_NONBLOCK)
+ return ERR_PTR(-EAGAIN);
if (flags & LOOKUP_RCU)
rcu_read_lock();
--
Jens Axboe
next prev parent reply other threads:[~2020-12-11 17:07 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-10 20:01 [PATCHSET 0/2] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK Jens Axboe
2020-12-10 20:01 ` [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK Jens Axboe
2020-12-10 20:53 ` Linus Torvalds
2020-12-10 21:06 ` Jens Axboe
2020-12-11 2:45 ` Al Viro
2020-12-11 16:05 ` Jens Axboe
2020-12-11 17:20 ` Al Viro
2020-12-11 17:35 ` Linus Torvalds
2020-12-11 18:50 ` Jens Axboe
2020-12-11 21:51 ` Al Viro
2020-12-11 23:47 ` Jens Axboe
2020-12-11 17:33 ` Matthew Wilcox
2020-12-11 18:55 ` Jens Axboe
2020-12-11 2:35 ` Al Viro
2020-12-11 15:57 ` Jens Axboe [this message]
2020-12-11 17:21 ` Linus Torvalds
2020-12-11 17:29 ` Al Viro
2020-12-11 17:38 ` Al Viro
2020-12-11 17:44 ` Linus Torvalds
2020-12-11 21:46 ` Jens Axboe
2020-12-10 20:01 ` [PATCH 2/2] fs: expose LOOKUP_NONBLOCK through openat2() RESOLVE_NONBLOCK Jens Axboe
2020-12-10 22:29 ` Dave Chinner
2020-12-10 23:12 ` Jens Axboe
2020-12-10 23:29 ` Linus Torvalds
2020-12-11 0:58 ` Dave Chinner
2020-12-11 1:01 ` Linus Torvalds
2020-12-11 3:45 ` Dave Chinner
2020-12-11 18:07 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bef3f905-f6b7-1134-7ca9-ff9385d6bf86@kernel.dk \
--to=axboe@kernel.dk \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).