From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Christian Brauner <brauner@kernel.org>
Cc: "Ahelenia Ziemiańska" <nabijaczleweli@nabijaczleweli.xyz>,
"David Howells" <dhowells@redhat.com>,
"Alexander Viro" <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Pending splice(file -> FIFO) excludes all other FIFO operations forever (was: ... always blocks read(FIFO), regardless of O_NONBLOCK on read side?)
Date: Fri, 7 Jul 2023 13:57:06 -0600 [thread overview]
Message-ID: <42294e31-9cc8-3c5a-c28f-cfa3854fbe69@kernel.dk> (raw)
In-Reply-To: <CAHk-=whHXogGiPkGFwQQBtn364M4caVNcBTs7hLNfa_X67ouzA@mail.gmail.com>
On 7/7/23 1:10?PM, Linus Torvalds wrote:
> On Fri, 7 Jul 2023 at 10:21, Christian Brauner <brauner@kernel.org> wrote:
>>
>> Forgot to say, fwiw, I've been running this through the LTP splice,
>> pipe, and ipc tests without issues. A hanging reader can be signaled
>> away cleanly with this.
>
> So that patch still has a couple of "wait for this" cases remaining.
>
> In particular, when we do a read, and we do have pipe buffers, both
> the read() system call and a number of internal splice functions will
> go "Ahh, I have data", and then do pipe_buf_confirm() and read it.
>
> Which then results in pipe_buf_confirm() blocking. It now blocks
> interruptibly, which is much nicer, but several of these users *could*
> just do a non-blocking confirmation instead, and wait for pipe
> readability.
>
> HOWEVER, that's slightly less trivial than you'd expect, because the
> "wait for readability" needs to be done without the pipe lock held -
> so you can't actually check the pipe buffer state at that point (since
> you need the pipe lock to look up the buffer).
>
> That's true even of "trivial" cases like actual user-space "read()
> with O_NONBLOCK and poll()" situations.
>
> Now, the solution to all this is *fairly* straightforward:
>
> (a) don't use "!pipe_empty()" for a readability check.
>
> We already have "pipe_readable()", but it's hidden in fs/pipe.c,
> so all the splice() code ended up writing the "does this pipe have
> data" using "!pipe_empty()" instead.
>
> (b) make "pipe_buf_confirm()" take a "non-blocking" boolean argument,
> and if it is non-blocking but hits one of those blocked pages, set
> "pipe->not_ready", and return -EAGAIN.
>
> This is ok, because "pipe_buf_confirm()" is always under the pipe
> lock, and we'll just clear "pipe->not_ready" under the pipe lock after
> finalizing all those pages (and before waking up readers)
>
> (c) make "pipe_wait_readable()" and "poll()" know about this all, so
> that we wait properly for a pipe that was not ready to become ready
>
> This all makes *most* users deal properly with these blocking events.
> In particular, things like splice_to_socket() can now do the whole
> proper "wait without holding the pipe lock" sequence, even when the
> pipe is not empty, just in this blocked state.
>
> This *may* also make all the cases Jens had with io_uring and splicing
> JustWork(tm).
Exactly! I was reading this thread with excitement just now, would be
nice to get rid of that kludge.
> NOTE! NOTE! NOTE! Once more, this "feels right to me", and I'd argue
> that the basic approach is fairly straightfoward. The patch is also
> not horrendous. It all makes a fair amount of sense. BUT! I haven't
> tested this, and like the previous patch, I really would want people
> to think about this a lot.
>
> Comments? Jens?
I'll take a closer look at this, but won't be until Monday most likely.
But the approach seems sane, and going in a more idiomatic direction
than before. So seems promising.
--
Jens Axboe
next prev parent reply other threads:[~2023-07-07 19:57 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-26 1:12 Pending splice(file -> FIFO) always blocks read(FIFO), regardless of O_NONBLOCK on read side? Ahelenia Ziemiańska
2023-06-26 9:32 ` Christian Brauner
2023-06-26 11:59 ` Pending splice(file -> FIFO) excludes all other FIFO operations forever (was: ... always blocks read(FIFO), regardless of O_NONBLOCK on read side?) Ahelenia Ziemiańska
2023-06-26 15:56 ` Christian Brauner
2023-06-26 16:14 ` Ahelenia Ziemiańska
2023-07-06 21:56 ` Linus Torvalds
2023-07-07 17:21 ` Christian Brauner
2023-07-07 19:10 ` Linus Torvalds
2023-07-07 19:57 ` Jens Axboe [this message]
2023-07-07 22:41 ` Ahelenia Ziemiańska
2023-07-07 22:57 ` Linus Torvalds
2023-07-08 0:30 ` Ahelenia Ziemiańska
2023-07-08 20:06 ` Linus Torvalds
2023-07-09 1:03 ` Ahelenia Ziemiańska
2023-07-09 22:33 ` Ahelenia Ziemiańska
2023-07-10 13:22 ` Ahelenia Ziemiańska
2023-07-08 0:00 ` Matthew Wilcox
2023-07-08 0:07 ` Linus Torvalds
2023-07-08 0:21 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42294e31-9cc8-3c5a-c28f-cfa3854fbe69@kernel.dk \
--to=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nabijaczleweli@nabijaczleweli.xyz \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).