From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL] pipe: nonblocking rw for io_uring
Date: Mon, 24 Apr 2023 16:07:21 -0600 [thread overview]
Message-ID: <2e7d4f63-7ddd-e4a6-e7eb-fd2a305d442e@kernel.dk> (raw)
In-Reply-To: <CAHk-=wgGzwaz2yGO9_PFv4O1ke_uHg25Ab0UndK+G9vJ9V4=hw@mail.gmail.com>
On 4/24/23 3:58?PM, Linus Torvalds wrote:
> On Mon, Apr 24, 2023 at 2:37?PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> And I completely refuse to add that trylock hack to paper that over.
>> The pipe lock is *not* meant for IO.
>
> If you want to paper it over, do it other ways.
>
> I'd love to just magically fix splice, but hey, that might not be possible.
Don't think it is... At least not trivially.
> But possible fixes papering this over might be to make splice "poison
> a pipe, and make io_uring falls back on io workers only on pipes that
> do splice. Make any normal pipe read/write load sane.
>
> And no, don't worry about races. If you have the same pipe used for
> io_uring IO *and* somebody else then doing splice on it and racing,
> just take the loss and tell people that they might hit a slow case if
> they do stupid things.
>
> Basically, the patch might look like something like
>
> - do_pipe() sets FMODE_NOWAIT by default when creating a pipe
>
> - splice then clears FMODE_NOWAIT on pipes as they are used
>
> and now io_uring sees whether the pipe is playing nice or not.
>
> As far as I can tell, something like that would make the
> 'pipe_buf_confirm()' part unnecessary too, since that's only relevant
> for splice.
>
> A fancier version might be to only do that "splice then clears
> FMODE_NOWAIT" thing if the other side of the splice has not set
> FMODE_NOWAIT.
>
> Honestly, if the problem is "pipe IO is slow", then splice should not
> be the thing you optimize for.
I think that'd be an acceptable approach, and would at least fix the
pure pipe case which I suspect is 99.9% of them, if not more. And yes,
it'd mean that we don't need to do the ->confirm() change either, as the
pipe is already tainted at that point.
I'll respin a v2, post, and send in later this merge window.
--
Jens Axboe
next prev parent reply other threads:[~2023-04-24 22:08 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-21 14:01 [GIT PULL] pipe: nonblocking rw for io_uring Christian Brauner
2023-04-24 21:05 ` Linus Torvalds
2023-04-24 21:22 ` Jens Axboe
2023-04-24 21:37 ` Linus Torvalds
2023-04-24 21:55 ` Jens Axboe
2023-04-24 22:00 ` Linus Torvalds
2023-04-24 22:05 ` Jens Axboe
2023-04-24 22:03 ` Jens Axboe
2023-04-24 21:58 ` Linus Torvalds
2023-04-24 22:07 ` Jens Axboe [this message]
2023-04-24 22:44 ` Jens Axboe
2023-04-25 3:16 ` Linus Torvalds
2023-04-25 13:46 ` Jens Axboe
2023-04-25 17:20 ` Linus Torvalds
2023-04-25 19:49 ` Peter Zijlstra
2023-04-25 19:58 ` Linus Torvalds
2023-04-25 20:10 ` Jens Axboe
2023-04-25 20:29 ` Linus Torvalds
[not found] ` <978690c4-1d25-46e8-3375-45940ec1ea51@huaweicloud.com>
2023-05-08 8:39 ` Peter Zijlstra
2023-05-08 10:16 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2e7d4f63-7ddd-e4a6-e7eb-fd2a305d442e@kernel.dk \
--to=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox