* [PATCH v1 1/1] [io_uring] require RWF_HIPRI for iopoll reads and writes
@ 2019-05-01 11:52 Stefan Bühler
2019-05-01 12:43 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Bühler @ 2019-05-01 11:52 UTC (permalink / raw)
To: Jens Axboe, linux-block, linux-fsdevel
This makes the mapping RWF_HIPRI <-> IOCB_HIPRI <-> iopoll more
consistent; it also allows supporting iopoll operations without
IORING_SETUP_IOPOLL in the future.
Signed-off-by: Stefan Bühler <source@stbuehler.de>
---
fs/io_uring.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 17eae94a54fc..6a480f04b0f3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -881,14 +881,16 @@ static int io_prep_rw(struct io_kiocb *req, const struct sqe_submit *s,
kiocb->ki_flags |= IOCB_NOWAIT;
if (ctx->flags & IORING_SETUP_IOPOLL) {
+ /* require O_DIRECT (or DAX) and RWF_HIPRI */
if (!(kiocb->ki_flags & IOCB_DIRECT) ||
+ !(kiocb->ki_flags & IOCB_HIPRI) ||
!kiocb->ki_filp->f_op->iopoll)
return -EOPNOTSUPP;
req->error = 0;
- kiocb->ki_flags |= IOCB_HIPRI;
kiocb->ki_complete = io_complete_rw_iopoll;
} else {
+ /* RWF_HIPRI not allowed */
if (kiocb->ki_flags & IOCB_HIPRI)
return -EINVAL;
kiocb->ki_complete = io_complete_rw;
--
2.20.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v1 1/1] [io_uring] require RWF_HIPRI for iopoll reads and writes
2019-05-01 11:52 [PATCH v1 1/1] [io_uring] require RWF_HIPRI for iopoll reads and writes Stefan Bühler
@ 2019-05-01 12:43 ` Jens Axboe
2019-05-01 13:40 ` Stefan Bühler
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2019-05-01 12:43 UTC (permalink / raw)
To: Stefan Bühler, linux-block, linux-fsdevel
On 5/1/19 5:52 AM, Stefan Bühler wrote:
> This makes the mapping RWF_HIPRI <-> IOCB_HIPRI <-> iopoll more
> consistent; it also allows supporting iopoll operations without
> IORING_SETUP_IOPOLL in the future.
I don't want to make this change now. Additionally, it's never
going to be possible to support polled IO mixed with non-polled
IO on an io_uring instance, as that makes the wait part of IO
impossible to support without adding tracking of requests.
As we can never mix them, it doesn't make a lot of sense to
request RWF_HIPRI for polled IO.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v1 1/1] [io_uring] require RWF_HIPRI for iopoll reads and writes
2019-05-01 12:43 ` Jens Axboe
@ 2019-05-01 13:40 ` Stefan Bühler
2019-05-01 14:12 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Bühler @ 2019-05-01 13:40 UTC (permalink / raw)
To: Jens Axboe, linux-block, linux-fsdevel
Hi,
On 01.05.19 14:43, Jens Axboe wrote:
> On 5/1/19 5:52 AM, Stefan Bühler wrote:
>> This makes the mapping RWF_HIPRI <-> IOCB_HIPRI <-> iopoll more
>> consistent; it also allows supporting iopoll operations without
>> IORING_SETUP_IOPOLL in the future.
>
> I don't want to make this change now. Additionally, it's never
> going to be possible to support polled IO mixed with non-polled
> IO on an io_uring instance, as that makes the wait part of IO
> impossible to support without adding tracking of requests.
>
> As we can never mix them, it doesn't make a lot of sense to
> request RWF_HIPRI for polled IO.
I'm not just new to memory ordering, I'm also new to kernel internals :)
To me it looks like iopoll is basically a busy-loop interface; it helps
making things move forward more quickly, while they still might (or
might not) finish on their own.
And io_do_iopoll simply loops over all requests and runs a single
iteration for them, or, if there is only one request
("!poll_multi_file"), it tells it to spin internally.
While there are multiple requests it can't spin in a single request
anyway, and I don't see why it couldn't also check for completion of
non-polled requests after looping over the polled requests (whether by
only checking the CQ tail or actively tracking (why would that be bad?)
the requests some other way). This only means that as long there are
non-polled requests pending it mustn't spin in a single request.
And if there are no polled-requests at all it could use io_cqring_wait.
So I don't see why it would be impossible to mix polled and non-polled
IO requests.
Any hints what I'm missing here?
(Even if it turns out to be impossible I still think requiring RWF_HIPRI
would be the right way, but well...)
cheers,
Stefan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v1 1/1] [io_uring] require RWF_HIPRI for iopoll reads and writes
2019-05-01 13:40 ` Stefan Bühler
@ 2019-05-01 14:12 ` Jens Axboe
0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2019-05-01 14:12 UTC (permalink / raw)
To: Stefan Bühler, linux-block, linux-fsdevel
On 5/1/19 7:40 AM, Stefan Bühler wrote:
> Hi,
>
> On 01.05.19 14:43, Jens Axboe wrote:
>> On 5/1/19 5:52 AM, Stefan Bühler wrote:
>>> This makes the mapping RWF_HIPRI <-> IOCB_HIPRI <-> iopoll more
>>> consistent; it also allows supporting iopoll operations without
>>> IORING_SETUP_IOPOLL in the future.
>>
>> I don't want to make this change now. Additionally, it's never
>> going to be possible to support polled IO mixed with non-polled
>> IO on an io_uring instance, as that makes the wait part of IO
>> impossible to support without adding tracking of requests.
>>
>> As we can never mix them, it doesn't make a lot of sense to
>> request RWF_HIPRI for polled IO.
>
> I'm not just new to memory ordering, I'm also new to kernel internals :)
>
> To me it looks like iopoll is basically a busy-loop interface; it helps
> making things move forward more quickly, while they still might (or
> might not) finish on their own.
Right, the key there is that they might not. For NVMe and anything else
that has been updated, polled IO will not finish on its own. It must be
actively found and reaped.
> And io_do_iopoll simply loops over all requests and runs a single
> iteration for them, or, if there is only one request
> ("!poll_multi_file"), it tells it to spin internally.
Correct
> While there are multiple requests it can't spin in a single request
> anyway, and I don't see why it couldn't also check for completion of
> non-polled requests after looping over the polled requests (whether by
> only checking the CQ tail or actively tracking (why would that be bad?)
> the requests some other way). This only means that as long there are
> non-polled requests pending it mustn't spin in a single request.
>
> And if there are no polled-requests at all it could use io_cqring_wait.
>
> So I don't see why it would be impossible to mix polled and non-polled
> IO requests.
It's not technically impossible, but it would be more inefficient to do
so. Adding per-request accounting would cost cycles, and the logic of
how to handle waiting/polling for a mixed workload would be interesting.
Hence it's simpler to simply disallow mixing polled and non polled IO.
There are no real benefits to allowing the mix and match of them.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-05-01 14:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-01 11:52 [PATCH v1 1/1] [io_uring] require RWF_HIPRI for iopoll reads and writes Stefan Bühler
2019-05-01 12:43 ` Jens Axboe
2019-05-01 13:40 ` Stefan Bühler
2019-05-01 14:12 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).