* [PATCH 1/2] io_uring/poll: flag request with REQ_F_POLL_TRIGGERED if it went through poll
2025-11-05 19:30 [PATCHSET 0/2] Add support for IORING_CQE_F_SOCK_FULL Jens Axboe
@ 2025-11-05 19:30 ` Jens Axboe
2025-11-05 19:30 ` [PATCH 2/2] io_uring/net: add IORING_CQE_F_SOCK_FULL if a send needed to poll arm Jens Axboe
2025-11-06 12:19 ` [PATCHSET 0/2] Add support for IORING_CQE_F_SOCK_FULL Pavel Begunkov
2 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2025-11-05 19:30 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe
If a request targeting a pollable file cannot get executed without
needing to wait for a POLLIN/POLLOUT trigger, then flag it with
REQ_F_POLL_TRIGGERED. Only supported for non-multishot requests, as it's
fully expected that multishot will always end up going through poll.
No functional changes in this patch, just in preparation for using this
information elsewhere.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
include/linux/io_uring_types.h | 3 +++
io_uring/poll.c | 1 +
2 files changed, 4 insertions(+)
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 92780764d5fa..ef1af730193a 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -521,6 +521,7 @@ enum {
REQ_F_HAS_METADATA_BIT,
REQ_F_IMPORT_BUFFER_BIT,
REQ_F_SQE_COPIED_BIT,
+ REQ_F_POLL_TRIGGERED_BIT,
/* not a real bit, just to check we're not overflowing the space */
__REQ_F_LAST_BIT,
@@ -612,6 +613,8 @@ enum {
REQ_F_IMPORT_BUFFER = IO_REQ_FLAG(REQ_F_IMPORT_BUFFER_BIT),
/* ->sqe_copy() has been called, if necessary */
REQ_F_SQE_COPIED = IO_REQ_FLAG(REQ_F_SQE_COPIED_BIT),
+ /* poll was triggered at least once for this request */
+ REQ_F_POLL_TRIGGERED = IO_REQ_FLAG(REQ_F_POLL_TRIGGERED_BIT),
};
struct io_tw_req {
diff --git a/io_uring/poll.c b/io_uring/poll.c
index 8aa4e3a31e73..d6f7cddf36d0 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -419,6 +419,7 @@ static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
req->flags &= ~REQ_F_DOUBLE_POLL;
else
req->flags &= ~REQ_F_SINGLE_POLL;
+ req->flags |= REQ_F_POLL_TRIGGERED;
}
__io_poll_execute(req, mask);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH 2/2] io_uring/net: add IORING_CQE_F_SOCK_FULL if a send needed to poll arm
2025-11-05 19:30 [PATCHSET 0/2] Add support for IORING_CQE_F_SOCK_FULL Jens Axboe
2025-11-05 19:30 ` [PATCH 1/2] io_uring/poll: flag request with REQ_F_POLL_TRIGGERED if it went through poll Jens Axboe
@ 2025-11-05 19:30 ` Jens Axboe
2025-11-06 12:19 ` [PATCHSET 0/2] Add support for IORING_CQE_F_SOCK_FULL Pavel Begunkov
2 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2025-11-05 19:30 UTC (permalink / raw)
To: io-uring; +Cc: Jens Axboe
If a send/sendmsg/sendzc/sendmsgzc needed to wait for space in the
socket to complete, add IORING_CQE_F_SOCK_FULL to the cqe->flags to tell
the application about it. This meant that the socket was full when the
operation was attempted.
This adds IORING_CQE_F_SOCK_FULL as a new CQE flag. It borrows the value
of IORING_CQE_F_SOCK_NONEMPTY, which is a flag only used on the receive
side. Hence there can be no confusion on which of the two meanings is
included in a given CQE, as the application must know which kind of
operation the completion refers to.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
include/uapi/linux/io_uring.h | 4 ++++
io_uring/net.c | 19 +++++++++++++------
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index e96080db3e4d..3d921cbb84f8 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -495,6 +495,9 @@ struct io_uring_cqe {
* IORING_CQE_F_BUFFER If set, the upper 16 bits are the buffer ID
* IORING_CQE_F_MORE If set, parent SQE will generate more CQE entries
* IORING_CQE_F_SOCK_NONEMPTY If set, more data to read after socket recv
+ * IORING_CQE_F_SOCK_FULL If set, the socket was full when this send or
+ * sendmsg was attempted. Hence it had to wait for POLLOUT
+ * before being able to complete.
* IORING_CQE_F_NOTIF Set for notification CQEs. Can be used to distinct
* them from sends.
* IORING_CQE_F_BUF_MORE If set, the buffer ID set in the completion will get
@@ -518,6 +521,7 @@ struct io_uring_cqe {
#define IORING_CQE_F_BUFFER (1U << 0)
#define IORING_CQE_F_MORE (1U << 1)
#define IORING_CQE_F_SOCK_NONEMPTY (1U << 2)
+#define IORING_CQE_F_SOCK_FULL IORING_CQE_F_SOCK_NONEMPTY
#define IORING_CQE_F_NOTIF (1U << 3)
#define IORING_CQE_F_BUF_MORE (1U << 4)
#define IORING_CQE_F_SKIP (1U << 5)
diff --git a/io_uring/net.c b/io_uring/net.c
index a95cc9ca2a4d..6a834237fd5c 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -530,11 +530,21 @@ static inline bool io_send_finish(struct io_kiocb *req,
/* Otherwise stop bundle and use the current result. */
finish:
+ if (req->flags & REQ_F_POLL_TRIGGERED)
+ cflags |= IORING_CQE_F_SOCK_FULL;
io_req_set_res(req, sel->val, cflags);
sel->val = IOU_COMPLETE;
return true;
}
+static int io_send_complete(struct io_kiocb *req, int ret, unsigned cflags)
+{
+ if (req->flags & REQ_F_POLL_TRIGGERED)
+ cflags |= IORING_CQE_F_SOCK_FULL;
+ io_req_set_res(req, ret, cflags);
+ return IOU_COMPLETE;
+}
+
int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
{
struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
@@ -580,8 +590,7 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
ret += sr->done_io;
else if (sr->done_io)
ret = sr->done_io;
- io_req_set_res(req, ret, 0);
- return IOU_COMPLETE;
+ return io_send_complete(req, ret, 0);
}
static int io_send_select_buffer(struct io_kiocb *req, unsigned int issue_flags,
@@ -1516,8 +1525,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags)
zc->notif = NULL;
io_req_msg_cleanup(req, 0);
}
- io_req_set_res(req, ret, IORING_CQE_F_MORE);
- return IOU_COMPLETE;
+ return io_send_complete(req, ret, IORING_CQE_F_MORE);
}
int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags)
@@ -1586,8 +1594,7 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags)
sr->notif = NULL;
io_req_msg_cleanup(req, 0);
}
- io_req_set_res(req, ret, IORING_CQE_F_MORE);
- return IOU_COMPLETE;
+ return io_send_complete(req, ret, IORING_CQE_F_MORE);
}
void io_sendrecv_fail(struct io_kiocb *req)
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCHSET 0/2] Add support for IORING_CQE_F_SOCK_FULL
2025-11-05 19:30 [PATCHSET 0/2] Add support for IORING_CQE_F_SOCK_FULL Jens Axboe
2025-11-05 19:30 ` [PATCH 1/2] io_uring/poll: flag request with REQ_F_POLL_TRIGGERED if it went through poll Jens Axboe
2025-11-05 19:30 ` [PATCH 2/2] io_uring/net: add IORING_CQE_F_SOCK_FULL if a send needed to poll arm Jens Axboe
@ 2025-11-06 12:19 ` Pavel Begunkov
2025-11-06 23:24 ` Jens Axboe
2 siblings, 1 reply; 5+ messages in thread
From: Pavel Begunkov @ 2025-11-06 12:19 UTC (permalink / raw)
To: Jens Axboe, io-uring
On 11/5/25 19:30, Jens Axboe wrote:
> Hi,
>
> It can be useful for userspace to know if a send request had to go
> through poll to complete, as that generally means that the socket was
> out of space. On the send side, this is pretty trivial to support - we
> just need to check if the request needed to go through poll to complete.
>
> This reuses the IORING_CQE_F_SOCK_NONEMPTY flag value, which is only
> valid for recv operations. As IORING_CQE_F_SOCK_FULL only applies on
> sends, there's no need for separate values for this flag.
>
> Based on an earlier patchset, which utilized REQ_F_POLL_ARMED instead
> and handled patch 1 a bit differently.
FWIW, same comments as last time. REQ_F_POLL_TRIGGERED is set not
in the right place. And, with how tcp manages wait queues, you won't
be able to use it well for any throttling, as the user will get the
flagged CQE long time after, when the queue is already half empty.
--
Pavel Begunkov
^ permalink raw reply [flat|nested] 5+ messages in thread