* [BUG] io_uring: possible CQE32 overflow flush inconsistency in __io_cqring_overflow_flush()
@ 2026-06-20 6:13 Cyber_black
2026-06-20 6:17 ` gregkh
0 siblings, 1 reply; 6+ messages in thread
From: Cyber_black @ 2026-06-20 6:13 UTC (permalink / raw)
To: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, axboe@kernel.dk,
stable@vger.kernel.org, gregkh@linuxfoundation.org,
gabriel@krisman.be
Hi Gabriel,
Thank you for your response.
I found this bug while doing independent research. I was reading the Linux kernel code from Linus Torvalds' main repository (git.kernel.org) and the io_uring subsystem caught my attention. In particular, the use of shared memory for optimization purposes stood out – especially since this very feature has been exploited in the past to develop rootkits targeting io_uring.
So I first studied its architecture and then read the code in depth. The bug emerged during that review.
Regarding a trigger scenario (PoC – Proof of Concept): unfortunately, I don't have one. My system does not support io_uring (it returns ENOSYS, likely due to enterprise compatibility settings), so I couldn't run the liburing test suite. However, the fix itself is straightforward and the logic is clear.
As for the target version: this issue exists in the mainline kernel. It is not in a stable release yet, as I found it directly in Linus' main tree.
Regarding the patch format – I just generated a clean patch using git format-patch and sent it separately
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] io_uring: possible CQE32 overflow flush inconsistency in __io_cqring_overflow_flush()
2026-06-20 6:13 [BUG] io_uring: possible CQE32 overflow flush inconsistency in __io_cqring_overflow_flush() Cyber_black
@ 2026-06-20 6:17 ` gregkh
0 siblings, 0 replies; 6+ messages in thread
From: gregkh @ 2026-06-20 6:17 UTC (permalink / raw)
To: Cyber_black
Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org,
axboe@kernel.dk, stable@vger.kernel.org, gabriel@krisman.be
On Sat, Jun 20, 2026 at 06:13:50AM +0000, Cyber_black wrote:
>
>
> Hi Gabriel,
>
> Thank you for your response.
>
> I found this bug while doing independent research. I was reading the Linux kernel code from Linus Torvalds' main repository (git.kernel.org) and the io_uring subsystem caught my attention. In particular, the use of shared memory for optimization purposes stood out – especially since this very feature has been exploited in the past to develop rootkits targeting io_uring.
>
> So I first studied its architecture and then read the code in depth. The bug emerged during that review.
>
> Regarding a trigger scenario (PoC – Proof of Concept): unfortunately, I don't have one. My system does not support io_uring (it returns ENOSYS, likely due to enterprise compatibility settings), so I couldn't run the liburing test suite. However, the fix itself is straightforward and the logic is clear.
That's not how any of this works, please always test your changes. If
you can't even build/boot them, don't expect others to do it for you.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
* [BUG] io_uring: possible CQE32 overflow flush inconsistency in __io_cqring_overflow_flush()
@ 2026-06-19 8:05 Cyber_black
2026-06-19 16:07 ` Gabriel Krisman Bertazi
0 siblings, 1 reply; 6+ messages in thread
From: Cyber_black @ 2026-06-19 8:05 UTC (permalink / raw)
To: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, axboe@kernel.dk,
stable@vger.kernel.org, gregkh@linuxfoundation.org
On Fri, Jun 19, 2026 at 04:49:32AM +0000, Greg KH wrote:> Please turn this into a real patch that you have gregkh@linuxfoundation.org to verify it
> resolves the issue so you get full credit for the fix.
Hi Greg,
Apologies for the previous mail's format. The patch compiles cleanly
on arm64. My current environment does not support io_uring (ENOSYS)
so I was unable to run the liburing suite, but the fix itself is
straightforward.
From 522b70bdd3ac64c64dd21842cb5901e59a1fb058 Mon Sep 17 00:00:00 2001
From: Eneshan Erdogan Karaca <cyberblackk@proton.me>
Date: Fri, 19 Jun 2026 07:59:58 +0000
Subject: [PATCH] io_uring: fix cqe_size/is_cqe32 inconsistency in overflow
flush
When IORING_SETUP_CQE32 is set, Block A doubles cqe_size to handle
32-byte CQEs. Block B then resets is_cqe32 to false so that
io_get_cqe_overflow() uses its own ctx flag check internally, but
fails to reset cqe_size. This leaves cqe_size=32 while a 16-byte
slot is allocated, causing memcpy() to write beyond the allocated
CQE slot.
Fix this by also resetting cqe_size when is_cqe32 is cleared.
Signed-off-by: Eneshan Erdogan Karaca <cyberblackk@proton.me>
---
io_uring/io_uring.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 1ea2fca34a36..f9690291633a 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -543,8 +543,10 @@ static void __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool dying)
is_cqe32 = true;
cqe_size <<= 1;
}
- if (ctx->flags & IORING_SETUP_CQE32)
+ if (ctx->flags & IORING_SETUP_CQE32) {
is_cqe32 = false;
+ cqe_size = sizeof(struct io_uring_cqe);
+ }
if (!dying) {
if (!io_get_cqe_overflow(ctx, &cqe, true, is_cqe32))
--
2.34.1
Thanks,
Eneshan Erdogan Karaca
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [BUG] io_uring: possible CQE32 overflow flush inconsistency in __io_cqring_overflow_flush()
2026-06-19 8:05 Cyber_black
@ 2026-06-19 16:07 ` Gabriel Krisman Bertazi
0 siblings, 0 replies; 6+ messages in thread
From: Gabriel Krisman Bertazi @ 2026-06-19 16:07 UTC (permalink / raw)
To: Cyber_black, io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, axboe@kernel.dk,
stable@vger.kernel.org, gregkh@linuxfoundation.org
Cyber_black <Cyberblackk@proton.me> writes:
> On Fri, Jun 19, 2026 at 04:49:32AM +0000, Greg KH wrote:> Please turn this into a real patch that you have gregkh@linuxfoundation.org to verify it
>> resolves the issue so you get full credit for the fix.
>
> Hi Greg,
>
> Apologies for the previous mail's format. The patch compiles cleanly
> on arm64. My current environment does not support io_uring (ENOSYS)
> so I was unable to run the liburing suite, but the fix itself is
> straightforward.
What's the context, was this sent against stable? The issue exists
in mainline.
> From 522b70bdd3ac64c64dd21842cb5901e59a1fb058 Mon Sep 17 00:00:00 2001
> From: Eneshan Erdogan Karaca <cyberblackk@proton.me>
> Date: Fri, 19 Jun 2026 07:59:58 +0000
> Subject: [PATCH] io_uring: fix cqe_size/is_cqe32 inconsistency in overflow
> flush
Ideally, send it as a patch to the list with [PATCH] so it doesn't vanish under a [BUG]
tag.
>
> When IORING_SETUP_CQE32 is set, Block A doubles cqe_size to handle
> 32-byte CQEs. Block B then resets is_cqe32 to false so that
> io_get_cqe_overflow() uses its own ctx flag check internally, but
> fails to reset cqe_size. This leaves cqe_size=32 while a 16-byte
> slot is allocated, causing memcpy() to write beyond the allocated
> CQE slot.
How was this found? Do you have a syzbot or a trigger? The fix looks
good but the patch appears corrupted, with a bunch of NBSP.
>
> Fix this by also resetting cqe_size when is_cqe32 is cleared.
>
> Signed-off-by: Eneshan Erdogan Karaca <cyberblackk@proton.me>
> ---
> io_uring/io_uring.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 1ea2fca34a36..f9690291633a 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -543,8 +543,10 @@ static void __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool dying)
> is_cqe32 = true;
> cqe_size <<= 1;
> }
> - if (ctx->flags & IORING_SETUP_CQE32)
> + if (ctx->flags & IORING_SETUP_CQE32) {
> is_cqe32 = false;
> + cqe_size = sizeof(struct io_uring_cqe);
> + }
> if (!dying) {
> if (!io_get_cqe_overflow(ctx, &cqe, true, is_cqe32))
> --
> 2.34.1
>
> Thanks,
> Eneshan Erdogan Karaca
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 6+ messages in thread
* [BUG] io_uring: possible CQE32 overflow flush inconsistency in __io_cqring_overflow_flush()
@ 2026-06-19 4:49 Cyber_black
2026-06-19 6:00 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Cyber_black @ 2026-06-19 4:49 UTC (permalink / raw)
To: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, axboe@kernel.dk,
stable@vger.kernel.org
Hi,
I believe there is a bug in __io_cqring_overflow_flush() in io_uring/io_uring.c
where `is_cqe32` and `cqe_size` are left in an inconsistent state when
IORING_SETUP_CQE32 is set, potentially leading to an out-of-bounds write into
the CQ ring.
AFFECTED FILE
=============
io_uring/io_uring.c
Function: __io_cqring_overflow_flush()
KERNEL VERSION
==============
Observed in current upstream (v6.8+). Please confirm against your tree.
Found File
==============
https://github.com/torvalds/linux/blob/master/io_uring/io_uring.c
DESCRIPTION
===========
Inside the flush loop, `cqe_size` and `is_cqe32` are both initialized and then
conditionally updated:
size_t cqe_size = sizeof(struct io_uring_cqe); /* 16 bytes /
bool is_cqe32 = false;
/ Block A */
if (ocqe->cqe.flags & IORING_CQE_F_32 ||
ctx->flags & IORING_SETUP_CQE32) {
is_cqe32 = true;
cqe_size <<= 1; /* cqe_size = 32 bytes /
}
/ Block B */
if (ctx->flags & IORING_SETUP_CQE32)
is_cqe32 = false; /* only is_cqe32 reset, cqe_size NOT reset */
if (!dying) {
if (!io_get_cqe_overflow(ctx, &cqe, true, is_cqe32))
break;
memcpy(cqe, &ocqe->cqe, cqe_size);
}
When IORING_SETUP_CQE32 is set, Block A correctly doubles cqe_size to 32 and
sets is_cqe32 = true. Block B then resets is_cqe32 back to false, but leaves
cqe_size at 32.
This means:
- io_get_cqe_overflow() is called with is_cqe32 = false
→ it returns a pointer to a 16-byte CQE slot in the ring
- memcpy() then copies cqe_size = 32 bytes into that 16-byte slot
→ 16 bytes past the end of the allocated CQE slot are overwritten
The destination `cqe` points directly into the shared CQ ring memory
(ctx->rings->cqes[]), so the excess bytes corrupt the adjacent CQE entry.
If the corrupted slot is the last one in the ring, the overflow writes past
the array and corrupts other fields in struct io_rings (e.g., sq_flags, cq_flags).
IMPACT
======
On a ring configured with IORING_SETUP_CQE32, flushing the overflow list
causes silent corruption of adjacent CQE entries (or adjacent ring metadata).
This can manifest as:
- Userspace receiving garbled CQE data (wrong user_data, res, flags)
- Link chains or multishot requests making decisions based on corrupt
completions
- Unpredictable kernel behavior if ring metadata is overwritten
- Potential data integrity issues in applications relying on io_uring with CQE32
STEPS TO REPRODUCE
==================
1. Create an io_uring instance with IORING_SETUP_CQE32.
2. Submit enough requests to fill the CQ ring and trigger overflow
(i.e., force entries onto ctx->cq_overflow_list).
3. Call io_uring_enter() or close the ring to trigger
__io_cqring_overflow_flush().
4. Observe that the CQE following the flushed entry (or ring metadata) is
silently overwritten. This can be verified by reading the CQ ring from
userspace.
SUSPECTED ROOT CAUSE
====================
Block B appears to have been added to handle IORING_SETUP_CQE_MIXED, where the
ctx-level CQE32 flag should not be passed down to io_get_cqe_overflow() (since
in mixed mode the slot size is determined per-entry by the flag, not globally).
However, Block B resets only is_cqe32 and not cqe_size, creating the
inconsistency.
PROPOSED FIX
============
If Block B is intentional (i.e. io_get_cqe_overflow already handles CQE32 slot
sizing internally when IORING_SETUP_CQE32 is set), then cqe_size must also be
reset:
if (ctx->flags & IORING_SETUP_CQE32) {
is_cqe32 = false;
cqe_size = sizeof(struct io_uring_cqe); /* undo Block A */
}
Alternatively, if Block B is dead/incorrect code, it should be removed entirely
and io_get_cqe_overflow() called with is_cqe32 = true when appropriate.
The correct fix depends on the intended semantics of is_cqe32 vs ctx flag
inside io_get_cqe_overflow(), which the maintainer is best placed to confirm.
RELEVANT CODE (verbatim)
========================
--- a/io_uring/io_uring.c (v6.8)
__io_cqring_overflow_flush(), lines ~541-552:
if (ocqe->cqe.flags & IORING_CQE_F_32 ||
ctx->flags & IORING_SETUP_CQE32) {
is_cqe32 = true;
cqe_size <<= 1;
}
if (ctx->flags & IORING_SETUP_CQE32)
is_cqe32 = false; /* BUG: cqe_size not restored /
if (!dying) {
if (!io_get_cqe_overflow(ctx, &cqe, true, is_cqe32))
break;
memcpy(cqe, &ocqe->cqe, cqe_size); / OOB if slot < cqe_size */
}
Thanks for looking into this.
Best Regards
Eneshan Erdoğan Karaca.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [BUG] io_uring: possible CQE32 overflow flush inconsistency in __io_cqring_overflow_flush()
2026-06-19 4:49 Cyber_black
@ 2026-06-19 6:00 ` Greg KH
0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2026-06-19 6:00 UTC (permalink / raw)
To: Cyber_black
Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org,
axboe@kernel.dk, stable@vger.kernel.org
On Fri, Jun 19, 2026 at 04:49:32AM +0000, Cyber_black wrote:
>
> Hi,
>
> I believe there is a bug in __io_cqring_overflow_flush() in io_uring/io_uring.c
> where `is_cqe32` and `cqe_size` are left in an inconsistent state when
> IORING_SETUP_CQE32 is set, potentially leading to an out-of-bounds write into
> the CQ ring.
>
> AFFECTED FILE
> =============
> io_uring/io_uring.c
> Function: __io_cqring_overflow_flush()
>
> KERNEL VERSION
> ==============
> Observed in current upstream (v6.8+). Please confirm against your tree.
Huh? Was this written by a LLM?
> PROPOSED FIX
> ============
> If Block B is intentional (i.e. io_get_cqe_overflow already handles CQE32 slot
> sizing internally when IORING_SETUP_CQE32 is set), then cqe_size must also be
> reset:
>
> if (ctx->flags & IORING_SETUP_CQE32) {
>
> is_cqe32 = false;
> cqe_size = sizeof(struct io_uring_cqe); /* undo Block A */
> }
>
> Alternatively, if Block B is dead/incorrect code, it should be removed entirely
> and io_get_cqe_overflow() called with is_cqe32 = true when appropriate.
>
> The correct fix depends on the intended semantics of is_cqe32 vs ctx flag
> inside io_get_cqe_overflow(), which the maintainer is best placed to confirm.
Please turn this into a real patch that you have tested to verify it
resolves the issue so you get full credit for the fix.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-20 6:18 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-20 6:13 [BUG] io_uring: possible CQE32 overflow flush inconsistency in __io_cqring_overflow_flush() Cyber_black
2026-06-20 6:17 ` gregkh
-- strict thread matches above, loose matches on Subject: below --
2026-06-19 8:05 Cyber_black
2026-06-19 16:07 ` Gabriel Krisman Bertazi
2026-06-19 4:49 Cyber_black
2026-06-19 6:00 ` Greg KH
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.