linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
       [not found] <20250827114339.367080-1-chunzhennn@qq.com>
@ 2025-08-27 11:44 ` Qingyue Zhang
  2025-08-27 14:30   ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Qingyue Zhang @ 2025-08-27 11:44 UTC (permalink / raw)
  To: axboe; +Cc: io-uring, linux-kernel, Qingyue Zhang, Suoxing Zhang

In io_kbuf_inc_commit(), buf points to a user-mapped memory region,
which means buf->len might be changed between importing and committing.
Add a check to avoid infinite loop when sum of buf->len is less than
len.

Co-developed-by: Suoxing Zhang <aftern00n@qq.com>
Signed-off-by: Suoxing Zhang <aftern00n@qq.com>
Signed-off-by: Qingyue Zhang <chunzhennn@qq.com>
---
 io_uring/kbuf.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 81a13338dfab..80ffe6755598 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -34,11 +34,12 @@ struct io_provide_buf {
 
 static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
 {
+	struct io_uring_buf *buf, *buf_start;
+
+	buf_start = buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
 	while (len) {
-		struct io_uring_buf *buf;
 		u32 this_len;
 
-		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
 		this_len = min_t(u32, len, buf->len);
 		buf->len -= this_len;
 		if (buf->len) {
@@ -47,6 +48,10 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
 		}
 		bl->head++;
 		len -= this_len;
+
+		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
+		if (unlikely(buf == buf_start))
+			break;
 	}
 	return true;
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-27 11:44 ` [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit() Qingyue Zhang
@ 2025-08-27 14:30   ` Jens Axboe
  2025-08-27 21:45     ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2025-08-27 14:30 UTC (permalink / raw)
  To: Qingyue Zhang; +Cc: io-uring, linux-kernel, Suoxing Zhang

On 8/27/25 5:44 AM, Qingyue Zhang wrote:
> In io_kbuf_inc_commit(), buf points to a user-mapped memory region,
> which means buf->len might be changed between importing and committing.
> Add a check to avoid infinite loop when sum of buf->len is less than
> len.
> 
> Co-developed-by: Suoxing Zhang <aftern00n@qq.com>
> Signed-off-by: Suoxing Zhang <aftern00n@qq.com>
> Signed-off-by: Qingyue Zhang <chunzhennn@qq.com>
> ---
>  io_uring/kbuf.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
> index 81a13338dfab..80ffe6755598 100644
> --- a/io_uring/kbuf.c
> +++ b/io_uring/kbuf.c
> @@ -34,11 +34,12 @@ struct io_provide_buf {
>  
>  static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>  {
> +	struct io_uring_buf *buf, *buf_start;
> +
> +	buf_start = buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
>  	while (len) {
> -		struct io_uring_buf *buf;
>  		u32 this_len;
>  
> -		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
>  		this_len = min_t(u32, len, buf->len);
>  		buf->len -= this_len;
>  		if (buf->len) {
> @@ -47,6 +48,10 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>  		}
>  		bl->head++;
>  		len -= this_len;
> +
> +		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
> +		if (unlikely(buf == buf_start))
> +			break;
>  	}
>  	return true;
>  }

Maybe I'm dense, but I don't follow this one. 'len' is passed in, and
the only thing that should cause things to loop more than it should
would be if we do:

len -= this_len;

and this_len > len;

Yes, buf->len is user mapped, perhaps we just need to do:

diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index f2d2cc319faa..569f4d957051 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -36,15 +36,18 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
 {
 	while (len) {
 		struct io_uring_buf *buf;
-		u32 this_len;
+		u32 buf_len, this_len;
 
 		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
-		this_len = min_t(int, len, buf->len);
-		buf->len -= this_len;
-		if (buf->len) {
+		buf_len = READ_ONCE(buf->len);
+		this_len = min_t(int, len, buf_len);
+		buf_len -= this_len;
+		if (buf_len) {
 			buf->addr += this_len;
+			buf->len = buf_len;
 			return false;
 		}
+		buf->len = 0;
 		bl->head++;
 		len -= this_len;
 	}

so that we operate on a local variable, and just set buf->len
appropriate for each buffer.

?

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-27 14:30   ` Jens Axboe
@ 2025-08-27 21:45     ` Jens Axboe
  2025-08-27 21:59       ` Keith Busch
  2025-08-28  1:36       ` Qingyue Zhang
  0 siblings, 2 replies; 12+ messages in thread
From: Jens Axboe @ 2025-08-27 21:45 UTC (permalink / raw)
  To: Qingyue Zhang; +Cc: io-uring, linux-kernel, Suoxing Zhang

On 8/27/25 8:30 AM, Jens Axboe wrote:
> On 8/27/25 5:44 AM, Qingyue Zhang wrote:
>> In io_kbuf_inc_commit(), buf points to a user-mapped memory region,
>> which means buf->len might be changed between importing and committing.
>> Add a check to avoid infinite loop when sum of buf->len is less than
>> len.
>>
>> Co-developed-by: Suoxing Zhang <aftern00n@qq.com>
>> Signed-off-by: Suoxing Zhang <aftern00n@qq.com>
>> Signed-off-by: Qingyue Zhang <chunzhennn@qq.com>
>> ---
>>  io_uring/kbuf.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
>> index 81a13338dfab..80ffe6755598 100644
>> --- a/io_uring/kbuf.c
>> +++ b/io_uring/kbuf.c
>> @@ -34,11 +34,12 @@ struct io_provide_buf {
>>  
>>  static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>>  {
>> +	struct io_uring_buf *buf, *buf_start;
>> +
>> +	buf_start = buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
>>  	while (len) {
>> -		struct io_uring_buf *buf;
>>  		u32 this_len;
>>  
>> -		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
>>  		this_len = min_t(u32, len, buf->len);
>>  		buf->len -= this_len;
>>  		if (buf->len) {
>> @@ -47,6 +48,10 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>>  		}
>>  		bl->head++;
>>  		len -= this_len;
>> +
>> +		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
>> +		if (unlikely(buf == buf_start))
>> +			break;
>>  	}
>>  	return true;
>>  }
> 
> Maybe I'm dense, but I don't follow this one. 'len' is passed in, and
> the only thing that should cause things to loop more than it should
> would be if we do:
> 
> len -= this_len;
> 
> and this_len > len;
> 
> Yes, buf->len is user mapped, perhaps we just need to do:
> 
> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
> index f2d2cc319faa..569f4d957051 100644
> --- a/io_uring/kbuf.c
> +++ b/io_uring/kbuf.c
> @@ -36,15 +36,18 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>  {
>  	while (len) {
>  		struct io_uring_buf *buf;
> -		u32 this_len;
> +		u32 buf_len, this_len;
>  
>  		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
> -		this_len = min_t(int, len, buf->len);
> -		buf->len -= this_len;
> -		if (buf->len) {
> +		buf_len = READ_ONCE(buf->len);
> +		this_len = min_t(int, len, buf_len);
> +		buf_len -= this_len;
> +		if (buf_len) {
>  			buf->addr += this_len;
> +			buf->len = buf_len;
>  			return false;
>  		}
> +		buf->len = 0;
>  		bl->head++;
>  		len -= this_len;
>  	}
> 
> so that we operate on a local variable, and just set buf->len
> appropriate for each buffer.

I took a closer look and there's another spot where we should be
using READ_ONCE() to get the buffer length. How about something like
the below rather than the loop work-around?


commit 7f472373b2855087ae2df9dc6a923f3016a1ed21
Author: Jens Axboe <axboe@kernel.dk>
Date:   Wed Aug 27 15:27:30 2025 -0600

    io_uring/kbuf: always use READ_ONCE() to read ring provided buffer lengths
    
    Since the buffers are mapped from userspace, it is prudent to use
    READ_ONCE() to read the value into a local variable, and use that for
    any other actions taken. Having a stable read of the buffer length
    avoids worrying about it changing after checking, or being read multiple
    times.
    
    Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
    Link: https://lore.kernel.org/io-uring/tencent_000C02641F6250C856D0C26228DE29A3D30A@qq.com/
    Reported-by: Qingyue Zhang <chunzhennn@qq.com>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 81a13338dfab..394037d3f2f6 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -36,15 +36,18 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
 {
 	while (len) {
 		struct io_uring_buf *buf;
-		u32 this_len;
+		u32 buf_len, this_len;
 
 		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
-		this_len = min_t(u32, len, buf->len);
-		buf->len -= this_len;
-		if (buf->len) {
+		buf_len = READ_ONCE(buf->len);
+		this_len = min_t(u32, len, buf_len);
+		buf_len -= this_len;
+		if (buf_len) {
 			buf->addr += this_len;
+			buf->len = buf_len;
 			return false;
 		}
+		buf->len = 0;
 		bl->head++;
 		len -= this_len;
 	}
@@ -159,6 +162,7 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
 	__u16 tail, head = bl->head;
 	struct io_uring_buf *buf;
 	void __user *ret;
+	u32 buf_len;
 
 	tail = smp_load_acquire(&br->tail);
 	if (unlikely(tail == head))
@@ -168,8 +172,9 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
 		req->flags |= REQ_F_BL_EMPTY;
 
 	buf = io_ring_head_to_buf(br, head, bl->mask);
-	if (*len == 0 || *len > buf->len)
-		*len = buf->len;
+	buf_len = READ_ONCE(buf->len);
+	if (*len == 0 || *len > buf_len)
+		*len = buf_len;
 	req->flags |= REQ_F_BUFFER_RING | REQ_F_BUFFERS_COMMIT;
 	req->buf_list = bl;
 	req->buf_index = buf->bid;
@@ -265,7 +270,7 @@ static int io_ring_buffers_peek(struct io_kiocb *req, struct buf_sel_arg *arg,
 
 	req->buf_index = buf->bid;
 	do {
-		u32 len = buf->len;
+		u32 len = READ_ONCE(buf->len);
 
 		/* truncate end piece, if needed, for non partial buffers */
 		if (len > arg->max_len) {

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-27 21:45     ` Jens Axboe
@ 2025-08-27 21:59       ` Keith Busch
  2025-08-27 22:23         ` Jens Axboe
  2025-08-28  1:36       ` Qingyue Zhang
  1 sibling, 1 reply; 12+ messages in thread
From: Keith Busch @ 2025-08-27 21:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Qingyue Zhang, io-uring, linux-kernel, Suoxing Zhang

On Wed, Aug 27, 2025 at 03:45:28PM -0600, Jens Axboe wrote:
> > +		buf_len = READ_ONCE(buf->len);
> > +		this_len = min_t(int, len, buf_len);
> > +		buf_len -= this_len;
> > +		if (buf_len) {
> >  			buf->addr += this_len;
> > +			buf->len = buf_len;
> >  			return false;
> >  		}
> > +		buf->len = 0;

Purely for symmetry, assigning buf->len ought to be a WRITE_ONCE.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-27 21:59       ` Keith Busch
@ 2025-08-27 22:23         ` Jens Axboe
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2025-08-27 22:23 UTC (permalink / raw)
  To: Keith Busch; +Cc: Qingyue Zhang, io-uring, linux-kernel, Suoxing Zhang

On 8/27/25 3:59 PM, Keith Busch wrote:
> On Wed, Aug 27, 2025 at 03:45:28PM -0600, Jens Axboe wrote:
>>> +		buf_len = READ_ONCE(buf->len);
>>> +		this_len = min_t(int, len, buf_len);
>>> +		buf_len -= this_len;
>>> +		if (buf_len) {
>>>  			buf->addr += this_len;
>>> +			buf->len = buf_len;
>>>  			return false;
>>>  		}
>>> +		buf->len = 0;
> 
> Purely for symmetry, assigning buf->len ought to be a WRITE_ONCE.

I did think about that, perhaps I should've mentioned it in the commit
message. While the reader side is important for the reasons stated, the
updating of buf->len isn't really as only the serialized kernel side
will do it. Hence the WRITE_ONCE() should not be needed on the write
side, outside of perhaps documenting that this is a shared buffer.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-27 21:45     ` Jens Axboe
  2025-08-27 21:59       ` Keith Busch
@ 2025-08-28  1:36       ` Qingyue Zhang
  2025-08-28  2:08         ` Jens Axboe
  1 sibling, 1 reply; 12+ messages in thread
From: Qingyue Zhang @ 2025-08-28  1:36 UTC (permalink / raw)
  To: axboe; +Cc: aftern00n, chunzhennn, io-uring, linux-kernel

On 2025-08-27 21:45 UTC, Jens Axboe wrote:
> I took a closer look and there's another spot where we should be
> using READ_ONCE() to get the buffer length. How about something like
> the below rather than the loop work-around?
> 
> 
> commit 7f472373b2855087ae2df9dc6a923f3016a1ed21
> Author: Jens Axboe <axboe@kernel.dk>
> Date:   Wed Aug 27 15:27:30 2025 -0600
> 
>     io_uring/kbuf: always use READ_ONCE() to read ring provided buffer lengths
>     
>     Since the buffers are mapped from userspace, it is prudent to use
>     READ_ONCE() to read the value into a local variable, and use that for
>     any other actions taken. Having a stable read of the buffer length
>     avoids worrying about it changing after checking, or being read multiple
>     times.
>     
>     Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
>     Link: https://lore.kernel.org/io-uring/tencent_000C02641F6250C856D0C26228DE29A3D30A@qq.com/
>     Reported-by: Qingyue Zhang <chunzhennn@qq.com>
>     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
> index 81a13338dfab..394037d3f2f6 100644
> --- a/io_uring/kbuf.c
> +++ b/io_uring/kbuf.c
> @@ -36,15 +36,18 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>  {
>  	while (len) {
>  		struct io_uring_buf *buf;
> -		u32 this_len;
> +		u32 buf_len, this_len;
>  
>  		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
> -		this_len = min_t(u32, len, buf->len);
> -		buf->len -= this_len;
> -		if (buf->len) {
> +		buf_len = READ_ONCE(buf->len);
> +		this_len = min_t(u32, len, buf_len);
> +		buf_len -= this_len;
> +		if (buf_len) {
>  			buf->addr += this_len;
> +			buf->len = buf_len;
>  			return false;
>  		}
> +		buf->len = 0;
>  		bl->head++;
>  		len -= this_len;
>  	}
> @@ -159,6 +162,7 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
>  	__u16 tail, head = bl->head;
>  	struct io_uring_buf *buf;
>  	void __user *ret;
> +	u32 buf_len;
>  
>  	tail = smp_load_acquire(&br->tail);
>  	if (unlikely(tail == head))
> @@ -168,8 +172,9 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
>  		req->flags |= REQ_F_BL_EMPTY;
>  
>  	buf = io_ring_head_to_buf(br, head, bl->mask);
> -	if (*len == 0 || *len > buf->len)
> -		*len = buf->len;
> +	buf_len = READ_ONCE(buf->len);
> +	if (*len == 0 || *len > buf_len)
> +		*len = buf_len;
>  	req->flags |= REQ_F_BUFFER_RING | REQ_F_BUFFERS_COMMIT;
>  	req->buf_list = bl;
>  	req->buf_index = buf->bid;
> @@ -265,7 +270,7 @@ static int io_ring_buffers_peek(struct io_kiocb *req, struct buf_sel_arg *arg,
>  
>  	req->buf_index = buf->bid;
>  	do {
> -		u32 len = buf->len;
> +		u32 len = READ_ONCE(buf->len);
>  
>  		/* truncate end piece, if needed, for non partial buffers */
>  		if (len > arg->max_len) {

I'm afraid this doesn't solve the problem. the value of buf->len
could be changed before the function is called. Maybe we shouldn't
trust it at all?

Here is a PoC that can still trigger infinite loop:

#include<liburing.h>
#include<liburing/io_uring.h>
#include<netinet/in.h>
#include<stdint.h>
#include<sys/socket.h>
#include<arpa/inet.h>
#include<stdlib.h>
int main(){
    struct io_uring ring;
    struct io_uring_buf* buf_info;

    posix_memalign((void**)&buf_info, 4096, 4096);
    char* buf = malloc(0x1000);
    struct io_uring_buf_reg reg = {
        .ring_addr = (uint64_t)buf_info,
        .ring_entries = 2
    };
    buf_info[0].addr = (uint64_t)buf_info;
    buf_info[0].len = 0x1000;
    buf_info[0].bid = 0;
    buf_info[0].resv = 1; // tail
    io_uring_queue_init(0x10, &ring, IORING_SETUP_NO_SQARRAY);
    io_uring_register_buf_ring(&ring, &reg, IOU_PBUF_RING_INC);

    int fds[2];
    socketpair(AF_UNIX, SOCK_DGRAM, 0, fds);
    void* send_buf = calloc(1, 32);
    send(fds[0], send_buf, 32, MSG_DONTWAIT);

    struct io_uring_sqe* sqe = io_uring_get_sqe(&ring);
    io_uring_prep_recv(sqe, fds[1], NULL, 0, 0);
    sqe->flags |=  1<<IOSQE_BUFFER_SELECT_BIT;
    io_uring_submit(&ring);
    struct io_uring_cqe* cqe;
    io_uring_wait_cqe(&ring, &cqe);
}


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-28  1:36       ` Qingyue Zhang
@ 2025-08-28  2:08         ` Jens Axboe
  2025-08-28  2:49           ` Qingyue Zhang
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2025-08-28  2:08 UTC (permalink / raw)
  To: Qingyue Zhang; +Cc: aftern00n, io-uring, linux-kernel

On 8/27/25 7:36 PM, Qingyue Zhang wrote:
> On 2025-08-27 21:45 UTC, Jens Axboe wrote:
>> I took a closer look and there's another spot where we should be
>> using READ_ONCE() to get the buffer length. How about something like
>> the below rather than the loop work-around?
>>
>>
>> commit 7f472373b2855087ae2df9dc6a923f3016a1ed21
>> Author: Jens Axboe <axboe@kernel.dk>
>> Date:   Wed Aug 27 15:27:30 2025 -0600
>>
>>     io_uring/kbuf: always use READ_ONCE() to read ring provided buffer lengths
>>     
>>     Since the buffers are mapped from userspace, it is prudent to use
>>     READ_ONCE() to read the value into a local variable, and use that for
>>     any other actions taken. Having a stable read of the buffer length
>>     avoids worrying about it changing after checking, or being read multiple
>>     times.
>>     
>>     Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers")
>>     Link: https://lore.kernel.org/io-uring/tencent_000C02641F6250C856D0C26228DE29A3D30A@qq.com/
>>     Reported-by: Qingyue Zhang <chunzhennn@qq.com>
>>     Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>
>> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
>> index 81a13338dfab..394037d3f2f6 100644
>> --- a/io_uring/kbuf.c
>> +++ b/io_uring/kbuf.c
>> @@ -36,15 +36,18 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>>  {
>>  	while (len) {
>>  		struct io_uring_buf *buf;
>> -		u32 this_len;
>> +		u32 buf_len, this_len;
>>  
>>  		buf = io_ring_head_to_buf(bl->buf_ring, bl->head, bl->mask);
>> -		this_len = min_t(u32, len, buf->len);
>> -		buf->len -= this_len;
>> -		if (buf->len) {
>> +		buf_len = READ_ONCE(buf->len);
>> +		this_len = min_t(u32, len, buf_len);
>> +		buf_len -= this_len;
>> +		if (buf_len) {
>>  			buf->addr += this_len;
>> +			buf->len = buf_len;
>>  			return false;
>>  		}
>> +		buf->len = 0;
>>  		bl->head++;
>>  		len -= this_len;
>>  	}
>> @@ -159,6 +162,7 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
>>  	__u16 tail, head = bl->head;
>>  	struct io_uring_buf *buf;
>>  	void __user *ret;
>> +	u32 buf_len;
>>  
>>  	tail = smp_load_acquire(&br->tail);
>>  	if (unlikely(tail == head))
>> @@ -168,8 +172,9 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
>>  		req->flags |= REQ_F_BL_EMPTY;
>>  
>>  	buf = io_ring_head_to_buf(br, head, bl->mask);
>> -	if (*len == 0 || *len > buf->len)
>> -		*len = buf->len;
>> +	buf_len = READ_ONCE(buf->len);
>> +	if (*len == 0 || *len > buf_len)
>> +		*len = buf_len;
>>  	req->flags |= REQ_F_BUFFER_RING | REQ_F_BUFFERS_COMMIT;
>>  	req->buf_list = bl;
>>  	req->buf_index = buf->bid;
>> @@ -265,7 +270,7 @@ static int io_ring_buffers_peek(struct io_kiocb *req, struct buf_sel_arg *arg,
>>  
>>  	req->buf_index = buf->bid;
>>  	do {
>> -		u32 len = buf->len;
>> +		u32 len = READ_ONCE(buf->len);
>>  
>>  		/* truncate end piece, if needed, for non partial buffers */
>>  		if (len > arg->max_len) {
> 
> I'm afraid this doesn't solve the problem. the value of buf->len
> could be changed before the function is called. Maybe we shouldn't
> trust it at all?
> 
> Here is a PoC that can still trigger infinite loop:
> 
> #include<liburing.h>
> #include<liburing/io_uring.h>
> #include<netinet/in.h>
> #include<stdint.h>
> #include<sys/socket.h>
> #include<arpa/inet.h>
> #include<stdlib.h>
> int main(){
>     struct io_uring ring;
>     struct io_uring_buf* buf_info;
> 
>     posix_memalign((void**)&buf_info, 4096, 4096);
>     char* buf = malloc(0x1000);
>     struct io_uring_buf_reg reg = {
>         .ring_addr = (uint64_t)buf_info,
>         .ring_entries = 2
>     };
>     buf_info[0].addr = (uint64_t)buf_info;
>     buf_info[0].len = 0x1000;
>     buf_info[0].bid = 0;
>     buf_info[0].resv = 1; // tail
>     io_uring_queue_init(0x10, &ring, IORING_SETUP_NO_SQARRAY);
>     io_uring_register_buf_ring(&ring, &reg, IOU_PBUF_RING_INC);
> 
>     int fds[2];
>     socketpair(AF_UNIX, SOCK_DGRAM, 0, fds);
>     void* send_buf = calloc(1, 32);
>     send(fds[0], send_buf, 32, MSG_DONTWAIT);
> 
>     struct io_uring_sqe* sqe = io_uring_get_sqe(&ring);
>     io_uring_prep_recv(sqe, fds[1], NULL, 0, 0);
>     sqe->flags |=  1<<IOSQE_BUFFER_SELECT_BIT;
>     io_uring_submit(&ring);
>     struct io_uring_cqe* cqe;
>     io_uring_wait_cqe(&ring, &cqe);
> }

Gotcha, yes the READ_ONCE() will ensure we only read it once and it
can't get changed in between in that loop, but this one receives into
the buffer ring.

I don't think there's anything wrong with the looping and stopping at
the other end is of course a safe guard, but couldn't we just abort the
loop if we see a 0 sized buffer? At that point we know the buffer is
invalid, or the kernel is buggy, and it'd be saner to stop at that
point. Something ala:


diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 394037d3f2f6..19a8bde5e1e1 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -42,7 +42,8 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
 		buf_len = READ_ONCE(buf->len);
 		this_len = min_t(u32, len, buf_len);
 		buf_len -= this_len;
-		if (buf_len) {
+		/* Stop looping for invalid buffer length of 0 */
+		if (buf_len || !this_len) {
 			buf->addr += this_len;
 			buf->len = buf_len;
 			return false;

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-28  2:08         ` Jens Axboe
@ 2025-08-28  2:49           ` Qingyue Zhang
  2025-08-28  2:50             ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Qingyue Zhang @ 2025-08-28  2:49 UTC (permalink / raw)
  To: axboe; +Cc: aftern00n, chunzhennn, io-uring, linux-kernel

On Wed, 27 Aug 2025 20:08:05 -0600, Jens Axboe wrote:
> I don't think there's anything wrong with the looping and stopping at
> the other end is of course a safe guard, but couldn't we just abort the
> loop if we see a 0 sized buffer? At that point we know the buffer is
> invalid, or the kernel is buggy, and it'd be saner to stop at that
> point. Something ala:
> 
> 
> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
> index 394037d3f2f6..19a8bde5e1e1 100644
> --- a/io_uring/kbuf.c
> +++ b/io_uring/kbuf.c
> @@ -42,7 +42,8 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>  		buf_len = READ_ONCE(buf->len);
>  		this_len = min_t(u32, len, buf_len);
>  		buf_len -= this_len;
> -		if (buf_len) {
> +		/* Stop looping for invalid buffer length of 0 */
> +		if (buf_len || !this_len) {
>  			buf->addr += this_len;
>  			buf->len = buf_len;
>  			return false;

Good idea, it looks nice to me.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-28  2:49           ` Qingyue Zhang
@ 2025-08-28  2:50             ` Jens Axboe
  2025-08-28  2:58               ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2025-08-28  2:50 UTC (permalink / raw)
  To: Qingyue Zhang; +Cc: aftern00n, io-uring, linux-kernel

On 8/27/25 8:49 PM, Qingyue Zhang wrote:
> On Wed, 27 Aug 2025 20:08:05 -0600, Jens Axboe wrote:
>> I don't think there's anything wrong with the looping and stopping at
>> the other end is of course a safe guard, but couldn't we just abort the
>> loop if we see a 0 sized buffer? At that point we know the buffer is
>> invalid, or the kernel is buggy, and it'd be saner to stop at that
>> point. Something ala:
>>
>>
>> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
>> index 394037d3f2f6..19a8bde5e1e1 100644
>> --- a/io_uring/kbuf.c
>> +++ b/io_uring/kbuf.c
>> @@ -42,7 +42,8 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>>  		buf_len = READ_ONCE(buf->len);
>>  		this_len = min_t(u32, len, buf_len);
>>  		buf_len -= this_len;
>> -		if (buf_len) {
>> +		/* Stop looping for invalid buffer length of 0 */
>> +		if (buf_len || !this_len) {
>>  			buf->addr += this_len;
>>  			buf->len = buf_len;
>>  			return false;
> 
> Good idea, it looks nice to me.

I'll send it out, I amended the commit message a bit too.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-28  2:50             ` Jens Axboe
@ 2025-08-28  2:58               ` Jens Axboe
  2025-08-28  3:27                 ` Qingyue Zhang
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2025-08-28  2:58 UTC (permalink / raw)
  To: Qingyue Zhang; +Cc: aftern00n, io-uring, linux-kernel

On 8/27/25 8:50 PM, Jens Axboe wrote:
> On 8/27/25 8:49 PM, Qingyue Zhang wrote:
>> On Wed, 27 Aug 2025 20:08:05 -0600, Jens Axboe wrote:
>>> I don't think there's anything wrong with the looping and stopping at
>>> the other end is of course a safe guard, but couldn't we just abort the
>>> loop if we see a 0 sized buffer? At that point we know the buffer is
>>> invalid, or the kernel is buggy, and it'd be saner to stop at that
>>> point. Something ala:
>>>
>>>
>>> diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
>>> index 394037d3f2f6..19a8bde5e1e1 100644
>>> --- a/io_uring/kbuf.c
>>> +++ b/io_uring/kbuf.c
>>> @@ -42,7 +42,8 @@ static bool io_kbuf_inc_commit(struct io_buffer_list *bl, int len)
>>>  		buf_len = READ_ONCE(buf->len);
>>>  		this_len = min_t(u32, len, buf_len);
>>>  		buf_len -= this_len;
>>> -		if (buf_len) {
>>> +		/* Stop looping for invalid buffer length of 0 */
>>> +		if (buf_len || !this_len) {
>>>  			buf->addr += this_len;
>>>  			buf->len = buf_len;
>>>  			return false;
>>
>> Good idea, it looks nice to me.
> 
> I'll send it out, I amended the commit message a bit too.

Oh, and let me know if you prefer any tags or whatever changed in that
patch.

Additionally, would you mind if I use your reproducer for a test case?
I turned it into the below. Will need a bit more to be a test case, but
it's pretty much there. Functionally it should be the same, it's just
using more idiomatic liburing rather than the lower level helpers.

And finally, thanks for the report and test case! Very useful.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <liburing.h>

#define BGID		0
#define RENTRIES	2
#define RMASK		(RENTRIES - 1)

int main(int argc, char *argv[])
{
	struct io_uring_buf_ring *br;
	struct io_uring_sqe *sqe;
	struct io_uring ring;
	void *send_buf;
	int ret, fds[2];

	io_uring_queue_init(1, &ring, IORING_SETUP_NO_SQARRAY);

	br = io_uring_setup_buf_ring(&ring, RENTRIES, BGID, IOU_PBUF_RING_INC, &ret);
	if (!br) {
		if (ret == -EINVAL)
			return 77;
		fprintf(stderr, "buf ring setup: %d\n", ret);
		return 1;
	}

	/*
	 * Use the buffer ring itself as the provided buffer address. Once the
	 * recv completes, it will have zero filled the buffer addr/len.
	 */
	io_uring_buf_ring_add(br, br, 4096, 0, RMASK, 0);
	io_uring_buf_ring_advance(br, 1);

	if (socketpair(AF_UNIX, SOCK_DGRAM, 0, fds) < 0) {
		perror("socketpair");
		return 1;
	}

	/*
	 * Send zeroes, overwriting the buffer ring
	 */
	send_buf = calloc(1, 32);
	ret = send(fds[0], send_buf, 32, MSG_DONTWAIT);
	if (ret < 0) {
		perror("send");
		return 1;
	}

	/*
	 * Do recv, picking the first buffer. When buffer is picked,
	 * it's still fully valid. By the time it needs to get committed,
	 * it will have invalid addr/len fields (all zeroes from the recv)
	 */
	sqe = io_uring_get_sqe(&ring);
	io_uring_prep_recv_multishot(sqe, fds[1], NULL, 0, 0);
	sqe->flags |= IOSQE_BUFFER_SELECT;
	sqe->buf_index = BGID;
	io_uring_submit_and_wait(&ring, 2);

	io_uring_free_buf_ring(&ring, br, RENTRIES, BGID);
	io_uring_queue_exit(&ring);
	return 0;
}

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-28  2:58               ` Jens Axboe
@ 2025-08-28  3:27                 ` Qingyue Zhang
  2025-08-28 11:50                   ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Qingyue Zhang @ 2025-08-28  3:27 UTC (permalink / raw)
  To: axboe; +Cc: aftern00n, io-uring, linux-kernel

Thanks for taking care of this report!

Regarding tags, it would be nice to add a 
'Reported-by: Suoxing Zhang <aftern00n@qq.com>'
tag too, as this report and reproducer are
developed by both of us.

And absolutely, please feel free to use our 
reproducer for a test case! I'm glad it can 
be useful. Your version with idiomatic 
liburing looks great.

This is my first contribution to the Linux 
Kernel, and I really appreciate your patience 
and quick responses throughout this process!


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit()
  2025-08-28  3:27                 ` Qingyue Zhang
@ 2025-08-28 11:50                   ` Jens Axboe
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2025-08-28 11:50 UTC (permalink / raw)
  To: Qingyue Zhang; +Cc: aftern00n, io-uring, linux-kernel

On 8/27/25 9:27 PM, Qingyue Zhang wrote:
> Thanks for taking care of this report!
> 
> Regarding tags, it would be nice to add a 
> 'Reported-by: Suoxing Zhang <aftern00n@qq.com>'
> tag too, as this report and reproducer are
> developed by both of us.

Done!

> And absolutely, please feel free to use our 
> reproducer for a test case! I'm glad it can 
> be useful. Your version with idiomatic 
> liburing looks great.

Perfect, will do.

> This is my first contribution to the Linux 
> Kernel, and I really appreciate your patience 
> and quick responses throughout this process!

Let's hope it's the first of many!

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-08-28 11:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250827114339.367080-1-chunzhennn@qq.com>
2025-08-27 11:44 ` [PATCH 2/2] io_uring/kbuf: fix infinite loop in io_kbuf_inc_commit() Qingyue Zhang
2025-08-27 14:30   ` Jens Axboe
2025-08-27 21:45     ` Jens Axboe
2025-08-27 21:59       ` Keith Busch
2025-08-27 22:23         ` Jens Axboe
2025-08-28  1:36       ` Qingyue Zhang
2025-08-28  2:08         ` Jens Axboe
2025-08-28  2:49           ` Qingyue Zhang
2025-08-28  2:50             ` Jens Axboe
2025-08-28  2:58               ` Jens Axboe
2025-08-28  3:27                 ` Qingyue Zhang
2025-08-28 11:50                   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).