public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH stable-5.15 0/5] io_uring backports
@ 2022-10-16 21:42 Pavel Begunkov
  2022-10-16 21:42 ` [PATCH stable-5.15 1/5] io_uring/af_unix: defer registered files gc to io_uring release Pavel Begunkov
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Pavel Begunkov @ 2022-10-16 21:42 UTC (permalink / raw)
  To: stable; +Cc: Jens Axboe, asml.silence

io_uring backports for stable 5.15

Pavel Begunkov (5):
  io_uring/af_unix: defer registered files gc to io_uring release
  io_uring: correct pinned_vm accounting
  io_uring/rw: fix short rw error handling
  io_uring/rw: fix error'ed retry return values
  io_uring/rw: fix unexpected link breakage

 fs/io_uring.c          | 40 ++++++++++++++++++++++++----------------
 include/linux/skbuff.h |  2 ++
 net/unix/garbage.c     | 20 ++++++++++++++++++++
 3 files changed, 46 insertions(+), 16 deletions(-)

-- 
2.38.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH stable-5.15 1/5] io_uring/af_unix: defer registered files gc to io_uring release
  2022-10-16 21:42 [PATCH stable-5.15 0/5] io_uring backports Pavel Begunkov
@ 2022-10-16 21:42 ` Pavel Begunkov
  2022-10-16 21:42 ` [PATCH stable-5.15 2/5] io_uring: correct pinned_vm accounting Pavel Begunkov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2022-10-16 21:42 UTC (permalink / raw)
  To: stable; +Cc: Jens Axboe, asml.silence

[ upstream commit 0091bfc81741b8d3aeb3b7ab8636f911b2de6e80 ]

Instead of putting io_uring's registered files in unix_gc() we want it
to be done by io_uring itself. The trick here is to consider io_uring
registered files for cycle detection but not actually putting them down.
Because io_uring can't register other ring instances, this will remove
all refs to the ring file triggering the ->release path and clean up
with io_ring_ctx_free().

Cc: stable@vger.kernel.org
Fixes: 6b06314c47e1 ("io_uring: add file set registration")
Reported-and-tested-by: David Bouman <dbouman03@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
[axboe: add kerneldoc comment to skb, fold in skb leak fix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c          |  1 +
 include/linux/skbuff.h |  2 ++
 net/unix/garbage.c     | 20 ++++++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ed6abd74f386..1b0e83fdea7a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8065,6 +8065,7 @@ static int __io_sqe_files_scm(struct io_ring_ctx *ctx, int nr, int offset)
 	}
 
 	skb->sk = sk;
+	skb->scm_io_uring = 1;
 
 	nr_files = 0;
 	fpl->user = get_uid(current_user());
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index cfb889f66c70..19e595cab23a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -725,6 +725,7 @@ typedef unsigned char *sk_buff_data_t;
  *	@csum_level: indicates the number of consecutive checksums found in
  *		the packet minus one that have been verified as
  *		CHECKSUM_UNNECESSARY (max 3)
+ *	@scm_io_uring: SKB holds io_uring registered files
  *	@dst_pending_confirm: need to confirm neighbour
  *	@decrypted: Decrypted SKB
  *	@slow_gro: state present at GRO time, slower prepare step required
@@ -910,6 +911,7 @@ struct sk_buff {
 	__u8			decrypted:1;
 #endif
 	__u8			slow_gro:1;
+	__u8			scm_io_uring:1;
 
 #ifdef CONFIG_NET_SCHED
 	__u16			tc_index;	/* traffic control index */
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index d45d5366115a..dc2763540393 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -204,6 +204,7 @@ void wait_for_unix_gc(void)
 /* The external entry point: unix_gc() */
 void unix_gc(void)
 {
+	struct sk_buff *next_skb, *skb;
 	struct unix_sock *u;
 	struct unix_sock *next;
 	struct sk_buff_head hitlist;
@@ -297,11 +298,30 @@ void unix_gc(void)
 
 	spin_unlock(&unix_gc_lock);
 
+	/* We need io_uring to clean its registered files, ignore all io_uring
+	 * originated skbs. It's fine as io_uring doesn't keep references to
+	 * other io_uring instances and so killing all other files in the cycle
+	 * will put all io_uring references forcing it to go through normal
+	 * release.path eventually putting registered files.
+	 */
+	skb_queue_walk_safe(&hitlist, skb, next_skb) {
+		if (skb->scm_io_uring) {
+			__skb_unlink(skb, &hitlist);
+			skb_queue_tail(&skb->sk->sk_receive_queue, skb);
+		}
+	}
+
 	/* Here we are. Hitlist is filled. Die. */
 	__skb_queue_purge(&hitlist);
 
 	spin_lock(&unix_gc_lock);
 
+	/* There could be io_uring registered files, just push them back to
+	 * the inflight list
+	 */
+	list_for_each_entry_safe(u, next, &gc_candidates, link)
+		list_move_tail(&u->link, &gc_inflight_list);
+
 	/* All candidates should have been detached by now. */
 	BUG_ON(!list_empty(&gc_candidates));
 
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH stable-5.15 2/5] io_uring: correct pinned_vm accounting
  2022-10-16 21:42 [PATCH stable-5.15 0/5] io_uring backports Pavel Begunkov
  2022-10-16 21:42 ` [PATCH stable-5.15 1/5] io_uring/af_unix: defer registered files gc to io_uring release Pavel Begunkov
@ 2022-10-16 21:42 ` Pavel Begunkov
  2022-10-16 21:42 ` [PATCH stable-5.15 3/5] io_uring/rw: fix short rw error handling Pavel Begunkov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2022-10-16 21:42 UTC (permalink / raw)
  To: stable; +Cc: Jens Axboe, asml.silence

[ upstream commit 42b6419d0aba47c5d8644cdc0b68502254671de5 ]

->mm_account should be released only after we free all registered
buffers, otherwise __io_sqe_buffers_unregister() will see a NULL
->mm_account and skip locked_vm accounting.

Cc: <Stable@vger.kernel.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6d798f65ed4ab8db3664c4d3397d4af16ca98846.1664849932.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1b0e83fdea7a..2da7a490d1c7 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9302,11 +9302,6 @@ static void io_ring_ctx_free(struct io_ring_ctx *ctx)
 {
 	io_sq_thread_finish(ctx);
 
-	if (ctx->mm_account) {
-		mmdrop(ctx->mm_account);
-		ctx->mm_account = NULL;
-	}
-
 	/* __io_rsrc_put_work() may need uring_lock to progress, wait w/o it */
 	io_wait_rsrc_data(ctx->buf_data);
 	io_wait_rsrc_data(ctx->file_data);
@@ -9342,6 +9337,11 @@ static void io_ring_ctx_free(struct io_ring_ctx *ctx)
 #endif
 	WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list));
 
+	if (ctx->mm_account) {
+		mmdrop(ctx->mm_account);
+		ctx->mm_account = NULL;
+	}
+
 	io_mem_free(ctx->rings);
 	io_mem_free(ctx->sq_sqes);
 
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH stable-5.15 3/5] io_uring/rw: fix short rw error handling
  2022-10-16 21:42 [PATCH stable-5.15 0/5] io_uring backports Pavel Begunkov
  2022-10-16 21:42 ` [PATCH stable-5.15 1/5] io_uring/af_unix: defer registered files gc to io_uring release Pavel Begunkov
  2022-10-16 21:42 ` [PATCH stable-5.15 2/5] io_uring: correct pinned_vm accounting Pavel Begunkov
@ 2022-10-16 21:42 ` Pavel Begunkov
  2022-10-16 21:42 ` [PATCH stable-5.15 4/5] io_uring/rw: fix error'ed retry return values Pavel Begunkov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2022-10-16 21:42 UTC (permalink / raw)
  To: stable; +Cc: Jens Axboe, asml.silence

[ upstream commit 89473c1a9205760c4fa6d158058da7b594a815f0 ]

We have a couple of problems, first reports of unexpected link breakage
for reads when cqe->res indicates that the IO was done in full. The
reason here is partial IO with retries.

TL;DR; we compare the result in __io_complete_rw_common() against
req->cqe.res, but req->cqe.res doesn't store the full length but rather
the length left to be done. So, when we pass the full corrected result
via kiocb_done() -> __io_complete_rw_common(), it fails.

The second problem is that we don't try to correct res in
io_complete_rw(), which, for instance, might be a problem for O_DIRECT
but when a prefix of data was cached in the page cache. We also
definitely don't want to pass a corrected result into io_rw_done().

The fix here is to leave __io_complete_rw_common() alone, always pass
not corrected result into it and fix it up as the last step just before
actually finishing the I/O.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2da7a490d1c7..c0d1948fb5a6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2701,6 +2701,20 @@ static bool __io_complete_rw_common(struct io_kiocb *req, long res)
 	return false;
 }
 
+static inline unsigned io_fixup_rw_res(struct io_kiocb *req, unsigned res)
+{
+	struct io_async_rw *io = req->async_data;
+
+	/* add previously done IO, if any */
+	if (io && io->bytes_done > 0) {
+		if (res < 0)
+			res = io->bytes_done;
+		else
+			res += io->bytes_done;
+	}
+	return res;
+}
+
 static void io_req_task_complete(struct io_kiocb *req, bool *locked)
 {
 	unsigned int cflags = io_put_rw_kbuf(req);
@@ -2724,7 +2738,7 @@ static void __io_complete_rw(struct io_kiocb *req, long res, long res2,
 {
 	if (__io_complete_rw_common(req, res))
 		return;
-	__io_req_complete(req, issue_flags, req->result, io_put_rw_kbuf(req));
+	__io_req_complete(req, issue_flags, io_fixup_rw_res(req, res), io_put_rw_kbuf(req));
 }
 
 static void io_complete_rw(struct kiocb *kiocb, long res, long res2)
@@ -2733,7 +2747,7 @@ static void io_complete_rw(struct kiocb *kiocb, long res, long res2)
 
 	if (__io_complete_rw_common(req, res))
 		return;
-	req->result = res;
+	req->result = io_fixup_rw_res(req, res);
 	req->io_task_work.func = io_req_task_complete;
 	io_req_task_work_add(req);
 }
@@ -2979,15 +2993,6 @@ static void kiocb_done(struct kiocb *kiocb, ssize_t ret,
 		       unsigned int issue_flags)
 {
 	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
-	struct io_async_rw *io = req->async_data;
-
-	/* add previously done IO, if any */
-	if (io && io->bytes_done > 0) {
-		if (ret < 0)
-			ret = io->bytes_done;
-		else
-			ret += io->bytes_done;
-	}
 
 	if (req->flags & REQ_F_CUR_POS)
 		req->file->f_pos = kiocb->ki_pos;
@@ -3004,6 +3009,7 @@ static void kiocb_done(struct kiocb *kiocb, ssize_t ret,
 			unsigned int cflags = io_put_rw_kbuf(req);
 			struct io_ring_ctx *ctx = req->ctx;
 
+			ret = io_fixup_rw_res(req, ret);
 			req_set_fail(req);
 			if (!(issue_flags & IO_URING_F_NONBLOCK)) {
 				mutex_lock(&ctx->uring_lock);
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH stable-5.15 4/5] io_uring/rw: fix error'ed retry return values
  2022-10-16 21:42 [PATCH stable-5.15 0/5] io_uring backports Pavel Begunkov
                   ` (2 preceding siblings ...)
  2022-10-16 21:42 ` [PATCH stable-5.15 3/5] io_uring/rw: fix short rw error handling Pavel Begunkov
@ 2022-10-16 21:42 ` Pavel Begunkov
  2022-10-16 21:42 ` [PATCH stable-5.15 5/5] io_uring/rw: fix unexpected link breakage Pavel Begunkov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2022-10-16 21:42 UTC (permalink / raw)
  To: stable; +Cc: Jens Axboe, asml.silence

[ upstream commit 62bb0647b14646fa6c9aa25ecdf67ad18f13523c ]

Kernel test robot reports that we test negativity of an unsigned in
io_fixup_rw_res() after a recent change, which masks error codes and
messes up the return value in case I/O is re-retried and failed with
an error.

Fixes: 4d9cb92ca41dd ("io_uring/rw: fix short rw error handling")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9754a0970af1861e7865f9014f735c70dc60bf79.1663071587.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index c0d1948fb5a6..f667197c8bb9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2701,7 +2701,7 @@ static bool __io_complete_rw_common(struct io_kiocb *req, long res)
 	return false;
 }
 
-static inline unsigned io_fixup_rw_res(struct io_kiocb *req, unsigned res)
+static inline int io_fixup_rw_res(struct io_kiocb *req, unsigned res)
 {
 	struct io_async_rw *io = req->async_data;
 
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH stable-5.15 5/5] io_uring/rw: fix unexpected link breakage
  2022-10-16 21:42 [PATCH stable-5.15 0/5] io_uring backports Pavel Begunkov
                   ` (3 preceding siblings ...)
  2022-10-16 21:42 ` [PATCH stable-5.15 4/5] io_uring/rw: fix error'ed retry return values Pavel Begunkov
@ 2022-10-16 21:42 ` Pavel Begunkov
  2022-10-17  9:37 ` [PATCH stable-5.15 0/5] io_uring backports Greg KH
  2022-10-17  9:44 ` Greg KH
  6 siblings, 0 replies; 10+ messages in thread
From: Pavel Begunkov @ 2022-10-16 21:42 UTC (permalink / raw)
  To: stable; +Cc: Jens Axboe, asml.silence

[ upstream commit bf68b5b34311ee57ed40749a1257a30b46127556 ]

req->cqe.res is set in io_read() to the amount of bytes left to be done,
which is used to figure out whether to fail a read or not. However,
io_read() may do another without returning, and we stash the previous
value into ->bytes_done but forget to update cqe.res. Then we ask a read
to do strictly less than cqe.res but expect the return to be exactly
cqe.res.

Fix the bug by updating cqe.res for retries.

Cc: stable@vger.kernel.org
Reported-and-Tested-by: Beld Zhang <beldzhang@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3a1088440c7be98e5800267af922a67da0ef9f13.1664235732.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f667197c8bb9..837dd96228b2 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3606,6 +3606,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
 			return -EAGAIN;
 		}
 
+		req->result = iov_iter_count(iter);
 		/*
 		 * Now retry read with the IOCB_WAITQ parts set in the iocb. If
 		 * we get -EIOCBQUEUED, then we'll get a notification when the
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH stable-5.15 0/5] io_uring backports
  2022-10-16 21:42 [PATCH stable-5.15 0/5] io_uring backports Pavel Begunkov
                   ` (4 preceding siblings ...)
  2022-10-16 21:42 ` [PATCH stable-5.15 5/5] io_uring/rw: fix unexpected link breakage Pavel Begunkov
@ 2022-10-17  9:37 ` Greg KH
  2022-10-17 13:08   ` Pavel Begunkov
  2022-10-17  9:44 ` Greg KH
  6 siblings, 1 reply; 10+ messages in thread
From: Greg KH @ 2022-10-17  9:37 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: stable, Jens Axboe

On Sun, Oct 16, 2022 at 10:42:53PM +0100, Pavel Begunkov wrote:
> io_uring backports for stable 5.15

How about 5.19?

If you wait a week or so, 5.19 will be end-of-life, so maybe we can just
not worry about it :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH stable-5.15 0/5] io_uring backports
  2022-10-16 21:42 [PATCH stable-5.15 0/5] io_uring backports Pavel Begunkov
                   ` (5 preceding siblings ...)
  2022-10-17  9:37 ` [PATCH stable-5.15 0/5] io_uring backports Greg KH
@ 2022-10-17  9:44 ` Greg KH
  6 siblings, 0 replies; 10+ messages in thread
From: Greg KH @ 2022-10-17  9:44 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: stable, Jens Axboe

On Sun, Oct 16, 2022 at 10:42:53PM +0100, Pavel Begunkov wrote:
> io_uring backports for stable 5.15

Now queued up, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH stable-5.15 0/5] io_uring backports
  2022-10-17  9:37 ` [PATCH stable-5.15 0/5] io_uring backports Greg KH
@ 2022-10-17 13:08   ` Pavel Begunkov
  2022-10-17 14:24     ` Greg KH
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Begunkov @ 2022-10-17 13:08 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, Jens Axboe

On 10/17/22 10:37, Greg KH wrote:
> On Sun, Oct 16, 2022 at 10:42:53PM +0100, Pavel Begunkov wrote:
>> io_uring backports for stable 5.15
> 
> How about 5.19?

Looks there were no conflicts for 5.19 and you applied them
yesterday.

And thanks for queuing all backports!

> If you wait a week or so, 5.19 will be end-of-life, so maybe we can just
> not worry about it :)

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH stable-5.15 0/5] io_uring backports
  2022-10-17 13:08   ` Pavel Begunkov
@ 2022-10-17 14:24     ` Greg KH
  0 siblings, 0 replies; 10+ messages in thread
From: Greg KH @ 2022-10-17 14:24 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: stable, Jens Axboe

On Mon, Oct 17, 2022 at 02:08:57PM +0100, Pavel Begunkov wrote:
> On 10/17/22 10:37, Greg KH wrote:
> > On Sun, Oct 16, 2022 at 10:42:53PM +0100, Pavel Begunkov wrote:
> > > io_uring backports for stable 5.15
> > 
> > How about 5.19?
> 
> Looks there were no conflicts for 5.19 and you applied them
> yesterday.

Ah, you are right, sorry about that, too many patches floating around
right now...

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-10-17 14:24 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-16 21:42 [PATCH stable-5.15 0/5] io_uring backports Pavel Begunkov
2022-10-16 21:42 ` [PATCH stable-5.15 1/5] io_uring/af_unix: defer registered files gc to io_uring release Pavel Begunkov
2022-10-16 21:42 ` [PATCH stable-5.15 2/5] io_uring: correct pinned_vm accounting Pavel Begunkov
2022-10-16 21:42 ` [PATCH stable-5.15 3/5] io_uring/rw: fix short rw error handling Pavel Begunkov
2022-10-16 21:42 ` [PATCH stable-5.15 4/5] io_uring/rw: fix error'ed retry return values Pavel Begunkov
2022-10-16 21:42 ` [PATCH stable-5.15 5/5] io_uring/rw: fix unexpected link breakage Pavel Begunkov
2022-10-17  9:37 ` [PATCH stable-5.15 0/5] io_uring backports Greg KH
2022-10-17 13:08   ` Pavel Begunkov
2022-10-17 14:24     ` Greg KH
2022-10-17  9:44 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox