From: Ming Lei <ming.lei@redhat.com>
To: Keith Busch <kbusch@meta.com>
Cc: asml.silence@gmail.com, axboe@kernel.dk,
linux-block@vger.kernel.org, io-uring@vger.kernel.org,
bernd@bsbernd.com, Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCHv2 4/6] ublk: zc register/unregister bvec
Date: Wed, 12 Feb 2025 10:49:15 +0800 [thread overview]
Message-ID: <Z6wMK5WxvS_MzLh3@fedora> (raw)
In-Reply-To: <20250211005646.222452-5-kbusch@meta.com>
On Mon, Feb 10, 2025 at 04:56:44PM -0800, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Provide new operations for the user to request mapping an active request
> to an io uring instance's buf_table. The user has to provide the index
> it wants to install the buffer.
>
> A reference count is taken on the request to ensure it can't be
> completed while it is active in a ring's buf_table.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> drivers/block/ublk_drv.c | 145 +++++++++++++++++++++++++---------
> include/uapi/linux/ublk_cmd.h | 4 +
> 2 files changed, 113 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 529085181f355..ccfda7b2c24da 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -51,6 +51,9 @@
> /* private ioctl command mirror */
> #define UBLK_CMD_DEL_DEV_ASYNC _IOC_NR(UBLK_U_CMD_DEL_DEV_ASYNC)
>
> +#define UBLK_IO_REGISTER_IO_BUF _IOC_NR(UBLK_U_IO_REGISTER_IO_BUF)
> +#define UBLK_IO_UNREGISTER_IO_BUF _IOC_NR(UBLK_U_IO_UNREGISTER_IO_BUF)
UBLK_IO_REGISTER_IO_BUF command may be completed, and buffer isn't used
by RW_FIXED yet in the following cases:
- application doesn't submit any RW_FIXED consumer OP
- io_uring_enter() only issued UBLK_IO_REGISTER_IO_BUF, and the other
OPs can't be issued because of out of resource
...
Then io_uring_enter() returns, and the application is panic or killed,
how to avoid buffer leak?
It need to deal with in io_uring cancel code for calling ->release() if
the kbuffer node isn't released.
UBLK_IO_UNREGISTER_IO_BUF still need to call ->release() if the node
buffer isn't used.
> +
> /* All UBLK_F_* have to be included into UBLK_F_ALL */
> #define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY \
> | UBLK_F_URING_CMD_COMP_IN_TASK \
> @@ -76,6 +79,9 @@ struct ublk_rq_data {
> struct llist_node node;
>
> struct kref ref;
> +
> +#define UBLK_ZC_REGISTERED 0
> + unsigned long flags;
> };
>
> struct ublk_uring_cmd_pdu {
> @@ -201,7 +207,7 @@ static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq,
> int tag);
> static inline bool ublk_dev_is_user_copy(const struct ublk_device *ub)
> {
> - return ub->dev_info.flags & UBLK_F_USER_COPY;
> + return ub->dev_info.flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY);
> }
>
> static inline bool ublk_dev_is_zoned(const struct ublk_device *ub)
> @@ -581,7 +587,7 @@ static void ublk_apply_params(struct ublk_device *ub)
>
> static inline bool ublk_support_user_copy(const struct ublk_queue *ubq)
> {
> - return ubq->flags & UBLK_F_USER_COPY;
> + return ubq->flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY);
> }
>
> static inline bool ublk_need_req_ref(const struct ublk_queue *ubq)
> @@ -1747,6 +1753,102 @@ static inline void ublk_prep_cancel(struct io_uring_cmd *cmd,
> io_uring_cmd_mark_cancelable(cmd, issue_flags);
> }
>
> +static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub,
> + struct ublk_queue *ubq, int tag, size_t offset)
> +{
> + struct request *req;
> +
> + if (!ublk_need_req_ref(ubq))
> + return NULL;
> +
> + req = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], tag);
> + if (!req)
> + return NULL;
> +
> + if (!ublk_get_req_ref(ubq, req))
> + return NULL;
> +
> + if (unlikely(!blk_mq_request_started(req) || req->tag != tag))
> + goto fail_put;
> +
> + if (!ublk_rq_has_data(req))
> + goto fail_put;
> +
> + if (offset > blk_rq_bytes(req))
> + goto fail_put;
> +
> + return req;
> +fail_put:
> + ublk_put_req_ref(ubq, req);
> + return NULL;
> +}
> +
> +static void ublk_io_release(void *priv)
> +{
> + struct request *rq = priv;
> + struct ublk_queue *ubq = rq->mq_hctx->driver_data;
> +
> + ublk_put_req_ref(ubq, rq);
> +}
It isn't enough to just get & put request reference here between registering
buffer and freeing the registered node buf, because the same reference can be
dropped from ublk_commit_completion() which is from queueing
UBLK_IO_COMMIT_AND_FETCH_REQ, and buggy app may queue this command multiple
times for freeing the request.
One solution is to not allow request completion until the ->release() is
returned.
Thanks,
Ming
next prev parent reply other threads:[~2025-02-12 2:49 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-11 0:56 [PATCHv2 0/6] ublk zero-copy support Keith Busch
2025-02-11 0:56 ` [PATCHv2 1/6] io_uring: use node for import Keith Busch
2025-02-11 0:56 ` [PATCHv2 2/6] io_uring: create resource release callback Keith Busch
2025-02-13 1:31 ` Pavel Begunkov
2025-02-13 1:58 ` Keith Busch
2025-02-13 13:06 ` Pavel Begunkov
2025-02-11 0:56 ` [PATCHv2 3/6] io_uring: add support for kernel registered bvecs Keith Busch
2025-02-13 1:33 ` Pavel Begunkov
2025-02-14 3:30 ` Ming Lei
2025-02-14 15:26 ` Keith Busch
2025-02-15 1:34 ` Ming Lei
2025-02-18 20:34 ` Keith Busch
2025-02-11 0:56 ` [PATCHv2 4/6] ublk: zc register/unregister bvec Keith Busch
2025-02-12 2:49 ` Ming Lei [this message]
2025-02-12 4:11 ` Keith Busch
2025-02-12 9:24 ` Ming Lei
2025-02-12 14:59 ` Keith Busch
2025-02-13 2:12 ` Pavel Begunkov
2025-02-11 0:56 ` [PATCHv2 5/6] io_uring: add abstraction for buf_table rsrc data Keith Busch
2025-02-11 0:56 ` [PATCHv2 6/6] io_uring: cache nodes and mapped buffers Keith Busch
2025-02-11 15:17 ` kernel test robot
2025-02-11 16:47 ` Keith Busch
2025-02-12 1:42 ` kernel test robot
2025-02-12 2:29 ` [PATCHv2 0/6] ublk zero-copy support Ming Lei
2025-02-12 15:28 ` Keith Busch
2025-02-12 16:06 ` Pavel Begunkov
2025-02-13 1:52 ` Ming Lei
2025-02-13 15:12 ` lizetao
2025-02-13 16:06 ` Keith Busch
2025-02-14 3:39 ` lizetao
2025-02-14 2:41 ` Ming Lei
2025-02-14 4:21 ` lizetao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6wMK5WxvS_MzLh3@fedora \
--to=ming.lei@redhat.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=bernd@bsbernd.com \
--cc=io-uring@vger.kernel.org \
--cc=kbusch@kernel.org \
--cc=kbusch@meta.com \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.