From: Ziyang Zhang <ZiyangZhang@linux.alibaba.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
xiaoguang.wang@linux.alibaba.com
Subject: Re: [PATCH V2 2/2] ublk_drv: add support for UBLK_IO_NEED_GET_DATA
Date: Wed, 27 Jul 2022 11:09:13 +0800 [thread overview]
Message-ID: <60bc9e53-605e-ee1d-9bd2-020693768339@linux.alibaba.com> (raw)
In-Reply-To: <YuCdYsWw9+k7nCAF@T590>
On 2022/7/27 10:05, Ming Lei wrote:
> On Tue, Jul 26, 2022 at 07:44:05PM +0800, ZiyangZhang wrote:
>> UBLK_IO_NEED_GET_DATA is one ublk IO command. It is designed for a user
>> application who wants to allocate IO buffer and set IO buffer address
>> only after it receives an IO request from ublksrv. This is a reasonable
>> scenario because these users may use a RPC framework as one IO backend
>> to handle IO requests passed from ublksrv. And a RPC framework may
>> allocate its own buffer(or memory pool).
>>
>> This new feature (UBLK_F_NEED_GET_DATA) is optional for ublk users.
>> Related userspace code has been added in ublksrv[1] as one pull request.
>>
>> Test cases for this feature are added in ublksrv and all the tests pass.
>> The performance result shows that this new feature does bring additional
>> latency because one IO is issued back to ublk_drv once again to copy data
>> from bio vectors to user-provided data buffer. UBLK_IO_NEED_GET_DATA is
>> suitable for bigger block size such as 512B or 1MB.
>>
>> [1] https://github.com/ming1/ubdsrv
>>
>> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
>> ---
>> drivers/block/ublk_drv.c | 88 ++++++++++++++++++++++++++++++++++++----
>> 1 file changed, 79 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
>> index 255b2de46a24..c2d2cd5ab25c 100644
>> --- a/drivers/block/ublk_drv.c
>> +++ b/drivers/block/ublk_drv.c
>> @@ -47,7 +47,9 @@
>> #define UBLK_MINORS (1U << MINORBITS)
>>
>> /* All UBLK_F_* have to be included into UBLK_F_ALL */
>> -#define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY | UBLK_F_URING_CMD_COMP_IN_TASK)
>> +#define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY \
>> + | UBLK_F_URING_CMD_COMP_IN_TASK \
>> + | UBLK_F_NEED_GET_DATA)
>>
>> struct ublk_rq_data {
>> struct callback_head work;
>> @@ -86,6 +88,15 @@ struct ublk_uring_cmd_pdu {
>> */
>> #define UBLK_IO_FLAG_ABORTED 0x04
>>
>> +/*
>> + * UBLK_IO_FLAG_NEED_GET_DATA is set because IO command requires
>> + * get data buffer address from ublksrv.
>> + *
>> + * Then, bio data could be copied into this data buffer for a WRITE request
>> + * after the IO command is issued again and UBLK_IO_FLAG_NEED_GET_DATA is unset.
>> + */
>> +#define UBLK_IO_FLAG_NEED_GET_DATA 0x08
>> +
>> struct ublk_io {
>> /* userspace buffer address from io cmd */
>> __u64 addr;
>> @@ -163,11 +174,19 @@ static struct miscdevice ublk_misc;
>> static inline bool ublk_can_use_task_work(const struct ublk_queue *ubq)
>> {
>> if (IS_BUILTIN(CONFIG_BLK_DEV_UBLK) &&
>> + !(ubq->flags & UBLK_F_NEED_GET_DATA) &&
>> !(ubq->flags & UBLK_F_URING_CMD_COMP_IN_TASK))
>> return true;
>> return false;
>
> NEED_GET_DATA isn't related with using task work.
>
> That said if use task work return true, you use task work to handle
> GET_DATA, otherwise use io_uring_cmd_complete_in_task() to handle
> it, then either we use task work or use io_uring_cmd_complete_in_task,
> not mix the two.
>
> BTW, in my test, it is observed reliably that task work can get better iops.
Ok, I will check whether UBLK_F_NEED_GET_DATA works with task_work_add().
I find that checking flags of ublk_io must stay in current ubq process.
Otherwise there may be potential data race while aborting IOs.
So whatever I use NEED_GET_DATA or not, task work is necessary.
Using task_work_add() or io_uring_cmd_complete_in_task() depends on:
(1) the module is built-in or not
(2) the user requires the feature: UBLK_F_URING_CMD_COMP_IN_TASK
In the future, in our case, we may choose
UBLK_F_URING_CMD_COMP_IN_TASK | UBLK_F_NEED_GET_DATA as an option.
>
>> }
>>
>> +static inline bool ublk_need_get_data(const struct ublk_queue *ubq)
>> +{
>> + if (ubq->flags & UBLK_F_NEED_GET_DATA)
>> + return true;
>> + return false;
>> +}
>> +
>> static struct ublk_device *ublk_get_device(struct ublk_device *ub)
>> {
>> if (kobject_get_unless_zero(&ub->cdev_dev.kobj))
>> @@ -509,6 +528,21 @@ static void __ublk_fail_req(struct ublk_io *io, struct request *req)
>> }
>> }
>>
>> +static void ubq_complete_io_cmd(struct ublk_io *io, int res)
>> +{
>> + /* mark this cmd owned by ublksrv */
>> + io->flags |= UBLK_IO_FLAG_OWNED_BY_SRV;
>> +
>> + /*
>> + * clear ACTIVE since we are done with this sqe/cmd slot
>> + * We can only accept io cmd in case of being not active.
>> + */
>> + io->flags &= ~UBLK_IO_FLAG_ACTIVE;
>> +
>> + /* tell ublksrv one io request is coming */
>> + io_uring_cmd_done(io->cmd, res, 0);
>> +}
>> +
>> #define UBLK_REQUEUE_DELAY_MS 3
>>
>> static inline void __ublk_rq_task_work(struct request *req)
>> @@ -531,6 +565,20 @@ static inline void __ublk_rq_task_work(struct request *req)
>> return;
>> }
>>
>> + if (ublk_need_get_data(ubq) &&
>> + (req_op(req) == REQ_OP_WRITE ||
>> + req_op(req) == REQ_OP_FLUSH) &&
>> + !(io->flags & UBLK_IO_FLAG_NEED_GET_DATA)) {
>> +
>> + io->flags |= UBLK_IO_FLAG_NEED_GET_DATA;
>> +
>> + pr_devel("%s: ublk_need_get_data. op %d, qid %d tag %d io_flags %x\n",
>> + __func__, io->cmd->cmd_op, ubq->q_id, req->tag, io->flags);
>> +
>> + ubq_complete_io_cmd(io, UBLK_IO_RES_NEED_GET_DATA);
>> + return;
>> + }
>> +
>> mapped_bytes = ublk_map_io(ubq, req, io);
>>
>> /* partially mapped, update io descriptor */
>> @@ -553,17 +601,13 @@ static inline void __ublk_rq_task_work(struct request *req)
>> mapped_bytes >> 9;
>> }
>>
>> - /* mark this cmd owned by ublksrv */
>> - io->flags |= UBLK_IO_FLAG_OWNED_BY_SRV;
>> -
>> /*
>> - * clear ACTIVE since we are done with this sqe/cmd slot
>> - * We can only accept io cmd in case of being not active.
>> + * Anyway, we have handled UBLK_IO_NEED_GET_DATA for WRITE/FLUSH requests,
>> + * or we did nothing for other types requests.
>> */
>> - io->flags &= ~UBLK_IO_FLAG_ACTIVE;
>> + io->flags &= ~UBLK_IO_FLAG_NEED_GET_DATA;
>>
>> - /* tell ublksrv one io request is coming */
>> - io_uring_cmd_done(io->cmd, UBLK_IO_RES_OK, 0);
>> + ubq_complete_io_cmd(io, UBLK_IO_RES_OK);
>> }
>>
>> static void ublk_rq_task_work_cb(struct io_uring_cmd *cmd)
>> @@ -846,6 +890,16 @@ static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq)
>> mutex_unlock(&ub->mutex);
>> }
>>
>> +static void ublk_handle_need_get_data(struct ublk_device *ub, int q_id,
>> + int tag, struct io_uring_cmd *cmd)
>> +{
>> + struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd);
>> + struct request *req = blk_mq_tag_to_rq(ub->tag_set.tags[q_id], tag);
>> +
>> + pdu->req = req;
>> + io_uring_cmd_complete_in_task(cmd, ublk_rq_task_work_cb);
>> +}
>
> Here you can choose to use task work or io_uring_cmd_complete_in_task() by
> checking ublk_can_use_task_work(), just like what ublk_queue_rq() does.
OK, I will add a check on ublk_can_use_task_work().
>
>> +
>> static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
>> {
>> struct ublksrv_io_cmd *ub_cmd = (struct ublksrv_io_cmd *)cmd->cmd;
>> @@ -884,6 +938,14 @@ static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
>> goto out;
>> }
>>
>> + /*
>> + * ensure that the user issues UBLK_IO_NEED_GET_DATA
>> + * iff the driver have set the UBLK_IO_FLAG_NEED_GET_DATA.
>> + */
>> + if ((!!(io->flags & UBLK_IO_FLAG_NEED_GET_DATA))
>> + ^ (cmd_op == UBLK_IO_NEED_GET_DATA))
>> + goto out;
>> +
>> switch (cmd_op) {
>> case UBLK_IO_FETCH_REQ:
>> /* UBLK_IO_FETCH_REQ is only allowed before queue is setup */
>> @@ -917,6 +979,14 @@ static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
>> io->cmd = cmd;
>> ublk_commit_completion(ub, ub_cmd);
>> break;
>> + case UBLK_IO_NEED_GET_DATA:
>> + if (!(io->flags & UBLK_IO_FLAG_OWNED_BY_SRV))
>> + goto out;
>> + io->addr = ub_cmd->addr;
>> + io->cmd = cmd;
>> + io->flags |= UBLK_IO_FLAG_ACTIVE;
>> + ublk_handle_need_get_data(ub, ub_cmd->q_id, ub_cmd->tag, cmd);
>
> You still need to clear UBLK_IO_FLAG_OWNED_BY_SRV here.
If ublk_drv gets one UBLK_IO_NEED_GET_DATA command, it should immediately call
io_uring_cmd_complete_in_task() and handling NEED_GET_DATA logic in the task work.
Since UBLK_IO_FLAG_OWNED_BY_SRV means the "slot" is owned by nbd driver or ublk driver,
I don't think UBLK_IO_FLAG_OWNED_BY_SRV should be cleared.
BTW, UBLK_IO_FLAG_OWNED_BY_SRV is set in the task work finally.
>
> In future, UBLK_IO_FLAG_OWNED_BY_SRV can be removed actually.
Agree. UBLK_IO_FLAG_OWNED_BY_SRV should be integrated with UBLK_IO_FLAG_ACTIVE.
Now I am trying to implement crash recovery mechanism and I concern about memory order
while operating these IO flags. IMO, too many IO flags looks dangerous
and I made mistakes on flags while developing UBLK_NEED_GET_DATA. :(
Thanks,
Zhang
next prev parent reply other threads:[~2022-07-27 3:09 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-26 11:44 [PATCH V2 0/2] ublk: add support for UBLK_IO_NEED_GET_DATA ZiyangZhang
2022-07-26 11:44 ` [PATCH V2 1/2] ublk_cmd.h: add one new ublk command: UBLK_IO_NEED_GET_DATA ZiyangZhang
2022-07-26 11:44 ` [PATCH V2 2/2] ublk_drv: add support for UBLK_IO_NEED_GET_DATA ZiyangZhang
2022-07-27 2:05 ` Ming Lei
2022-07-27 3:09 ` Ziyang Zhang [this message]
2022-07-27 15:48 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=60bc9e53-605e-ee1d-9bd2-020693768339@linux.alibaba.com \
--to=ziyangzhang@linux.alibaba.com \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=xiaoguang.wang@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox