All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Kanchan Joshi <joshi.k@samsung.com>,
	martin.petersen@oracle.com, kbusch@kernel.org, hch@lst.de,
	brauner@kernel.org
Cc: asml.silence@gmail.com, dw@davidwei.uk, io-uring@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	gost.dev@samsung.com, Anuj Gupta <anuj20.g@samsung.com>,
	Nitesh Shetty <nj.shetty@samsung.com>
Subject: Re: [PATCH 08/10] io_uring/rw: add support to send meta along with read/write
Date: Fri, 26 Apr 2024 08:25:56 -0600	[thread overview]
Message-ID: <f3489d0c-2d27-4e27-ae49-df2e9dad2e00@kernel.dk> (raw)
In-Reply-To: <20240425183943.6319-9-joshi.k@samsung.com>

> diff --git a/io_uring/rw.c b/io_uring/rw.c
> index 3134a6ece1be..b2c9ac91d5e5 100644
> --- a/io_uring/rw.c
> +++ b/io_uring/rw.c
> @@ -587,6 +623,8 @@ static int kiocb_done(struct io_kiocb *req, ssize_t ret,
>  
>  		req->flags &= ~REQ_F_REISSUE;
>  		iov_iter_restore(&io->iter, &io->iter_state);
> +		if (unlikely(rw->kiocb.ki_flags & IOCB_USE_META))
> +			iov_iter_restore(&io->meta.iter, &io->iter_meta_state);
>  		return -EAGAIN;
>  	}
>  	return IOU_ISSUE_SKIP_COMPLETE;
This puzzles me a bit, why is the restore now dependent on
IOCB_USE_META?

> @@ -768,7 +806,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
>  	if (!(req->flags & REQ_F_FIXED_FILE))
>  		req->flags |= io_file_get_flags(file);
>  
> -	kiocb->ki_flags = file->f_iocb_flags;
> +	kiocb->ki_flags |= file->f_iocb_flags;
>  	ret = kiocb_set_rw_flags(kiocb, rw->flags);
>  	if (unlikely(ret))
>  		return ret;
> @@ -787,7 +825,8 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
>  		if (!(kiocb->ki_flags & IOCB_DIRECT) || !file->f_op->iopoll)
>  			return -EOPNOTSUPP;
>  
> -		kiocb->private = NULL;
> +		if (likely(!(kiocb->ki_flags & IOCB_USE_META)))
> +			kiocb->private = NULL;
>  		kiocb->ki_flags |= IOCB_HIPRI;
>  		kiocb->ki_complete = io_complete_rw_iopoll;
>  		req->iopoll_completed = 0;

Why don't we just set ->private generically earlier, eg like we do for
the ki_flags, rather than have it be a branch in here?

> @@ -853,7 +892,8 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags)
>  	} else if (ret == -EIOCBQUEUED) {
>  		return IOU_ISSUE_SKIP_COMPLETE;
>  	} else if (ret == req->cqe.res || ret <= 0 || !force_nonblock ||
> -		   (req->flags & REQ_F_NOWAIT) || !need_complete_io(req)) {
> +		   (req->flags & REQ_F_NOWAIT) || !need_complete_io(req) ||
> +		   (kiocb->ki_flags & IOCB_USE_META)) {
>  		/* read all, failed, already did sync or don't want to retry */
>  		goto done;
>  	}

Would it be cleaner to stuff that IOCB_USE_META check in
need_complete_io(), as that would closer seem to describe why that check
is there in the first place? With a comment.

> @@ -864,6 +904,12 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags)
>  	 * manually if we need to.
>  	 */
>  	iov_iter_restore(&io->iter, &io->iter_state);
> +	if (unlikely(kiocb->ki_flags & IOCB_USE_META)) {
> +		/* don't handle partial completion for read + meta */
> +		if (ret > 0)
> +			goto done;
> +		iov_iter_restore(&io->meta.iter, &io->iter_meta_state);
> +	}

Also seems a bit odd why we need this check here, surely if this is
needed other "don't do retry IOs" conditions would be the same?

> @@ -1053,7 +1099,8 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags)
>  		if (ret2 == -EAGAIN && (req->ctx->flags & IORING_SETUP_IOPOLL))
>  			goto ret_eagain;
>  
> -		if (ret2 != req->cqe.res && ret2 >= 0 && need_complete_io(req)) {
> +		if (ret2 != req->cqe.res && ret2 >= 0 && need_complete_io(req)
> +				&& !(kiocb->ki_flags & IOCB_USE_META)) {
>  			trace_io_uring_short_write(req->ctx, kiocb->ki_pos - ret2,
>  						req->cqe.res, ret2);

Same here. Would be nice to integrate this a bit nicer rather than have
a bunch of "oh we also need this extra check here" conditions.

> @@ -1074,12 +1121,33 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags)
>  	} else {
>  ret_eagain:
>  		iov_iter_restore(&io->iter, &io->iter_state);
> +		if (unlikely(kiocb->ki_flags & IOCB_USE_META))
> +			iov_iter_restore(&io->meta.iter, &io->iter_meta_state);
>  		if (kiocb->ki_flags & IOCB_WRITE)
>  			io_req_end_write(req);
>  		return -EAGAIN;
>  	}
>  }

Same question here on the (now) conditional restore.

> +int io_rw_meta(struct io_kiocb *req, unsigned int issue_flags)
> +{
> +	struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
> +	struct io_async_rw *io = req->async_data;
> +	struct kiocb *kiocb = &rw->kiocb;
> +	int ret;
> +
> +	if (!(req->file->f_flags & O_DIRECT))
> +		return -EOPNOTSUPP;

Why isn't this just caught at init time when IOCB_DIRECT is checked?

> +	kiocb->private = &io->meta;
> +	if (req->opcode == IORING_OP_READ_META)
> +		ret = io_read(req, issue_flags);
> +	else
> +		ret = io_write(req, issue_flags);
> +
> +	return ret;
> +}

kiocb->private is a bit of an odd beast, and ownership isn't clear at
all. It would make the most sense if the owner of the kiocb (eg io_uring
in this case) owned it, but take a look at eg ocfs2 and see what they do
with it... I think this would blow up as a result.

Outside of that, and with the O_DIRECT thing check fixed, this should
just be two separate functions, one for read and one for write.

-- 
Jens Axboe



  reply	other threads:[~2024-04-26 14:25 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240425184649epcas5p42f6ddbfb1c579f043a919973c70ebd03@epcas5p4.samsung.com>
2024-04-25 18:39 ` [PATCH 00/10] Read/Write with meta/integrity Kanchan Joshi
2024-04-25 18:39   ` [PATCH 01/10] block: set bip_vcnt correctly Kanchan Joshi
2024-04-27  7:02     ` Christoph Hellwig
2024-04-27 14:16       ` Keith Busch
2024-04-29 10:59         ` Kanchan Joshi
2024-05-01  7:45         ` Christoph Hellwig
2024-05-01  8:03           ` Keith Busch
2024-04-25 18:39   ` [PATCH 02/10] block: copy bip_max_vcnt vecs instead of bip_vcnt during clone Kanchan Joshi
2024-04-27  7:03     ` Christoph Hellwig
2024-04-29 11:28       ` Kanchan Joshi
2024-04-29 12:04         ` Keith Busch
2024-04-29 17:07           ` Christoph Hellwig
2024-04-30  8:25             ` Keith Busch
2024-05-01  7:46               ` Christoph Hellwig
2024-05-01  7:50         ` Christoph Hellwig
2024-04-25 18:39   ` [PATCH 03/10] block: copy result back to user meta buffer correctly in case of split Kanchan Joshi
2024-04-27  7:04     ` Christoph Hellwig
2024-04-25 18:39   ` [PATCH 04/10] block: avoid unpinning/freeing the bio_vec incase of cloned bio Kanchan Joshi
2024-04-27  7:05     ` Christoph Hellwig
2024-04-29 11:40       ` Kanchan Joshi
2024-04-29 17:09         ` Christoph Hellwig
2024-05-01 13:02           ` Kanchan Joshi
2024-05-02  7:12             ` Christoph Hellwig
2024-05-03 12:01               ` Kanchan Joshi
2024-04-25 18:39   ` [PATCH 05/10] block, nvme: modify rq_integrity_vec function Kanchan Joshi
2024-04-27  7:18     ` Christoph Hellwig
2024-04-29 11:34       ` Kanchan Joshi
2024-04-29 17:11         ` Christoph Hellwig
2024-04-25 18:39   ` [PATCH 06/10] block: modify bio_integrity_map_user argument Kanchan Joshi
2024-04-27  7:19     ` Christoph Hellwig
2024-04-25 18:39   ` [PATCH 07/10] block: define meta io descriptor Kanchan Joshi
2024-04-25 18:39   ` [PATCH 08/10] io_uring/rw: add support to send meta along with read/write Kanchan Joshi
2024-04-26 14:25     ` Jens Axboe [this message]
2024-04-29 20:11       ` Kanchan Joshi
2024-04-25 18:39   ` [PATCH 09/10] block: add support to send meta buffer Kanchan Joshi
2024-04-26 15:21     ` Keith Busch
2024-04-29 11:47       ` Kanchan Joshi
2024-04-25 18:39   ` [PATCH 10/10] nvme: add separate handling for user integrity buffer Kanchan Joshi
2024-04-25 19:56     ` Keith Busch
2024-04-26 10:57     ` kernel test robot
2024-04-26 14:19   ` [PATCH 00/10] Read/Write with meta/integrity Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3489d0c-2d27-4e27-ae49-df2e9dad2e00@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=anuj20.g@samsung.com \
    --cc=asml.silence@gmail.com \
    --cc=brauner@kernel.org \
    --cc=dw@davidwei.uk \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=io-uring@vger.kernel.org \
    --cc=joshi.k@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=martin.petersen@oracle.com \
    --cc=nj.shetty@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.