From: Uday Shankar <ushankar@purestorage.com>
To: Jens Axboe <axboe@kernel.dk>, Kanchan Joshi <joshi.k@samsung.com>,
Anuj Gupta <anuj20.g@samsung.com>, Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org, Xinyu Zhang <xizhang@purestorage.com>
Subject: Re: [PATCH] block: fix sanity checks in blk_rq_map_user_bvec
Date: Wed, 23 Oct 2024 16:50:41 -0600 [thread overview]
Message-ID: <Zxl9wS2j5mUkye9o@dev-ushankar.dev.purestorage.com> (raw)
In-Reply-To: <20241023211519.4177873-1-ushankar@purestorage.com>
On Wed, Oct 23, 2024 at 03:15:19PM -0600, Uday Shankar wrote:
> From: Xinyu Zhang <xizhang@purestorage.com>
>
> blk_rq_map_user_bvec contains a check bytes + bv->bv_len > nr_iter which
> causes unnecessary failures in NVMe passthrough I/O, reproducible as
> follows:
>
> - register a 2 page, page-aligned buffer against a ring
> - use that buffer to do a 1 page io_uring NVMe passthrough read
>
> The second (i = 1) iteration of the loop in blk_rq_map_user_bvec will
> then have nr_iter == 1 page, bytes == 1 page, bv->bv_len == 1 page, so
> the check bytes + bv->bv_len > nr_iter will succeed, causing the I/O to
> fail. This failure is unnecessary, as when the check succeeds, it means
> we've checked the entire buffer that will be used by the request - i.e.
> blk_rq_map_user_bvec should complete successfully. Therefore, terminate
> the loop early and return successfully when the check bytes + bv->bv_len
> > nr_iter succeeds.
For anyone interested, here are the details on how to reproduce the
issue described above:
# cat test.c
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/ioctl.h>
#include <liburing.h>
#include <stdlib.h>
#include <assert.h>
#include <linux/nvme_ioctl.h>
int main(int argc, char *argv[]) {
struct io_uring ring;
assert(io_uring_queue_init(1, &ring, IORING_SETUP_SQE128 | IORING_SETUP_CQE32) == 0);
void *buf = memalign(4096, 2 * 4096);
printf("buf %p\n", buf);
struct iovec iov = {
.iov_base = buf,
.iov_len = 2 * 4096,
};
assert(io_uring_register_buffers(&ring, &iov, 1) == 0);
struct io_uring_sqe *sqe = io_uring_get_sqe(&ring);
assert(sqe != NULL);
int fd = open("/dev/ng0n1", O_RDONLY);
assert(fd > 0);
sqe->fd = fd;
sqe->opcode = IORING_OP_URING_CMD;
sqe->cmd_op = NVME_URING_CMD_IO;
sqe->buf_index = 0;
sqe->flags = 0;
sqe->uring_cmd_flags = IORING_URING_CMD_FIXED;
struct nvme_passthru_cmd *cmd = &sqe->cmd;
cmd->opcode = 2; // read
cmd->nsid = 1;
cmd->data_len = 1 * 4096;
cmd->addr = buf;
struct io_uring_cqe *cqe;
assert(io_uring_submit(&ring) == 1);
assert(io_uring_wait_cqe(&ring, &cqe) == 0);
printf("res %d\n", cqe->res);
return 0;
}
# gcc -o test -luring test.c
test.c: In function ‘main’:
test.c:15:17: warning: implicit declaration of function ‘memalign’ [-Wimplicit-function-declaration]
15 | void *buf = memalign(4096, 2 * 4096);
| ^~~~~~~~
test.c:15:17: warning: initialization of ‘void *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
test.c:36:37: warning: initialization of ‘struct nvme_passthru_cmd *’ from incompatible pointer type ‘__u8 (*)[0]’ {aka ‘unsigned char (*)[]’} [-Wincompatible-pointer-types]
36 | struct nvme_passthru_cmd *cmd = &sqe->cmd;
| ^
test.c:40:15: warning: assignment to ‘__u64’ {aka ‘long long unsigned int’} from ‘void *’ makes integer from pointer without a cast [-Wint-conversion]
40 | cmd->addr = buf;
|
# ./test
buf 0x406000
res -22
next prev parent reply other threads:[~2024-10-23 22:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-23 21:15 [PATCH] block: fix sanity checks in blk_rq_map_user_bvec Uday Shankar
2024-10-23 22:31 ` Jens Axboe
2024-10-23 22:46 ` Uday Shankar
2024-10-23 22:50 ` Uday Shankar [this message]
2024-10-23 22:54 ` Bart Van Assche
2024-10-24 0:42 ` Chaitanya Kulkarni
2024-10-23 23:03 ` Jens Axboe
2024-10-24 4:56 ` Christoph Hellwig
2024-10-24 6:05 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zxl9wS2j5mUkye9o@dev-ushankar.dev.purestorage.com \
--to=ushankar@purestorage.com \
--cc=anuj20.g@samsung.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=joshi.k@samsung.com \
--cc=linux-block@vger.kernel.org \
--cc=xizhang@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).