From: Minwoo Im <minwoo.im.dev@gmail.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
Sagi Grimberg <sagi@grimberg.me>
Subject: Re: [RFC] nvme: set block size during namespace validation
Date: Thu, 24 Dec 2020 01:16:50 +0900 [thread overview]
Message-ID: <20201223161650.GA13354@localhost.localdomain> (raw)
In-Reply-To: <20201223154904.GA5967@lst.de>
Hello,
On 20-12-23 16:49:04, Christoph Hellwig wrote:
> set_blocksize just sets the block sise used for buffer heads and should
> not be called by the driver. blkdev_get updates the block size, so
> you must already have the fd re-reading the partition table open?
> I'm not entirely sure how we can work around this except by avoiding
> buffer head I/O in the partition reread code. Note that this affects
> all block drivers where the block size could change at runtime.
Thank you Christoph for your comment on this.
Agreed. BLKRRPART leads us to block_read_full_page which takes buffer
heads for I/O.
Yes, __blkdev_get() sets i_blkbits of block device inode via
set_init_blocksize. And Yes again as nvme-cli already opened the block
device fd and requests the BLKRRPART with that fd. Also, __bdev_get()
only updates the i_blkbits(blocksize) in case bdev->bd_openers == 0 which
is the first time to open this block device.
Then, how about having NVMe driver prevent underflow case for the
request->__data_len is smaller than the logical block size like:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ce1b61519441..030353d203bf 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -803,7 +803,11 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
cmnd->rw.opcode = op;
cmnd->rw.nsid = cpu_to_le32(ns->head->ns_id);
cmnd->rw.slba = cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req)));
- cmnd->rw.length = cpu_to_le16((blk_rq_bytes(req) >> ns->lba_shift) - 1);
+
+ if (unlikely(blk_rq_bytes(req) < (1 << ns->lba_shift)))
+ cmnd->rw.length = 0;
+ else
+ cmnd->rw.length = cpu_to_le16((blk_rq_bytes(req) >> ns->lba_shift) - 1);
if (req_op(req) == REQ_OP_WRITE && ctrl->nr_streams)
nvme_assign_write_stream(ctrl, req, &control, &dsmgmt);
Thanks,
WARNING: multiple messages have this Message-ID (diff)
From: Minwoo Im <minwoo.im.dev@gmail.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-block@vger.kernel.org, Keith Busch <kbusch@kernel.org>,
Jens Axboe <axboe@fb.com>,
linux-nvme@lists.infradead.org, Sagi Grimberg <sagi@grimberg.me>
Subject: Re: [RFC] nvme: set block size during namespace validation
Date: Thu, 24 Dec 2020 01:16:50 +0900 [thread overview]
Message-ID: <20201223161650.GA13354@localhost.localdomain> (raw)
In-Reply-To: <20201223154904.GA5967@lst.de>
Hello,
On 20-12-23 16:49:04, Christoph Hellwig wrote:
> set_blocksize just sets the block sise used for buffer heads and should
> not be called by the driver. blkdev_get updates the block size, so
> you must already have the fd re-reading the partition table open?
> I'm not entirely sure how we can work around this except by avoiding
> buffer head I/O in the partition reread code. Note that this affects
> all block drivers where the block size could change at runtime.
Thank you Christoph for your comment on this.
Agreed. BLKRRPART leads us to block_read_full_page which takes buffer
heads for I/O.
Yes, __blkdev_get() sets i_blkbits of block device inode via
set_init_blocksize. And Yes again as nvme-cli already opened the block
device fd and requests the BLKRRPART with that fd. Also, __bdev_get()
only updates the i_blkbits(blocksize) in case bdev->bd_openers == 0 which
is the first time to open this block device.
Then, how about having NVMe driver prevent underflow case for the
request->__data_len is smaller than the logical block size like:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ce1b61519441..030353d203bf 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -803,7 +803,11 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
cmnd->rw.opcode = op;
cmnd->rw.nsid = cpu_to_le32(ns->head->ns_id);
cmnd->rw.slba = cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req)));
- cmnd->rw.length = cpu_to_le16((blk_rq_bytes(req) >> ns->lba_shift) - 1);
+
+ if (unlikely(blk_rq_bytes(req) < (1 << ns->lba_shift)))
+ cmnd->rw.length = 0;
+ else
+ cmnd->rw.length = cpu_to_le16((blk_rq_bytes(req) >> ns->lba_shift) - 1);
if (req_op(req) == REQ_OP_WRITE && ctrl->nr_streams)
nvme_assign_write_stream(ctrl, req, &control, &dsmgmt);
Thanks,
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-12-23 16:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-23 15:01 [RFC] nvme: set block size during namespace validation Minwoo Im
2020-12-23 15:01 ` Minwoo Im
2020-12-23 15:49 ` Christoph Hellwig
2020-12-23 15:49 ` Christoph Hellwig
2020-12-23 16:16 ` Minwoo Im [this message]
2020-12-23 16:16 ` Minwoo Im
2020-12-23 16:27 ` Christoph Hellwig
2020-12-23 16:27 ` Christoph Hellwig
2020-12-23 18:31 ` Minwoo Im
2020-12-23 18:31 ` Minwoo Im
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201223161650.GA13354@localhost.localdomain \
--to=minwoo.im.dev@gmail.com \
--cc=axboe@fb.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.