From: Keith Busch <kbusch@meta.com>
To: <axboe@kernel.dk>, <hch@lst.de>, <linux-block@vger.kernel.org>,
<linux-nvme@lists.infradead.org>, <linux-fsdevel@vger.kernel.org>,
<io-uring@vger.kernel.org>
Cc: <sagi@grimberg.me>, <asml.silence@gmail.com>,
Keith Busch <kbusch@kernel.org>
Subject: [PATCHv11 07/10] block: expose write streams for block device nodes
Date: Thu, 5 Dec 2024 17:53:05 -0800 [thread overview]
Message-ID: <20241206015308.3342386-8-kbusch@meta.com> (raw)
In-Reply-To: <20241206015308.3342386-1-kbusch@meta.com>
From: Christoph Hellwig <hch@lst.de>
Export statx information about the number and granularity of write
streams, use the per-kiocb write hint and map temperature hints to write
streams (which is a bit questionable, but this shows how it is done).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
block/bdev.c | 6 ++++++
block/fops.c | 23 +++++++++++++++++++++++
2 files changed, 29 insertions(+)
diff --git a/block/bdev.c b/block/bdev.c
index 738e3c8457e7f..c23245f1fdfe3 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1296,6 +1296,12 @@ void bdev_statx(struct path *path, struct kstat *stat,
stat->result_mask |= STATX_DIOALIGN;
}
+ if ((request_mask & STATX_WRITE_STREAM) &&
+ bdev_max_write_streams(bdev)) {
+ stat->write_stream_max = bdev_max_write_streams(bdev);
+ stat->result_mask |= STATX_WRITE_STREAM;
+ }
+
if (request_mask & STATX_WRITE_ATOMIC && bdev_can_atomic_write(bdev)) {
struct request_queue *bd_queue = bdev->bd_queue;
diff --git a/block/fops.c b/block/fops.c
index 6d5c4fc5a2168..f16aa39bf5bad 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -73,6 +73,7 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
}
bio.bi_iter.bi_sector = pos >> SECTOR_SHIFT;
bio.bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+ bio.bi_write_stream = iocb->ki_write_stream;
bio.bi_ioprio = iocb->ki_ioprio;
if (iocb->ki_flags & IOCB_ATOMIC)
bio.bi_opf |= REQ_ATOMIC;
@@ -206,6 +207,7 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
for (;;) {
bio->bi_iter.bi_sector = pos >> SECTOR_SHIFT;
bio->bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+ bio->bi_write_stream = iocb->ki_write_stream;
bio->bi_private = dio;
bio->bi_end_io = blkdev_bio_end_io;
bio->bi_ioprio = iocb->ki_ioprio;
@@ -333,6 +335,7 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
dio->iocb = iocb;
bio->bi_iter.bi_sector = pos >> SECTOR_SHIFT;
bio->bi_write_hint = file_inode(iocb->ki_filp)->i_write_hint;
+ bio->bi_write_stream = iocb->ki_write_stream;
bio->bi_end_io = blkdev_bio_end_io_async;
bio->bi_ioprio = iocb->ki_ioprio;
@@ -398,6 +401,26 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
if (blkdev_dio_invalid(bdev, iocb, iter))
return -EINVAL;
+ if (iov_iter_rw(iter) == WRITE) {
+ u16 max_write_streams = bdev_max_write_streams(bdev);
+
+ if (iocb->ki_write_stream) {
+ if (iocb->ki_write_stream > max_write_streams)
+ return -EINVAL;
+ } else if (max_write_streams) {
+ enum rw_hint write_hint =
+ file_inode(iocb->ki_filp)->i_write_hint;
+
+ /*
+ * Just use the write hint as write stream for block
+ * device writes. This assumes no file system is
+ * mounted that would use the streams differently.
+ */
+ if (write_hint <= max_write_streams)
+ iocb->ki_write_stream = write_hint;
+ }
+ }
+
nr_pages = bio_iov_vecs_to_alloc(iter, BIO_MAX_VECS + 1);
if (likely(nr_pages <= BIO_MAX_VECS)) {
if (is_sync_kiocb(iocb))
--
2.43.5
next prev parent reply other threads:[~2024-12-06 1:53 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-06 1:52 [PATCHv11 00/10] block write streams with nvme fdp Keith Busch
2024-12-06 1:52 ` [PATCHv11 01/10] fs: add a write stream field to the kiocb Keith Busch
2024-12-06 1:53 ` [PATCHv11 02/10] io_uring: protection information enhancements Keith Busch
2024-12-06 9:49 ` Anuj Gupta
2024-12-06 1:53 ` [PATCHv11 03/10] io_uring: add write stream attribute Keith Busch
2024-12-06 9:55 ` Anuj Gupta
2024-12-06 12:44 ` Kanchan Joshi
2024-12-06 16:53 ` Keith Busch
2024-12-06 1:53 ` [PATCHv11 04/10] block: add a bi_write_stream field Keith Busch
2024-12-06 1:53 ` [PATCHv11 05/10] block: introduce max_write_streams queue limit Keith Busch
2024-12-06 1:53 ` [PATCHv11 06/10] block: introduce a write_stream_granularity " Keith Busch
2024-12-06 1:53 ` Keith Busch [this message]
2024-12-06 4:13 ` [PATCHv11 07/10] block: expose write streams for block device nodes kernel test robot
2024-12-06 6:43 ` kernel test robot
2024-12-06 9:11 ` Nitesh Shetty
2024-12-06 1:53 ` [PATCHv11 08/10] nvme: add a nvme_get_log_lsi helper Keith Busch
2024-12-06 1:53 ` [PATCHv11 09/10] nvme: register fdp queue limits Keith Busch
2024-12-06 5:26 ` kernel test robot
2024-12-06 1:53 ` [PATCHv11 10/10] nvme: use fdp streams if write stream is provided Keith Busch
2024-12-06 13:18 ` kernel test robot
2024-12-06 2:18 ` [PATCHv11 00/10] block write streams with nvme fdp Keith Busch
2024-12-09 12:51 ` Christoph Hellwig
2024-12-09 15:57 ` Keith Busch
2024-12-09 17:14 ` [EXT] " Pierre Labat
2024-12-09 17:25 ` Keith Busch
2024-12-09 17:35 ` Pierre Labat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241206015308.3342386-8-kbusch@meta.com \
--to=kbusch@meta.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=io-uring@vger.kernel.org \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.