* [RFC] bdev: use bdev_io_min() for statx DIO min IO
@ 2024-06-28 21:23 Luis Chamberlain
2024-06-29 6:25 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Luis Chamberlain @ 2024-06-28 21:23 UTC (permalink / raw)
To: p.raghav, hare, kbusch, david, neilb
Cc: mcgrof, gost.dev, linux-block, linux-mm, patches
We currently rely on the block device logical block size for the
offset alignment. While this *works* it doesn't work with performance
in mind. That's exactly what the minimum_io_size attribute is for.
This would for example enhance performance for DIO on 4k IU drives which
have for example an LBA format of 512 bytes for both HDDs and NVMe.
Another use case is to ensure that DIO will be used with 16k IOs on
existing market 16k IU drives with an LBA format of 4k or 512 bytes.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
block/bdev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/bdev.c b/block/bdev.c
index 1b4af2cc3b1e..5d0874aa8661 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1282,7 +1282,7 @@ void bdev_statx(struct inode *backing_inode, struct kstat *stat,
if (request_mask & STATX_DIOALIGN) {
stat->dio_mem_align = bdev_dma_alignment(bdev) + 1;
- stat->dio_offset_align = bdev_logical_block_size(bdev);
+ stat->dio_offset_align = (unsigned int) bdev_io_min(bdev);
stat->result_mask |= STATX_DIOALIGN;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC] bdev: use bdev_io_min() for statx DIO min IO
2024-06-28 21:23 [RFC] bdev: use bdev_io_min() for statx DIO min IO Luis Chamberlain
@ 2024-06-29 6:25 ` Christoph Hellwig
2024-06-30 3:24 ` Luis Chamberlain
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2024-06-29 6:25 UTC (permalink / raw)
To: Luis Chamberlain
Cc: p.raghav, hare, kbusch, david, neilb, gost.dev, linux-block,
linux-mm, patches
On Fri, Jun 28, 2024 at 02:23:50PM -0700, Luis Chamberlain wrote:
> We currently rely on the block device logical block size for the
> offset alignment. While this *works* it doesn't work with performance
> in mind. That's exactly what the minimum_io_size attribute is for.
>
> This would for example enhance performance for DIO on 4k IU drives which
> have for example an LBA format of 512 bytes for both HDDs and NVMe.
> Another use case is to ensure that DIO will be used with 16k IOs on
> existing market 16k IU drives with an LBA format of 4k or 512 bytes.
The minimum_io_size clearly is the minimum I/O size, not the minimal
nice to have one. Changing this will break existing setups.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] bdev: use bdev_io_min() for statx DIO min IO
2024-06-29 6:25 ` Christoph Hellwig
@ 2024-06-30 3:24 ` Luis Chamberlain
2024-06-30 5:54 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Luis Chamberlain @ 2024-06-30 3:24 UTC (permalink / raw)
To: Christoph Hellwig, martin.petersen, ebiggers
Cc: p.raghav, hare, kbusch, david, neilb, gost.dev, linux-block,
linux-mm, patches
On Fri, Jun 28, 2024 at 11:25:34PM -0700, Christoph Hellwig wrote:
> On Fri, Jun 28, 2024 at 02:23:50PM -0700, Luis Chamberlain wrote:
> > We currently rely on the block device logical block size for the
> > offset alignment. While this *works* it doesn't work with performance
> > in mind. That's exactly what the minimum_io_size attribute is for.
> >
> > This would for example enhance performance for DIO on 4k IU drives which
> > have for example an LBA format of 512 bytes for both HDDs and NVMe.
> > Another use case is to ensure that DIO will be used with 16k IOs on
> > existing market 16k IU drives with an LBA format of 4k or 512 bytes.
>
> The minimum_io_size clearly is the minimum I/O size, not the minimal
> nice to have one.
I may have misread the below documentation then, because it seems to
suggest this is a performance parameter, not a real minimum. Do we need
to update it?
What: /sys/block/<disk>/queue/minimum_io_size
Date: April 2009
Contact: Martin K. Petersen <martin.petersen@oracle.com>
Description:
[RO] Storage devices may report a granularity or preferred
minimum I/O size which is the smallest request the device can
perform without incurring a performance penalty. For disk
drives this is often the physical block size. For RAID arrays
it is often the stripe chunk size. A properly aligned multiple
of minimum_io_size is the preferred request size for workloads
where a high number of I/O operations is desired.
If this is not the right place, do we need to use a new topology entry
for the IU? Today the NVMe drive uses it for the NPWG which these days
is the IU.
> Changing this will break existing setups.
My impression was that STATX_DIOALIGN was rather new, so any new
enhancements due to the difficulties in getting DIO alignment right,
this was the right place to do so.
Let me know if you have other suggestions.
Luis
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] bdev: use bdev_io_min() for statx DIO min IO
2024-06-30 3:24 ` Luis Chamberlain
@ 2024-06-30 5:54 ` Christoph Hellwig
2024-06-30 20:42 ` Luis Chamberlain
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2024-06-30 5:54 UTC (permalink / raw)
To: Luis Chamberlain
Cc: Christoph Hellwig, martin.petersen, ebiggers, p.raghav, hare,
kbusch, david, neilb, gost.dev, linux-block, linux-mm, patches
On Sat, Jun 29, 2024 at 08:24:00PM -0700, Luis Chamberlain wrote:
> > The minimum_io_size clearly is the minimum I/O size, not the minimal
> > nice to have one.
>
> I may have misread the below documentation then, because it seems to
> suggest this is a performance parameter, not a real minimum. Do we need
> to update it?
queue_limits.min_io is corretly described and a performance hint.
The statx dio_offset_align is actual minimum I/O size and alignment and
not in any way related to the performance hint in minimum_io_size.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] bdev: use bdev_io_min() for statx DIO min IO
2024-06-30 5:54 ` Christoph Hellwig
@ 2024-06-30 20:42 ` Luis Chamberlain
2024-06-30 22:35 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Luis Chamberlain @ 2024-06-30 20:42 UTC (permalink / raw)
To: Christoph Hellwig
Cc: martin.petersen, ebiggers, p.raghav, hare, kbusch, david, neilb,
gost.dev, linux-block, linux-mm, patches
On Sat, Jun 29, 2024 at 10:54:19PM -0700, Christoph Hellwig wrote:
> On Sat, Jun 29, 2024 at 08:24:00PM -0700, Luis Chamberlain wrote:
> > > The minimum_io_size clearly is the minimum I/O size, not the minimal
> > > nice to have one.
> >
> > I may have misread the below documentation then, because it seems to
> > suggest this is a performance parameter, not a real minimum. Do we need
> > to update it?
>
> queue_limits.min_io is corretly described and a performance hint.
OK, great!
> The statx dio_offset_align is actual minimum I/O size and alignment and
> not in any way related to the performance hint in minimum_io_size.
Oh, darn, I just read again 825cf206ed510 ("statx: add direct I/O
alignment information") and the block layer change through commit
2d985f8c6b91b ("vfs: support STATX_DIOALIGN on block devices") and
no where do I see any mention of it being a min. Should we clarify
that?
And should we add a respective value for performance? I suspect
userspace will want to work with optimal values, not ones which could
for instance incur read-modify-write. Altough we have BLKIOMIN to get
the optimal performance min IO and BLKIOOPT to get the optimal size it
is not terribly clear to me that users know they should prefer to align
to BLKIOMIN and use that for an DIO size for writes when possible.
Luis
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] bdev: use bdev_io_min() for statx DIO min IO
2024-06-30 20:42 ` Luis Chamberlain
@ 2024-06-30 22:35 ` Dave Chinner
0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2024-06-30 22:35 UTC (permalink / raw)
To: Luis Chamberlain
Cc: Christoph Hellwig, martin.petersen, ebiggers, p.raghav, hare,
kbusch, neilb, gost.dev, linux-block, linux-mm, patches
On Sun, Jun 30, 2024 at 01:42:45PM -0700, Luis Chamberlain wrote:
> On Sat, Jun 29, 2024 at 10:54:19PM -0700, Christoph Hellwig wrote:
> > On Sat, Jun 29, 2024 at 08:24:00PM -0700, Luis Chamberlain wrote:
> > > > The minimum_io_size clearly is the minimum I/O size, not the minimal
> > > > nice to have one.
> > >
> > > I may have misread the below documentation then, because it seems to
> > > suggest this is a performance parameter, not a real minimum. Do we need
> > > to update it?
> >
> > queue_limits.min_io is corretly described and a performance hint.
>
> OK, great!
>
> > The statx dio_offset_align is actual minimum I/O size and alignment and
> > not in any way related to the performance hint in minimum_io_size.
>
> Oh, darn, I just read again 825cf206ed510 ("statx: add direct I/O
> alignment information") and the block layer change through commit
> 2d985f8c6b91b ("vfs: support STATX_DIOALIGN on block devices") and
> no where do I see any mention of it being a min. Should we clarify
> that?
dio_offset_align is an _alignment_ parameter, not a "size"
parameter. In no way does it define either the absolute or "best
performance" minimum IO size - it just defines the minimum valid
alignment for the file offset that the filesystem/device supports.
It is implied that an IO of dio_offset_align bytes in length will be
supported, because that is the minimum length IO that meets the
offset alignment requirements defined by dio_offset_align.
> And should we add a respective value for performance?
We could, but we already have:
__u32 stx_blksize; /* Block size for filesystem I/O */
which is defined as:
stx_blksize
The "preferred" block size for efficient filesystem
I/O. (Writing to a file in smaller chunks may cause
an inefficient read-modify-rewrite.)
> I suspect
> userspace will want to work with optimal values, not ones which
> could for instance incur read-modify-write.
Yup, that's the very definition of what stx_blksize should contain.
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-30 22:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-28 21:23 [RFC] bdev: use bdev_io_min() for statx DIO min IO Luis Chamberlain
2024-06-29 6:25 ` Christoph Hellwig
2024-06-30 3:24 ` Luis Chamberlain
2024-06-30 5:54 ` Christoph Hellwig
2024-06-30 20:42 ` Luis Chamberlain
2024-06-30 22:35 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).