* [PATCH] dm-integrity: align the outgoing bio in integrity_recheck
@ 2024-03-21 16:48 Mikulas Patocka
2024-03-22 1:41 ` Ming Lei
0 siblings, 1 reply; 7+ messages in thread
From: Mikulas Patocka @ 2024-03-21 16:48 UTC (permalink / raw)
To: Mike Snitzer, Ming Lei, Benjamin Marzinski; +Cc: dm-devel
It may be possible to set up dm-integrity with smaller sector size than
the logical sector size of the underlying device. In this situation,
dm-integrity guarantees that the outgoing bios have the same alignment as
incoming bios (so, if you create a filesystem with 4k block size,
dm-integrity would send 4k-aligned bios to the underlying device).
This guarantee was broken when integrity_recheck was implemented.
integrity_recheck sends bio that is aligned to ic->sectors_per_block. So
if we set up integrity with 512-byte sector size on a device with logical
block size 4k, we would be sending unaligned bio. This triggered a bug in
one of our internal tests.
This commit fixes it - it determines what's the actual alignment of the
incoming bio and then makes sure that the outgoing bio in
integrity_recheck has the same alignment.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure")
Cc: stable@vger.kernel.org
---
drivers/md/dm-integrity.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
Index: linux-2.6/drivers/md/dm-integrity.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-integrity.c 2024-03-21 14:25:45.000000000 +0100
+++ linux-2.6/drivers/md/dm-integrity.c 2024-03-21 17:47:39.000000000 +0100
@@ -1699,7 +1699,6 @@ static noinline void integrity_recheck(s
struct bio_vec bv;
sector_t sector, logical_sector, area, offset;
struct page *page;
- void *buffer;
get_area_and_offset(ic, dio->range.logical_sector, &area, &offset);
dio->metadata_block = get_metadata_sector_and_offset(ic, area, offset,
@@ -1708,13 +1707,14 @@ static noinline void integrity_recheck(s
logical_sector = dio->range.logical_sector;
page = mempool_alloc(&ic->recheck_pool, GFP_NOIO);
- buffer = page_to_virt(page);
__bio_for_each_segment(bv, bio, iter, dio->bio_details.bi_iter) {
unsigned pos = 0;
do {
+ sector_t alignment;
char *mem;
+ char *buffer = page_to_virt(page);
int r;
struct dm_io_request io_req;
struct dm_io_region io_loc;
@@ -1727,6 +1727,14 @@ static noinline void integrity_recheck(s
io_loc.sector = sector;
io_loc.count = ic->sectors_per_block;
+ /* Align the bio to logical block size */
+ alignment = dio->range.logical_sector | bio_sectors(bio) | (PAGE_SIZE >> SECTOR_SHIFT);
+ alignment &= -alignment;
+ io_loc.sector = round_down(io_loc.sector, alignment);
+ io_loc.count += sector - io_loc.sector;
+ buffer += (sector - io_loc.sector) << SECTOR_SHIFT;
+ io_loc.count = round_up(io_loc.count, alignment);
+
r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
dio->bi_status = errno_to_blk_status(r);
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] dm-integrity: align the outgoing bio in integrity_recheck 2024-03-21 16:48 [PATCH] dm-integrity: align the outgoing bio in integrity_recheck Mikulas Patocka @ 2024-03-22 1:41 ` Ming Lei 2024-03-22 10:30 ` Mikulas Patocka 0 siblings, 1 reply; 7+ messages in thread From: Ming Lei @ 2024-03-22 1:41 UTC (permalink / raw) To: Mikulas Patocka; +Cc: Mike Snitzer, Benjamin Marzinski, dm-devel, ming.lei On Thu, Mar 21, 2024 at 05:48:45PM +0100, Mikulas Patocka wrote: > It may be possible to set up dm-integrity with smaller sector size than > the logical sector size of the underlying device. In this situation, > dm-integrity guarantees that the outgoing bios have the same alignment as > incoming bios (so, if you create a filesystem with 4k block size, > dm-integrity would send 4k-aligned bios to the underlying device). > > This guarantee was broken when integrity_recheck was implemented. > integrity_recheck sends bio that is aligned to ic->sectors_per_block. So > if we set up integrity with 512-byte sector size on a device with logical > block size 4k, we would be sending unaligned bio. This triggered a bug in > one of our internal tests. > > This commit fixes it - it determines what's the actual alignment of the > incoming bio and then makes sure that the outgoing bio in > integrity_recheck has the same alignment. > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > Fixes: c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure") > Cc: stable@vger.kernel.org > > --- > drivers/md/dm-integrity.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > Index: linux-2.6/drivers/md/dm-integrity.c > =================================================================== > --- linux-2.6.orig/drivers/md/dm-integrity.c 2024-03-21 14:25:45.000000000 +0100 > +++ linux-2.6/drivers/md/dm-integrity.c 2024-03-21 17:47:39.000000000 +0100 > @@ -1699,7 +1699,6 @@ static noinline void integrity_recheck(s > struct bio_vec bv; > sector_t sector, logical_sector, area, offset; > struct page *page; > - void *buffer; > > get_area_and_offset(ic, dio->range.logical_sector, &area, &offset); > dio->metadata_block = get_metadata_sector_and_offset(ic, area, offset, > @@ -1708,13 +1707,14 @@ static noinline void integrity_recheck(s > logical_sector = dio->range.logical_sector; > > page = mempool_alloc(&ic->recheck_pool, GFP_NOIO); > - buffer = page_to_virt(page); > > __bio_for_each_segment(bv, bio, iter, dio->bio_details.bi_iter) { > unsigned pos = 0; > > do { > + sector_t alignment; > char *mem; > + char *buffer = page_to_virt(page); > int r; > struct dm_io_request io_req; > struct dm_io_region io_loc; > @@ -1727,6 +1727,14 @@ static noinline void integrity_recheck(s > io_loc.sector = sector; > io_loc.count = ic->sectors_per_block; > > + /* Align the bio to logical block size */ > + alignment = dio->range.logical_sector | bio_sectors(bio) | (PAGE_SIZE >> SECTOR_SHIFT); > + alignment &= -alignment; The above is less readable, :-( > + io_loc.sector = round_down(io_loc.sector, alignment); > + io_loc.count += sector - io_loc.sector; > + buffer += (sector - io_loc.sector) << SECTOR_SHIFT; > + io_loc.count = round_up(io_loc.count, alignment); I feel the above code isn't very reliable, what we need actually is to make sure that io's sector & size is aligned with dm's bdev_logical_block_size(bdev). Yeah, so far the max logical block size is 4k, but it may be increased in future and you can see the recent lsfmm proposal, so can we force it to be aligned with bdev_logical_block_size(bdev) here? Also can the above change work efficiently in case of 64K PAGE_SIZE? Thanks, Ming ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] dm-integrity: align the outgoing bio in integrity_recheck 2024-03-22 1:41 ` Ming Lei @ 2024-03-22 10:30 ` Mikulas Patocka 2024-03-22 12:03 ` Ming Lei 0 siblings, 1 reply; 7+ messages in thread From: Mikulas Patocka @ 2024-03-22 10:30 UTC (permalink / raw) To: Ming Lei; +Cc: Mike Snitzer, Benjamin Marzinski, dm-devel On Fri, 22 Mar 2024, Ming Lei wrote: > On Thu, Mar 21, 2024 at 05:48:45PM +0100, Mikulas Patocka wrote: > > It may be possible to set up dm-integrity with smaller sector size than > > the logical sector size of the underlying device. In this situation, > > dm-integrity guarantees that the outgoing bios have the same alignment as > > incoming bios (so, if you create a filesystem with 4k block size, > > dm-integrity would send 4k-aligned bios to the underlying device). > > > > This guarantee was broken when integrity_recheck was implemented. > > integrity_recheck sends bio that is aligned to ic->sectors_per_block. So > > if we set up integrity with 512-byte sector size on a device with logical > > block size 4k, we would be sending unaligned bio. This triggered a bug in > > one of our internal tests. > > > > This commit fixes it - it determines what's the actual alignment of the > > incoming bio and then makes sure that the outgoing bio in > > integrity_recheck has the same alignment. > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > Fixes: c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure") > > Cc: stable@vger.kernel.org > > > > --- > > drivers/md/dm-integrity.c | 12 ++++++++++-- > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > Index: linux-2.6/drivers/md/dm-integrity.c > > =================================================================== > > --- linux-2.6.orig/drivers/md/dm-integrity.c 2024-03-21 14:25:45.000000000 +0100 > > +++ linux-2.6/drivers/md/dm-integrity.c 2024-03-21 17:47:39.000000000 +0100 > > @@ -1699,7 +1699,6 @@ static noinline void integrity_recheck(s > > struct bio_vec bv; > > sector_t sector, logical_sector, area, offset; > > struct page *page; > > - void *buffer; > > > > get_area_and_offset(ic, dio->range.logical_sector, &area, &offset); > > dio->metadata_block = get_metadata_sector_and_offset(ic, area, offset, > > @@ -1708,13 +1707,14 @@ static noinline void integrity_recheck(s > > logical_sector = dio->range.logical_sector; > > > > page = mempool_alloc(&ic->recheck_pool, GFP_NOIO); > > - buffer = page_to_virt(page); > > > > __bio_for_each_segment(bv, bio, iter, dio->bio_details.bi_iter) { > > unsigned pos = 0; > > > > do { > > + sector_t alignment; > > char *mem; > > + char *buffer = page_to_virt(page); > > int r; > > struct dm_io_request io_req; > > struct dm_io_region io_loc; > > @@ -1727,6 +1727,14 @@ static noinline void integrity_recheck(s > > io_loc.sector = sector; > > io_loc.count = ic->sectors_per_block; > > > > + /* Align the bio to logical block size */ > > + alignment = dio->range.logical_sector | bio_sectors(bio) | (PAGE_SIZE >> SECTOR_SHIFT); > > + alignment &= -alignment; > > The above is less readable, :-( It isolates the lowest bit from dio->range.logical_sector, bio_sectors(bio) and (PAGE_SIZE >> SECTOR_SHIFT). See for example this https://www.felixcloutier.com/x86/blsi > > + io_loc.sector = round_down(io_loc.sector, alignment); > > + io_loc.count += sector - io_loc.sector; > > + buffer += (sector - io_loc.sector) << SECTOR_SHIFT; > > + io_loc.count = round_up(io_loc.count, alignment); > > I feel the above code isn't very reliable, what we need actually is to > make sure that io's sector & size is aligned with dm's > bdev_logical_block_size(bdev). I thought about using bdev_logical_block_size. But it may be wrong if the device stack is reconfigured. So, I concluded that taking the alignment from the bio would be better. > Yeah, so far the max logical block size is 4k, but it may be increased > in future and you can see the recent lsfmm proposal, so can we force it to be > aligned with bdev_logical_block_size(bdev) here? > > Also can the above change work efficiently in case of 64K PAGE_SIZE? It doesn't work efficiently at all - this piece of code is only run in a pathological case where the user writes into a buffer while reading it (or when he reads multiple blocks into the same buffer), so I optimized it for size, not for performance. But yes, it works with 64K PAGE_SIZE. > Thanks, > Ming Mikulas ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] dm-integrity: align the outgoing bio in integrity_recheck 2024-03-22 10:30 ` Mikulas Patocka @ 2024-03-22 12:03 ` Ming Lei 2024-03-22 12:32 ` Mikulas Patocka 2024-03-22 15:53 ` Mike Snitzer 0 siblings, 2 replies; 7+ messages in thread From: Ming Lei @ 2024-03-22 12:03 UTC (permalink / raw) To: Mikulas Patocka; +Cc: Mike Snitzer, Benjamin Marzinski, dm-devel, ming.lei On Fri, Mar 22, 2024 at 11:30:33AM +0100, Mikulas Patocka wrote: > > > On Fri, 22 Mar 2024, Ming Lei wrote: > > > On Thu, Mar 21, 2024 at 05:48:45PM +0100, Mikulas Patocka wrote: > > > It may be possible to set up dm-integrity with smaller sector size than > > > the logical sector size of the underlying device. In this situation, > > > dm-integrity guarantees that the outgoing bios have the same alignment as > > > incoming bios (so, if you create a filesystem with 4k block size, > > > dm-integrity would send 4k-aligned bios to the underlying device). > > > > > > This guarantee was broken when integrity_recheck was implemented. > > > integrity_recheck sends bio that is aligned to ic->sectors_per_block. So > > > if we set up integrity with 512-byte sector size on a device with logical > > > block size 4k, we would be sending unaligned bio. This triggered a bug in > > > one of our internal tests. > > > > > > This commit fixes it - it determines what's the actual alignment of the > > > incoming bio and then makes sure that the outgoing bio in > > > integrity_recheck has the same alignment. > > > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > > Fixes: c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure") > > > Cc: stable@vger.kernel.org > > > > > > --- > > > drivers/md/dm-integrity.c | 12 ++++++++++-- > > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > > > Index: linux-2.6/drivers/md/dm-integrity.c > > > =================================================================== > > > --- linux-2.6.orig/drivers/md/dm-integrity.c 2024-03-21 14:25:45.000000000 +0100 > > > +++ linux-2.6/drivers/md/dm-integrity.c 2024-03-21 17:47:39.000000000 +0100 > > > @@ -1699,7 +1699,6 @@ static noinline void integrity_recheck(s > > > struct bio_vec bv; > > > sector_t sector, logical_sector, area, offset; > > > struct page *page; > > > - void *buffer; > > > > > > get_area_and_offset(ic, dio->range.logical_sector, &area, &offset); > > > dio->metadata_block = get_metadata_sector_and_offset(ic, area, offset, > > > @@ -1708,13 +1707,14 @@ static noinline void integrity_recheck(s > > > logical_sector = dio->range.logical_sector; > > > > > > page = mempool_alloc(&ic->recheck_pool, GFP_NOIO); > > > - buffer = page_to_virt(page); > > > > > > __bio_for_each_segment(bv, bio, iter, dio->bio_details.bi_iter) { > > > unsigned pos = 0; > > > > > > do { > > > + sector_t alignment; > > > char *mem; > > > + char *buffer = page_to_virt(page); > > > int r; > > > struct dm_io_request io_req; > > > struct dm_io_region io_loc; > > > @@ -1727,6 +1727,14 @@ static noinline void integrity_recheck(s > > > io_loc.sector = sector; > > > io_loc.count = ic->sectors_per_block; > > > > > > + /* Align the bio to logical block size */ > > > + alignment = dio->range.logical_sector | bio_sectors(bio) | (PAGE_SIZE >> SECTOR_SHIFT); > > > + alignment &= -alignment; > > > > The above is less readable, :-( > > It isolates the lowest bit from dio->range.logical_sector, > bio_sectors(bio) and (PAGE_SIZE >> SECTOR_SHIFT). > > See for example this https://www.felixcloutier.com/x86/blsi Fine, but I have to say such usage isn't popular. > > > > + io_loc.sector = round_down(io_loc.sector, alignment); > > > + io_loc.count += sector - io_loc.sector; > > > + buffer += (sector - io_loc.sector) << SECTOR_SHIFT; > > > + io_loc.count = round_up(io_loc.count, alignment); > > > > I feel the above code isn't very reliable, what we need actually is to > > make sure that io's sector & size is aligned with dm's > > bdev_logical_block_size(bdev). > > I thought about using bdev_logical_block_size. But it may be wrong if the > device stack is reconfigured. So, I concluded that taking the alignment > from the bio would be better. If logical block becomes mismatched by reconfiguration, the whole DM stack can't work: - at the beginning, DM is over NVMe(512 bs), DM & NVMe lbs is 512 - later, nvme is reconfigured and its lbs becomes 4k, but DM's lbs can't be updated - then unaligned IO is submitted to NVMe So DM _never_ works with mis-matched logical block size because of reconfigure, and same with MD. > > > Yeah, so far the max logical block size is 4k, but it may be increased > > in future and you can see the recent lsfmm proposal, so can we force it to be > > aligned with bdev_logical_block_size(bdev) here? > > > > Also can the above change work efficiently in case of 64K PAGE_SIZE? > > It doesn't work efficiently at all - this piece of code is only run in a > pathological case where the user writes into a buffer while reading it (or > when he reads multiple blocks into the same buffer), so I optimized it for > size, not for performance. > > But yes, it works with 64K PAGE_SIZE. Fine, but I still think PAGE_SIZE is hard to follow than logical block size. Thanks, Ming ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] dm-integrity: align the outgoing bio in integrity_recheck 2024-03-22 12:03 ` Ming Lei @ 2024-03-22 12:32 ` Mikulas Patocka 2024-03-22 12:58 ` Ming Lei 2024-03-22 15:53 ` Mike Snitzer 1 sibling, 1 reply; 7+ messages in thread From: Mikulas Patocka @ 2024-03-22 12:32 UTC (permalink / raw) To: Ming Lei; +Cc: Mike Snitzer, Benjamin Marzinski, dm-devel On Fri, 22 Mar 2024, Ming Lei wrote: > If logical block becomes mismatched by reconfiguration, the whole DM stack can't work: > > - at the beginning, DM is over NVMe(512 bs), DM & NVMe lbs is 512 > - later, nvme is reconfigured and its lbs becomes 4k, but DM's lbs can't > be updated > - then unaligned IO is submitted to NVMe > > So DM _never_ works with mis-matched logical block size because of > reconfigure, and same with MD. It can work. If a filesystem has 4k block size and it is used on a block device with misconfigured with 512-byte sector size, it will work. The buffer cache also prefers using page-sized blocks rather than logical block size, so it will work too. > > But yes, it works with 64K PAGE_SIZE. > > Fine, but I still think PAGE_SIZE is hard to follow than logical block > size. > > Thanks, > Ming The temporary buffer has one page, so I restricted the alignment to PAGE_SIZE. logical block size is always <= PAGE_SIZE. Mikulas ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] dm-integrity: align the outgoing bio in integrity_recheck 2024-03-22 12:32 ` Mikulas Patocka @ 2024-03-22 12:58 ` Ming Lei 0 siblings, 0 replies; 7+ messages in thread From: Ming Lei @ 2024-03-22 12:58 UTC (permalink / raw) To: Mikulas Patocka; +Cc: Mike Snitzer, Benjamin Marzinski, dm-devel On Fri, Mar 22, 2024 at 01:32:30PM +0100, Mikulas Patocka wrote: > > > On Fri, 22 Mar 2024, Ming Lei wrote: > > > If logical block becomes mismatched by reconfiguration, the whole DM stack can't work: > > > > - at the beginning, DM is over NVMe(512 bs), DM & NVMe lbs is 512 > > - later, nvme is reconfigured and its lbs becomes 4k, but DM's lbs can't > > be updated > > - then unaligned IO is submitted to NVMe > > > > So DM _never_ works with mis-matched logical block size because of > > reconfigure, and same with MD. > > It can work. If a filesystem has 4k block size and it is used on a block Here 'work' means it supports any allowed block size, instead of 4KB only or if balabala... The DM device advertises its lbs as 512, application & FS should be free to read/write 512 aligned data, right? Thanks, Ming ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dm-integrity: align the outgoing bio in integrity_recheck 2024-03-22 12:03 ` Ming Lei 2024-03-22 12:32 ` Mikulas Patocka @ 2024-03-22 15:53 ` Mike Snitzer 1 sibling, 0 replies; 7+ messages in thread From: Mike Snitzer @ 2024-03-22 15:53 UTC (permalink / raw) To: Ming Lei; +Cc: Mikulas Patocka, Benjamin Marzinski, dm-devel On Fri, Mar 22 2024 at 8:03P -0400, Ming Lei <ming.lei@redhat.com> wrote: > On Fri, Mar 22, 2024 at 11:30:33AM +0100, Mikulas Patocka wrote: > > > > > > On Fri, 22 Mar 2024, Ming Lei wrote: > > > > > On Thu, Mar 21, 2024 at 05:48:45PM +0100, Mikulas Patocka wrote: > > > > It may be possible to set up dm-integrity with smaller sector size than > > > > the logical sector size of the underlying device. In this situation, > > > > dm-integrity guarantees that the outgoing bios have the same alignment as > > > > incoming bios (so, if you create a filesystem with 4k block size, > > > > dm-integrity would send 4k-aligned bios to the underlying device). > > > > > > > > This guarantee was broken when integrity_recheck was implemented. > > > > integrity_recheck sends bio that is aligned to ic->sectors_per_block. So > > > > if we set up integrity with 512-byte sector size on a device with logical > > > > block size 4k, we would be sending unaligned bio. This triggered a bug in > > > > one of our internal tests. > > > > > > > > This commit fixes it - it determines what's the actual alignment of the > > > > incoming bio and then makes sure that the outgoing bio in > > > > integrity_recheck has the same alignment. > > > > > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > > > Fixes: c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure") > > > > Cc: stable@vger.kernel.org > > > > > > > > --- > > > > drivers/md/dm-integrity.c | 12 ++++++++++-- > > > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > > > > > Index: linux-2.6/drivers/md/dm-integrity.c > > > > =================================================================== > > > > --- linux-2.6.orig/drivers/md/dm-integrity.c 2024-03-21 14:25:45.000000000 +0100 > > > > +++ linux-2.6/drivers/md/dm-integrity.c 2024-03-21 17:47:39.000000000 +0100 > > > > @@ -1699,7 +1699,6 @@ static noinline void integrity_recheck(s > > > > struct bio_vec bv; > > > > sector_t sector, logical_sector, area, offset; > > > > struct page *page; > > > > - void *buffer; > > > > > > > > get_area_and_offset(ic, dio->range.logical_sector, &area, &offset); > > > > dio->metadata_block = get_metadata_sector_and_offset(ic, area, offset, > > > > @@ -1708,13 +1707,14 @@ static noinline void integrity_recheck(s > > > > logical_sector = dio->range.logical_sector; > > > > > > > > page = mempool_alloc(&ic->recheck_pool, GFP_NOIO); > > > > - buffer = page_to_virt(page); > > > > > > > > __bio_for_each_segment(bv, bio, iter, dio->bio_details.bi_iter) { > > > > unsigned pos = 0; > > > > > > > > do { > > > > + sector_t alignment; > > > > char *mem; > > > > + char *buffer = page_to_virt(page); > > > > int r; > > > > struct dm_io_request io_req; > > > > struct dm_io_region io_loc; > > > > @@ -1727,6 +1727,14 @@ static noinline void integrity_recheck(s > > > > io_loc.sector = sector; > > > > io_loc.count = ic->sectors_per_block; > > > > > > > > + /* Align the bio to logical block size */ > > > > + alignment = dio->range.logical_sector | bio_sectors(bio) | (PAGE_SIZE >> SECTOR_SHIFT); > > > > + alignment &= -alignment; > > > > > > The above is less readable, :-( > > > > It isolates the lowest bit from dio->range.logical_sector, > > bio_sectors(bio) and (PAGE_SIZE >> SECTOR_SHIFT). > > > > See for example this https://www.felixcloutier.com/x86/blsi > > Fine, but I have to say such usage isn't popular. Yeah, at a minimum it should have a comment explaining the optimization of combining and then getting lsbit. The non-ffs() optimized gcd() uses the same but comments it: /* Isolate lsbit of r */ r &= -r; > > > > + io_loc.sector = round_down(io_loc.sector, alignment); > > > > + io_loc.count += sector - io_loc.sector; > > > > + buffer += (sector - io_loc.sector) << SECTOR_SHIFT; > > > > + io_loc.count = round_up(io_loc.count, alignment); > > > > > > I feel the above code isn't very reliable, what we need actually is to > > > make sure that io's sector & size is aligned with dm's > > > bdev_logical_block_size(bdev). > > > > I thought about using bdev_logical_block_size. But it may be wrong if the > > device stack is reconfigured. So, I concluded that taking the alignment > > from the bio would be better. > > If logical block becomes mismatched by reconfiguration, the whole DM stack can't work: > > - at the beginning, DM is over NVMe(512 bs), DM & NVMe lbs is 512 > - later, nvme is reconfigured and its lbs becomes 4k, but DM's lbs can't > be updated > - then unaligned IO is submitted to NVMe > > So DM _never_ works with mis-matched logical block size because of > reconfigure, and same with MD. At some point we need to trust the queue_limits and DM takes considerable pain to validate the alignment when a dm-table is (re)loaded. But we could get into problems with deep(er) device stacks where an underlying DM device is reloaded but the upper level devices' queue_limits aren't restacked. Thankfully, in practice that generally doesn't occur! If it were to become a prevalent issue DM would need to grow validation that DM devices aren't changing their logic_block_size and overall alignment during runtime. > > > Yeah, so far the max logical block size is 4k, but it may be increased > > > in future and you can see the recent lsfmm proposal, so can we force it to be > > > aligned with bdev_logical_block_size(bdev) here? > > > > > > Also can the above change work efficiently in case of 64K PAGE_SIZE? > > > > It doesn't work efficiently at all - this piece of code is only run in a > > pathological case where the user writes into a buffer while reading it (or > > when he reads multiple blocks into the same buffer), so I optimized it for > > size, not for performance. > > > > But yes, it works with 64K PAGE_SIZE. > > Fine, but I still think PAGE_SIZE is hard to follow than logical block > size. Thanks for your review. I shared many of your review concerns (the math isn't apporachable, and why not just use logical_block_size in queue_limits?). That said, I'm OK with the code as-is because it has been tested to fix the reported misalignment issue. BUT, I would like to see follow-on cleanup in a separate commit, at a minimum there should be some helpful comments (to address the math and assumptions made, e.g. this recheck code is not fast path). Mike ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-03-22 15:53 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-21 16:48 [PATCH] dm-integrity: align the outgoing bio in integrity_recheck Mikulas Patocka 2024-03-22 1:41 ` Ming Lei 2024-03-22 10:30 ` Mikulas Patocka 2024-03-22 12:03 ` Ming Lei 2024-03-22 12:32 ` Mikulas Patocka 2024-03-22 12:58 ` Ming Lei 2024-03-22 15:53 ` Mike Snitzer
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.