* [PATCH] dm-table:fix zone block_device not aligned with zonesize
@ 2024-07-04 15:15 Li Dong
2024-07-04 15:29 ` Mikulas Patocka
0 siblings, 1 reply; 6+ messages in thread
From: Li Dong @ 2024-07-04 15:15 UTC (permalink / raw)
To: Alasdair Kergon, Mike Snitzer, Mikulas Patocka,
open list:DEVICE-MAPPER (LVM), open list
Cc: opensource.kernel, lidong
For zone block devices, device_area_is_invalid may return an incorrect
value.
Failure log:
[ 19.337657]: device-mapper: table: 254:56: len=836960256 not aligned to
h/w zone size 3244032 of sde
[ 19.337665]: device-mapper: core: Cannot calculate initial queue limits
[ 19.337667]: device-mapper: ioctl: unable to set up device queue for
new table.
Actually, the device's zone length is aligned to the zonesize.
Fixes: 5dea271b6d87 ("dm table: pass correct dev area size to device_area_is_valid")
Signed-off-by: Li Dong <lidong@vivo.com>
---
drivers/md/dm-table.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 33b7a1844ed4..0bddae0bee3c 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -257,7 +257,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev,
if (bdev_is_zoned(bdev)) {
unsigned int zone_sectors = bdev_zone_sectors(bdev);
- if (start & (zone_sectors - 1)) {
+ if (start % zone_sectors) {
DMERR("%s: start=%llu not aligned to h/w zone size %u of %pg",
dm_device_name(ti->table->md),
(unsigned long long)start,
@@ -274,7 +274,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev,
* devices do not end up with a smaller zone in the middle of
* the sector range.
*/
- if (len & (zone_sectors - 1)) {
+ if (len % zone_sectors) {
DMERR("%s: len=%llu not aligned to h/w zone size %u of %pg",
dm_device_name(ti->table->md),
(unsigned long long)len,
--
2.31.1.windows.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] dm-table:fix zone block_device not aligned with zonesize 2024-07-04 15:15 [PATCH] dm-table:fix zone block_device not aligned with zonesize Li Dong @ 2024-07-04 15:29 ` Mikulas Patocka 2024-07-04 19:00 ` Non-power-of-2 zone size (was: [PATCH] dm-table:fix zone block_device not aligned with zonesize) Mikulas Patocka 2024-07-05 1:08 ` [PATCH] dm-table:fix zone block_device not aligned with zonesize Damien Le Moal 0 siblings, 2 replies; 6+ messages in thread From: Mikulas Patocka @ 2024-07-04 15:29 UTC (permalink / raw) To: Li Dong Cc: Alasdair Kergon, Mike Snitzer, open list:DEVICE-MAPPER (LVM), open list, opensource.kernel On Thu, 4 Jul 2024, Li Dong wrote: > For zone block devices, device_area_is_invalid may return an incorrect > value. > > Failure log: > [ 19.337657]: device-mapper: table: 254:56: len=836960256 not aligned to > h/w zone size 3244032 of sde > [ 19.337665]: device-mapper: core: Cannot calculate initial queue limits > [ 19.337667]: device-mapper: ioctl: unable to set up device queue for > new table. > > Actually, the device's zone length is aligned to the zonesize. > > Fixes: 5dea271b6d87 ("dm table: pass correct dev area size to device_area_is_valid") > Signed-off-by: Li Dong <lidong@vivo.com> > --- > drivers/md/dm-table.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c > index 33b7a1844ed4..0bddae0bee3c 100644 > --- a/drivers/md/dm-table.c > +++ b/drivers/md/dm-table.c > @@ -257,7 +257,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev, > if (bdev_is_zoned(bdev)) { > unsigned int zone_sectors = bdev_zone_sectors(bdev); > > - if (start & (zone_sectors - 1)) { > + if (start % zone_sectors) { > DMERR("%s: start=%llu not aligned to h/w zone size %u of %pg", > dm_device_name(ti->table->md), > (unsigned long long)start, > @@ -274,7 +274,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev, > * devices do not end up with a smaller zone in the middle of > * the sector range. > */ > - if (len & (zone_sectors - 1)) { > + if (len % zone_sectors) { > DMERR("%s: len=%llu not aligned to h/w zone size %u of %pg", > dm_device_name(ti->table->md), > (unsigned long long)len, > -- > 2.31.1.windows.1 Hi This probably won't compile on 32-bit architectures because the operators for 64-bit divide and modulo don't work there. Please, rework the patch using dm_sector_div64. Mikulas ^ permalink raw reply [flat|nested] 6+ messages in thread
* Non-power-of-2 zone size (was: [PATCH] dm-table:fix zone block_device not aligned with zonesize) 2024-07-04 15:29 ` Mikulas Patocka @ 2024-07-04 19:00 ` Mikulas Patocka 2024-07-04 21:43 ` Pankaj Raghav (Samsung) 2024-07-05 0:13 ` Non-power-of-2 zone size Damien Le Moal 2024-07-05 1:08 ` [PATCH] dm-table:fix zone block_device not aligned with zonesize Damien Le Moal 1 sibling, 2 replies; 6+ messages in thread From: Mikulas Patocka @ 2024-07-04 19:00 UTC (permalink / raw) To: Li Dong, Damien Le Moal, Naohiro Aota, Johannes Thumshirn, Jens Axboe Cc: Alasdair Kergon, Mike Snitzer, open list:DEVICE-MAPPER (LVM), open list, opensource.kernel, linux-fsdevel, linux-block > On Thu, 4 Jul 2024, Li Dong wrote: > > > For zone block devices, device_area_is_invalid may return an incorrect > > value. > > > > Failure log: > > [ 19.337657]: device-mapper: table: 254:56: len=836960256 not aligned to > > h/w zone size 3244032 of sde > > [ 19.337665]: device-mapper: core: Cannot calculate initial queue limits > > [ 19.337667]: device-mapper: ioctl: unable to set up device queue for > > new table. > > > > Actually, the device's zone length is aligned to the zonesize. > > > > Fixes: 5dea271b6d87 ("dm table: pass correct dev area size to device_area_is_valid") > > Signed-off-by: Li Dong <lidong@vivo.com> > > --- > > drivers/md/dm-table.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c > > index 33b7a1844ed4..0bddae0bee3c 100644 > > --- a/drivers/md/dm-table.c > > +++ b/drivers/md/dm-table.c > > @@ -257,7 +257,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev, > > if (bdev_is_zoned(bdev)) { > > unsigned int zone_sectors = bdev_zone_sectors(bdev); > > > > - if (start & (zone_sectors - 1)) { > > + if (start % zone_sectors) { > > DMERR("%s: start=%llu not aligned to h/w zone size %u of %pg", > > dm_device_name(ti->table->md), > > (unsigned long long)start, > > @@ -274,7 +274,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev, > > * devices do not end up with a smaller zone in the middle of > > * the sector range. > > */ > > - if (len & (zone_sectors - 1)) { > > + if (len % zone_sectors) { > > DMERR("%s: len=%llu not aligned to h/w zone size %u of %pg", > > dm_device_name(ti->table->md), > > (unsigned long long)len, > > -- > > 2.31.1.windows.1 I grepped the kernel for bdev_zone_sectors and there are more assumptions that bdev_zone_sectors is a power of 2. drivers/md/dm-zone.c: sector_t mask = bdev_zone_sectors(disk->part0) - 1 drivers/nvme/target/zns.c: if (get_capacity(bd_disk) & (bdev_zone_sectors(ns->bdev) - 1)) drivers/nvme/target/zns.c: if (sect & (bdev_zone_sectors(req->ns->bdev) - 1)) { fs/zonefs/super.c: sbi->s_zone_sectors_shift = ilog2(bdev_zone_sectors(sb->s_bdev)); fs/btrfs/zoned.c: return (sector_t)zone_number << ilog2(bdev_zone_sectors(bdev)); fs/btrfs/zoned.c: zone_info->zone_size_shift = ilog2(zone_info->zone_size); include/linux/blkdev.h: return sector & (bdev_zone_sectors(bdev) - 1); fs/f2fs/super.c: if (nr_sectors & (zone_sectors - 1)) So, if we want to support non-power-of-2 zone size, we need some systematic fix. Now it appears that Linux doesn't even attempt to support disks non-power-of-2 zone size. I added Damien Le Moal so that he can help with testing disks with non-power-of-2 zone size (if WD is actually making them). Mikulas ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Non-power-of-2 zone size (was: [PATCH] dm-table:fix zone block_device not aligned with zonesize) 2024-07-04 19:00 ` Non-power-of-2 zone size (was: [PATCH] dm-table:fix zone block_device not aligned with zonesize) Mikulas Patocka @ 2024-07-04 21:43 ` Pankaj Raghav (Samsung) 2024-07-05 0:13 ` Non-power-of-2 zone size Damien Le Moal 1 sibling, 0 replies; 6+ messages in thread From: Pankaj Raghav (Samsung) @ 2024-07-04 21:43 UTC (permalink / raw) To: Mikulas Patocka Cc: Li Dong, Damien Le Moal, Naohiro Aota, Johannes Thumshirn, Jens Axboe, Alasdair Kergon, Mike Snitzer, open list:DEVICE-MAPPER (LVM), open list, opensource.kernel, linux-fsdevel, linux-block > I grepped the kernel for bdev_zone_sectors and there are more assumptions > that bdev_zone_sectors is a power of 2. > > drivers/md/dm-zone.c: sector_t mask = bdev_zone_sectors(disk->part0) - 1 > drivers/nvme/target/zns.c: if (get_capacity(bd_disk) & (bdev_zone_sectors(ns->bdev) - 1)) > drivers/nvme/target/zns.c: if (sect & (bdev_zone_sectors(req->ns->bdev) - 1)) { > fs/zonefs/super.c: sbi->s_zone_sectors_shift = ilog2(bdev_zone_sectors(sb->s_bdev)); > fs/btrfs/zoned.c: return (sector_t)zone_number << ilog2(bdev_zone_sectors(bdev)); > fs/btrfs/zoned.c: zone_info->zone_size_shift = ilog2(zone_info->zone_size); > include/linux/blkdev.h: return sector & (bdev_zone_sectors(bdev) - 1); > fs/f2fs/super.c: if (nr_sectors & (zone_sectors - 1)) > > So, if we want to support non-power-of-2 zone size, we need some > systematic fix. Now it appears that Linux doesn't even attempt to support > disks non-power-of-2 zone size. FYI, I sent the patches to add non-power-of-2 zone size support to the block layer [0] a long time back. Sadly, it was not finally merged upstream. Android wanted this support, so for now it is in the Android tree. -- Pankaj Raghav [0] https://lore.kernel.org/linux-block/20220923173618.6899-1-p.raghav@samsung.com/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Non-power-of-2 zone size 2024-07-04 19:00 ` Non-power-of-2 zone size (was: [PATCH] dm-table:fix zone block_device not aligned with zonesize) Mikulas Patocka 2024-07-04 21:43 ` Pankaj Raghav (Samsung) @ 2024-07-05 0:13 ` Damien Le Moal 1 sibling, 0 replies; 6+ messages in thread From: Damien Le Moal @ 2024-07-05 0:13 UTC (permalink / raw) To: Mikulas Patocka, Li Dong, Naohiro Aota, Johannes Thumshirn, Jens Axboe Cc: Alasdair Kergon, Mike Snitzer, open list:DEVICE-MAPPER (LVM), open list, opensource.kernel, linux-fsdevel, linux-block, Pankaj Raghav (Samsung) On 7/5/24 04:00, Mikulas Patocka wrote: > > >> On Thu, 4 Jul 2024, Li Dong wrote: >> >>> For zone block devices, device_area_is_invalid may return an incorrect >>> value. >>> >>> Failure log: >>> [ 19.337657]: device-mapper: table: 254:56: len=836960256 not aligned to >>> h/w zone size 3244032 of sde >>> [ 19.337665]: device-mapper: core: Cannot calculate initial queue limits >>> [ 19.337667]: device-mapper: ioctl: unable to set up device queue for >>> new table. >>> >>> Actually, the device's zone length is aligned to the zonesize. >>> >>> Fixes: 5dea271b6d87 ("dm table: pass correct dev area size to device_area_is_valid") >>> Signed-off-by: Li Dong <lidong@vivo.com> >>> --- >>> drivers/md/dm-table.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c >>> index 33b7a1844ed4..0bddae0bee3c 100644 >>> --- a/drivers/md/dm-table.c >>> +++ b/drivers/md/dm-table.c >>> @@ -257,7 +257,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev, >>> if (bdev_is_zoned(bdev)) { >>> unsigned int zone_sectors = bdev_zone_sectors(bdev); >>> >>> - if (start & (zone_sectors - 1)) { >>> + if (start % zone_sectors) { >>> DMERR("%s: start=%llu not aligned to h/w zone size %u of %pg", >>> dm_device_name(ti->table->md), >>> (unsigned long long)start, >>> @@ -274,7 +274,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev, >>> * devices do not end up with a smaller zone in the middle of >>> * the sector range. >>> */ >>> - if (len & (zone_sectors - 1)) { >>> + if (len % zone_sectors) { >>> DMERR("%s: len=%llu not aligned to h/w zone size %u of %pg", >>> dm_device_name(ti->table->md), >>> (unsigned long long)len, >>> -- >>> 2.31.1.windows.1 > > I grepped the kernel for bdev_zone_sectors and there are more assumptions > that bdev_zone_sectors is a power of 2. > > drivers/md/dm-zone.c: sector_t mask = bdev_zone_sectors(disk->part0) - 1 > drivers/nvme/target/zns.c: if (get_capacity(bd_disk) & (bdev_zone_sectors(ns->bdev) - 1)) > drivers/nvme/target/zns.c: if (sect & (bdev_zone_sectors(req->ns->bdev) - 1)) { > fs/zonefs/super.c: sbi->s_zone_sectors_shift = ilog2(bdev_zone_sectors(sb->s_bdev)); > fs/btrfs/zoned.c: return (sector_t)zone_number << ilog2(bdev_zone_sectors(bdev)); > fs/btrfs/zoned.c: zone_info->zone_size_shift = ilog2(zone_info->zone_size); > include/linux/blkdev.h: return sector & (bdev_zone_sectors(bdev) - 1); > fs/f2fs/super.c: if (nr_sectors & (zone_sectors - 1)) > > So, if we want to support non-power-of-2 zone size, we need some > systematic fix. Now it appears that Linux doesn't even attempt to support > disks non-power-of-2 zone size. Correct. We currently do not support zoned devices with a zone size that is not a power of 2 number of LBAs. Such drives are rejected by the block layer. So I am surprised that Li could even reach a DM error. It means that his kernel was already patched to accept zoned drives with a zone size that is not a power of 2. > I added Damien Le Moal so that he can help with testing disks with > non-power-of-2 zone size (if WD is actually making them). No, I do not have such drive. The vast majority of zoned device users today are SMR HDD users, and in that space, no one wants a non power of 2 zone size drive. As far as I know, the main push is from zoned-UFS users for Android, as Pankaj mentioned. As you rightly guessed, and as Pankaj confirmed, supporting non power of 2 zone size drives requires a lot more changes than just the fix Li sent for DM. Pankaj initial patch set that did not make it upstream (but is in Android) would need to be reworked given the extensive changes that happened since then (zone write locking replaced with zone write plugging, queue limits changes, etc). -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] dm-table:fix zone block_device not aligned with zonesize 2024-07-04 15:29 ` Mikulas Patocka 2024-07-04 19:00 ` Non-power-of-2 zone size (was: [PATCH] dm-table:fix zone block_device not aligned with zonesize) Mikulas Patocka @ 2024-07-05 1:08 ` Damien Le Moal 1 sibling, 0 replies; 6+ messages in thread From: Damien Le Moal @ 2024-07-05 1:08 UTC (permalink / raw) To: Mikulas Patocka, Li Dong Cc: Alasdair Kergon, Mike Snitzer, open list:DEVICE-MAPPER (LVM), open list, opensource.kernel On 7/5/24 00:29, Mikulas Patocka wrote: > > > On Thu, 4 Jul 2024, Li Dong wrote: > >> For zone block devices, device_area_is_invalid may return an incorrect >> value. >> >> Failure log: >> [ 19.337657]: device-mapper: table: 254:56: len=836960256 not aligned to >> h/w zone size 3244032 of sde >> [ 19.337665]: device-mapper: core: Cannot calculate initial queue limits >> [ 19.337667]: device-mapper: ioctl: unable to set up device queue for >> new table. >> >> Actually, the device's zone length is aligned to the zonesize. >> >> Fixes: 5dea271b6d87 ("dm table: pass correct dev area size to device_area_is_valid") >> Signed-off-by: Li Dong <lidong@vivo.com> >> --- >> drivers/md/dm-table.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c >> index 33b7a1844ed4..0bddae0bee3c 100644 >> --- a/drivers/md/dm-table.c >> +++ b/drivers/md/dm-table.c >> @@ -257,7 +257,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev, >> if (bdev_is_zoned(bdev)) { >> unsigned int zone_sectors = bdev_zone_sectors(bdev); >> >> - if (start & (zone_sectors - 1)) { >> + if (start % zone_sectors) { >> DMERR("%s: start=%llu not aligned to h/w zone size %u of %pg", >> dm_device_name(ti->table->md), >> (unsigned long long)start, >> @@ -274,7 +274,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev, >> * devices do not end up with a smaller zone in the middle of >> * the sector range. >> */ >> - if (len & (zone_sectors - 1)) { >> + if (len % zone_sectors) { >> DMERR("%s: len=%llu not aligned to h/w zone size %u of %pg", >> dm_device_name(ti->table->md), >> (unsigned long long)len, >> -- >> 2.31.1.windows.1 > > Hi > > This probably won't compile on 32-bit architectures because the operators > for 64-bit divide and modulo don't work there. > > Please, rework the patch using dm_sector_div64. This patch alone will not allow non power of 2 zone size drives to be supported as the block layer will reject such drive in the first place. This patch should be part of a larger series enabling such support. On its own, it is a NACK from me for this change. -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-05 1:08 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-07-04 15:15 [PATCH] dm-table:fix zone block_device not aligned with zonesize Li Dong 2024-07-04 15:29 ` Mikulas Patocka 2024-07-04 19:00 ` Non-power-of-2 zone size (was: [PATCH] dm-table:fix zone block_device not aligned with zonesize) Mikulas Patocka 2024-07-04 21:43 ` Pankaj Raghav (Samsung) 2024-07-05 0:13 ` Non-power-of-2 zone size Damien Le Moal 2024-07-05 1:08 ` [PATCH] dm-table:fix zone block_device not aligned with zonesize Damien Le Moal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox