* [PATCH v8 0/1] scsi: sas: fix mkfs.xfs failure due to bogus optimal_io_size
@ 2026-05-19 13:52 Ionut Nechita (Wind River)
2026-05-19 13:52 ` [PATCH v8 1/1] scsi: sas: skip opt_sectors when DMA reports no real optimization hint Ionut Nechita (Wind River)
0 siblings, 1 reply; 5+ messages in thread
From: Ionut Nechita (Wind River) @ 2026-05-19 13:52 UTC (permalink / raw)
To: James.Bottomley, martin.petersen
Cc: linux-scsi, linux-kernel, stable, hch, dlemoal, robin.murphy,
john.g.garry, axboe, m.szyprowski, ahuang12, ionut_n2001,
sunlightlinux, Ionut Nechita
From: Ionut Nechita <ionut.nechita@windriver.com>
From: Ionut Nechita <ionut.nechita@windriver.com>
v8 (per Christoph Hellwig's review of v7):
- Removed the dma_dev->dma_mask guard — dma_opt_mapping_size() and
dma_max_mapping_size() both return SIZE_MAX when no DMA ops are
present, so the opt >= max early-return already covers this case.
- Added inline comments explaining each conditional in the helper.
v7 (per John Garry's review of v6):
- Dropped the redundant !opt check from the first guard; the
!opt_sectors check later already handles the opt == 0 case.
Now simply: if (opt >= max) return;
- Added Reviewed-by: John Garry <john.g.garry@oracle.com>.
- Rebased onto linux-next (next-20260414).
v6 (per John Garry's review of v5):
- Replaced kerneldoc (/**) with a regular comment — function is static.
- Condensed the comment to a single paragraph.
- Removed WARN_ONCE for opt > max — not the driver's job.
- Combined the !opt and opt == max checks into: if (!opt || opt >= max).
- Apply rounddown_pow_of_two() to min(opt_sectors, max_sectors) instead
of just opt, since max_sectors can be any value.
- Restructured as sas_dma_setup_opt_sectors(struct Scsi_Host *shost)
with the dma_mask check moved inside, removing the need for a
separate dma_dev variable in sas_host_setup().
v5 (per Damien Le Moal's and James Bottomley's review of v4):
- Expanded kdoc, inline comment at opt == max, guard for opt == 0
before rounddown_pow_of_two, trimmed Cc list.
v4 (per Damien Le Moal's review of v3):
- WARN_ONCE for opt > max, min_t overflow protection, reformatted
call site.
v3 (per Christoph Hellwig's review of v2):
- Extracted the opt_sectors logic into a dedicated helper function.
- Added rounddown_pow_of_two().
v2:
- Dropped the dma_opt_mapping_size() change per Robin Murphy's
feedback.
Single patch fixing scsi_transport_sas.c.
Test environment:
- Dell PowerEdge R750
- SAS Controller: Broadcom/LSI mpt3sas (SAS3816, FW 33.15.00.00)
- Disks: SAMSUNG MZILT800HBHQ0D3 (800GB SCSI SAS SSD)
- Kernel: 6.12.0-1-amd64 with intel_iommu=off
- IOMMU: Disabled (DMAR: IOMMU disabled), default domain: Passthrough
Based on linux-next (next-20260519).
Link: https://lore.kernel.org/lkml/20260316203956.64515-1-ionut.nechita@windriver.com/ [v1]
Link: https://lore.kernel.org/all/20260318074314.17372-1-ionut.nechita@windriver.com/ [v2]
Link: https://lore.kernel.org/all/20260318200532.51232-1-ionut.nechita@windriver.com/ [v3]
Link: https://lore.kernel.org/lkml/20260319083954.21056-1-ionut.nechita@windriver.com/ [v4]
Link: https://lore.kernel.org/linux-scsi/20260320081429.42106-1-ionut.nechita@windriver.com/ [v5]
Link: https://lore.kernel.org/linux-scsi/20260326084644.27162-1-ionut.nechita@windriver.com/ [v6]
Link: https://lore.kernel.org/linux-scsi/20260415071849.25693-1-ionut.nechita@windriver.com/ [v7]
Ionut Nechita (Wind River) (1):
scsi: sas: skip opt_sectors when DMA reports no real optimization hint
drivers/scsi/scsi_transport_sas.c | 43 +++++++++++++++++++++++++++----
1 file changed, 38 insertions(+), 5 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH v8 1/1] scsi: sas: skip opt_sectors when DMA reports no real optimization hint 2026-05-19 13:52 [PATCH v8 0/1] scsi: sas: fix mkfs.xfs failure due to bogus optimal_io_size Ionut Nechita (Wind River) @ 2026-05-19 13:52 ` Ionut Nechita (Wind River) 2026-05-25 6:02 ` Christoph Hellwig 2026-06-02 1:43 ` Martin K. Petersen 0 siblings, 2 replies; 5+ messages in thread From: Ionut Nechita (Wind River) @ 2026-05-19 13:52 UTC (permalink / raw) To: James.Bottomley, martin.petersen Cc: linux-scsi, linux-kernel, stable, hch, dlemoal, robin.murphy, john.g.garry, axboe, m.szyprowski, ahuang12, ionut_n2001, sunlightlinux, Ionut Nechita From: Ionut Nechita <ionut.nechita@windriver.com> sas_host_setup() unconditionally sets shost->opt_sectors from dma_opt_mapping_size(). When the IOMMU is disabled or in passthrough mode and no DMA ops provide an opt_mapping_size callback, dma_opt_mapping_size() returns min(dma_max_mapping_size(), SIZE_MAX) which equals dma_max_mapping_size() — a hard upper bound, not an optimization hint. On a Dell PowerEdge R750 with mpt3sas (Broadcom SAS3816, FW 33.15.00.00) and intel_iommu=off the following values are observed: dma_opt_mapping_size() = dma_max_mapping_size() (no real hint) shost->max_sectors = 32767 opt_sectors = min(32767, huge >> 9) = 32767 optimal_io_size = 32767 << 9 = 16776704 → round_down(16776704, 4096) = 16773120 The SAS disk (SAMSUNG MZILT800HBHQ0D3) does not report an Optimal Transfer Length in VPD page B0, so sdkp->opt_xfer_blocks remains 0. sd_revalidate_disk() then uses min_not_zero(0, opt_sectors) = opt_sectors, propagating the bogus value into the block device's optimal_io_size (visible as OPT-IO = 16773120 in lsblk --topology). mkfs.xfs picks up optimal_io_size and minimum_io_size and computes: swidth = 16773120 / 4096 = 4095 sunit = 8192 / 4096 = 2 Since 4095 % 2 != 0, XFS rejects the geometry: SB stripe unit sanity check failed This makes it impossible to create XFS filesystems (e.g. for /var/lib/docker) during system bootstrap. Fix this by introducing a sas_dma_setup_opt_sectors() helper that sets opt_sectors only when dma_opt_mapping_size() is strictly less than dma_max_mapping_size(), indicating a genuine DMA optimization constraint. The helper computes min(opt_sectors, max_sectors) first, then rounds down to a power of two so that filesystem geometry calculations always produce clean results. When the two DMA values are equal, no backend provided a real hint, so opt_sectors stays at 0 ("no preference"). Fixes: 4cbfca5f7750 ("scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit") Cc: stable@vger.kernel.org Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Ionut Nechita <ionut.nechita@windriver.com> --- Changes in v8: - Remove dma_dev->dma_mask check — dma_opt/max_mapping_size() handle the no-DMA case gracefully by returning SIZE_MAX (Christoph Hellwig). - Add inline comments explaining each conditional (Christoph Hellwig). Changes in v7: - Drop redundant !opt check; the !opt_sectors check below already handles the opt == 0 case (John Garry). - Add Reviewed-by from John Garry. Changes in v6: - No kerneldoc, short inline comment, removed WARN_ONCE, combined checks (!opt || opt >= max), rounddown on min(opt, max_sectors), restructured as sas_dma_setup_opt_sectors(shost) (John Garry). Changes in v5: - Expanded kdoc, inline comment at opt == max, guard for opt == 0 before rounddown_pow_of_two, trimmed Cc list (Damien/James/Sashiko). Changes in v4: - WARN_ONCE for opt > max, min_t overflow protection, reformatted call site (Damien Le Moal). Changes in v3: - sas_dma_opt_sectors() helper + rounddown_pow_of_two() (Christoph). Changes in v2: - Single patch fixing scsi_transport_sas.c, Fixes: 4cbfca5f7750. drivers/scsi/scsi_transport_sas.c | 43 +++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c index 13412702188e4..ebd063b51bd6b 100644 --- a/drivers/scsi/scsi_transport_sas.c +++ b/drivers/scsi/scsi_transport_sas.c @@ -27,6 +27,7 @@ #include <linux/module.h> #include <linux/jiffies.h> #include <linux/err.h> +#include <linux/log2.h> #include <linux/slab.h> #include <linux/string.h> #include <linux/blkdev.h> @@ -222,12 +223,47 @@ static int sas_bsg_initialize(struct Scsi_Host *shost, struct sas_rphy *rphy) * SAS host attributes */ +/* + * Set shost->opt_sectors from the DMA optimal mapping size, but only + * when dma_opt_mapping_size() is strictly less than dma_max_mapping_size(), + * indicating a genuine optimization hint from an IOMMU or DMA backend. + * When the two are equal (e.g. IOMMU disabled / passthrough), no real + * hint exists, so leave opt_sectors at 0 to avoid bogus optimal_io_size + * values that break filesystem geometry (e.g. mkfs.xfs stripe alignment). + */ +static void sas_dma_setup_opt_sectors(struct Scsi_Host *shost) +{ + struct device *dma_dev = shost->dma_dev; + size_t opt, max; + unsigned int opt_sectors; + + opt = dma_opt_mapping_size(dma_dev); + max = dma_max_mapping_size(dma_dev); + + /* opt >= max means no real hint was provided by the DMA layer */ + if (opt >= max) + return; + + /* Clamp to max_sectors to avoid overflow in sector arithmetic */ + opt_sectors = min_t(unsigned int, opt >> SECTOR_SHIFT, + shost->max_sectors); + + /* Guard against zero before rounddown_pow_of_two() */ + if (!opt_sectors) + return; + + /* + * Round down to power-of-two so filesystem geometry calculations + * (e.g. XFS stripe width/unit) always produce clean divisors. + */ + shost->opt_sectors = rounddown_pow_of_two(opt_sectors); +} + static int sas_host_setup(struct transport_container *tc, struct device *dev, struct device *cdev) { struct Scsi_Host *shost = dev_to_shost(dev); struct sas_host_attrs *sas_host = to_sas_host_attrs(shost); - struct device *dma_dev = shost->dma_dev; INIT_LIST_HEAD(&sas_host->rphy_list); mutex_init(&sas_host->lock); @@ -239,10 +275,7 @@ static int sas_host_setup(struct transport_container *tc, struct device *dev, dev_printk(KERN_ERR, dev, "fail to a bsg device %d\n", shost->host_no); - if (dma_dev->dma_mask) { - shost->opt_sectors = min_t(unsigned int, shost->max_sectors, - dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT); - } + sas_dma_setup_opt_sectors(shost); return 0; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v8 1/1] scsi: sas: skip opt_sectors when DMA reports no real optimization hint 2026-05-19 13:52 ` [PATCH v8 1/1] scsi: sas: skip opt_sectors when DMA reports no real optimization hint Ionut Nechita (Wind River) @ 2026-05-25 6:02 ` Christoph Hellwig 2026-06-02 1:43 ` Martin K. Petersen 1 sibling, 0 replies; 5+ messages in thread From: Christoph Hellwig @ 2026-05-25 6:02 UTC (permalink / raw) To: Ionut Nechita (Wind River) Cc: James.Bottomley, martin.petersen, linux-scsi, linux-kernel, stable, hch, dlemoal, robin.murphy, john.g.garry, axboe, m.szyprowski, ahuang12, ionut_n2001, sunlightlinux On Tue, May 19, 2026 at 04:52:33PM +0300, Ionut Nechita (Wind River) wrote: > +static void sas_dma_setup_opt_sectors(struct Scsi_Host *shost) > +{ > + struct device *dma_dev = shost->dma_dev; > + size_t opt, max; > + unsigned int opt_sectors; > + > + opt = dma_opt_mapping_size(dma_dev); > + max = dma_max_mapping_size(dma_dev); I'm almost feeling bad for suggesting more changes, but this would read much cleaner by doing: struct device *dma_dev = shost->dma_dev; size_t opt = dma_opt_mapping_size(dma_dev); size_t max = dma_max_mapping_size(dma_dev); unsigned int opt_sectors; but otherwise this looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v8 1/1] scsi: sas: skip opt_sectors when DMA reports no real optimization hint 2026-05-19 13:52 ` [PATCH v8 1/1] scsi: sas: skip opt_sectors when DMA reports no real optimization hint Ionut Nechita (Wind River) 2026-05-25 6:02 ` Christoph Hellwig @ 2026-06-02 1:43 ` Martin K. Petersen 2026-06-02 8:04 ` Ionut Nechita (Wind River) 1 sibling, 1 reply; 5+ messages in thread From: Martin K. Petersen @ 2026-06-02 1:43 UTC (permalink / raw) To: Ionut Nechita (Wind River) Cc: James.Bottomley, martin.petersen, linux-scsi, linux-kernel, stable, hch, dlemoal, robin.murphy, john.g.garry, axboe, m.szyprowski, ahuang12, ionut_n2001, sunlightlinux Ionut, > sas_host_setup() unconditionally sets shost->opt_sectors from > dma_opt_mapping_size(). Applied to 7.2/scsi-staging (with Christoph's suggested tweak). Thanks! -- Martin K. Petersen ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v8 1/1] scsi: sas: skip opt_sectors when DMA reports no real optimization hint 2026-06-02 1:43 ` Martin K. Petersen @ 2026-06-02 8:04 ` Ionut Nechita (Wind River) 0 siblings, 0 replies; 5+ messages in thread From: Ionut Nechita (Wind River) @ 2026-06-02 8:04 UTC (permalink / raw) To: martin.petersen; +Cc: hch, James.Bottomley, linux-scsi, Ionut Nechita From: Ionut Nechita <ionut.nechita@windriver.com> Martin, Christoph, Thank you both for the review and for picking this up. Much appreciated — happy to respin if anything else comes up during the staging cycle. Ionut ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-02 8:04 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-19 13:52 [PATCH v8 0/1] scsi: sas: fix mkfs.xfs failure due to bogus optimal_io_size Ionut Nechita (Wind River) 2026-05-19 13:52 ` [PATCH v8 1/1] scsi: sas: skip opt_sectors when DMA reports no real optimization hint Ionut Nechita (Wind River) 2026-05-25 6:02 ` Christoph Hellwig 2026-06-02 1:43 ` Martin K. Petersen 2026-06-02 8:04 ` Ionut Nechita (Wind River)
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.