* [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166
@ 2026-06-21 10:08 Alvin Lim
2026-06-21 10:21 ` sashiko-bot
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Alvin Lim @ 2026-06-21 10:08 UTC (permalink / raw)
To: linux-ide; +Cc: Damien Le Moal, Niklas Cassel, linux-kernel, Alvin Lim, stable
The ASMedia ASM1166 SATA controller (1b21:1166) advertises 64-bit DMA
support (AHCI CAP.S64A), but on systems with the IOMMU enabled - where it
can be handed DMA addresses above 4 GB - it silently corrupts data in
transit. Reads return different, wrong data on each access. SMART is clean,
there are no SATA link resets and no MCE is raised, so the corruption is
invisible until it surfaces as filesystem metadata errors (XFS EUCLEAN)
or, on Ceph, mass scrub errors across multiple independent filesystems at
once - i.e. host-level, not filesystem-level.
This is the same failure mode already quirked for other controllers that
falsely claim working 64-bit DMA. See commit 105c42566a55 ("ata: ahci:
force 32-bit DMA for JMicron JMB582/JMB585") and commit 20730e9b2778
("ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers").
The ASM1166 currently maps to plain board_ahci with no DMA limit.
Limit the ASM1166 to 32-bit DMA. 32-bit is the guaranteed-correct lower
bound; the only cost is extra SWIOTLB bounce-buffering on transfers above
4 GB, negligible for storage. A future change can widen it to the
controller's true addressable width if characterised. Until this lands the
only workarounds are disabling the IOMMU (amd_iommu=off / intel_iommu=off)
or using an HBA.
Reproduced on an AOOSTAR WTR MAX (AMD Ryzen 7 PRO 8845HS) whose six SATA
bays all hang off one ASM1166: with the IOMMU on, six concurrent
'dd ... | md5sum' of the same large file return six different sums; with
amd_iommu=off they are identical, and a full Ceph deep-scrub of a 5.4 TiB
/ 1.43M-object pool re-reads end-to-end with zero scrub errors.
Add a board_ahci_32bit_dma board type (mirroring board_ahci_43bit_dma)
and point the ASM1166 entry at it.
Fixes: 3bf614106094 ("ata: ahci: add identifiers for ASM2116 series adapters")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4.8
Signed-off-by: Alvin Lim <alvinwylim@gmail.com>
---
drivers/ata/ahci.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 58f512f8952a..895956c2ca15 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -48,6 +48,7 @@ enum {
enum board_ids {
/* board IDs by feature in alphabetical order */
board_ahci,
+ board_ahci_32bit_dma,
board_ahci_43bit_dma,
board_ahci_ign_iferr,
board_ahci_no_debounce_delay,
@@ -132,6 +133,13 @@ static const struct ata_port_info ahci_port_info[] = {
.udma_mask = ATA_UDMA6,
.port_ops = &ahci_ops,
},
+ [board_ahci_32bit_dma] = {
+ AHCI_HFLAGS (AHCI_HFLAG_32BIT_ONLY),
+ .flags = AHCI_FLAG_COMMON,
+ .pio_mask = ATA_PIO4,
+ .udma_mask = ATA_UDMA6,
+ .port_ops = &ahci_ops,
+ },
[board_ahci_43bit_dma] = {
AHCI_HFLAGS (AHCI_HFLAG_43BIT_ONLY),
.flags = AHCI_FLAG_COMMON,
@@ -1559,7 +1567,7 @@ static const struct pci_device_id ahci_pci_tbl[] = {
}, {
/* ASM1166 */
PCI_VDEVICE(ASMEDIA, 0x1166),
- .driver_data = board_ahci,
+ .driver_data = board_ahci_32bit_dma,
}, {
/*
* Samsung SSDs found on some macbooks. NCQ times out if MSI is
base-commit: 322008f87f917e2217eeac386a9410945092eb2e
--
2.47.3
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166 2026-06-21 10:08 [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166 Alvin Lim @ 2026-06-21 10:21 ` sashiko-bot 2026-06-21 12:48 ` David Laight 2026-06-22 11:31 ` Damien Le Moal 2 siblings, 0 replies; 5+ messages in thread From: sashiko-bot @ 2026-06-21 10:21 UTC (permalink / raw) To: Alvin Lim; +Cc: dlemoal, cassel, linux-ide Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider: Pre-existing issues: - [Critical] The 32-bit DMA quirk is applied only to the ASM1166, leaving identical sibling controllers (ASM1164 and ASM1165) exposed to catastrophic silent data corruption. - [Critical] The `AHCI_HFLAG_32BIT_ONLY` quirk clears `HOST_CAP_64` from `hpriv->cap`, preventing the driver from explicitly zeroing the upper 32-bit DMA address registers, which can lead to DMA to corrupted physical addresses. -- commit aa72273e536a2a474b07a2e0e23ffb50c8b94418 Author: Alvin Lim <alvinwylim@gmail.com> ata: ahci: force 32-bit DMA for ASMedia ASM1166 This commit limits the ASMedia ASM1166 SATA controller to 32-bit DMA. It addresses a regression that occurs when the IOMMU is enabled and hands out DMA addresses above 4 GB, resulting in silent data corruption. A new board_ahci_32bit_dma board type is introduced and applied to the ASM1166. > diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c > index 58f512f8952a9..895956c2ca15f 100644 > --- a/drivers/ata/ahci.c > +++ b/drivers/ata/ahci.c [ ... ] > @@ -132,6 +133,13 @@ static const struct ata_port_info ahci_port_info[] = { > .udma_mask = ATA_UDMA6, > .port_ops = &ahci_ops, > }, > + [board_ahci_32bit_dma] = { > + AHCI_HFLAGS (AHCI_HFLAG_32BIT_ONLY), [Severity: Critical] This is a pre-existing issue, but does using AHCI_HFLAG_32BIT_ONLY here expose a regression in ahci_start_fis_rx()? When AHCI_HFLAG_32BIT_ONLY is set, ahci_save_initial_config() clears HOST_CAP_64 from the capability flags. Later, in ahci_start_fis_rx(), the hardware registers PORT_LST_ADDR_HI and PORT_FIS_ADDR_HI are not zeroed out because the flag is cleared: if (hpriv->cap & HOST_CAP_64) { writel((pp->cmd_slot_dma >> 16) >> 16, port_mmio + PORT_LST_ADDR_HI); writel((pp->rx_fis_dma >> 16) >> 16, port_mmio + PORT_FIS_ADDR_HI); } If the hardware is genuinely 64-bit capable and not fully reset, can it retain stale upper 32-bit addresses left by the firmware, causing DMA to target the wrong physical addresses when concatenated with the new lower 32 bits? > + .flags = AHCI_FLAG_COMMON, > + .pio_mask = ATA_PIO4, > + .udma_mask = ATA_UDMA6, > + .port_ops = &ahci_ops, > + }, [ ... ] > @@ -1559,7 +1567,7 @@ static const struct pci_device_id ahci_pci_tbl[] = { > }, { > /* ASM1166 */ > PCI_VDEVICE(ASMEDIA, 0x1166), > - .driver_data = board_ahci, > + .driver_data = board_ahci_32bit_dma, > }, { [Severity: Critical] This is a pre-existing issue, but does this quirk need to be applied to sibling controllers as well? Looking at ahci_pci_tbl, ASM1164 and ASM1165 are from the same silicon series and still map to plain board_ahci: /* ASM1164 */ PCI_VDEVICE(ASMEDIA, 0x1164), .driver_data = board_ahci, }, { /* ASM1165 */ PCI_VDEVICE(ASMEDIA, 0x1165), .driver_data = board_ahci, Does leaving them without the 32-bit DMA limit expose them to the same regression where IOMMU addresses above 4 GB cause data corruption? > /* > * Samsung SSDs found on some macbooks. NCQ times out if MSI is -- Sashiko AI review · https://sashiko.dev/#/patchset/20260621100844.1224301-1-alvinwylim@gmail.com?part=1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166 2026-06-21 10:08 [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166 Alvin Lim 2026-06-21 10:21 ` sashiko-bot @ 2026-06-21 12:48 ` David Laight [not found] ` <CA+CYLR6Rg-3brg9yCMAKJDr7t=mtu4vP0+aMFs+JhLPWtQxOYA@mail.gmail.com> 2026-06-22 11:31 ` Damien Le Moal 2 siblings, 1 reply; 5+ messages in thread From: David Laight @ 2026-06-21 12:48 UTC (permalink / raw) To: Alvin Lim; +Cc: linux-ide, Damien Le Moal, Niklas Cassel, linux-kernel, stable On Sun, 21 Jun 2026 18:08:44 +0800 Alvin Lim <alvinwylim@gmail.com> wrote: > The ASMedia ASM1166 SATA controller (1b21:1166) advertises 64-bit DMA > support (AHCI CAP.S64A), but on systems with the IOMMU enabled - where it > can be handed DMA addresses above 4 GB - it silently corrupts data in > transit. That seems wrong. If any iommu is disabled the sata cotroller will be passed the memory's physical address which is very likely to be over 4G. So the controller seems to support 64bit addresses - as advertised. But with the iommu enabled the addresses the controller needs to use are different from the physical address - so the controller will almost certainly see sub 4G addresses for buffers above 4G. (The iommu probably allocates 32bit PCIe addresses for all buffers so that it doesn't have to worry about targets that only support 32bit addresses.) It seems more likely that the wrong addresses are ending up in the host side of the iommu and using bounce buffers fixes that. Using the wrong addresses is likely to lead to major kernel memory corruptions. Mixing up physical addresses and dma addresses, assuming that memory from dma_alloc_coherent() is physically contiguous, or just losing the high bits of the physical address passed to the iommu seem more likely. David > Reads return different, wrong data on each access. SMART is clean, > there are no SATA link resets and no MCE is raised, so the corruption is > invisible until it surfaces as filesystem metadata errors (XFS EUCLEAN) > or, on Ceph, mass scrub errors across multiple independent filesystems at > once - i.e. host-level, not filesystem-level. > > This is the same failure mode already quirked for other controllers that > falsely claim working 64-bit DMA. See commit 105c42566a55 ("ata: ahci: > force 32-bit DMA for JMicron JMB582/JMB585") and commit 20730e9b2778 > ("ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers"). > The ASM1166 currently maps to plain board_ahci with no DMA limit. > > Limit the ASM1166 to 32-bit DMA. 32-bit is the guaranteed-correct lower > bound; the only cost is extra SWIOTLB bounce-buffering on transfers above > 4 GB, negligible for storage. A future change can widen it to the > controller's true addressable width if characterised. Until this lands the > only workarounds are disabling the IOMMU (amd_iommu=off / intel_iommu=off) > or using an HBA. > > Reproduced on an AOOSTAR WTR MAX (AMD Ryzen 7 PRO 8845HS) whose six SATA > bays all hang off one ASM1166: with the IOMMU on, six concurrent > 'dd ... | md5sum' of the same large file return six different sums; with > amd_iommu=off they are identical, and a full Ceph deep-scrub of a 5.4 TiB > / 1.43M-object pool re-reads end-to-end with zero scrub errors. > > Add a board_ahci_32bit_dma board type (mirroring board_ahci_43bit_dma) > and point the ASM1166 entry at it. > > Fixes: 3bf614106094 ("ata: ahci: add identifiers for ASM2116 series adapters") > Cc: stable@vger.kernel.org > Assisted-by: Claude:claude-opus-4.8 > Signed-off-by: Alvin Lim <alvinwylim@gmail.com> > --- > drivers/ata/ahci.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c > index 58f512f8952a..895956c2ca15 100644 > --- a/drivers/ata/ahci.c > +++ b/drivers/ata/ahci.c > @@ -48,6 +48,7 @@ enum { > enum board_ids { > /* board IDs by feature in alphabetical order */ > board_ahci, > + board_ahci_32bit_dma, > board_ahci_43bit_dma, > board_ahci_ign_iferr, > board_ahci_no_debounce_delay, > @@ -132,6 +133,13 @@ static const struct ata_port_info ahci_port_info[] = { > .udma_mask = ATA_UDMA6, > .port_ops = &ahci_ops, > }, > + [board_ahci_32bit_dma] = { > + AHCI_HFLAGS (AHCI_HFLAG_32BIT_ONLY), > + .flags = AHCI_FLAG_COMMON, > + .pio_mask = ATA_PIO4, > + .udma_mask = ATA_UDMA6, > + .port_ops = &ahci_ops, > + }, > [board_ahci_43bit_dma] = { > AHCI_HFLAGS (AHCI_HFLAG_43BIT_ONLY), > .flags = AHCI_FLAG_COMMON, > @@ -1559,7 +1567,7 @@ static const struct pci_device_id ahci_pci_tbl[] = { > }, { > /* ASM1166 */ > PCI_VDEVICE(ASMEDIA, 0x1166), > - .driver_data = board_ahci, > + .driver_data = board_ahci_32bit_dma, > }, { > /* > * Samsung SSDs found on some macbooks. NCQ times out if MSI is > > base-commit: 322008f87f917e2217eeac386a9410945092eb2e ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <CA+CYLR6Rg-3brg9yCMAKJDr7t=mtu4vP0+aMFs+JhLPWtQxOYA@mail.gmail.com>]
* Re: [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166 [not found] ` <CA+CYLR6Rg-3brg9yCMAKJDr7t=mtu4vP0+aMFs+JhLPWtQxOYA@mail.gmail.com> @ 2026-06-21 21:57 ` David Laight 0 siblings, 0 replies; 5+ messages in thread From: David Laight @ 2026-06-21 21:57 UTC (permalink / raw) To: Alvin Lim; +Cc: linux-ide, Damien Le Moal, Niklas Cassel, linux-kernel, stable On Sun, 21 Jun 2026 22:02:31 +0800 Alvin Lim <alvinwylim@gmail.com> wrote: > > It seems more likely that the wrong addresses are ending up in the host > > side of the iommu and using bounce buffers fixes that. > > Using the wrong addresses is likely to lead to major kernel memory > > corruptions. > > Mixing up physical addresses and dma addresses, assuming that memory > > from dma_alloc_coherent() is physically contiguous, or just losing the > > high bits of the physical address passed to the iommu seem more likely. > > Thank you - this is the right question to ask, and you're correct that > my commit message overstates what I actually established. I described the > controller's internal behaviour as fact when it is an inference from the > failure class, not something I characterised on the silicon. I'll reword > v2 to claim only what I measured. I also did not instrument the IOVAs > actually handed to the device, so I won't argue the allocator internals > with you. > > But I don't think a host-side addressing bug fits the observations, > independent of mechanism, for three reasons: > > 1. The fault is specific to this one controller. On this host the AMD > IOMMU also translates for four NVMe controllers (three of them Ceph > OSDs under identical load) and the NIC - all doing 64-bit DMA through > the same iommu, same kernel, same RAM. Every one of them is clean: the > NVMe OSDs deep-scrub to zero errors. Only the six disks behind the > ASM1166 corrupt. A host-side bug - phys/dma mixup, a > dma_alloc_coherent() contiguity assumption, or high bits lost into the > iommu - is device-agnostic: the mapping path doesn't know which device > sits behind an IOVA, so it would have to corrupt NVMe too. It doesn't. > That selectivity is what points at the controller rather than the > kernel. > > 2. It is load-dependent. Single isolated reads are almost always clean; > only concurrent reads corrupt (six parallel dd of one file return six > different md5s; serial reads agree). A deterministic host-side > addressing bug would not switch on and off with concurrency. > > 3. The fix holds at scale. Running amd_iommu=off, the controller only > ever sees real physical addresses - the top of this box's RAM is > 0x83e2fffff (~33 GiB, a 36-bit address, from /proc/iomem) - and a full > deep-scrub of the 5.4 TiB / 1.43M-object pool re-read end-to-end with > zero scrub errors, plus ~4.7 TiB of backfill writes through the same > path, also at zero. > > On your "works with the iommu off, so it supports 64-bit" inference: > amd_iommu=off does not exercise the 64-bit range on this box. With the > iommu off the controller is handed untranslated physical addresses, and > the highest in the map is 0x83e2fffff (36-bit, from /proc/iomem). The > clean scrub in (3) is therefore strong evidence the controller does > correct DMA up to ~36-bit - which is also how I know it is not truly > 32-bit. It says nothing about the full 64-bit range, because no address > that high is ever presented to it in that configuration. > > A note on what I actually validated: the at-scale proof above is under > amd_iommu=off. The submitted quirk is the IOMMU-preserving equivalent - > it caps the controller's DMA mask at 32 bits and bounces transfers above > 4 GiB via SWIOTLB. 32-bit addresses are a strict subset of the ~36-bit > physical range already shown clean, so the quirk is conservative relative > to what is validated; I have not separately re-run the scrub under the > quirk itself, as this is a production NAS and I won't deliberately > re-induce corruption to characterise it. I'd definitely look at how the iommu code selects the address the target side uses. There is something very fishy going on. With the iommu enabled whether bounce buffers get used should make no difference - the address the device sees is different from the physical address. But with a 32bit dma mask it should always generate low addresses (regardless of the physical address of the buffer). One possibility is the iommu code uses increasing addresses for each transfer (to get uniqueness) but resets the address when idle. That would mean the a single thread test would keep using the same address (so it would stay low) but concurrent requests would keep using forever larger addresses until the target starts aliasing the addresses (it puts in the request TLP). At which point the requests should hit invalid iommu pte (or whatever they are called) and the PCIe read/write should fail. I'm not sure what the device would see - PCIe writes are 'posted' and async, host reads of invalid addresses return ~0 and trigger an AER error (if enabled and if anyone can manage to work out what to do, hint an NMI into the host is not helpful on a 'nebs' rated server). One effect of setting the dma mask to 32bits is to force swiotbl bounces when they aren't needed because the iommu is disabled/absent. I think you are probably using the wrong shaped hammer :-) David > > This is the same shape as the sibling ASMedia ASM1061, quirked to 43-bit > DMA (20730e9b2778), and JMicron JMB585, quirked to 32-bit (105c42566a55). > I submitted the conservative 32-bit lower bound (as JMB585 itself > shipped) rather than assert a width I did not measure; a later change can > widen it if someone characterises the true value. > > If it helps, I'll send a v2 whose commit message states only the measured > facts - corruption isolated to the ASM1166's disks while every other DMA > master on the same IOMMU is clean; load-dependent; eliminated by > constraining the controller's DMA, verified at scale - and leaves the > internal silicon mechanism as inference consistent with the existing > ASM1061/JMB585 quirks rather than asserting it. > > Would that address your concern? > > Thanks, > Alvin > > On Sun, Jun 21, 2026 at 8:48 PM David Laight <david.laight.linux@gmail.com> > wrote: > > > On Sun, 21 Jun 2026 18:08:44 +0800 > > Alvin Lim <alvinwylim@gmail.com> wrote: > > > > > The ASMedia ASM1166 SATA controller (1b21:1166) advertises 64-bit DMA > > > support (AHCI CAP.S64A), but on systems with the IOMMU enabled - where it > > > can be handed DMA addresses above 4 GB - it silently corrupts data in > > > transit. > > > > That seems wrong. > > If any iommu is disabled the sata cotroller will be passed the memory's > > physical address which is very likely to be over 4G. > > So the controller seems to support 64bit addresses - as advertised. > > > > But with the iommu enabled the addresses the controller needs to use are > > different from the physical address - so the controller will almost > > certainly see sub 4G addresses for buffers above 4G. > > (The iommu probably allocates 32bit PCIe addresses for all buffers so that > > it doesn't have to worry about targets that only support 32bit addresses.) > > > > It seems more likely that the wrong addresses are ending up in the host > > side of the iommu and using bounce buffers fixes that. > > Using the wrong addresses is likely to lead to major kernel memory > > corruptions. > > > > Mixing up physical addresses and dma addresses, assuming that memory > > from dma_alloc_coherent() is physically contiguous, or just losing the > > high bits of the physical address passed to the iommu seem more likely. > > > > David > > > > > > > > > Reads return different, wrong data on each access. SMART is clean, > > > there are no SATA link resets and no MCE is raised, so the corruption is > > > invisible until it surfaces as filesystem metadata errors (XFS EUCLEAN) > > > or, on Ceph, mass scrub errors across multiple independent filesystems at > > > once - i.e. host-level, not filesystem-level. > > > > > > This is the same failure mode already quirked for other controllers that > > > falsely claim working 64-bit DMA. See commit 105c42566a55 ("ata: ahci: > > > force 32-bit DMA for JMicron JMB582/JMB585") and commit 20730e9b2778 > > > ("ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers"). > > > The ASM1166 currently maps to plain board_ahci with no DMA limit. > > > > > > Limit the ASM1166 to 32-bit DMA. 32-bit is the guaranteed-correct lower > > > bound; the only cost is extra SWIOTLB bounce-buffering on transfers above > > > 4 GB, negligible for storage. A future change can widen it to the > > > controller's true addressable width if characterised. Until this lands > > the > > > only workarounds are disabling the IOMMU (amd_iommu=off / > > intel_iommu=off) > > > or using an HBA. > > > > > > Reproduced on an AOOSTAR WTR MAX (AMD Ryzen 7 PRO 8845HS) whose six SATA > > > bays all hang off one ASM1166: with the IOMMU on, six concurrent > > > 'dd ... | md5sum' of the same large file return six different sums; with > > > amd_iommu=off they are identical, and a full Ceph deep-scrub of a 5.4 TiB > > > / 1.43M-object pool re-reads end-to-end with zero scrub errors. > > > > > > Add a board_ahci_32bit_dma board type (mirroring board_ahci_43bit_dma) > > > and point the ASM1166 entry at it. > > > > > > Fixes: 3bf614106094 ("ata: ahci: add identifiers for ASM2116 series > > adapters") > > > Cc: stable@vger.kernel.org > > > Assisted-by: Claude:claude-opus-4.8 > > > Signed-off-by: Alvin Lim <alvinwylim@gmail.com> > > > --- > > > drivers/ata/ahci.c | 10 +++++++++- > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c > > > index 58f512f8952a..895956c2ca15 100644 > > > --- a/drivers/ata/ahci.c > > > +++ b/drivers/ata/ahci.c > > > @@ -48,6 +48,7 @@ enum { > > > enum board_ids { > > > /* board IDs by feature in alphabetical order */ > > > board_ahci, > > > + board_ahci_32bit_dma, > > > board_ahci_43bit_dma, > > > board_ahci_ign_iferr, > > > board_ahci_no_debounce_delay, > > > @@ -132,6 +133,13 @@ static const struct ata_port_info ahci_port_info[] > > = { > > > .udma_mask = ATA_UDMA6, > > > .port_ops = &ahci_ops, > > > }, > > > + [board_ahci_32bit_dma] = { > > > + AHCI_HFLAGS (AHCI_HFLAG_32BIT_ONLY), > > > + .flags = AHCI_FLAG_COMMON, > > > + .pio_mask = ATA_PIO4, > > > + .udma_mask = ATA_UDMA6, > > > + .port_ops = &ahci_ops, > > > + }, > > > [board_ahci_43bit_dma] = { > > > AHCI_HFLAGS (AHCI_HFLAG_43BIT_ONLY), > > > .flags = AHCI_FLAG_COMMON, > > > @@ -1559,7 +1567,7 @@ static const struct pci_device_id ahci_pci_tbl[] = > > { > > > }, { > > > /* ASM1166 */ > > > PCI_VDEVICE(ASMEDIA, 0x1166), > > > - .driver_data = board_ahci, > > > + .driver_data = board_ahci_32bit_dma, > > > }, { > > > /* > > > * Samsung SSDs found on some macbooks. NCQ times out if > > MSI is > > > > > > base-commit: 322008f87f917e2217eeac386a9410945092eb2e > > > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166 2026-06-21 10:08 [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166 Alvin Lim 2026-06-21 10:21 ` sashiko-bot 2026-06-21 12:48 ` David Laight @ 2026-06-22 11:31 ` Damien Le Moal 2 siblings, 0 replies; 5+ messages in thread From: Damien Le Moal @ 2026-06-22 11:31 UTC (permalink / raw) To: Alvin Lim, linux-ide; +Cc: Niklas Cassel, linux-kernel, stable On 6/21/26 19:08, Alvin Lim wrote: > The ASMedia ASM1166 SATA controller (1b21:1166) advertises 64-bit DMA > support (AHCI CAP.S64A), but on systems with the IOMMU enabled - where it > can be handed DMA addresses above 4 GB - it silently corrupts data in > transit. Reads return different, wrong data on each access. SMART is clean, > there are no SATA link resets and no MCE is raised, so the corruption is > invisible until it surfaces as filesystem metadata errors (XFS EUCLEAN) > or, on Ceph, mass scrub errors across multiple independent filesystems at > once - i.e. host-level, not filesystem-level. > > This is the same failure mode already quirked for other controllers that > falsely claim working 64-bit DMA. See commit 105c42566a55 ("ata: ahci: > force 32-bit DMA for JMicron JMB582/JMB585") and commit 20730e9b2778 > ("ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers"). > The ASM1166 currently maps to plain board_ahci with no DMA limit. Have you tried the same quirk, limiting DMA to 43-bits ? It is very likely that this adapter bug is the same as the 1061. -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-22 11:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-21 10:08 [PATCH] ata: ahci: force 32-bit DMA for ASMedia ASM1166 Alvin Lim
2026-06-21 10:21 ` sashiko-bot
2026-06-21 12:48 ` David Laight
[not found] ` <CA+CYLR6Rg-3brg9yCMAKJDr7t=mtu4vP0+aMFs+JhLPWtQxOYA@mail.gmail.com>
2026-06-21 21:57 ` David Laight
2026-06-22 11:31 ` Damien Le Moal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox