* SG2042 SATA (DMA?) issues
@ 2026-05-31 4:32 Michael Orlitzky
2026-05-31 15:05 ` Michael Orlitzky
0 siblings, 1 reply; 11+ messages in thread
From: Michael Orlitzky @ 2026-05-31 4:32 UTC (permalink / raw)
To: sophgo
Has anyone tried the recent SG2042 firmware and DMA coherence patches
with the onboard SATA? I am currently in the process of learning an
important lesson and wish I weren't. After a recent update, any ext4
filesystem that I put on a SATA drive quickly becomes corrupt. For
example,
EXT4-fs (sda1): mounted filesystem 54f42dce-97ef-4b4d-8a92-dab5d72a527b r/w with ordered data mode. Quota mode: disabled.
EXT4-fs error (device sda1): ext4_validate_block_bitmap:423: comm ext4lazyinit: bg 103: bad block bitmap checksum
EXT4-fs error (device sda1): ext4_validate_block_bitmap:423: comm ext4lazyinit: bg 255: bad block bitmap checksum
EXT4-fs error (device sda1): ext4_validate_block_bitmap:423: comm ext4lazyinit: bg 350: bad block bitmap checksum
EXT4-fs error (device sda1): ext4_validate_block_bitmap:423: comm ext4lazyinit: bg 886: bad block bitmap checksum
...
Here's a different error, from a different drive:
EXT4-fs (sdb1): ext4_check_descriptors: Block bitmap for group 109 overlaps superblock
EXT4-fs (sdb1): group descriptors corrupted!
These are two new drives, fresh out of the box. Running fsck doesn't
help -- the corruption returns almost immediately. Similarly after a
reformat.
I am testing with linux-next-20260529, and the following patches from
the mailing list. OpenSBI, zsbl, EDK2, and fip.bin are all up-to-date.
commit fe7d4d1c81361d02c8156c3989b8b07edebf96bf
Author: Vivian Wang <wangruikang@iscas.ac.cn>
Date: Mon Mar 9 19:09:38 2026 +0800
riscv: mm: Define DIRECT_MAP_PHYSMEM_END
commit 8e12c1f0ee16bd901df098d5133a2041a3f5383a
Author: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Date: Tue Apr 21 10:31:40 2026 -0400
riscv: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
commit d2c0c75176d2c220b7c13ae1dc2ca029d99f42fe
Author: Inochi Amaoto <inochiama@gmail.com>
Date: Tue Apr 7 07:26:55 2026 +0800
riscv: dts: sophgo: sg2042: use hex for CPU unit address
commit f582c404b9e034efa5891db6a5a0751a7488659c
Author: Han Gao <gaohan@iscas.ac.cn>
Date: Wed Apr 1 01:56:58 2026 +0800
PCI: Add quirk to disable PCIe port services on Sophgo SG2042
commit 6b2339e1faf20f8e819f729fd0b8b94a0169e6d2
Author: Han Gao <gaohan@iscas.ac.cn>
Date: Wed Apr 1 01:56:57 2026 +0800
PCI: Add per-device flag to disable native PCIe port services
commit 7e9fc0eb8ecdc96d31af68920b55f1885683ec88
Author: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Date: Wed Apr 8 00:01:43 2026 +0800
riscv: dts: sophgo: reduce SG2042 MSI count to 16
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: SG2042 SATA (DMA?) issues
2026-05-31 4:32 SG2042 SATA (DMA?) issues Michael Orlitzky
@ 2026-05-31 15:05 ` Michael Orlitzky
2026-06-01 0:31 ` Chen Wang
0 siblings, 1 reply; 11+ messages in thread
From: Michael Orlitzky @ 2026-05-31 15:05 UTC (permalink / raw)
To: sophgo
On 2026-05-31 00:32:46, Michael Orlitzky wrote:
> Has anyone tried the recent SG2042 firmware and DMA coherence
> patches with the onboard SATA? ... I am testing with
> linux-next-20260529, and the following patches from the mailing
> list. OpenSBI, zsbl, EDK2, and fip.bin are all up-to-date.
Just to confirm, downgrading the firmware (zsbl, edk2, fip.bin) to
snapshots from December 2025 and reverting the dma-coherent dts
changes (e728a57834d) does fix the issue. It looks like something may
be seriously wrong, be careful if you have anything important on a
SATA drive.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: SG2042 SATA (DMA?) issues
2026-05-31 15:05 ` Michael Orlitzky
@ 2026-06-01 0:31 ` Chen Wang
2026-06-01 2:22 ` Michael Orlitzky
0 siblings, 1 reply; 11+ messages in thread
From: Chen Wang @ 2026-06-01 0:31 UTC (permalink / raw)
To: Michael Orlitzky, sophgo, Han Gao, Inochi Amaoto, Han Gao
On 5/31/2026 11:05 PM, Michael Orlitzky wrote:
> On 2026-05-31 00:32:46, Michael Orlitzky wrote:
>> Has anyone tried the recent SG2042 firmware and DMA coherence
>> patches with the onboard SATA? ... I am testing with
>> linux-next-20260529, and the following patches from the mailing
>> list. OpenSBI, zsbl, EDK2, and fip.bin are all up-to-date.
> Just to confirm, downgrading the firmware (zsbl, edk2, fip.bin) to
> snapshots from December 2025 and reverting the dma-coherent dts
> changes (e728a57834d) does fix the issue. It looks like something may
> be seriously wrong, be careful if you have anything important on a
> SATA drive.
Changes related to DMA can be referred to at
https://lore.kernel.org/linux-riscv/20260331171248.973014-1-gaohan@iscas.ac.cn/
Hi Michael,
please confirm whether you have updated the firmware mentioned in the
patch email. If there are still issues after updating the firmware, this
patch may need to be reassessed.
Adding Han & Inochi.
Hi Han Gao,
Can you please take a look at the issue reported by Michael and can you
provide more information on this? For a more complete report, please
refer to https://lore.kernel.org/sophgo/ahu57vcS0oOFmCI9@mertle/
Thanks,
Chen
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: SG2042 SATA (DMA?) issues
2026-06-01 0:31 ` Chen Wang
@ 2026-06-01 2:22 ` Michael Orlitzky
2026-06-01 6:06 ` Chen Wang
0 siblings, 1 reply; 11+ messages in thread
From: Michael Orlitzky @ 2026-06-01 2:22 UTC (permalink / raw)
To: Chen Wang; +Cc: sophgo, Han Gao, Inochi Amaoto, Han Gao
On 2026-06-01 08:31:21, Chen Wang wrote:
>
> Changes related to DMA can be referred to at
> https://lore.kernel.org/linux-riscv/20260331171248.973014-1-gaohan@iscas.ac.cn/
>
> Hi Michael,
>
> please confirm whether you have updated the firmware mentioned in the
> patch email. If there are still issues after updating the firmware, this
> patch may need to be reassessed.
Yes, I have git checkouts of
* edk2-non-osi.git
* edk2-platforms.git
* edk2.git
* opensbi.git
* zsbl.git
from github.com/sophgo and am using the sg2042-dev branch where
applicable. The problem occurs with fip.bin from,
commit b379b0674328be40c9811032193014195944c664
Author: Chao Wei <chao.wei@sophgo.com>
Date: Mon Mar 30 21:27:39 2026 +0800
SG2042/Boot: Update fip.bin
SG2042: Disable PCIe write reorder on single chip
which is one commit ahead of the one linked by Han Gao. All other
firmware components are at their most recent commits.
The SATA drives had been working for over a year without issue. The
first occurrence of the problem in my kernel logs is immediately after
a kernel/firmware update, so to confirm, I looked through the git logs
and reverted each firmware component to an earlier commit:
* SRA1-20.fd (20251209)
* fip.bin (20251209, from the bootloader-riscv repo)
* fw_dynamic.bin (20251106)
* zsbl.bin (20251215)
* mango-milkv-pioneer.dtb (20251215)
and of course, I reverted the kernel patch that adds dma-coherent to
the dts, and replaced sg2042-milkv-pioneer.dtb.
This seems to fix the problem: I mounted and unmounted, copied to and
from, both partions for a few minutes without error. (I get an
unrelated kernel panic in amdgpu_irq.c with this firmware/kernel, but
was able to test over SSH after removing the GPU).
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: SG2042 SATA (DMA?) issues
2026-06-01 2:22 ` Michael Orlitzky
@ 2026-06-01 6:06 ` Chen Wang
2026-06-01 6:47 ` Michael Orlitzky
0 siblings, 1 reply; 11+ messages in thread
From: Chen Wang @ 2026-06-01 6:06 UTC (permalink / raw)
To: Michael Orlitzky; +Cc: sophgo, Han Gao, Inochi Amaoto, Han Gao
Thanks for your info. Some more quick questions:
On 6/1/2026 10:22 AM, Michael Orlitzky wrote:
> On 2026-06-01 08:31:21, Chen Wang wrote:
>> Changes related to DMA can be referred to at
>> https://lore.kernel.org/linux-riscv/20260331171248.973014-1-gaohan@iscas.ac.cn/
>>
>> Hi Michael,
>>
>> please confirm whether you have updated the firmware mentioned in the
>> patch email. If there are still issues after updating the firmware, this
>> patch may need to be reassessed.
> Yes, I have git checkouts of
>
> * edk2-non-osi.git
> * edk2-platforms.git
> * edk2.git
> * opensbi.git
> * zsbl.git
>
> from github.com/sophgo and am using the sg2042-dev branch where
> applicable. The problem occurs with fip.bin from,
>
> commit b379b0674328be40c9811032193014195944c664
> Author: Chao Wei <chao.wei@sophgo.com>
> Date: Mon Mar 30 21:27:39 2026 +0800
>
> SG2042/Boot: Update fip.bin
>
> SG2042: Disable PCIe write reorder on single chip
>
> which is one commit ahead of the one linked by Han Gao. All other
> firmware components are at their most recent commits.
>
> The SATA drives had been working for over a year without issue. The
> first occurrence of the problem in my kernel logs is immediately after
> a kernel/firmware update, so to confirm, I looked through the git logs
> and reverted each firmware component to an earlier commit:
>
> * SRA1-20.fd (20251209)
> * fip.bin (20251209, from the bootloader-riscv repo)
> * fw_dynamic.bin (20251106)
> * zsbl.bin (20251215)
> * mango-milkv-pioneer.dtb (20251215)
Are these firmware components correspond to the https://github.com/sophgo/edk2-non-osi/commit/017a5aea26a066fd2bf501b7893937183165af36?
>
> and of course, I reverted the kernel patch that adds dma-coherent to
> the dts, and replaced sg2042-milkv-pioneer.dtb.
Just want to double-confirm if we use
https://github.com/sophgo/edk2-non-osi/commit/017a5aea26a066fd2bf501b7893937183165af36
plus the dts dma-coherenet patch, how the SATA works?
>
> This seems to fix the problem: I mounted and unmounted, copied to and
> from, both partions for a few minutes without error. (I get an
> unrelated kernel panic in amdgpu_irq.c with this firmware/kernel, but
> was able to test over SSH after removing the GPU).
Thanks,
Chen
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: SG2042 SATA (DMA?) issues
2026-06-01 6:06 ` Chen Wang
@ 2026-06-01 6:47 ` Michael Orlitzky
2026-06-01 8:00 ` Chen Wang
0 siblings, 1 reply; 11+ messages in thread
From: Michael Orlitzky @ 2026-06-01 6:47 UTC (permalink / raw)
To: Chen Wang; +Cc: sophgo, Han Gao, Inochi Amaoto, Han Gao
On 2026-06-01 14:06:29, Chen Wang wrote:
> >
> > * SRA1-20.fd (20251209)
> > * fip.bin (20251209, from the bootloader-riscv repo)
> > * fw_dynamic.bin (20251106)
> > * zsbl.bin (20251215)
> > * mango-milkv-pioneer.dtb (20251215)
>
> Are these firmware components correspond to the https://github.com/sophgo/edk2-non-osi/commit/017a5aea26a066fd2bf501b7893937183165af36?
The dates listed above are before any dma-coherent patches. This set
of firmware files is OK, no filesystem corruption.
> Just want to double-confirm if we use
> https://github.com/sophgo/edk2-non-osi/commit/017a5aea26a066fd2bf501b7893937183165af36
> plus the dts dma-coherenet patch, how the SATA works?
The older firmware, without the patch, works. It is the newer firmware
(plus the dma-coherent patch) that appears to cause the problem.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: SG2042 SATA (DMA?) issues
2026-06-01 6:47 ` Michael Orlitzky
@ 2026-06-01 8:00 ` Chen Wang
2026-06-01 8:12 ` Michael Orlitzky
2026-06-02 12:19 ` Han Gao
0 siblings, 2 replies; 11+ messages in thread
From: Chen Wang @ 2026-06-01 8:00 UTC (permalink / raw)
To: Michael Orlitzky; +Cc: sophgo, Han Gao, Inochi Amaoto, Han Gao
Hi,Mike,
On 6/1/2026 2:47 PM, Michael Orlitzky wrote:
> On 2026-06-01 14:06:29, Chen Wang wrote:
>>> * SRA1-20.fd (20251209)
>>> * fip.bin (20251209, from the bootloader-riscv repo)
>>> * fw_dynamic.bin (20251106)
>>> * zsbl.bin (20251215)
>>> * mango-milkv-pioneer.dtb (20251215)
>> Are these firmware components correspond to the https://github.com/sophgo/edk2-non-osi/commit/017a5aea26a066fd2bf501b7893937183165af36?
> The dates listed above are before any dma-coherent patches. This set
> of firmware files is OK, no filesystem corruption.
>
>
>> Just want to double-confirm if we use
>> https://github.com/sophgo/edk2-non-osi/commit/017a5aea26a066fd2bf501b7893937183165af36
>> plus the dts dma-coherenet patch, how the SATA works?
> The older firmware, without the patch, works. It is the newer firmware
> (plus the dma-coherent patch) that appears to cause the problem.
I'm a little confused about what you mean by "newer firmware." Do you
mean >= 017a5aea or > 017a5aea?
Because in your last email you mentioned you discovered the problem
starting with b379b067.
The firmware's tree log is as follows:
https://github.com/sophgo/edk2-non-osi/commits/devel-sg2042/
latest commit
.
b379b067
017a5aea <---- dma-coherenet patch was submitted together with this commit.
.... older commit
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: SG2042 SATA (DMA?) issues
2026-06-01 8:00 ` Chen Wang
@ 2026-06-01 8:12 ` Michael Orlitzky
2026-06-02 12:19 ` Han Gao
1 sibling, 0 replies; 11+ messages in thread
From: Michael Orlitzky @ 2026-06-01 8:12 UTC (permalink / raw)
To: Chen Wang; +Cc: sophgo, Han Gao, Inochi Amaoto, Han Gao
On 2026-06-01 16:00:04, Chen Wang wrote:
>
> I'm a little confused about what you mean by "newer firmware." Do you
> mean >= 017a5aea or > 017a5aea?
> ...
> Because in your last email you mentioned you discovered the problem
> starting with b379b067.
I have never tried 017a5aea, only b379b067.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: SG2042 SATA (DMA?) issues
2026-06-01 8:00 ` Chen Wang
2026-06-01 8:12 ` Michael Orlitzky
@ 2026-06-02 12:19 ` Han Gao
2026-06-02 19:20 ` Michael Orlitzky
2026-06-03 9:14 ` Niklas Cassel
1 sibling, 2 replies; 11+ messages in thread
From: Han Gao @ 2026-06-02 12:19 UTC (permalink / raw)
To: Chen Wang; +Cc: Michael Orlitzky, sophgo, Inochi Amaoto, Han Gao, zhengjingkun
Hi, Michael
Based on the new firmware with DMA coherence, we tested the following cases.
Test method:
mkfs.btrfs /dev/sda1
mount /dev/sda1 /mnt
f3write -e 128 /mnt
sync
f3read /mnt
case1:
2042pcie - asm2824 - jmb585: failed, csum failed.
case2:
2042pcie - amd b650 bridge(prom 21) - asm1062: pass.
case3:
2042pcie - asm2824 - amd b650 bridge(prom 21) - asm1062: pass
case4:
2042pcie - asm2824 - jmb585 + kernel parameter libata.force=noncq: pass
Based on the test results of the above four cases,
the problem is suspected to lie in the JMB585 chip itself.
The test was conducted by Jingkun Zheng.
Thanks,
Han
On Mon, Jun 1, 2026 at 4:00 PM Chen Wang <unicorn_wang@outlook.com> wrote:
>
> Hi,Mike,
>
> On 6/1/2026 2:47 PM, Michael Orlitzky wrote:
> > On 2026-06-01 14:06:29, Chen Wang wrote:
> >>> * SRA1-20.fd (20251209)
> >>> * fip.bin (20251209, from the bootloader-riscv repo)
> >>> * fw_dynamic.bin (20251106)
> >>> * zsbl.bin (20251215)
> >>> * mango-milkv-pioneer.dtb (20251215)
> >> Are these firmware components correspond to the https://github.com/sophgo/edk2-non-osi/commit/017a5aea26a066fd2bf501b7893937183165af36?
> > The dates listed above are before any dma-coherent patches. This set
> > of firmware files is OK, no filesystem corruption.
> >
> >
> >> Just want to double-confirm if we use
> >> https://github.com/sophgo/edk2-non-osi/commit/017a5aea26a066fd2bf501b7893937183165af36
> >> plus the dts dma-coherenet patch, how the SATA works?
> > The older firmware, without the patch, works. It is the newer firmware
> > (plus the dma-coherent patch) that appears to cause the problem.
>
> I'm a little confused about what you mean by "newer firmware." Do you
> mean >= 017a5aea or > 017a5aea?
>
> Because in your last email you mentioned you discovered the problem
> starting with b379b067.
>
> The firmware's tree log is as follows:
>
> https://github.com/sophgo/edk2-non-osi/commits/devel-sg2042/
>
> latest commit
>
> .
> b379b067
> 017a5aea <---- dma-coherenet patch was submitted together with this commit.
> .... older commit
>
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: SG2042 SATA (DMA?) issues
2026-06-02 12:19 ` Han Gao
@ 2026-06-02 19:20 ` Michael Orlitzky
2026-06-03 9:14 ` Niklas Cassel
1 sibling, 0 replies; 11+ messages in thread
From: Michael Orlitzky @ 2026-06-02 19:20 UTC (permalink / raw)
To: Han Gao; +Cc: Chen Wang, sophgo, Inochi Amaoto, Han Gao, zhengjingkun
On 2026-06-02 20:19:32, Han Gao wrote:
>
> case4:
> 2042pcie - asm2824 - jmb585 + kernel parameter libata.force=noncq: pass
>
> Based on the test results of the above four cases,
> the problem is suspected to lie in the JMB585 chip itself.
After a few hours of testing, I can confirm that adding
libata.force=noncq fixes the issue. Thank you!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: SG2042 SATA (DMA?) issues
2026-06-02 12:19 ` Han Gao
2026-06-02 19:20 ` Michael Orlitzky
@ 2026-06-03 9:14 ` Niklas Cassel
1 sibling, 0 replies; 11+ messages in thread
From: Niklas Cassel @ 2026-06-03 9:14 UTC (permalink / raw)
To: Han Gao
Cc: Chen Wang, Michael Orlitzky, sophgo, Inochi Amaoto, Han Gao,
zhengjingkun, linux-ide, dlemoal
Hello Han,
On Tue, Jun 02, 2026 at 08:19:32PM +0800, Han Gao wrote:
> Hi, Michael
>
> Based on the new firmware with DMA coherence, we tested the following cases.
>
> Test method:
> mkfs.btrfs /dev/sda1
> mount /dev/sda1 /mnt
> f3write -e 128 /mnt
> sync
> f3read /mnt
>
> case1:
> 2042pcie - asm2824 - jmb585: failed, csum failed.
> case2:
> 2042pcie - amd b650 bridge(prom 21) - asm1062: pass.
> case3:
> 2042pcie - asm2824 - amd b650 bridge(prom 21) - asm1062: pass
> case4:
> 2042pcie - asm2824 - jmb585 + kernel parameter libata.force=noncq: pass
>
> Based on the test results of the above four cases,
> the problem is suspected to lie in the JMB585 chip itself.
+linux-ide
Original thread:
https://lore.kernel.org/sophgo/ahu57vcS0oOFmCI9@mertle/
I interpret this as, before you added 'dma-coherent' to your PCIe controller
device tree node:
Sophgo SG2042 PCIe + ASM2824 + JMB585
worked fine, without any libata.force=noncq kernel parameter which disables NCQ.
After adding 'dma-coherent' to your PCIe controller device tree node:
Sophgo SG2042 PCIe + ASM2824 + JMB585
no longer works fine, and you need to disable NCQ to not get filesystem
corruption.
If this was a problem with the JMB585 chip, why did it work fine to run with NCQ
enabled before you did firmware changes + added 'dma-coherent' to your PCIe
controller device tree node?
Disabling NCQ will significantly reduce the drive performance.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-06-03 9:14 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-31 4:32 SG2042 SATA (DMA?) issues Michael Orlitzky
2026-05-31 15:05 ` Michael Orlitzky
2026-06-01 0:31 ` Chen Wang
2026-06-01 2:22 ` Michael Orlitzky
2026-06-01 6:06 ` Chen Wang
2026-06-01 6:47 ` Michael Orlitzky
2026-06-01 8:00 ` Chen Wang
2026-06-01 8:12 ` Michael Orlitzky
2026-06-02 12:19 ` Han Gao
2026-06-02 19:20 ` Michael Orlitzky
2026-06-03 9:14 ` Niklas Cassel
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.