* [bug report] IOMMU reports data translation fault for fio testing
@ 2022-05-13 12:01 John Garry
2022-05-14 2:28 ` Bart Van Assche
0 siblings, 1 reply; 4+ messages in thread
From: John Garry @ 2022-05-13 12:01 UTC (permalink / raw)
To: linux-block, linux-scsi
Cc: chenxiang66@hisilicon.com >> Xiang Chen, liyihang (E)
Hi guys,
My colleague Yihang Li noticed this issue when testing throughput for
hisi SAS arm64 controller on v5.18-rc6:
estuary:/$ bash ./create_fio_task.sh 4k read 128 1
test2 4k
my_runtime 1500
Creat 4k_read_depth128_fiotest file successfully
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
job1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
fio-2.2.10
Starting 10 processes
N[ 304.798377] arm-smmu-v3 arm-smmu-v3.3.auto: event 0x10
received:iops] [eta 23m:39s]
O[ 304.804431] arm-smmu-v3 arm-smmu-v3.3.auto: 0x0000741000000010
T[ 304.810404] arm-smmu-v3 arm-smmu-v3.3.auto: 0x000012000000007a
I[ 304.816379] arm-smmu-v3 arm-smmu-v3.3.auto: 0x00000000abff6000
C[ 304.822354] arm-smmu-v3 arm-smmu-v3.3.auto: 0x00000000abff6000
E[ 304.828330] arm-smmu-v3 arm-smmu-v3.3.auto: event 0x10 received:
:[ 304.834392] arm-smmu-v3 arm-smmu-v3.3.auto: 0x0000741000000010
[ 304.840368] arm-smmu-v3 arm-smmu-v3.3.auto: 0x0000120000000058
[ 304.846344] arm-smmu-v3 arm-smmu-v3.3.auto: 0x00000000abff6100
R[ 304.852320] arm-smmu-v3 arm-smmu-v3.3.auto: 0x00000000abff6000
a[ 304.858297] arm-smmu-v3 arm-smmu-v3.3.auto: event 0x10 received:
s[ 304.864361] arm-smmu-v3 arm-smmu-v3.3.auto: 0x0000741000000010
I[ 304.870337] arm-smmu-v3 arm-smmu-v3.3.auto: 0x000012000000004a
n[ 304.876313] arm-smmu-v3 arm-smmu-v3.3.auto: 0x00000000abff62c0
t[ 304.882289] arm-smmu-v3 arm-smmu-v3.3.auto: 0x00000000abff6000
Event 0x10 is a translation fault, meaning the DMA mapping is prob
misconfigured.
I don't think it's an IOMMU issue as I tested that separately with a DMA
mapping benchmark driver.
I'm told v5.17-rc7 does not have the issue. Any idea on the possible
cause or if there is a fix in waiting? It could be an issue with the
SCSI hba driver.
I'll bisect in the meantime.
thanks,
John
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [bug report] IOMMU reports data translation fault for fio testing
2022-05-13 12:01 [bug report] IOMMU reports data translation fault for fio testing John Garry
@ 2022-05-14 2:28 ` Bart Van Assche
2022-05-14 9:49 ` John Garry
0 siblings, 1 reply; 4+ messages in thread
From: Bart Van Assche @ 2022-05-14 2:28 UTC (permalink / raw)
To: John Garry, linux-block, linux-scsi
Cc: chenxiang66@hisilicon.com >> Xiang Chen, liyihang (E)
On 5/13/22 05:01, John Garry wrote:
> It could be an issue with the SCSI hba driver.
That seems likely to me.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [bug report] IOMMU reports data translation fault for fio testing
2022-05-14 2:28 ` Bart Van Assche
@ 2022-05-14 9:49 ` John Garry
2022-05-16 10:51 ` John Garry
0 siblings, 1 reply; 4+ messages in thread
From: John Garry @ 2022-05-14 9:49 UTC (permalink / raw)
To: Bart Van Assche, linux-block, linux-scsi
Cc: chenxiang66@hisilicon.com >> Xiang Chen, liyihang (E)
On 14/05/2022 03:28, Bart Van Assche wrote:
> On 5/13/22 05:01, John Garry wrote:
>> It could be an issue with the SCSI hba driver.
>
> That seems likely to me.
Sure, that would be common wisdom. However the commit before anything
related to driver was added for 5.18 is also bad. It could be
pre-existing, but that starts to seem unlikely. Or it could still be an
IOMMU issue - we already have a performance issue there.
This issue can take more than 15 minutes to occur, so is pretty painful
to bisect...
Thanks,
John
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [bug report] IOMMU reports data translation fault for fio testing
2022-05-14 9:49 ` John Garry
@ 2022-05-16 10:51 ` John Garry
0 siblings, 0 replies; 4+ messages in thread
From: John Garry @ 2022-05-16 10:51 UTC (permalink / raw)
To: Bart Van Assche, linux-block, linux-scsi
Cc: chenxiang66@hisilicon.com >> Xiang Chen, liyihang (E)
On 14/05/2022 10:49, John Garry wrote:
>>> It could be an issue with the SCSI hba driver.
>>
>> That seems likely to me.
>
Actually it is a LLDD problem. Sometimes it takes 45 minutes to trigger,
though – not nice to bisect.
This looks to be the problematic patch:
author John Garry <john.garry@huawei.com> 2022-02-10 18:43:24 +0800
committer Martin K. Petersen <martin.petersen@oracle.com> 2022-02-11
17:02:50 -0500
commit 26fc0ea74fcb9b76b41f5e9b89728cd1c01559cd (patch)
scsi: libsas: Drop SAS_TASK_AT_INITIATOR
If interested, this looks like the issue:
void hisi_sas_task_deliver(struct hisi_hba *hisi_hba,
break;
}
- spin_lock_irqsave(&task->task_state_lock, flags);
- task->task_state_flags |= SAS_TASK_AT_INITIATOR;
- spin_unlock_irqrestore(&task->task_state_lock, flags);
-
WRITE_ONCE(slot->ready, 1);
Losing the spinlock loses the barrier semantics as well, so a memory
ordering issue.
> Sure, that would be common wisdom. However the commit before anything
> related to driver was added for 5.18 is also bad. It could be
> pre-existing, but that starts to seem unlikely. Or it could still be an
> IOMMU issue - we already have a performance issue there.
>
> This issue can take more than 15 minutes to occur, so is pretty painful
> to bisect...
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-05-16 10:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-13 12:01 [bug report] IOMMU reports data translation fault for fio testing John Garry
2022-05-14 2:28 ` Bart Van Assche
2022-05-14 9:49 ` John Garry
2022-05-16 10:51 ` John Garry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox