* QAIC reset failure
@ 2024-01-16 16:58 Baruch Siach
2024-01-22 22:57 ` Jeffrey Hugo
0 siblings, 1 reply; 2+ messages in thread
From: Baruch Siach @ 2024-01-16 16:58 UTC (permalink / raw)
To: Jeffrey Hugo, Carl Vanderlip, Pranjal Ramajor Asha Kanojiya
Cc: linux-arm-msm, dri-devel, Ramon Fried, Orr Mazor
Hi qaic driver maintainers,
I am testing an A100 device on arm64 platform. Kernel version is current
Linus master as of commit 052d534373b7. The driver is unable to reset
the device properly.
[ 137.706765] pci 0000:01:00.0: enabling device (0000 -> 0002)
[ 137.712528] pci 0000:02:00.0: enabling device (0000 -> 0002)
[ 137.718230] qaic 0000:03:00.0: enabling device (0000 -> 0002)
[ 137.725720] [drm] Initialized qaic 0.0.0 20190618 for 0000:03:00.0 on minor 0
[ 137.734326] mhi mhi0: Requested to power ON
[ 137.738520] mhi mhi0: Power on setup success
[ 137.855108] mhi mhi0: Wait for device to enter SBL or Mission mode
[ 137.861578] qaic_timesync mhi0_QAIC_TIMESYNC: 20: Failed to receive START channel command completion
[ 137.870733] qaic_timesync mhi0_QAIC_TIMESYNC: 21: Failed to reset channel, still resetting
[ 137.879063] qaic_timesync mhi0_QAIC_TIMESYNC: 20: Failed to reset channel, still resetting
[ 137.887334] qaic_timesync: probe of mhi0_QAIC_TIMESYNC failed with error -5
[ 137.894866] qaic_timesync mhi0_QAIC_TIMESYNC: 20: Failed to receive START channel command completion
[ 137.904006] qaic_timesync mhi0_QAIC_TIMESYNC: 21: Failed to reset channel, still resetting
[ 137.912263] qaic_timesync mhi0_QAIC_TIMESYNC: 20: Failed to reset channel, still resetting
[ 137.920517] qaic_timesync: probe of mhi0_QAIC_TIMESYNC failed with error -5
[ 140.807091] mhi mhi0: Device failed to enter MHI Ready
[ 143.695094] mhi mhi0: Device failed to enter MHI Ready
This is with firmware from SDK version 1.12.2.0. I tried also version
1.10.0.193 with similar results.
Some more state information from MHI debugfs below.
/sys/kernel/debug/mhi/mhi0/regdump:
Host PM state: SYS ERROR Process Device state: RESET EE: DISABLE
Device EE: PRIMARY BOOTLOADER state: SYS ERROR
MHI_REGLEN: 0x100
MHI_VER: 0x1000000
MHI_CFG: 0x8000000
MHI_CTRL: 0x0
MHI_STATUS: 0xff04
MHI_WAKE_DB: 0x1
BHI_EXECENV: 0x0
BHI_STATUS: 0xa93f0935
BHI_ERRCODE: 0x0
BHI_ERRDBG1: 0xc0300000
BHI_ERRDBG2: 0xb
BHI_ERRDBG3: 0xcabb0
/sys/kernel/debug/mhi/mhi0/states:
PM state: SYS ERROR Process Device: Inactive MHI state: RESET EE: DISABLE wake: true
M0: 2 M2: 0 M3: 0 device wake: 0 pending packets: 0
Any idea?
Thanks,
baruch
--
~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: QAIC reset failure
2024-01-16 16:58 QAIC reset failure Baruch Siach
@ 2024-01-22 22:57 ` Jeffrey Hugo
0 siblings, 0 replies; 2+ messages in thread
From: Jeffrey Hugo @ 2024-01-22 22:57 UTC (permalink / raw)
To: Baruch Siach, Carl Vanderlip, Pranjal Ramajor Asha Kanojiya
Cc: linux-arm-msm, dri-devel, Ramon Fried, Orr Mazor
On 1/16/2024 9:58 AM, Baruch Siach wrote:
> Hi qaic driver maintainers,
Sorry I was holiday last week and I am just now catching up on email and
seeing this.
> I am testing an A100 device on arm64 platform. Kernel version is current
> Linus master as of commit 052d534373b7. The driver is unable to reset
> the device properly.
>
> [ 137.706765] pci 0000:01:00.0: enabling device (0000 -> 0002)
> [ 137.712528] pci 0000:02:00.0: enabling device (0000 -> 0002)
> [ 137.718230] qaic 0000:03:00.0: enabling device (0000 -> 0002)
> [ 137.725720] [drm] Initialized qaic 0.0.0 20190618 for 0000:03:00.0 on minor 0
> [ 137.734326] mhi mhi0: Requested to power ON
> [ 137.738520] mhi mhi0: Power on setup success
> [ 137.855108] mhi mhi0: Wait for device to enter SBL or Mission mode
This all looks good
> [ 137.861578] qaic_timesync mhi0_QAIC_TIMESYNC: 20: Failed to receive START channel command completion
> [ 137.870733] qaic_timesync mhi0_QAIC_TIMESYNC: 21: Failed to reset channel, still resetting
> [ 137.879063] qaic_timesync mhi0_QAIC_TIMESYNC: 20: Failed to reset channel, still resetting
> [ 137.887334] qaic_timesync: probe of mhi0_QAIC_TIMESYNC failed with error -5
> [ 137.894866] qaic_timesync mhi0_QAIC_TIMESYNC: 20: Failed to receive START channel command completion
> [ 137.904006] qaic_timesync mhi0_QAIC_TIMESYNC: 21: Failed to reset channel, still resetting
> [ 137.912263] qaic_timesync mhi0_QAIC_TIMESYNC: 20: Failed to reset channel, still resetting
> [ 137.920517] qaic_timesync: probe of mhi0_QAIC_TIMESYNC failed with error -5
> [ 140.807091] mhi mhi0: Device failed to enter MHI Ready
> [ 143.695094] mhi mhi0: Device failed to enter MHI Ready
This looks like the device stopped responding to the host, early in
boot. Trying to access channels while the device is not in MHI Ready
state is odd.
> This is with firmware from SDK version 1.12.2.0. I tried also version
> 1.10.0.193 with similar results.
>
> Some more state information from MHI debugfs below.
>
> /sys/kernel/debug/mhi/mhi0/regdump:
> Host PM state: SYS ERROR Process Device state: RESET EE: DISABLE
> Device EE: PRIMARY BOOTLOADER state: SYS ERROR
> MHI_REGLEN: 0x100
> MHI_VER: 0x1000000
> MHI_CFG: 0x8000000
> MHI_CTRL: 0x0
> MHI_STATUS: 0xff04
> MHI_WAKE_DB: 0x1
> BHI_EXECENV: 0x0
> BHI_STATUS: 0xa93f0935
> BHI_ERRCODE: 0x0
> BHI_ERRDBG1: 0xc0300000
> BHI_ERRDBG2: 0xb
> BHI_ERRDBG3: 0xcabb0
This suggests that the device crashed, which is unexpected.
> /sys/kernel/debug/mhi/mhi0/states:
> PM state: SYS ERROR Process Device: Inactive MHI state: RESET EE: DISABLE wake: true
> M0: 2 M2: 0 M3: 0 device wake: 0 pending packets: 0
>
> Any idea?
We may need our firmware engineers involved. I think there is already a
thread with some of the POCs involved.
-Jeff
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-01-22 22:57 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-16 16:58 QAIC reset failure Baruch Siach
2024-01-22 22:57 ` Jeffrey Hugo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox