All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 221297] New: AMDGPU SMU driver interface version mismatch on R9700 - fan control broken under load
@ 2026-03-29 16:18 bugzilla-daemon
  2026-03-30  8:51 ` [Bug 221297] " bugzilla-daemon
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: bugzilla-daemon @ 2026-03-29 16:18 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=221297

            Bug ID: 221297
           Summary: AMDGPU SMU driver interface version mismatch on R9700
                    - fan control broken under load
           Product: Drivers
           Version: 2.5
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: regbob.home@gmail.com
        Regression: No

Hardware: ASUS Turbo Radeon AI Pro R9700 32GB
vBIOS: 115-G287BP00-100
OS: Ubuntu 24.04
Kernel: 6.17.0-19
ROCm: 7.2.1
AMDGPU driver: 6.16.13

BUG:
The GPU fan does not spin up automatically under thermal load. During AI
training the GPU reached 109°C and thermally throttled with the fan physically
stationary throughout.

ROOT CAUSE - SMU interface version mismatch (from dmesg):

amdgpu 0000:2d:00.0: amdgpu: smu driver if version = 0x0000002e (46)
amdgpu 0000:2d:00.0: amdgpu: smu fw if version = 0x00000032 (50)
amdgpu 0000:2d:00.0: amdgpu: smu fw version = 0x00684b00 (104.75.0)
amdgpu 0000:2d:00.0: amdgpu: SMU driver if version not matched

The card firmware is 4 interface versions ahead of what the AMDGPU driver
supports. As a result all fan control registers are inaccessible:
- rocm-smi --setfan returns 'Not supported on this system'
- sysfs pwm1 node is READ-ONLY (-r--r--r--)
- fan1_enable returns 'Invalid argument' when read
- GPU enters runtime power suspend under load, further suppressing fan response

Confirmed on AMDGPU 6.16.6 and 6.16.13 under ROCm 7.2.1.

FIX REQUIRED:
The amdgpu kernel driver needs to be updated to support SMU interface version
50 (0x00000032) as shipped on the R9700 (gfx1201, RDNA 4).

Related ROCm issue: https://github.com/ROCm/ROCm/issues/5908

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug 221297] AMDGPU SMU driver interface version mismatch on R9700 - fan control broken under load
  2026-03-29 16:18 [Bug 221297] New: AMDGPU SMU driver interface version mismatch on R9700 - fan control broken under load bugzilla-daemon
@ 2026-03-30  8:51 ` bugzilla-daemon
  2026-03-30 13:01 ` bugzilla-daemon
  2026-04-01  8:32 ` bugzilla-daemon
  2 siblings, 0 replies; 4+ messages in thread
From: bugzilla-daemon @ 2026-03-30  8:51 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=221297

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |ANSWERED

--- Comment #1 from Artem S. Tashkinov (aros@gmx.com) ---
Report here after installing the latest firmware and testing kernels 6.18.20 or
6.19.10:

https://gitlab.freedesktop.org/drm/amd/-/issues

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug 221297] AMDGPU SMU driver interface version mismatch on R9700 - fan control broken under load
  2026-03-29 16:18 [Bug 221297] New: AMDGPU SMU driver interface version mismatch on R9700 - fan control broken under load bugzilla-daemon
  2026-03-30  8:51 ` [Bug 221297] " bugzilla-daemon
@ 2026-03-30 13:01 ` bugzilla-daemon
  2026-04-01  8:32 ` bugzilla-daemon
  2 siblings, 0 replies; 4+ messages in thread
From: bugzilla-daemon @ 2026-03-30 13:01 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=221297

--- Comment #2 from lobsterman (regbob.home@gmail.com) ---
Reporting results of kernel testing as requested.

Hardware: ASUS Turbo Radeon AI Pro R9700 32GB
vBIOS: 115-G287BP00-100
OS: Ubuntu 24.04
ROCm: 7.2.1
AMDGPU driver: 6.16.13
Current kernel: 6.17.0-19

SMU MISMATCH (persists across all tested configurations):

amdgpu 0000:2d:00.0: amdgpu: smu driver if version = 0x0000002e (46)
amdgpu 0000:2d:00.0: amdgpu: smu fw if version = 0x00000032 (50)
amdgpu 0000:2d:00.0: amdgpu: smu fw version = 0x00684b00 (104.75.0)
amdgpu 0000:2d:00.0: amdgpu: SMU driver if version not matched

KERNEL TESTING RESULTS:

Kernel 6.17.0-19:
- SMU mismatch confirmed present
- Fan does not spin under any load
- GPU reached 109C and thermally throttled during AI training with fan
physically stationary

Kernel 6.18.20 mainline:
- AMDGPU 6.16.13 DKMS fails to build against this kernel
- NVIDIA 535 DKMS fails to build against this kernel  
- Kernel panics on boot: VFS: Unable to mount root fs on unknown-block(0,0)
- NVMe storage drivers absent from mainline build
- Testing blocked — system unbootable

Kernel 6.19.10 mainline:
- Same DKMS build failures as 6.18.20
- Kernel panic on boot
- Testing blocked — system unbootable

CONCLUSION:
Mainline kernel testing on 6.18.20 and 6.19.10 is not feasible on this hardware
due to VFS/NVMe boot failures in the minimal mainline builds. The SMU interface
version mismatch (driver if version 46 vs firmware if version 50) persists on
kernel 6.17 with AMDGPU 6.16.13 under ROCm 7.2.1.

A fix in the AMDGPU driver to support SMU interface version 50 is required.
Mainline kernel testing cannot be used to validate this fix on this system.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug 221297] AMDGPU SMU driver interface version mismatch on R9700 - fan control broken under load
  2026-03-29 16:18 [Bug 221297] New: AMDGPU SMU driver interface version mismatch on R9700 - fan control broken under load bugzilla-daemon
  2026-03-30  8:51 ` [Bug 221297] " bugzilla-daemon
  2026-03-30 13:01 ` bugzilla-daemon
@ 2026-04-01  8:32 ` bugzilla-daemon
  2 siblings, 0 replies; 4+ messages in thread
From: bugzilla-daemon @ 2026-04-01  8:32 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=221297

--- Comment #3 from Artem S. Tashkinov (aros@gmx.com) ---
This bugzilla is generally NOT monitored by AMD employees.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-01  8:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-29 16:18 [Bug 221297] New: AMDGPU SMU driver interface version mismatch on R9700 - fan control broken under load bugzilla-daemon
2026-03-30  8:51 ` [Bug 221297] " bugzilla-daemon
2026-03-30 13:01 ` bugzilla-daemon
2026-04-01  8:32 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.