* Bug: UVD initialization / clock gating issue on kabini
@ 2017-01-22 19:26 Nils Holland
[not found] ` <20170122192610.GB3125-iI9p2NPcQ/rYtjvyW6yDsg@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Nils Holland @ 2017-01-22 19:26 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Hi folks,
while playing around with the amdgpu drm driver, I stumbled upon an
issue. In fact, I have tracked it down to commit
aa4747c00a2dd034c5fdf70ca73b1674ca15beb3 ("drm/amdgpu: refine uvd_4.2
clock gate sequence.").
When I run the latest mainline git kernel, which contains this commit,
on my system, I get the following in dmesg (notice the lines about UVD
not responding at the end):
[ 2.276715] Linux agpgart interface v0.103
[ 2.277418] [drm] Initialized
[ 2.277612] [drm] amdgpu kernel modesetting enabled.
[ 2.278311] [drm] initializing kernel modesetting (KABINI 0x1002:0x9834 0x103C:0x21F7 0x00).
[ 2.278554] [drm] register mmio base: 0xF0C00000
[ 2.278683] [drm] register mmio size: 262144
[ 2.278819] [drm] doorbell mmio base: 0xF0000000
[ 2.278945] [drm] doorbell mmio size: 8388608
[ 2.282482] ATOM BIOS: AMD
[ 2.282640] [drm] GPU post is not needed
[ 2.282767] [drm] Changing default dispclk from 300Mhz to 600Mhz
[ 2.283369] amdgpu 0000:00:01.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used)
[ 2.283604] amdgpu 0000:00:01.0: GTT: 1024M 0x0000000020000000 - 0x000000005FFFFFFF
[ 2.283823] [drm] Detected VRAM RAM=512M, BAR=256M
[ 2.283951] [drm] RAM width 128bits UNKNOWN
[ 2.284199] [TTM] Zone kernel: Available graphics memory: 425732 kiB
[ 2.284329] [TTM] Zone highmem: Available graphics memory: 1788424 kiB
[ 2.284458] [TTM] Initializing pool allocator
[ 2.284617] [TTM] Initializing DMA pool allocator
[ 2.284796] [drm] amdgpu: 512M of VRAM memory ready
[ 2.284924] [drm] amdgpu: 1024M of GTT memory ready.
[ 2.285069] [drm] GART: num cpu pages 262144, num gpu pages 262144
[ 2.346421] [drm] PCIE GART of 1024M enabled (table at 0x0000000000040000).
[ 2.346608] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 2.346717] [drm] Driver supports precise vblank timestamp query.
[ 2.346894] amdgpu 0000:00:01.0: amdgpu: using MSI.
[ 2.347044] [drm] amdgpu: irq initialized.
[ 2.347160] [drm] Internal thermal controller without fan control
[ 2.347269] [drm] amdgpu: dpm initialized
[ 2.401730] [drm] amdgpu atom DIG backlight initialized
[ 2.401854] [drm] AMDGPU Display Connectors
[ 2.401959] [drm] Connector 0:
[ 2.402062] [drm] LVDS-1
[ 2.402165] [drm] HPD1
[ 2.402270] [drm] DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
[ 2.402439] [drm] Encoders:
[ 2.402562] [drm] LCD1: INTERNAL_UNIPHY
[ 2.402667] [drm] Connector 1:
[ 2.402769] [drm] HDMI-A-1
[ 2.402871] [drm] HPD2
[ 2.402974] [drm] DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
[ 2.403142] [drm] Encoders:
[ 2.403244] [drm] DFP1: INTERNAL_UNIPHY
[ 2.403348] [drm] Connector 2:
[ 2.403450] [drm] VGA-1
[ 2.403565] [drm] DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973 0x1973
[ 2.403733] [drm] Encoders:
[ 2.403836] [drm] CRT1: INTERNAL_KLDSCP_DAC1
[ 2.404484] amdgpu 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000020000010, cpu addr 0xffc01010
[ 2.404785] amdgpu 0000:00:01.0: fence driver on ring 1 use gpu addr 0x0000000020000020, cpu addr 0xffc01020
[ 2.405046] amdgpu 0000:00:01.0: fence driver on ring 2 use gpu addr 0x0000000020000030, cpu addr 0xffc01030
[ 2.405343] amdgpu 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000020000040, cpu addr 0xffc01040
[ 2.405644] amdgpu 0000:00:01.0: fence driver on ring 4 use gpu addr 0x0000000020000050, cpu addr 0xffc01050
[ 2.405904] amdgpu 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000020000060, cpu addr 0xffc01060
[ 2.406176] amdgpu 0000:00:01.0: fence driver on ring 6 use gpu addr 0x0000000020000070, cpu addr 0xffc01070
[ 2.406432] amdgpu 0000:00:01.0: fence driver on ring 7 use gpu addr 0x0000000020000080, cpu addr 0xffc01080
[ 2.406706] amdgpu 0000:00:01.0: fence driver on ring 8 use gpu addr 0x0000000020000090, cpu addr 0xffc01090
[ 2.407073] amdgpu 0000:00:01.0: fence driver on ring 9 use gpu addr 0x00000000200000a0, cpu addr 0xffc010a0
[ 2.407329] amdgpu 0000:00:01.0: fence driver on ring 10 use gpu addr 0x00000000200000b0, cpu addr 0xffc010b0
[ 2.407801] [drm] Found UVD firmware Version: 1.64 Family ID: 9
[ 2.409254] amdgpu 0000:00:01.0: fence driver on ring 11 use gpu addr 0x000000000028cd30, cpu addr 0xf8a38d30
[ 2.409646] [drm] Found VCE firmware Version: 50.10 Binary ID: 2
[ 2.409911] amdgpu 0000:00:01.0: fence driver on ring 12 use gpu addr 0x00000000200000d0, cpu addr 0xffc010d0
[ 2.410172] amdgpu 0000:00:01.0: fence driver on ring 13 use gpu addr 0x00000000200000e0, cpu addr 0xffc010e0
[ 2.414503] [drm] ring test on 0 succeeded in 16 usecs
[ 2.414875] [drm] ring test on 1 succeeded in 3 usecs
[ 2.414992] [drm] ring test on 2 succeeded in 3 usecs
[ 2.415107] [drm] ring test on 3 succeeded in 3 usecs
[ 2.415223] [drm] ring test on 4 succeeded in 3 usecs
[ 2.415338] [drm] ring test on 5 succeeded in 3 usecs
[ 2.415452] [drm] ring test on 6 succeeded in 3 usecs
[ 2.415583] [drm] ring test on 7 succeeded in 3 usecs
[ 2.415698] [drm] ring test on 8 succeeded in 3 usecs
[ 2.416076] [drm] ring test on 9 succeeded in 5 usecs
[ 2.416189] [drm] ring test on 10 succeeded in 5 usecs
[ 2.442350] [drm] ring test on 11 succeeded in 1 usecs
[ 2.442459] [drm] UVD initialized successfully.
[ 2.580653] [Firmware Bug]: battery: (dis)charge rate invalid.
[ 2.580899] ACPI: Battery Slot [BAT1] (battery present)
[ 2.673280] [drm] ring test on 12 succeeded in 13 usecs
[ 2.673398] [drm] ring test on 13 succeeded in 2 usecs
[ 2.673500] [drm] VCE initialized successfully.
[ 3.246698] tsc: Refined TSC clocksource calibration: 998.128 MHz
[ 3.246828] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1cc65d64e77, max_idle_ns: 881590512558 ns
[ 3.698778] [drm] fb mappable at 0xE0428000
[ 3.698896] [drm] vram apper at 0xE0000000
[ 3.699007] [drm] size 4325376
[ 3.699116] [drm] fb depth is 24
[ 3.699228] [drm] pitch is 5632
[ 3.699503] fbcon: amdgpudrmfb (fb0) is primary device
[ 3.826793] Console: switching to colour frame buffer device 170x48
[ 3.837201] amdgpu 0000:00:01.0: fb0: amdgpudrmfb frame buffer device
[ 3.843195] [drm] ib test on ring 0 succeeded
[ 3.843355] [drm] ib test on ring 1 succeeded
[ 3.843486] [drm] ib test on ring 2 succeeded
[ 3.843651] [drm] ib test on ring 3 succeeded
[ 3.843779] [drm] ib test on ring 4 succeeded
[ 3.843917] [drm] ib test on ring 5 succeeded
[ 3.844047] [drm] ib test on ring 6 succeeded
[ 3.844175] [drm] ib test on ring 7 succeeded
[ 3.844304] [drm] ib test on ring 8 succeeded
[ 3.844427] [drm] ib test on ring 9 succeeded
[ 3.844576] [drm] ib test on ring 10 succeeded
[ 4.870999] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 5.891351] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 6.911713] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 7.932062] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 8.952409] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 9.974050] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 10.995701] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 12.017354] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 13.039024] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 14.060692] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!!
[ 14.082168] [drm:uvd_v4_2_start] *ERROR* UVD not responding, giving up!!!
[ 14.084797] clocksource: Switched to clocksource tsc
[ 14.086404] kwatchdog (97) used greatest stack depth: 7260 bytes left
[ 15.086475] [drm:amdgpu_uvd_ring_test_ib] *ERROR* amdgpu: IB test timed out.
[ 15.088263] [drm:amdgpu_ib_ring_tests] *ERROR* amdgpu: failed testing IB on ring 11 (-110).
[ 15.211331] [drm] ib test on ring 12 succeeded
[ 15.213105] [drm:amdgpu_device_init] *ERROR* ib ring test failed (-110).
[ 15.223583] [drm] Initialized amdgpu 3.9.0 20150101 for 0000:00:01.0 on minor 0
After the short delay that happens while the ERROR messages are
printed, the system continues to boot just fine and in general, for
what I am concerned, works as expected. However, I'm probably not
doing anything that would use the UVD and thus might not notice when
it's not been properly initialized. :-)
Now, when I revert just the one commit
aa4747c00a2dd034c5fdf70ca73b1674ca15beb3, or try a kernel that doesn't
contain it in the first place (like, some 4.9.x one), the UVD seems to
get initialized just fine. Here are the relevant dmesg lines from that
case:
[ 2.274639] Linux agpgart interface v0.103
[ 2.275342] [drm] Initialized
[ 2.275534] [drm] amdgpu kernel modesetting enabled.
[ 2.276201] [drm] initializing kernel modesetting (KABINI 0x1002:0x9834 0x103C:0x21F7 0x00).
[ 2.276442] [drm] register mmio base: 0xF0C00000
[ 2.276568] [drm] register mmio size: 262144
[ 2.276702] [drm] doorbell mmio base: 0xF0000000
[ 2.276825] [drm] doorbell mmio size: 8388608
[ 2.280335] ATOM BIOS: AMD
[ 2.280493] [drm] GPU post is not needed
[ 2.280620] [drm] Changing default dispclk from 300Mhz to 600Mhz
[ 2.281220] amdgpu 0000:00:01.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used)
[ 2.281454] amdgpu 0000:00:01.0: GTT: 1024M 0x0000000020000000 - 0x000000005FFFFFFF
[ 2.281674] [drm] Detected VRAM RAM=512M, BAR=256M
[ 2.281801] [drm] RAM width 128bits UNKNOWN
[ 2.282062] [TTM] Zone kernel: Available graphics memory: 425732 kiB
[ 2.282194] [TTM] Zone highmem: Available graphics memory: 1788424 kiB
[ 2.282322] [TTM] Initializing pool allocator
[ 2.282476] [TTM] Initializing DMA pool allocator
[ 2.282657] [drm] amdgpu: 512M of VRAM memory ready
[ 2.282785] [drm] amdgpu: 1024M of GTT memory ready.
[ 2.282930] [drm] GART: num cpu pages 262144, num gpu pages 262144
[ 2.344268] [drm] PCIE GART of 1024M enabled (table at 0x0000000000040000).
[ 2.344462] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 2.344572] [drm] Driver supports precise vblank timestamp query.
[ 2.344754] amdgpu 0000:00:01.0: amdgpu: using MSI.
[ 2.344905] [drm] amdgpu: irq initialized.
[ 2.345022] [drm] Internal thermal controller without fan control
[ 2.345131] [drm] amdgpu: dpm initialized
[ 2.398603] [drm] amdgpu atom DIG backlight initialized
[ 2.398727] [drm] AMDGPU Display Connectors
[ 2.398833] [drm] Connector 0:
[ 2.398936] [drm] LVDS-1
[ 2.399039] [drm] HPD1
[ 2.399145] [drm] DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
[ 2.399313] [drm] Encoders:
[ 2.399437] [drm] LCD1: INTERNAL_UNIPHY
[ 2.399542] [drm] Connector 1:
[ 2.399644] [drm] HDMI-A-1
[ 2.399746] [drm] HPD2
[ 2.399849] [drm] DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
[ 2.400017] [drm] Encoders:
[ 2.400120] [drm] DFP1: INTERNAL_UNIPHY
[ 2.400223] [drm] Connector 2:
[ 2.400325] [drm] VGA-1
[ 2.400440] [drm] DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973 0x1973
[ 2.400609] [drm] Encoders:
[ 2.400711] [drm] CRT1: INTERNAL_KLDSCP_DAC1
[ 2.401352] amdgpu 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000020000010, cpu addr 0xffc01010
[ 2.401651] amdgpu 0000:00:01.0: fence driver on ring 1 use gpu addr 0x0000000020000020, cpu addr 0xffc01020
[ 2.401926] amdgpu 0000:00:01.0: fence driver on ring 2 use gpu addr 0x0000000020000030, cpu addr 0xffc01030
[ 2.402216] amdgpu 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000020000040, cpu addr 0xffc01040
[ 2.402516] amdgpu 0000:00:01.0: fence driver on ring 4 use gpu addr 0x0000000020000050, cpu addr 0xffc01050
[ 2.402774] amdgpu 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000020000060, cpu addr 0xffc01060
[ 2.403030] amdgpu 0000:00:01.0: fence driver on ring 6 use gpu addr 0x0000000020000070, cpu addr 0xffc01070
[ 2.403293] amdgpu 0000:00:01.0: fence driver on ring 7 use gpu addr 0x0000000020000080, cpu addr 0xffc01080
[ 2.403573] amdgpu 0000:00:01.0: fence driver on ring 8 use gpu addr 0x0000000020000090, cpu addr 0xffc01090
[ 2.403941] amdgpu 0000:00:01.0: fence driver on ring 9 use gpu addr 0x00000000200000a0, cpu addr 0xffc010a0
[ 2.404198] amdgpu 0000:00:01.0: fence driver on ring 10 use gpu addr 0x00000000200000b0, cpu addr 0xffc010b0
[ 2.404679] [drm] Found UVD firmware Version: 1.64 Family ID: 9
[ 2.406144] amdgpu 0000:00:01.0: fence driver on ring 11 use gpu addr 0x000000000028cd30, cpu addr 0xf8a38d30
[ 2.406543] [drm] Found VCE firmware Version: 50.10 Binary ID: 2
[ 2.406800] amdgpu 0000:00:01.0: fence driver on ring 12 use gpu addr 0x00000000200000d0, cpu addr 0xffc010d0
[ 2.407061] amdgpu 0000:00:01.0: fence driver on ring 13 use gpu addr 0x00000000200000e0, cpu addr 0xffc010e0
[ 2.411298] [drm] ring test on 0 succeeded in 16 usecs
[ 2.411662] [drm] ring test on 1 succeeded in 3 usecs
[ 2.411781] [drm] ring test on 2 succeeded in 3 usecs
[ 2.411896] [drm] ring test on 3 succeeded in 3 usecs
[ 2.412011] [drm] ring test on 4 succeeded in 3 usecs
[ 2.412126] [drm] ring test on 5 succeeded in 3 usecs
[ 2.412240] [drm] ring test on 6 succeeded in 3 usecs
[ 2.412355] [drm] ring test on 7 succeeded in 3 usecs
[ 2.412484] [drm] ring test on 8 succeeded in 3 usecs
[ 2.412864] [drm] ring test on 9 succeeded in 6 usecs
[ 2.412976] [drm] ring test on 10 succeeded in 4 usecs
[ 2.459127] [drm] ring test on 11 succeeded in 1 usecs
[ 2.479259] [drm] UVD initialized successfully.
[ 2.580509] [Firmware Bug]: battery: (dis)charge rate invalid.
[ 2.580741] ACPI: Battery Slot [BAT1] (battery present)
[ 2.710161] [drm] ring test on 12 succeeded in 13 usecs
[ 2.710286] [drm] ring test on 13 succeeded in 1 usecs
[ 2.710387] [drm] VCE initialized successfully.
[ 3.243531] tsc: Refined TSC clocksource calibration: 998.127 MHz
[ 3.243662] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1cc65b93289, max_idle_ns: 881590487074 ns
[ 3.734588] [drm] fb mappable at 0xE0428000
[ 3.734701] [drm] vram apper at 0xE0000000
[ 3.734802] [drm] size 4325376
[ 3.734900] [drm] fb depth is 24
[ 3.734999] [drm] pitch is 5632
[ 3.735303] fbcon: amdgpudrmfb (fb0) is primary device
[ 3.858601] Console: switching to colour frame buffer device 170x48
[ 3.869008] amdgpu 0000:00:01.0: fb0: amdgpudrmfb frame buffer device
[ 3.875082] [drm] ib test on ring 0 succeeded
[ 3.875275] [drm] ib test on ring 1 succeeded
[ 3.875442] [drm] ib test on ring 2 succeeded
[ 3.875577] [drm] ib test on ring 3 succeeded
[ 3.875711] [drm] ib test on ring 4 succeeded
[ 3.875844] [drm] ib test on ring 5 succeeded
[ 3.875977] [drm] ib test on ring 6 succeeded
[ 3.876110] [drm] ib test on ring 7 succeeded
[ 3.876243] [drm] ib test on ring 8 succeeded
[ 3.876363] [drm] ib test on ring 9 succeeded
[ 3.876480] [drm] ib test on ring 10 succeeded
[ 3.904244] [drm] ib test on ring 11 succeeded
[ 4.025142] [drm] ib test on ring 12 succeeded
[ 4.033356] [drm] Initialized amdgpu 3.9.0 20150101 for 0000:00:01.0 on minor 0
So, it seems that this specific commit, at least on my kabini card (or
rather, APU), probably also only on 32 bit (couldn't yet test a 64 bit
kernel), has introduced a regression.
I thought I'd report this so someone with a bit more knowledge can
have a look. Actually, just three hours ago, I didn't even know what a
UVD is, now I know that it's a Unified Video Decoder, and ... well, if
I continue at that pace, in a year from now I might even know how to
fix this issue I've discovered ;-), but right now, I guess the help of an
expert would be greatly appreaciated! :-)
Greetings
Nils
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 4+ messages in thread[parent not found: <20170122192610.GB3125-iI9p2NPcQ/rYtjvyW6yDsg@public.gmane.org>]
* Re: Bug (and probably, fix): UVD initialization / clock gating issue on kabini [not found] ` <20170122192610.GB3125-iI9p2NPcQ/rYtjvyW6yDsg@public.gmane.org> @ 2017-01-23 0:47 ` Nils Holland [not found] ` <20170123004738.GA2893-iI9p2NPcQ/rYtjvyW6yDsg@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Nils Holland @ 2017-01-23 0:47 UTC (permalink / raw) To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW On Sun, Jan 22, 2017 at 08:26:10PM +0100, Nils Holland wrote: > Hi folks, > > while playing around with the amdgpu drm driver, I stumbled upon an > issue. In fact, I have tracked it down to commit > aa4747c00a2dd034c5fdf70ca73b1674ca15beb3 ("drm/amdgpu: refine uvd_4.2 > clock gate sequence."). > > When I run the latest mainline git kernel, which contains this commit, > on my system, I get the following in dmesg (notice the lines about UVD > not responding at the end): > [ 4.870999] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 5.891351] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 6.911713] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 7.932062] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 8.952409] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 9.974050] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 10.995701] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 12.017354] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 13.039024] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 14.060692] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 14.082168] [drm:uvd_v4_2_start] *ERROR* UVD not responding, giving up!!! Replying to myself here, as I believe I may have found a fix: Leaving commit aa4747c00a2dd034c5fdf70ca73b1674ca15beb3 applied, and then applying the following patch, which re-adds a little line that the original patch removed, fixes the problem for me: diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c index 96444e4d862a..350e7dab9b6d 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c @@ -273,6 +273,8 @@ static int uvd_v4_2_start(struct amdgpu_device *adev) uvd_v4_2_mc_resume(adev); + WREG32(mmUVD_CGC_GATE, 0); + /* disable interupt */ WREG32_P(mmUVD_MASTINT_EN, 0, ~(1 << 1)); Hmm ... what does this do? Completely disable clock gating? And if so, is this a sane thing to do at this point during UVD initialization? The fact that it was done like that before aa4747c00a2dd034c5fdf70ca73b1674ca15beb3 might suggest that to be the case. If the experts think I'm on the right track here I'll also re-submit this change as a proper patch. I could probably have done so in the right away and then ask for comments / review as part of the patch submission instead of asking here in this message ... good question if that would have been the better approach - I guess I'm still learning all these things. ;-) Greetings Nils _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 4+ messages in thread
[parent not found: <20170123004738.GA2893-iI9p2NPcQ/rYtjvyW6yDsg@public.gmane.org>]
* RE: Bug (and probably, fix): UVD initialization / clock gating issue on kabini [not found] ` <20170123004738.GA2893-iI9p2NPcQ/rYtjvyW6yDsg@public.gmane.org> @ 2017-01-23 10:55 ` Zhu, Rex [not found] ` <CY4PR12MB1687C4749E80E440DE9C89C5FB720-rpdhrqHFk06Y0SjTqZDccQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Zhu, Rex @ 2017-01-23 10:55 UTC (permalink / raw) To: Nils Holland, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org [-- Attachment #1: Type: text/plain, Size: 3711 bytes --] we fixed this issue on Kv as uvd pg was enabled on APU. We need to change the uvd cg mode. When idle, use hw cg. And encode, use sw cg. So WREG32(mmUVD_CGC_GATE, 0); // ture off cg. Then uvd_v4_2_set_dcm(adev, true); // set sw cg. The first patch can fix this issue. The second dpm patch can fix similar issue which caused by dpm's power state setting. Best Regards Rex -----Original Message----- From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Nils Holland Sent: Monday, January 23, 2017 8:48 AM To: amd-gfx@lists.freedesktop.org Subject: Re: Bug (and probably, fix): UVD initialization / clock gating issue on kabini On Sun, Jan 22, 2017 at 08:26:10PM +0100, Nils Holland wrote: > Hi folks, > > while playing around with the amdgpu drm driver, I stumbled upon an > issue. In fact, I have tracked it down to commit > aa4747c00a2dd034c5fdf70ca73b1674ca15beb3 ("drm/amdgpu: refine uvd_4.2 > clock gate sequence."). > > When I run the latest mainline git kernel, which contains this commit, > on my system, I get the following in dmesg (notice the lines about UVD > not responding at the end): > [ 4.870999] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 5.891351] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 6.911713] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 7.932062] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 8.952409] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 9.974050] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 10.995701] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 12.017354] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 13.039024] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 14.060692] [drm:uvd_v4_2_start] *ERROR* UVD not responding, trying to reset the VCPU!!! > [ 14.082168] [drm:uvd_v4_2_start] *ERROR* UVD not responding, giving up!!! Replying to myself here, as I believe I may have found a fix: Leaving commit aa4747c00a2dd034c5fdf70ca73b1674ca15beb3 applied, and then applying the following patch, which re-adds a little line that the original patch removed, fixes the problem for me: diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c index 96444e4d862a..350e7dab9b6d 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c @@ -273,6 +273,8 @@ static int uvd_v4_2_start(struct amdgpu_device *adev) uvd_v4_2_mc_resume(adev); + WREG32(mmUVD_CGC_GATE, 0); + /* disable interupt */ WREG32_P(mmUVD_MASTINT_EN, 0, ~(1 << 1)); Hmm ... what does this do? Completely disable clock gating? And if so, is this a sane thing to do at this point during UVD initialization? The fact that it was done like that before aa4747c00a2dd034c5fdf70ca73b1674ca15beb3 might suggest that to be the case. If the experts think I'm on the right track here I'll also re-submit this change as a proper patch. I could probably have done so in the right away and then ask for comments / review as part of the patch submission instead of asking here in this message ... good question if that would have been the better approach - I guess I'm still learning all these things. ;-) Greetings Nils _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx [-- Attachment #2: 0001-drm-amdgpu-change-clock-gating-mode-for-uvd_v4.patch --] [-- Type: application/octet-stream, Size: 3767 bytes --] From a44ef01e1596f7be3323a1a53fa783e330046aad Mon Sep 17 00:00:00 2001 From: Rex Zhu <Rex.Zhu@amd.com> Date: Thu, 12 Jan 2017 21:48:26 +0800 Subject: [PATCH] drm/amdgpu: change clock gating mode for uvd_v4. use sw cg when decode. and hw cg when idle. Change-Id: I22370cceef2411af00ad086e1d4e499a4ae84b6c Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Ack-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 42 +++++++++-------------------------- 1 file changed, 10 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c index 96444e4..7fb9137 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c @@ -40,13 +40,14 @@ #include "smu/smu_7_0_1_sh_mask.h" static void uvd_v4_2_mc_resume(struct amdgpu_device *adev); -static void uvd_v4_2_init_cg(struct amdgpu_device *adev); static void uvd_v4_2_set_ring_funcs(struct amdgpu_device *adev); static void uvd_v4_2_set_irq_funcs(struct amdgpu_device *adev); static int uvd_v4_2_start(struct amdgpu_device *adev); static void uvd_v4_2_stop(struct amdgpu_device *adev); static int uvd_v4_2_set_clockgating_state(void *handle, enum amd_clockgating_state state); +static void uvd_v4_2_set_dcm(struct amdgpu_device *adev, + bool sw_mode); /** * uvd_v4_2_ring_get_rptr - get read pointer * @@ -140,7 +141,8 @@ static int uvd_v4_2_sw_fini(void *handle) return r; } - +static void uvd_v4_2_enable_mgcg(struct amdgpu_device *adev, + bool enable); /** * uvd_v4_2_hw_init - start and test UVD block * @@ -155,8 +157,7 @@ static int uvd_v4_2_hw_init(void *handle) uint32_t tmp; int r; - uvd_v4_2_init_cg(adev); - uvd_v4_2_set_clockgating_state(adev, AMD_CG_STATE_GATE); + uvd_v4_2_enable_mgcg(adev, true); amdgpu_asic_set_uvd_clocks(adev, 10000, 10000); r = uvd_v4_2_start(adev); if (r) @@ -266,11 +267,13 @@ static int uvd_v4_2_start(struct amdgpu_device *adev) struct amdgpu_ring *ring = &adev->uvd.ring; uint32_t rb_bufsz; int i, j, r; - /* disable byte swapping */ u32 lmi_swap_cntl = 0; u32 mp_swap_cntl = 0; + WREG32(mmUVD_CGC_GATE, 0); + uvd_v4_2_set_dcm(adev, true); + uvd_v4_2_mc_resume(adev); /* disable interupt */ @@ -406,6 +409,8 @@ static void uvd_v4_2_stop(struct amdgpu_device *adev) /* Unstall UMC and register bus */ WREG32_P(mmUVD_LMI_CTRL2, 0, ~(1 << 8)); + + uvd_v4_2_set_dcm(adev, false); } /** @@ -619,19 +624,6 @@ static void uvd_v4_2_set_dcm(struct amdgpu_device *adev, WREG32_UVD_CTX(ixUVD_CGC_CTRL2, tmp2); } -static void uvd_v4_2_init_cg(struct amdgpu_device *adev) -{ - bool hw_mode = true; - - if (hw_mode) { - uvd_v4_2_set_dcm(adev, false); - } else { - u32 tmp = RREG32(mmUVD_CGC_CTRL); - tmp &= ~UVD_CGC_CTRL__DYN_CLOCK_MODE_MASK; - WREG32(mmUVD_CGC_CTRL, tmp); - } -} - static bool uvd_v4_2_is_idle(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; @@ -685,17 +677,6 @@ static int uvd_v4_2_process_interrupt(struct amdgpu_device *adev, static int uvd_v4_2_set_clockgating_state(void *handle, enum amd_clockgating_state state) { - bool gate = false; - struct amdgpu_device *adev = (struct amdgpu_device *)handle; - - if (!(adev->cg_flags & AMD_CG_SUPPORT_UVD_MGCG)) - return 0; - - if (state == AMD_CG_STATE_GATE) - gate = true; - - uvd_v4_2_enable_mgcg(adev, gate); - return 0; } @@ -711,9 +692,6 @@ static int uvd_v4_2_set_powergating_state(void *handle, */ struct amdgpu_device *adev = (struct amdgpu_device *)handle; - if (!(adev->pg_flags & AMD_PG_SUPPORT_UVD)) - return 0; - if (state == AMD_PG_STATE_GATE) { uvd_v4_2_stop(adev); return 0; -- 1.9.1 [-- Attachment #3: 0001-drm-amdgpu-fix-dpm-bug-on-Kv.patch --] [-- Type: application/octet-stream, Size: 3065 bytes --] From 8ca333e1663afa34c93093d3e07d7972b90b751a Mon Sep 17 00:00:00 2001 From: Rex Zhu <Rex.Zhu@amd.com> Date: Fri, 20 Jan 2017 14:27:22 +0800 Subject: [PATCH] drm/amdgpu: fix dpm bug on Kv. 1. current_ps/request_ps not update. 2. compare crrent_ps and request_ps, if same, don't re-set power state. which will lead uvd can't work after power up in some case. Change-Id: I9c6f09e7bcdbe8bd76b1ac1ff3e441aab6100b08 Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> --- drivers/gpu/drm/amd/amdgpu/kv_dpm.c | 44 ++++++++++++++++++++++++++++++++++--- 1 file changed, 41 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c index 90c2af3..6b6476d 100644 --- a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c +++ b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c @@ -1230,6 +1230,7 @@ static void kv_update_current_ps(struct amdgpu_device *adev, pi->current_rps = *rps; pi->current_ps = *new_ps; pi->current_rps.ps_priv = &pi->current_ps; + adev->pm.dpm.current_ps = &pi->current_rps; } static void kv_update_requested_ps(struct amdgpu_device *adev, @@ -1241,6 +1242,7 @@ static void kv_update_requested_ps(struct amdgpu_device *adev, pi->requested_rps = *rps; pi->requested_ps = *new_ps; pi->requested_rps.ps_priv = &pi->requested_ps; + adev->pm.dpm.requested_ps = &pi->requested_rps; } static void kv_dpm_enable_bapm(struct amdgpu_device *adev, bool enable) @@ -3009,7 +3011,6 @@ static int kv_dpm_late_init(void *handle) kv_dpm_powergate_samu(adev, true); kv_dpm_powergate_vce(adev, true); kv_dpm_powergate_uvd(adev, true); - return 0; } @@ -3245,15 +3246,52 @@ static int kv_dpm_set_powergating_state(void *handle, return 0; } +static inline bool kv_are_power_levels_equal(const struct kv_pl *kv_cpl1, + const struct kv_pl *kv_cpl2) +{ + return ((kv_cpl1->sclk == kv_cpl2->sclk) && + (kv_cpl1->vddc_index == kv_cpl2->vddc_index) && + (kv_cpl1->ds_divider_index == kv_cpl2->ds_divider_index) && + (kv_cpl1->force_nbp_state == kv_cpl2->force_nbp_state)); +} + static int kv_check_state_equal(struct amdgpu_device *adev, struct amdgpu_ps *cps, struct amdgpu_ps *rps, bool *equal) { - if (equal == NULL) + struct kv_ps *kv_cps; + struct kv_ps *kv_rps; + int i; + + if (adev == NULL || cps == NULL || rps == NULL || equal == NULL) return -EINVAL; - *equal = false; + kv_cps = kv_get_ps(cps); + kv_rps = kv_get_ps(rps); + + if (kv_cps == NULL) { + *equal = false; + return 0; + } + + if (kv_cps->num_levels != kv_rps->num_levels) { + *equal = false; + return 0; + } + + for (i = 0; i < kv_cps->num_levels; i++) { + if (!kv_are_power_levels_equal(&(kv_cps->levels[i]), + &(kv_rps->levels[i]))) { + *equal = false; + return 0; + } + } + + /* If all performance levels are the same try to use the UVD clocks to break the tie.*/ + *equal = ((cps->vclk == rps->vclk) && (cps->dclk == rps->dclk)); + *equal &= ((cps->evclk == rps->evclk) && (cps->ecclk == rps->ecclk)); + return 0; } -- 1.9.1 [-- Attachment #4: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 4+ messages in thread
[parent not found: <CY4PR12MB1687C4749E80E440DE9C89C5FB720-rpdhrqHFk06Y0SjTqZDccQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>]
* Re: Bug (and probably, fix): UVD initialization / clock gating issue on kabini [not found] ` <CY4PR12MB1687C4749E80E440DE9C89C5FB720-rpdhrqHFk06Y0SjTqZDccQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> @ 2017-01-23 11:44 ` Nils Holland 0 siblings, 0 replies; 4+ messages in thread From: Nils Holland @ 2017-01-23 11:44 UTC (permalink / raw) To: Zhu, Rex; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org On Mon, Jan 23, 2017 at 10:55:49AM +0000, Zhu, Rex wrote: > we fixed this issue on Kv as uvd pg was enabled on APU. > > We need to change the uvd cg mode. > When idle, use hw cg. And encode, use sw cg. > > So > WREG32(mmUVD_CGC_GATE, 0); // ture off cg. > Then > uvd_v4_2_set_dcm(adev, true); // set sw cg. > The first patch can fix this issue. > > The second dpm patch can fix similar issue which caused by dpm's power state setting. Ah, thanks, that sounds good! I'll give these two patches a try on my system soon and report back in case I encounter any problems! :-) Greetings Nils _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-01-23 11:44 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-22 19:26 Bug: UVD initialization / clock gating issue on kabini Nils Holland
[not found] ` <20170122192610.GB3125-iI9p2NPcQ/rYtjvyW6yDsg@public.gmane.org>
2017-01-23 0:47 ` Bug (and probably, fix): " Nils Holland
[not found] ` <20170123004738.GA2893-iI9p2NPcQ/rYtjvyW6yDsg@public.gmane.org>
2017-01-23 10:55 ` Zhu, Rex
[not found] ` <CY4PR12MB1687C4749E80E440DE9C89C5FB720-rpdhrqHFk06Y0SjTqZDccQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-01-23 11:44 ` Nils Holland
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.