All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 216645] New: Fence fallback timer expired on ring gfx
@ 2022-10-31 13:22 bugzilla-daemon
  2022-10-31 13:23 ` [Bug 216645] " bugzilla-daemon
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-10-31 13:22 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

            Bug ID: 216645
           Summary: Fence fallback timer expired on ring gfx
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.15.0-43-generic
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: ask4support@email.cz
        Regression: No

Created attachment 303109
  --> https://bugzilla.kernel.org/attachment.cgi?id=303109&action=edit
Kernel log created by the script in the menuetry

Sometimes when I run a KDE system monitor, or Chrome, my laptop freezes and
won't unfreeze until reboot (well, after a while I can move the mouse cursor,
but that's all I can do). 
I'm using Dell G5 SE 5505 with AMD Ryzen 7 4800H as a CPU, Radeon RX Vega 7 as
iGPU and AMD Radeon RX 5600M as dGPU. 

I've searched through existing bugs and found that it might be related to
interrupts. With that in mind, I've compiled a list of kernel parameters which
might be related and, as well as that, I've tested all of them: 

PW = Probably Working, NW = Not Working, NB = Not Booting
PW      pcie_port_pm=off
PW      amdgpu.msi=0
NW      amd_iommu=fullflush
NW      amd_iommu=force_isolation
NW      amd_iommu=off
NW      amd_iommu_intr=legacy
NW      amd_iommu_intr=vapic kvm-amd.avic=1
NW      iommu=off
NW      iommu=force
NW      iommu=noforce
NW      iommu=biomerge
NW      iommu=merge
NW      iommu=nomerge
NW      iommu=forcesac
NW      iommu=soft
NW      iommu=pt
NW      irqfixup
NW      irqpoll
NW      nointremap
NW      pcie_port_pm=force
NW      amdgpu.pcie_gen2=1
NW      amdgpu.pcie_gen2=0
NW      amdgpu.msi=1
NW      amdgpu.lockup_timeout=1000
NW      amdgpu.lockup_timeout=100
NW      amdgpu.aspm=1
NW      amdgpu.aspm=0
NW      amdgpu.bapm=1
NW      amdgpu.bapm=0
NW      amdgpu.ppfeaturemask=0xfff7bff7
NW      amdgpu.ppfeaturemask=0xfff7bdff
NW      amdgpu.ppfeaturemask=0xfff7bbff
NW      amdgpu.ppfeaturemask=0xfff73fff
NW      amdgpu.ppfeaturemask=0xfff3bfff
NW      amdgpu.exp_hw_support=1
NW      amdgpu.exp_hw_support=0
NW      amdgpu.forcelongtraining=0
NW      amdgpu.forcelongtraining=1
NW      amdgpu.cg_mask=0x00000000
NW      amdgpu.cg_mask=0xffffffff
NW      amdgpu.pg_mask=0xffffffff
NW      amdgpu.ngg=1
NW      amdgpu.ngg=0
NW      amdgpu.job_hang_limit=1000
NW      amdgpu.job_hang_limit=100
NW      amdgpu.lbpw=1
NW      amdgpu.lbpw=0
NW      amdgpu.gpu_recovery=1
NW      amdgpu.gpu_recovery=0
NW      amdgpu.sched_policy=2
NW      amdgpu.sched_policy=1
NW      amdgpu.sched_policy=0
NW      amdgpu.ignore_crat=0
NW      amdgpu.ignore_crat=1
NW      amdgpu.ras_enable=0
NW      amdgpu.ras_enable=1
NW      amdgpu.async_gfx_ring=0
NW      amdgpu.async_gfx_ring=1
NW      amdgpu.mcbp=1
NW      amdgpu.mcbp=0
NW      amdgpu.mes=0
NW      amdgpu.mes_kiq=1
NW      amdgpu.mes_kiq=0
NW      amdgpu.reset_method=0
NW      amdgpu.reset_method=1
NW      amdgpu.reset_method=2
NW      amdgpu.reset_method=3
NW      amdgpu.reset_method=4
NW      amdgpu.reset_method=-1
NW      idle=nomwait
NB      amdgpu.pg_mask=0x00000000
NB      amdgpu.mes=1



I've developed a script and a GRUB2 menu entry for live Kubuntu that triggers
the freeze and saves the dmesg into a file called Freeze_Dell_G5_SE_5505.sh.log
at the root of the drive it's being booted from.
Replace the ISO variable value with the path to your iso file if it's not at
root directory of the drive and/or if it's of a different version: 

menuentry "Start Kubuntu 22.04.1 (64 bit) without Ubiquity and with a freezing
script" {
        ISO=/kubuntu-22.04.1-desktop-amd64.iso
        set gfxpayload=keep
        loopback loop "$ISO"
        probe -u $root --set=rootid
        linux   (loop)/casper/vmlinuz   iso-scan/filename="$ISO"
file=/cdrom/preseed/kubuntu.seed maybe-ubiquity quiet splash init=/bin/sh -- -c
'for script in /home/kubuntu/Desktop/Freeze_Dell_G5_SE_5505.sh ; do for autorun
in /home/kubuntu/.config/autostart/${script##*/} ; do ln -fs /dev/null
/etc/systemd/system/graphical.target.wants/ubiquity.service ; mkdir -p
${script%/*} ${autorun%/*} ; printf
\043!_/bin/sh++print\050\051_{+\tprintf_"@1"_,_seq_-s"_"_@\050\050_@\050stty_size_\074_@t_?_sed_"s/^/\050/,_s/_/_-_1_\051_*_/"\051_-_@{\0431}_\051\051_?_sed_s/[0-9]//g+}+t\075"@\050readlink_/proc/self/fd/0\051"++d\075"@\050env_LANG\075C_udisksctl_mount_-b_/dev/disk/by-uuid/$0_-o_sync_2\076_/dev/null_?_sed_"s/^Mounted_.*_at_//g,_s/\\.@//g"\051"+[_-d_"@d"_]_\046\046_f\075oflag\075direct_??_d\075"@{0%%/*}"+sudo_dmesg_-w_?_sudo_dd_of\075"@d/@{0\043\043*/}.log"_@f_\046+i\0750+seq_28_150000_?_while_read_N_,_do+\tprint_@N+\ttimeout_3_env_DISPLAY\075:0_plasma-systemmonitor_\076_/dev/null_2\076\0461+\tn\075@N_,_while_[_0_-lt_@n_]_,_do+\t\tsleep_1+\t\tn\075@\050\050_@n_-_1_\051\051+\t\ti\075@\050\050_@i_^_1_\051\051+\t\t[_"@i"_\075_1_]_\046\046_printf_"\\33[30m\\33[47m"_??_printf_"\\33[37m\\33[40m"+\t\tprint_@n+\tdone+done++echo_END!+exit+
| tr _,?@+ \40\73\174\044\n > $script ; printf
[Desktop_Entry]\nType=Application\nExec=kstart_--maximize_--_konsole_-e_  | tr
_ \40 > ${autorun%.sh}.desktop ; printf $script\n >> ${autorun%.sh}.desktop ;
chmod +x $script ${autorun%.sh}.desktop ; chown -R kubuntu:kubuntu
/home/kubuntu ; exec /sbin/init maybe-ubiquity splash --- ; done ; done'
$rootid
        initrd  (loop)/casper/initrd
}



The script generated on the live Kubuntu's desktop runs KDE's System Monitor
for a three seconds and waits before running it again. With each iteration, it
waits one second longer than before. The parameter passed the test if it
managed not to freeze until the script was waiting for 50 seconds (now I'd
recommend 60, as with 50 it sometimes froze after the second boot) for five
boots in a row. 

Would someone also tell us which workaround should be used under which
performace/latency requirements? ("Maybe wrong but still an" EXAMPLE: Users who
need the best performace or lowest latency should use pcie_port_pm=off, users
who need the best battery life should use amdgpu.msi=0.)

If you fix the issue, may you please tell the users (not just developers) what
was the problem? ("Maybe wrong but still an" EXAMPLE: The driver was waiting
for an interrupt, but the bus was down, therefore the message-signalled
interrupt could not have come and the operation timed out.)

Thanks.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 216645] Fence fallback timer expired on ring gfx
  2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
@ 2022-10-31 13:23 ` bugzilla-daemon
  2022-10-31 15:40 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-10-31 13:23 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

Martin Šušla (ask4support@email.cz) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #303109|Kernel log created by the   |Kernel log created by the
        description|script in the menuetry      |script in the menuentry

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 216645] Fence fallback timer expired on ring gfx
  2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
  2022-10-31 13:23 ` [Bug 216645] " bugzilla-daemon
@ 2022-10-31 15:40 ` bugzilla-daemon
  2022-10-31 15:42 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-10-31 15:40 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

Alex Deucher (alexdeucher@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alexdeucher@gmail.com

--- Comment #1 from Alex Deucher (alexdeucher@gmail.com) ---
(In reply to Martin Šušla from comment #0)
> 
> Would someone also tell us which workaround should be used under which
> performace/latency requirements? ("Maybe wrong but still an" EXAMPLE: Users
> who need the best performace or lowest latency should use pcie_port_pm=off,
> users who need the best battery life should use amdgpu.msi=0.)
> 

You should not need to override any of the defaults other than for debugging.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 216645] Fence fallback timer expired on ring gfx
  2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
  2022-10-31 13:23 ` [Bug 216645] " bugzilla-daemon
  2022-10-31 15:40 ` bugzilla-daemon
@ 2022-10-31 15:42 ` bugzilla-daemon
  2022-10-31 17:53 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-10-31 15:42 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

--- Comment #2 from Alex Deucher (alexdeucher@gmail.com) ---
Are you getting interrupts on the GPU?  Check /proc/interrupts to see if you
are getting interrupts for the GPU.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 216645] Fence fallback timer expired on ring gfx
  2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
                   ` (2 preceding siblings ...)
  2022-10-31 15:42 ` bugzilla-daemon
@ 2022-10-31 17:53 ` bugzilla-daemon
  2022-10-31 17:56 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-10-31 17:53 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

--- Comment #3 from Martin Šušla (ask4support@email.cz) ---
After the message mention in title appears, not even a single interrupt is
registered.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 216645] Fence fallback timer expired on ring gfx
  2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
                   ` (3 preceding siblings ...)
  2022-10-31 17:53 ` bugzilla-daemon
@ 2022-10-31 17:56 ` bugzilla-daemon
  2022-10-31 18:02 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-10-31 17:56 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

--- Comment #4 from Martin Šušla (ask4support@email.cz) ---
Created attachment 303110
  --> https://bugzilla.kernel.org/attachment.cgi?id=303110&action=edit
Kernel log interlaced with contents of /proc/interrupts polled every second

#! /bin/sh

print() {
        printf "$1" ; seq -s" " $(( $(stty size < $t | sed "s/^/(/; s/ / - 1 )
* /") - ${#1} )) | sed s/[0-9]//g
}
t="$(readlink /proc/self/fd/0)"

d="$(env LANG=C udisksctl mount -b /dev/disk/by-uuid/$1 -o sync 2> /dev/null |
sed "s/^Mounted .* at //g; s/\.$//g")"
[ -d "$d" ] && f=oflag=direct || d="${0%/*}" f=oflag=direct

(sudo dmesg -w & while sleep 1 ; do cat /proc/interrupts ; done) | sudo dd
of="$d/${0##*/}.log" $f &
i=0
seq 28 150000 | while read N ; do
        print $N
        timeout 3 env DISPLAY=:0 plasma-systemmonitor > /dev/null 2>&1
        n=$N ; while [ 0 -lt $n ] ; do
                sleep 1
                n=$(( $n - 1 ))
                i=$(( $i ^ 1 ))
                [ "$i" = 1 ] && printf "\33[30m\33[47m" || printf
"\33[37m\33[40m"
                print $n
        done
done

echo END!
exit

# This script was used to generate it

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 216645] Fence fallback timer expired on ring gfx
  2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
                   ` (4 preceding siblings ...)
  2022-10-31 17:56 ` bugzilla-daemon
@ 2022-10-31 18:02 ` bugzilla-daemon
  2022-10-31 21:53 ` bugzilla-daemon
  2022-11-01  8:41 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-10-31 18:02 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

--- Comment #5 from Martin Šušla (ask4support@email.cz) ---
(In reply to Martin Šušla from comment #3)
> After the message mention in title appears, not even a single interrupt is
> registered.

(Valid for both interrupts of the amdgpu driver.)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 216645] Fence fallback timer expired on ring gfx
  2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
                   ` (5 preceding siblings ...)
  2022-10-31 18:02 ` bugzilla-daemon
@ 2022-10-31 21:53 ` bugzilla-daemon
  2022-11-01  8:41 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-10-31 21:53 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

--- Comment #6 from Alex Deucher (alexdeucher@gmail.com) ---
(In reply to Martin Šušla from comment #5)
> (In reply to Martin Šušla from comment #3)
> > After the message mention in title appears, not even a single interrupt is
> > registered.
> 
> (Valid for both interrupts of the amdgpu driver.)

There are two GPUs in the system.  You appear to be getting at least some
interrupts.

Are you using the dGPU at all or just the APU?  You might try a newer system
bios if there is one available for your system.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 216645] Fence fallback timer expired on ring gfx
  2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
                   ` (6 preceding siblings ...)
  2022-10-31 21:53 ` bugzilla-daemon
@ 2022-11-01  8:41 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2022-11-01  8:41 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=216645

--- Comment #7 from Martin Šušla (ask4support@email.cz) ---
Sure, the GPU (at 0000:03:00.0) being initialized before the message mentioned
in the title appears is the dGPU. Line 1078 in the "Kernel log created by the
script in the menuentry" confirms this as the APU (at 0000:07:00.0, line 1184)
doesn't use 6128M VRAM, it uses just 512M.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-11-01  8:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-31 13:22 [Bug 216645] New: Fence fallback timer expired on ring gfx bugzilla-daemon
2022-10-31 13:23 ` [Bug 216645] " bugzilla-daemon
2022-10-31 15:40 ` bugzilla-daemon
2022-10-31 15:42 ` bugzilla-daemon
2022-10-31 17:53 ` bugzilla-daemon
2022-10-31 17:56 ` bugzilla-daemon
2022-10-31 18:02 ` bugzilla-daemon
2022-10-31 21:53 ` bugzilla-daemon
2022-11-01  8:41 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.