From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 110781] Radeon: heavy r300 performance drop regression between
11.x and 19.x
Date: Tue, 28 May 2019 11:51:35 +0000
Message-ID:
Bug ID
110781
Summary
Radeon: heavy r300 performance drop regression between 11.x a=
nd 19.x
Product
Mesa
Version
git
Hardware
x86 (IA32)
OS
Linux (All)
Status
NEW
Severity
normal
Priority
medium
Component
Drivers/Gallium/r300
Assignee
dri-devel@lists.freedesktop.org
Reporter
u9vata@gmail.com
QA Contact
dri-devel@lists.freedesktop.org
Dear mesa/freedesktop team!
I am a happy user of the open source radeon r300 driver for my Mobility rad=
eon
200M card in my (pretty old, but good-enough) laptop.
I have updated my system + changed distro and got a complete slowdown. I ha=
ve
checked if it is the kernel or the distro and found that I can get back my
performance if I revert to mesa 11.x and corresponding xorg while still usi=
ng
the latest linux kernel. This seems to be some kind of performance regressi=
on.
The performance drop is heavy: 50%..100% slowdown and very high CPU usage. =
For
example now extreme tux racer reports 100% CPU usage and before it reports
25-50% at maximum (including any spikes) and mostly around 30% actually when
averaging.
I have used the perf tool to locate what causes the heavy CPU usage and I f=
ind
that there is a lot of memory movements in a create_bo call.
See this log:
Samples: 171K of event 'cycles', Event count (approx.): 67632101793=20=20=
=20=20=20=20=20=20=20=20=20
Children Self Command Shared Object Symbol=20=
=20=20=20=20=20=20=20=20=20
- 61,12% 0,09% etr [kernel.vmlinux] [k]
entry_SYSENTER_32 =
=E2=97=86
- 61,07% entry_SYSENTER_32=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20
- 60,94% do_fast_syscall_32=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20
- 57,92% sys_ioctl=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
- 57,84% do_vfs_ioctl=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20
- 57,43% radeon_drm_ioctl=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20
- 57,04% drm_ioctl=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20
- 56,81% drm_ioctl_kernel=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20
- 55,86% radeon_gem_create_ioctl=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20
- 55,46% radeon_gem_object_create=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20
- 55,36% radeon_bo_create=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20
- 55,20% ttm_bo_init=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20
- 55,14% ttm_bo_init_reserved=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20
- 54,75% ttm_bo_validate=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20
- 54,42% ttm_bo_handle_move_mem=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
- 54,07% ttm_tt_bind=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20
- 53,36% radeon_ttm_tt_populate=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
+ 53,33% ttm_populate_and_map_pages=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
0,62% radeon_ttm_backend_bind
I see these are code paths in the kernel, but the same happens regardless I=
use
an old kernel (and modules) or the newest one, while this gets a near compl=
ete
disappear when I revert to old mesa and X.
I do not see anything related to bo (buffer object??) creation in mesa sour=
ces
below the gallium/r300 directory, but I have found things that make ioctl i=
n:
gallium/winsys/radeon/drm/radeon_drm_bo.c
^^Is this also used for r300 cards? The "source tree" documentati=
on page seems
to tell me this is a "shared code for r600 and radeonsi", but whe=
re is r300
doing calls to the ioctls then?
Something happened in the last 1-3 years that changed stuff to move memory
around crazily for some reason and use more CPU for that. Surely it was not
having this heavy slowdown before. Now it is nearly as slow as llvmpipe for
practical cases (but not slower!).
Can anyone help me with this? I am a developer myself, but I am not well ve=
rsed
in the source code of mesa and in how to analyse its performance bottleneck=
s.
PS.: On phoronix I was already analysing the problem for long:=20
https://www.phoronix.com/forums/forum/linu=
x-graphics-x-org-drivers/open-source-amd-linux/1099745-how-to-tell-if-a-dri=
ver-is-gallium-or-just-mesa-slow-renderng-with-radeon
^^There the whole process of what I was trying is written with every step, =
but
maybe only perf outputs are of interest from there...
Feel free to ask me anything about the issue. If I would be able to help
solving this myself I will be happy too, but I have never really did any
patches to these kind of core system libraries and I am quite rookie for gpu
drivers...