From mboxrd@z Thu Jan 1 00:00:00 1970
From: bugzilla-daemon@freedesktop.org
Subject: [Bug 102646] Screen flickering under amdgpu-experimental [buggy auto
power profile]
Date: Tue, 06 Aug 2019 18:54:36 +0000
Message-ID:
References:
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============0077330890=="
Return-path:
Received: from culpepper.freedesktop.org (culpepper.freedesktop.org
[IPv6:2610:10:20:722:a800:ff:fe98:4b55])
by gabe.freedesktop.org (Postfix) with ESMTP id C64998911D
for ; Tue, 6 Aug 2019 18:54:37 +0000 (UTC)
In-Reply-To:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel"
To: dri-devel@lists.freedesktop.org
List-Id: dri-devel@lists.freedesktop.org
--===============0077330890==
Content-Type: multipart/alternative; boundary="15651176777.B621b7259.32212"
Content-Transfer-Encoding: 7bit
--15651176777.B621b7259.32212
Date: Tue, 6 Aug 2019 18:54:37 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
https://bugs.freedesktop.org/show_bug.cgi?id=3D102646
--- Comment #101 from Maxim Ivanov ---
(In reply to Ahzo from comment #97)
> Created attachment 144950 [details] [review]
> Patch to fix the problem
>=20
> TLDR: A script to reproduce and a patch to fix this problem are attached.
>=20
> The problem occurs when switching between high and low GPU memory
> frequencies at specific time intervals. It can be reproduced with the
> attached script, which optionally accepts a time parameter, defaulting to=
1
> ms.
> With a 75 Hz display mode, screen corruption occurs rather reliably by us=
ing
> a time parameter in the following ranges:
> 0.000-0.002, 0.011-0.015, 0.024-0.028, 0.038-0.042, 0.051-0.055,
> 0.064-0.068, 0.078-0.082, 0.091-0.095, 0.104-0.108
>=20
> However, using sleep times between these intervals, e.g. 0.1, does not
> produce any screen corruption.
> For a frequency of 75 Hz the frame time is T =3D 1000 / 75 ms =3D 13.3 ms=
and
> the screen corruption happens for sleep times of:
> S =3D n * T +- 2 ms
> Here n is a natural number, i.e. 0, 1, 2, 3, and so on.
>=20
> Linux 4.14 is not affected by this problem, as is noted in comment 93.
> However, that version only works by accident: When the display mode is not
> yet known, default parameters, in particular 60 Hz, are used to calculate
> frame_time_x2 as (1000000 / 60) * 2 / 100 =3D 333, which is then used to =
set
> VBITimeout. Later, when the refresh rate of 75 Hz is known, frame_time_x2
> gets updated to 266, but VBITimeout is never actually set to that value v=
ia
> smu7_notify_smc_display.
>=20
> Linux 4.15 included the DC patches, and when using DC (e.g. by using the
> boot argument amdgpu.dc=3D1), VBITimeout is never set to the default 333,=
but
> directly to 266, which triggers the screen corruption and flickering
> problems described in this bug.
>=20
> With Linux 4.17 the problem got more widespread, because the default was
> accidentally switched to enable DC by erroneously removing the 'return
> amdgpu_dc > 0;' line with:
> commit 367e66870e9cc20b867b11c4484ae83336efcb67
> Author: Alex Deucher
> Date: Thu Jan 25 16:53:25 2018 -0500
>=20
> drm/amdgpu: remove DC special casing for KB/ML
>=20=20=20=20=20
> It seems to be working now.
>=20=20=20=20=20
> Bug: https://bugs.freedesktop.org/show_bug.cgi?id=3D102372
> Reviewed-by: Mike Lothian
> Reviewed-by: Harry Wentland
> Signed-off-by: Alex Deucher
>=20
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 309977ef5b51..2ad9de42b65b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1704,6 +1704,8 @@ bool amdgpu_device_asic_has_dc_support(enum
> amd_asic_type asic_type)
> case CHIP_BONAIRE:
> case CHIP_HAWAII:
> case CHIP_KAVERI:
> + case CHIP_KABINI:
> + case CHIP_MULLINS:
> case CHIP_CARRIZO:
> case CHIP_STONEY:
> case CHIP_POLARIS11:
> @@ -1714,9 +1716,6 @@ bool amdgpu_device_asic_has_dc_support(enum
> amd_asic_type asic_type)
> #if defined(CONFIG_DRM_AMD_DC_PRE_VEGA)
> return amdgpu_dc !=3D 0;
> #endif
> - case CHIP_KABINI:
> - case CHIP_MULLINS:
> - return amdgpu_dc > 0;
> case CHIP_VEGA10:
> #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
> case CHIP_RAVEN:
>=20
>=20
> Linux 4.18 aligns the Non-DC case more closely with the DC case and thus
> VBITimeout gets actually set to the updated frame_time_x2 via
> smu7_notify_smc_display. Thus the Non-DC case is also affected by this bug
> since:
> commit 555fd70c59bc7f7acd8bc429d92bd59a66a7b83b
> Author: Rex Zhu
> Date: Tue Mar 27 13:32:02 2018 +0800
>=20
> drm/amd/pp: Not call cgs interface to get display info
>=20=20=20=20=20
> DC/Non DC all will update display configuration
> when the display state changed
> No need to get display info through cgs interface
>=20=20=20=20=20
> Reviewed-by: Evan Quan
> Signed-off-by: Rex Zhu
> Signed-off-by: Alex Deucher
>=20
> Linux 4.20 contains a commit trying to fix flickering issues:
> commit ec2e082a79b5d46addf2e7b83a13fb015fca6149
> Author: Alex Deucher
> Date: Thu Aug 9 14:24:08 2018 -0500
>=20
> drm/amdgpu/powerplay: check vrefresh when when changing displays
>=20=20=20=20=20
> Compare the current vrefresh in addition to the number of displays
> when determining whether or not the smu needs updates when changing
> modes. The SMU needs to be updated if the vbi timeout changes due
> to a different refresh rate. Fixes flickering around mode changes
> in some cases on polaris parts.
>=20=20=20=20=20
> Reviewed-by: Rex Zhu
> Reviewed-by: Huang Rui
> Signed-off-by: Alex Deucher
>=20
> But that doesn't fix the screen corruption described in this bug, because
> the problem is not that VBITimeout isn't updated enough, but rather the
> opposite, i.e. that it gets set to the frame_time_x2 value calculated from
> the correct, high refresh rate instead of the default value of 333.
>=20
> At least for 75 Hz, this problem can be fixed by preventing frame_time_x2
> and thus VBITimeout from being smaller than 280, as in the attached patch.
> Setting VBITimeout to higher values than the calcualted frame_time_x2 does
> not seem to cause any problems.
> It would be great if someone could test this patch with higher refresh
> rates, as well.
Well, people are reporting this patch to be a success. Can you submit this =
to
be reviewed for merging into the kernel? By the way, I have this issue with=
the
amdgpu package, not amdgpu-experimental.
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--15651176777.B621b7259.32212
Date: Tue, 6 Aug 2019 18:54:37 +0000
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.freedesktop.org/
Auto-Submitted: auto-generated
Comm=
ent # 101
on bug 10264=
6
from Maxim Ivanov
(In reply to Ahzo from comment #97)
> Created a=
ttachment 144950 [details] [review]=
a> [review]
> Patch to fix the problem
>=20
> TLDR: A script to reproduce and a patch to fix this problem are attach=
ed.
>=20
> The problem occurs when switching between high and low GPU memory
> frequencies at specific time intervals. It can be reproduced with the
> attached script, which optionally accepts a time parameter, defaulting=
to 1
> ms.
> With a 75 Hz display mode, screen corruption occurs rather reliably by=
using
> a time parameter in the following ranges:
> 0.000-0.002, 0.011-0.015, 0.024-0.028, 0.038-0.042, 0.051-0.055,
> 0.064-0.068, 0.078-0.082, 0.091-0.095, 0.104-0.108
>=20
> However, using sleep times between these intervals, e.g. 0.1, does not
> produce any screen corruption.
> For a frequency of 75 Hz the frame time is T =3D 1000 / 75 ms =3D 13.3=
ms and
> the screen corruption happens for sleep times of:
> S =3D n * T +- 2 ms
> Here n is a natural number, i.e. 0, 1, 2, 3, and so on.
>=20
> Linux 4.14 is not affected by this problem, as is noted in comment 93.
> However, that version only works by accident: When the display mode is=
not
> yet known, default parameters, in particular 60 Hz, are used to calcul=
ate
> frame_time_x2 as (1000000 / 60) * 2 / 100 =3D 333, which is then used =
to set
> VBITimeout. Later, when the refresh rate of 75 Hz is known, frame_time=
_x2
> gets updated to 266, but VBITimeout is never actually set to that valu=
e via
> smu7_notify_smc_display.
>=20
> Linux 4.15 included the DC patches, and when using DC (e.g. by using t=
he
> boot argument amdgpu.dc=3D1), VBITimeout is never set to the default 3=
33, but
> directly to 266, which triggers the screen corruption and flickering
> problems described in this bug.
>=20
> With Linux 4.17 the problem got more widespread, because the default w=
as
> accidentally switched to enable DC by erroneously removing the 'return
> amdgpu_dc > 0;' line with:
> commit 367e66870e9cc20b867b11c4484ae83336efcb67
> Author: Alex Deucher <alexander.deucher@amd.com>
> Date: Thu Jan 25 16:53:25 2018 -0500
>=20
> drm/amdgpu: remove DC special casing for KB/ML
>=20=20=20=20=20
> It seems to be working now.
>=20=20=20=20=20
> Bug: https://bugs.freedesktop.org/show_bug.=
cgi?id=3D102372
> Reviewed-by: Mike Lothian <mike@fireburn.co.uk>
> Reviewed-by: Harry Wentland <harry.wentland@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>=20
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 309977ef5b51..2ad9de42b65b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1704,6 +1704,8 @@ bool amdgpu_device_asic_has_dc_s=
upport(enum
> amd_asic_type asic_type)
> case CHIP_BONAIRE:
> case CHIP_HAWAII:
> case CHIP_KAVERI:
> + case CHIP_KABINI:
> + case CHIP_MULLINS:
> case CHIP_CARRIZO:
> case CHIP_STONEY:
> case CHIP_POLARIS11:
> @@ -1714,9 +1716,6 @@ bool amdgpu_device_asic_has_dc_s=
upport(enum
> amd_asic_type asic_type)
> #if defined(CONFIG_DRM_AMD_DC_PRE_VEGA)
> return amdgpu_dc !=3D 0;
> #endif
> - case CHIP_KABINI:
> - case CHIP_MULLINS:
> - return amdgpu_dc > 0;
> case CHIP_VEGA10:
> #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
> case CHIP_RAVEN:
>=20
>=20
> Linux 4.18 aligns the Non-DC case more closely with the DC case and th=
us
> VBITimeout gets actually set to the updated frame_time_x2 via
> smu7_notify_smc_display. Thus the Non-DC case is also affected by this=
bug
> since:
> commit 555fd70c59bc7f7acd8bc429d92bd59a66a7b83b
> Author: Rex Zhu <Rex.Zhu@=
;amd.com>
> Date: Tue Mar 27 13:32:02 2018 +0800
>=20
> drm/amd/pp: Not call cgs interface to get display info
>=20=20=20=20=20
> DC/Non DC all will update display configuration
> when the display state changed
> No need to get display info through cgs interface
>=20=20=20=20=20
> Reviewed-by: Evan Quan <evan.quan@amd.com>
> Signed-off-by: Rex Zhu <=
Rex.Zhu@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>=20
> Linux 4.20 contains a commit trying to fix flickering issues:
> commit ec2e082a79b5d46addf2e7b83a13fb015fca6149
> Author: Alex Deucher <alexander.deucher@amd.com>
> Date: Thu Aug 9 14:24:08 2018 -0500
>=20
> drm/amdgpu/powerplay: check vrefresh when when changing displays
>=20=20=20=20=20
> Compare the current vrefresh in addition to the number of displays
> when determining whether or not the smu needs updates when changing
> modes. The SMU needs to be updated if the vbi timeout changes due
> to a different refresh rate. Fixes flickering around mode changes
> in some cases on polaris parts.
>=20=20=20=20=20
> Reviewed-by: Rex Zhu <Re=
x.Zhu@amd.com>
> Reviewed-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>=20
> But that doesn't fix the screen corruption described in this bug, beca=
use
> the problem is not that VBITimeout isn't updated enough, but rather the
> opposite, i.e. that it gets set to the frame_time_x2 value calculated =
from
> the correct, high refresh rate instead of the default value of 333.
>=20
> At least for 75 Hz, this problem can be fixed by preventing frame_time=
_x2
> and thus VBITimeout from being smaller than 280, as in the attached pa=
tch.
> Setting VBITimeout to higher values than the calcualted frame_time_x2 =
does
> not seem to cause any problems.
> It would be great if someone could test this patch with higher refresh
> rates, as well.
Well, people are reporting this patch to be a success. Can you submit this =
to
be reviewed for merging into the kernel? By the way, I have this issue with=
the
amdgpu package, not amdgpu-experimental.
You are receiving this mail because:
- You are the assignee for the bug.
=
--15651176777.B621b7259.32212--
--===============0077330890==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs
IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz
dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVs
--===============0077330890==--