linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
@ 2025-07-16 16:17 Brian Geffon
  2025-07-16 16:33 ` Alex Deucher
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Geffon @ 2025-07-16 16:17 UTC (permalink / raw)
  To: Alex Deucher, christian.koenig
  Cc: David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans,
	Thadeu Lima de Souza Cascardo, Brian Geffon, stable

Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
allowed for newer ASICs to mix GTT and VRAM, this change also noted that
some older boards, such as Stoney and Carrizo do not support this.
It appears that at least one additional ASIC does not support this which
is Raven.

We observed this issue when migrating a device from a 5.4 to 6.6 kernel
and have confirmed that Raven also needs to be excluded from mixing GTT
and VRAM.

Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1+
Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Signed-off-by: Brian Geffon <bgeffon@google.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 73403744331a..5d7f13e25b7c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
 					    uint32_t domain)
 {
 	if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
-	    ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
+	    ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
+	     (adev->asic_type == CHIP_RAVEN))) {
 		domain = AMDGPU_GEM_DOMAIN_VRAM;
 		if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
 			domain = AMDGPU_GEM_DOMAIN_GTT;
-- 
2.50.0.727.gbf7dc18ff4-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-16 16:17 [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM Brian Geffon
@ 2025-07-16 16:33 ` Alex Deucher
  2025-07-16 16:40   ` Brian Geffon
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Deucher @ 2025-07-16 16:33 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Alex Deucher, christian.koenig, David Airlie, Simona Vetter,
	Tvrtko Ursulin, Yunxiang Li, Lijo Lazar, Prike Liang,
	Pratap Nirujogi, Luben Tuikov, amd-gfx, dri-devel, linux-kernel,
	Garrick Evans, Thadeu Lima de Souza Cascardo, stable

On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
>
> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> some older boards, such as Stoney and Carrizo do not support this.
> It appears that at least one additional ASIC does not support this which
> is Raven.
>
> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> and have confirmed that Raven also needs to be excluded from mixing GTT
> and VRAM.

Can you elaborate a bit on what the problem is?  For carrizo and
stoney this is a hardware limitation (all display buffers need to be
in GTT or VRAM, but not both).  Raven and newer don't have this
limitation and we tested raven pretty extensively at the time.

Alex

>
> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: stable@vger.kernel.org # 6.1+
> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> Signed-off-by: Brian Geffon <bgeffon@google.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 73403744331a..5d7f13e25b7c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
>                                             uint32_t domain)
>  {
>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> +            (adev->asic_type == CHIP_RAVEN))) {
>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
>                         domain = AMDGPU_GEM_DOMAIN_GTT;
> --
> 2.50.0.727.gbf7dc18ff4-goog
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-16 16:33 ` Alex Deucher
@ 2025-07-16 16:40   ` Brian Geffon
  2025-07-16 21:02     ` Alex Deucher
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Geffon @ 2025-07-16 16:40 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Alex Deucher, christian.koenig, David Airlie, Simona Vetter,
	Tvrtko Ursulin, Yunxiang Li, Lijo Lazar, Prike Liang,
	Pratap Nirujogi, Luben Tuikov, amd-gfx, dri-devel, linux-kernel,
	Garrick Evans, Thadeu Lima de Souza Cascardo, stable

On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> >
> > Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> > some older boards, such as Stoney and Carrizo do not support this.
> > It appears that at least one additional ASIC does not support this which
> > is Raven.
> >
> > We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> > and have confirmed that Raven also needs to be excluded from mixing GTT
> > and VRAM.
>
> Can you elaborate a bit on what the problem is?  For carrizo and
> stoney this is a hardware limitation (all display buffers need to be
> in GTT or VRAM, but not both).  Raven and newer don't have this
> limitation and we tested raven pretty extensively at the time.

Thanks for taking the time to look. We have automated testing and a
few igt gpu tools tests failed and after debugging we found that
commit 81d0bcf99009 is what introduced the failures on this hardware
on 6.1+ kernels. The specific tests that fail are kms_async_flips and
kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
VRAM buffers resolves the issue.

Brian

>
>
> Alex
>
> >
> > Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: stable@vger.kernel.org # 6.1+
> > Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> > Signed-off-by: Brian Geffon <bgeffon@google.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > index 73403744331a..5d7f13e25b7c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> >                                             uint32_t domain)
> >  {
> >         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> > -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> > +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> > +            (adev->asic_type == CHIP_RAVEN))) {
> >                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> >                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> >                         domain = AMDGPU_GEM_DOMAIN_GTT;
> > --
> > 2.50.0.727.gbf7dc18ff4-goog
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-16 16:40   ` Brian Geffon
@ 2025-07-16 21:02     ` Alex Deucher
  2025-07-17  0:12       ` Brian Geffon
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Deucher @ 2025-07-16 21:02 UTC (permalink / raw)
  To: Brian Geffon, Wentland, Harry, Leo (Sunpeng) Li
  Cc: Alex Deucher, christian.koenig, David Airlie, Simona Vetter,
	Tvrtko Ursulin, Yunxiang Li, Lijo Lazar, Prike Liang,
	Pratap Nirujogi, Luben Tuikov, amd-gfx, dri-devel, linux-kernel,
	Garrick Evans, Thadeu Lima de Souza Cascardo, stable

On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
>
> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> > >
> > > Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> > > some older boards, such as Stoney and Carrizo do not support this.
> > > It appears that at least one additional ASIC does not support this which
> > > is Raven.
> > >
> > > We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> > > and have confirmed that Raven also needs to be excluded from mixing GTT
> > > and VRAM.
> >
> > Can you elaborate a bit on what the problem is?  For carrizo and
> > stoney this is a hardware limitation (all display buffers need to be
> > in GTT or VRAM, but not both).  Raven and newer don't have this
> > limitation and we tested raven pretty extensively at the time.
>
> Thanks for taking the time to look. We have automated testing and a
> few igt gpu tools tests failed and after debugging we found that
> commit 81d0bcf99009 is what introduced the failures on this hardware
> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> VRAM buffers resolves the issue.

+ Harry and Leo

This sounds like the memory placement issue we discussed last week.
In that case, the issue is related to where the buffer ends up when we
try to do an async flip.  In that case, we can't do an async flip
without a full modeset if the buffers locations are different than the
last modeset because we need to update more than just the buffer base
addresses.  This change works around that limitation by always forcing
display buffers into VRAM or GTT.  Adding raven to this case may fix
those tests but will make the overall experience worse because we'll
end up effectively not being able to not fully utilize both gtt and
vram for display which would reintroduce all of the problems fixed by
81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").

Alex

>
> Brian
>
> >
> >
> > Alex
> >
> > >
> > > Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > Cc: Christian König <christian.koenig@amd.com>
> > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > Cc: stable@vger.kernel.org # 6.1+
> > > Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> > > Signed-off-by: Brian Geffon <bgeffon@google.com>
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > index 73403744331a..5d7f13e25b7c 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> > >                                             uint32_t domain)
> > >  {
> > >         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> > > -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> > > +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> > > +            (adev->asic_type == CHIP_RAVEN))) {
> > >                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> > >                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> > >                         domain = AMDGPU_GEM_DOMAIN_GTT;
> > > --
> > > 2.50.0.727.gbf7dc18ff4-goog
> > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-16 21:02     ` Alex Deucher
@ 2025-07-17  0:12       ` Brian Geffon
  2025-07-17  7:58         ` Christian König
  2025-07-17 14:58         ` Alex Deucher
  0 siblings, 2 replies; 18+ messages in thread
From: Brian Geffon @ 2025-07-17  0:12 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Wentland, Harry, Leo (Sunpeng) Li, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans,
	Thadeu Lima de Souza Cascardo, stable

On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
> >
> > On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > >
> > > On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> > > >
> > > > Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > > allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> > > > some older boards, such as Stoney and Carrizo do not support this.
> > > > It appears that at least one additional ASIC does not support this which
> > > > is Raven.
> > > >
> > > > We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> > > > and have confirmed that Raven also needs to be excluded from mixing GTT
> > > > and VRAM.
> > >
> > > Can you elaborate a bit on what the problem is?  For carrizo and
> > > stoney this is a hardware limitation (all display buffers need to be
> > > in GTT or VRAM, but not both).  Raven and newer don't have this
> > > limitation and we tested raven pretty extensively at the time.
> >
> > Thanks for taking the time to look. We have automated testing and a
> > few igt gpu tools tests failed and after debugging we found that
> > commit 81d0bcf99009 is what introduced the failures on this hardware
> > on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> > kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> > VRAM buffers resolves the issue.
>
> + Harry and Leo
>
> This sounds like the memory placement issue we discussed last week.
> In that case, the issue is related to where the buffer ends up when we
> try to do an async flip.  In that case, we can't do an async flip
> without a full modeset if the buffers locations are different than the
> last modeset because we need to update more than just the buffer base
> addresses.  This change works around that limitation by always forcing
> display buffers into VRAM or GTT.  Adding raven to this case may fix
> those tests but will make the overall experience worse because we'll
> end up effectively not being able to not fully utilize both gtt and
> vram for display which would reintroduce all of the problems fixed by
> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").

Thanks Alex, the thing is, we only observe this on Raven boards, why
would Raven only be impacted by this? It would seem that all devices
would have this issue, no? Also, I'm not familiar with how
kms_plane_alpha_blend works, but does this also support that test
failing as the cause?

Thanks again,
Brian

>
> Alex
>
> >
> > Brian
> >
> > >
> > >
> > > Alex
> > >
> > > >
> > > > Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > > Cc: Christian König <christian.koenig@amd.com>
> > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > Cc: stable@vger.kernel.org # 6.1+
> > > > Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> > > > Signed-off-by: Brian Geffon <bgeffon@google.com>
> > > > ---
> > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > index 73403744331a..5d7f13e25b7c 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> > > >                                             uint32_t domain)
> > > >  {
> > > >         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> > > > -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> > > > +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> > > > +            (adev->asic_type == CHIP_RAVEN))) {
> > > >                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> > > >                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> > > >                         domain = AMDGPU_GEM_DOMAIN_GTT;
> > > > --
> > > > 2.50.0.727.gbf7dc18ff4-goog
> > > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-17  0:12       ` Brian Geffon
@ 2025-07-17  7:58         ` Christian König
  2025-07-17 14:58         ` Alex Deucher
  1 sibling, 0 replies; 18+ messages in thread
From: Christian König @ 2025-07-17  7:58 UTC (permalink / raw)
  To: Brian Geffon, Alex Deucher
  Cc: Wentland, Harry, Leo (Sunpeng) Li, Alex Deucher, David Airlie,
	Simona Vetter, Tvrtko Ursulin, Yunxiang Li, Lijo Lazar,
	Prike Liang, Pratap Nirujogi, amd-gfx, dri-devel, linux-kernel,
	Garrick Evans, Thadeu Lima de Souza Cascardo, stable

On 17.07.25 02:12, Brian Geffon wrote:
> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>
>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
>>>
>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>
>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>
>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
>>>>> some older boards, such as Stoney and Carrizo do not support this.
>>>>> It appears that at least one additional ASIC does not support this which
>>>>> is Raven.
>>>>>
>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
>>>>> and VRAM.
>>>>
>>>> Can you elaborate a bit on what the problem is?  For carrizo and
>>>> stoney this is a hardware limitation (all display buffers need to be
>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
>>>> limitation and we tested raven pretty extensively at the time.
>>>
>>> Thanks for taking the time to look. We have automated testing and a
>>> few igt gpu tools tests failed and after debugging we found that
>>> commit 81d0bcf99009 is what introduced the failures on this hardware
>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
>>> VRAM buffers resolves the issue.
>>
>> + Harry and Leo
>>
>> This sounds like the memory placement issue we discussed last week.
>> In that case, the issue is related to where the buffer ends up when we
>> try to do an async flip.  In that case, we can't do an async flip
>> without a full modeset if the buffers locations are different than the
>> last modeset because we need to update more than just the buffer base
>> addresses.  This change works around that limitation by always forcing
>> display buffers into VRAM or GTT.  Adding raven to this case may fix
>> those tests but will make the overall experience worse because we'll
>> end up effectively not being able to not fully utilize both gtt and
>> vram for display which would reintroduce all of the problems fixed by
>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
> 
> Thanks Alex, the thing is, we only observe this on Raven boards, why
> would Raven only be impacted by this?

Exactly that's the point, this is most likely just coincident.

As far as I understand when the registers are already initialized properly by a previous modeset it just works out of the box.

We potentially need to change the DC design to always program all registers independent of the current placement of the display buffer.

But I'm not sure if that is even possible or if that has some more problematic side effects (drawing more power etc...)

> It would seem that all devices
> would have this issue, no? Also, I'm not familiar with how
> kms_plane_alpha_blend works, but does this also support that test
> failing as the cause?

Correct, it affects all APUs which can do scanout from GTT.

dGPUs are not affected because they can't do scanout from GTT over the PCIe connection in the first place.

Anyway Harry and Leo need to take a look, it's clearly not that easy to fix.

Regards,
Christian.

> 
> Thanks again,
> Brian
> 
>>
>> Alex
>>
>>>
>>> Brian
>>>
>>>>
>>>>
>>>> Alex
>>>>
>>>>>
>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>> Cc: stable@vger.kernel.org # 6.1+
>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
>>>>> Signed-off-by: Brian Geffon <bgeffon@google.com>
>>>>> ---
>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> index 73403744331a..5d7f13e25b7c 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
>>>>>                                             uint32_t domain)
>>>>>  {
>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
>>>>> --
>>>>> 2.50.0.727.gbf7dc18ff4-goog
>>>>>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-17  0:12       ` Brian Geffon
  2025-07-17  7:58         ` Christian König
@ 2025-07-17 14:58         ` Alex Deucher
  2025-07-17 15:34           ` Michel Dänzer
  2025-07-18 17:56           ` Brian Geffon
  1 sibling, 2 replies; 18+ messages in thread
From: Alex Deucher @ 2025-07-17 14:58 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Wentland, Harry, Leo (Sunpeng) Li, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans,
	Thadeu Lima de Souza Cascardo, stable

On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
>
> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
> > >
> > > On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > > >
> > > > On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> > > > >
> > > > > Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > > > allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> > > > > some older boards, such as Stoney and Carrizo do not support this.
> > > > > It appears that at least one additional ASIC does not support this which
> > > > > is Raven.
> > > > >
> > > > > We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> > > > > and have confirmed that Raven also needs to be excluded from mixing GTT
> > > > > and VRAM.
> > > >
> > > > Can you elaborate a bit on what the problem is?  For carrizo and
> > > > stoney this is a hardware limitation (all display buffers need to be
> > > > in GTT or VRAM, but not both).  Raven and newer don't have this
> > > > limitation and we tested raven pretty extensively at the time.
> > >
> > > Thanks for taking the time to look. We have automated testing and a
> > > few igt gpu tools tests failed and after debugging we found that
> > > commit 81d0bcf99009 is what introduced the failures on this hardware
> > > on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> > > kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> > > VRAM buffers resolves the issue.
> >
> > + Harry and Leo
> >
> > This sounds like the memory placement issue we discussed last week.
> > In that case, the issue is related to where the buffer ends up when we
> > try to do an async flip.  In that case, we can't do an async flip
> > without a full modeset if the buffers locations are different than the
> > last modeset because we need to update more than just the buffer base
> > addresses.  This change works around that limitation by always forcing
> > display buffers into VRAM or GTT.  Adding raven to this case may fix
> > those tests but will make the overall experience worse because we'll
> > end up effectively not being able to not fully utilize both gtt and
> > vram for display which would reintroduce all of the problems fixed by
> > 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
>
> Thanks Alex, the thing is, we only observe this on Raven boards, why
> would Raven only be impacted by this? It would seem that all devices
> would have this issue, no? Also, I'm not familiar with how

It depends on memory pressure and available memory in each pool.
E.g., initially the display buffer is in VRAM when the initial mode
set happens.  The watermarks, etc. are set for that scenario.  One of
the next frames ends up in a pool different than the original.  Now
the buffer is in GTT.  The async flip interface does a fast validation
to try and flip as soon as possible, but that validation fails because
the watermarks need to be updated which requires a full modeset.

It's tricky to fix because you don't want to use the worst case
watermarks all the time because that will limit the number available
display options and you don't want to force everything to a particular
memory pool because that will limit the amount of memory that can be
used for display (which is what the patch in question fixed).  Ideally
the caller would do a test commit before the page flip to determine
whether or not it would succeed before issuing it and then we'd have
some feedback mechanism to tell the caller that the commit would fail
due to buffer placement so it would do a full modeset instead.  We
discussed this feedback mechanism last week at the display hackfest.


> kms_plane_alpha_blend works, but does this also support that test
> failing as the cause?

That may be related.  I'm not too familiar with that test either, but
Leo or Harry can provide some guidance.

Alex

>
> Thanks again,
> Brian
>
> >
> > Alex
> >
> > >
> > > Brian
> > >
> > > >
> > > >
> > > > Alex
> > > >
> > > > >
> > > > > Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > > > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > > > Cc: Christian König <christian.koenig@amd.com>
> > > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > > Cc: stable@vger.kernel.org # 6.1+
> > > > > Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> > > > > Signed-off-by: Brian Geffon <bgeffon@google.com>
> > > > > ---
> > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > index 73403744331a..5d7f13e25b7c 100644
> > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> > > > >                                             uint32_t domain)
> > > > >  {
> > > > >         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> > > > > -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> > > > > +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> > > > > +            (adev->asic_type == CHIP_RAVEN))) {
> > > > >                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> > > > >                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> > > > >                         domain = AMDGPU_GEM_DOMAIN_GTT;
> > > > > --
> > > > > 2.50.0.727.gbf7dc18ff4-goog
> > > > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-17 14:58         ` Alex Deucher
@ 2025-07-17 15:34           ` Michel Dänzer
  2025-07-18 17:56           ` Brian Geffon
  1 sibling, 0 replies; 18+ messages in thread
From: Michel Dänzer @ 2025-07-17 15:34 UTC (permalink / raw)
  To: Alex Deucher, Brian Geffon
  Cc: Wentland, Harry, Leo (Sunpeng) Li, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, amd-gfx, dri-devel,
	linux-kernel, Garrick Evans, Thadeu Lima de Souza Cascardo

On 17.07.25 16:58, Alex Deucher wrote:
> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>
>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
>>>>>> some older boards, such as Stoney and Carrizo do not support this.
>>>>>> It appears that at least one additional ASIC does not support this which
>>>>>> is Raven.
>>>>>>
>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
>>>>>> and VRAM.
>>>>>
>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
>>>>> stoney this is a hardware limitation (all display buffers need to be
>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
>>>>> limitation and we tested raven pretty extensively at the time.
>>>>
>>>> Thanks for taking the time to look. We have automated testing and a
>>>> few igt gpu tools tests failed and after debugging we found that
>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
>>>> VRAM buffers resolves the issue.
>>>
>>> + Harry and Leo
>>>
>>> This sounds like the memory placement issue we discussed last week.
>>> In that case, the issue is related to where the buffer ends up when we
>>> try to do an async flip.  In that case, we can't do an async flip
>>> without a full modeset if the buffers locations are different than the
>>> last modeset because we need to update more than just the buffer base
>>> addresses.  This change works around that limitation by always forcing
>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
>>> those tests but will make the overall experience worse because we'll
>>> end up effectively not being able to not fully utilize both gtt and
>>> vram for display which would reintroduce all of the problems fixed by
>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
>>
>> Thanks Alex, the thing is, we only observe this on Raven boards, why
>> would Raven only be impacted by this? It would seem that all devices
>> would have this issue, no? Also, I'm not familiar with how
> 
> It depends on memory pressure and available memory in each pool.
> E.g., initially the display buffer is in VRAM when the initial mode
> set happens.  The watermarks, etc. are set for that scenario.  One of
> the next frames ends up in a pool different than the original.  Now
> the buffer is in GTT.  The async flip interface does a fast validation
> to try and flip as soon as possible, but that validation fails because
> the watermarks need to be updated which requires a full modeset.
> 
> It's tricky to fix because you don't want to use the worst case
> watermarks all the time because that will limit the number available
> display options and you don't want to force everything to a particular
> memory pool because that will limit the amount of memory that can be
> used for display (which is what the patch in question fixed).  Ideally
> the caller would do a test commit before the page flip to determine
> whether or not it would succeed before issuing it and then we'd have
> some feedback mechanism to tell the caller that the commit would fail
> due to buffer placement so it would do a full modeset instead.  We
> discussed this feedback mechanism last week at the display hackfest.

(A separate test commit may not buy anything, the compositor can just try it and react to errors)

Most compositors won't want to set the DRM_MODE_ATOMIC_ALLOW_MODESET flag for a "simple flip", since it could result in user-visible artifacts such as the display intermittently blanking.

If the driver can make it work without user-visible artifacts (e.g. by reprogramming watermarks), it should just do so without DRM_MODE_ATOMIC_ALLOW_MODESET. If not, it should return an error (and possibly more information via the future mechanism).


P.S. Without DRM_MODE_PAGE_FLIP_ASYNC, the driver must always be able to flip at least the primary plane (it can require disabling overlay planes) without DRM_MODE_ATOMIC_ALLOW_MODESET, or the compositor could end up in a corner it can't get out of.


-- 
Earthling Michel Dänzer       \        GNOME / Xwayland / Mesa developer
https://redhat.com             \               Libre software enthusiast

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-17 14:58         ` Alex Deucher
  2025-07-17 15:34           ` Michel Dänzer
@ 2025-07-18 17:56           ` Brian Geffon
  2025-07-18 20:07             ` Alex Deucher
  1 sibling, 1 reply; 18+ messages in thread
From: Brian Geffon @ 2025-07-18 17:56 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Wentland, Harry, Leo (Sunpeng) Li, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans,
	Thadeu Lima de Souza Cascardo, stable

On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
> >
> > On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > >
> > > On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
> > > >
> > > > On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> > > > > >
> > > > > > Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > > > > allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> > > > > > some older boards, such as Stoney and Carrizo do not support this.
> > > > > > It appears that at least one additional ASIC does not support this which
> > > > > > is Raven.
> > > > > >
> > > > > > We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> > > > > > and have confirmed that Raven also needs to be excluded from mixing GTT
> > > > > > and VRAM.
> > > > >
> > > > > Can you elaborate a bit on what the problem is?  For carrizo and
> > > > > stoney this is a hardware limitation (all display buffers need to be
> > > > > in GTT or VRAM, but not both).  Raven and newer don't have this
> > > > > limitation and we tested raven pretty extensively at the time.
> > > >
> > > > Thanks for taking the time to look. We have automated testing and a
> > > > few igt gpu tools tests failed and after debugging we found that
> > > > commit 81d0bcf99009 is what introduced the failures on this hardware
> > > > on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> > > > kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> > > > VRAM buffers resolves the issue.
> > >
> > > + Harry and Leo
> > >
> > > This sounds like the memory placement issue we discussed last week.
> > > In that case, the issue is related to where the buffer ends up when we
> > > try to do an async flip.  In that case, we can't do an async flip
> > > without a full modeset if the buffers locations are different than the
> > > last modeset because we need to update more than just the buffer base
> > > addresses.  This change works around that limitation by always forcing
> > > display buffers into VRAM or GTT.  Adding raven to this case may fix
> > > those tests but will make the overall experience worse because we'll
> > > end up effectively not being able to not fully utilize both gtt and
> > > vram for display which would reintroduce all of the problems fixed by
> > > 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
> >
> > Thanks Alex, the thing is, we only observe this on Raven boards, why
> > would Raven only be impacted by this? It would seem that all devices
> > would have this issue, no? Also, I'm not familiar with how
>
> It depends on memory pressure and available memory in each pool.
> E.g., initially the display buffer is in VRAM when the initial mode
> set happens.  The watermarks, etc. are set for that scenario.  One of
> the next frames ends up in a pool different than the original.  Now
> the buffer is in GTT.  The async flip interface does a fast validation
> to try and flip as soon as possible, but that validation fails because
> the watermarks need to be updated which requires a full modeset.
>
> It's tricky to fix because you don't want to use the worst case
> watermarks all the time because that will limit the number available
> display options and you don't want to force everything to a particular
> memory pool because that will limit the amount of memory that can be
> used for display (which is what the patch in question fixed).  Ideally
> the caller would do a test commit before the page flip to determine
> whether or not it would succeed before issuing it and then we'd have
> some feedback mechanism to tell the caller that the commit would fail
> due to buffer placement so it would do a full modeset instead.  We
> discussed this feedback mechanism last week at the display hackfest.
>
>
> > kms_plane_alpha_blend works, but does this also support that test
> > failing as the cause?
>
> That may be related.  I'm not too familiar with that test either, but
> Leo or Harry can provide some guidance.
>
> Alex

Thanks everyone for the input so far. I have a question for the
maintainers, given that it seems that this is functionally broken for
ASICs which are iGPUs, and there does not seem to be an easy fix, does
it make sense to extend this proposed patch to all iGPUs until a more
permanent fix can be identified? At the end of the day I'll take
functional correctness over performance.

Brian

>
> >
> > Thanks again,
> > Brian
> >
> > >
> > > Alex
> > >
> > > >
> > > > Brian
> > > >
> > > > >
> > > > >
> > > > > Alex
> > > > >
> > > > > >
> > > > > > Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > > > > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > > > > Cc: Christian König <christian.koenig@amd.com>
> > > > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > > > Cc: stable@vger.kernel.org # 6.1+
> > > > > > Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> > > > > > Signed-off-by: Brian Geffon <bgeffon@google.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > > index 73403744331a..5d7f13e25b7c 100644
> > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > > @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> > > > > >                                             uint32_t domain)
> > > > > >  {
> > > > > >         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> > > > > > -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> > > > > > +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> > > > > > +            (adev->asic_type == CHIP_RAVEN))) {
> > > > > >                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> > > > > >                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> > > > > >                         domain = AMDGPU_GEM_DOMAIN_GTT;
> > > > > > --
> > > > > > 2.50.0.727.gbf7dc18ff4-goog
> > > > > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-18 17:56           ` Brian Geffon
@ 2025-07-18 20:07             ` Alex Deucher
  2025-07-18 21:02               ` Leo Li
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Deucher @ 2025-07-18 20:07 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Wentland, Harry, Leo (Sunpeng) Li, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans,
	Thadeu Lima de Souza Cascardo, stable

On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@google.com> wrote:
>
> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
> > >
> > > On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > > >
> > > > On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
> > > > >
> > > > > On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> > > > > > >
> > > > > > > Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > > > > > allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> > > > > > > some older boards, such as Stoney and Carrizo do not support this.
> > > > > > > It appears that at least one additional ASIC does not support this which
> > > > > > > is Raven.
> > > > > > >
> > > > > > > We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> > > > > > > and have confirmed that Raven also needs to be excluded from mixing GTT
> > > > > > > and VRAM.
> > > > > >
> > > > > > Can you elaborate a bit on what the problem is?  For carrizo and
> > > > > > stoney this is a hardware limitation (all display buffers need to be
> > > > > > in GTT or VRAM, but not both).  Raven and newer don't have this
> > > > > > limitation and we tested raven pretty extensively at the time.
> > > > >
> > > > > Thanks for taking the time to look. We have automated testing and a
> > > > > few igt gpu tools tests failed and after debugging we found that
> > > > > commit 81d0bcf99009 is what introduced the failures on this hardware
> > > > > on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> > > > > kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> > > > > VRAM buffers resolves the issue.
> > > >
> > > > + Harry and Leo
> > > >
> > > > This sounds like the memory placement issue we discussed last week.
> > > > In that case, the issue is related to where the buffer ends up when we
> > > > try to do an async flip.  In that case, we can't do an async flip
> > > > without a full modeset if the buffers locations are different than the
> > > > last modeset because we need to update more than just the buffer base
> > > > addresses.  This change works around that limitation by always forcing
> > > > display buffers into VRAM or GTT.  Adding raven to this case may fix
> > > > those tests but will make the overall experience worse because we'll
> > > > end up effectively not being able to not fully utilize both gtt and
> > > > vram for display which would reintroduce all of the problems fixed by
> > > > 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
> > >
> > > Thanks Alex, the thing is, we only observe this on Raven boards, why
> > > would Raven only be impacted by this? It would seem that all devices
> > > would have this issue, no? Also, I'm not familiar with how
> >
> > It depends on memory pressure and available memory in each pool.
> > E.g., initially the display buffer is in VRAM when the initial mode
> > set happens.  The watermarks, etc. are set for that scenario.  One of
> > the next frames ends up in a pool different than the original.  Now
> > the buffer is in GTT.  The async flip interface does a fast validation
> > to try and flip as soon as possible, but that validation fails because
> > the watermarks need to be updated which requires a full modeset.
> >
> > It's tricky to fix because you don't want to use the worst case
> > watermarks all the time because that will limit the number available
> > display options and you don't want to force everything to a particular
> > memory pool because that will limit the amount of memory that can be
> > used for display (which is what the patch in question fixed).  Ideally
> > the caller would do a test commit before the page flip to determine
> > whether or not it would succeed before issuing it and then we'd have
> > some feedback mechanism to tell the caller that the commit would fail
> > due to buffer placement so it would do a full modeset instead.  We
> > discussed this feedback mechanism last week at the display hackfest.
> >
> >
> > > kms_plane_alpha_blend works, but does this also support that test
> > > failing as the cause?
> >
> > That may be related.  I'm not too familiar with that test either, but
> > Leo or Harry can provide some guidance.
> >
> > Alex
>
> Thanks everyone for the input so far. I have a question for the
> maintainers, given that it seems that this is functionally broken for
> ASICs which are iGPUs, and there does not seem to be an easy fix, does
> it make sense to extend this proposed patch to all iGPUs until a more
> permanent fix can be identified? At the end of the day I'll take
> functional correctness over performance.

It's not functional correctness, it's usability.  All that is
potentially broken is async flips (which depend on memory pressure and
buffer placement), while if you effectively revert the patch, you end
up  limiting all display buffers to either VRAM or GTT which may end
up causing the inability to display anything because there is not
enough memory in that pool for the next modeset.  We'll start getting
bug reports about blank screens and failure to set modes because of
memory pressure.  I think if we want a short term fix, it would be to
always set the worst case watermarks.  The downside to that is that it
would possibly cause some working display setups to stop working if
they were on the margins to begin with.

Alex

>
> Brian
>
> >
> > >
> > > Thanks again,
> > > Brian
> > >
> > > >
> > > > Alex
> > > >
> > > > >
> > > > > Brian
> > > > >
> > > > > >
> > > > > >
> > > > > > Alex
> > > > > >
> > > > > > >
> > > > > > > Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > > > > > > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > > > > > Cc: Christian König <christian.koenig@amd.com>
> > > > > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > > > > Cc: stable@vger.kernel.org # 6.1+
> > > > > > > Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> > > > > > > Signed-off-by: Brian Geffon <bgeffon@google.com>
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > > > index 73403744331a..5d7f13e25b7c 100644
> > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > > > > @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> > > > > > >                                             uint32_t domain)
> > > > > > >  {
> > > > > > >         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> > > > > > > -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> > > > > > > +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> > > > > > > +            (adev->asic_type == CHIP_RAVEN))) {
> > > > > > >                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> > > > > > >                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> > > > > > >                         domain = AMDGPU_GEM_DOMAIN_GTT;
> > > > > > > --
> > > > > > > 2.50.0.727.gbf7dc18ff4-goog
> > > > > > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-18 20:07             ` Alex Deucher
@ 2025-07-18 21:02               ` Leo Li
  2025-07-18 21:33                 ` Alex Deucher
  0 siblings, 1 reply; 18+ messages in thread
From: Leo Li @ 2025-07-18 21:02 UTC (permalink / raw)
  To: Alex Deucher, Brian Geffon
  Cc: Wentland, Harry, Alex Deucher, christian.koenig, David Airlie,
	Simona Vetter, Tvrtko Ursulin, Yunxiang Li, Lijo Lazar,
	Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx, dri-devel,
	linux-kernel, Garrick Evans, Thadeu Lima de Souza Cascardo,
	stable



On 2025-07-18 16:07, Alex Deucher wrote:
> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@google.com> wrote:
>>
>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>
>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>
>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>>
>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>
>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>>>>
>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>>>
>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
>>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
>>>>>>>> It appears that at least one additional ASIC does not support this which
>>>>>>>> is Raven.
>>>>>>>>
>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
>>>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
>>>>>>>> and VRAM.
>>>>>>>
>>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
>>>>>>> stoney this is a hardware limitation (all display buffers need to be
>>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
>>>>>>> limitation and we tested raven pretty extensively at the time.
>>>>>>
>>>>>> Thanks for taking the time to look. We have automated testing and a
>>>>>> few igt gpu tools tests failed and after debugging we found that
>>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
>>>>>> VRAM buffers resolves the issue.
>>>>>
>>>>> + Harry and Leo
>>>>>
>>>>> This sounds like the memory placement issue we discussed last week.
>>>>> In that case, the issue is related to where the buffer ends up when we
>>>>> try to do an async flip.  In that case, we can't do an async flip
>>>>> without a full modeset if the buffers locations are different than the
>>>>> last modeset because we need to update more than just the buffer base
>>>>> addresses.  This change works around that limitation by always forcing
>>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
>>>>> those tests but will make the overall experience worse because we'll
>>>>> end up effectively not being able to not fully utilize both gtt and
>>>>> vram for display which would reintroduce all of the problems fixed by
>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
>>>>
>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
>>>> would Raven only be impacted by this? It would seem that all devices
>>>> would have this issue, no? Also, I'm not familiar with how
>>>
>>> It depends on memory pressure and available memory in each pool.
>>> E.g., initially the display buffer is in VRAM when the initial mode
>>> set happens.  The watermarks, etc. are set for that scenario.  One of
>>> the next frames ends up in a pool different than the original.  Now
>>> the buffer is in GTT.  The async flip interface does a fast validation
>>> to try and flip as soon as possible, but that validation fails because
>>> the watermarks need to be updated which requires a full modeset.

Huh, I'm not sure if this actually is an issue for APUs. The fix that introduced
a check for same memory placement on async flips was on a system with a DGPU,
for which VRAM placement does matter:
https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507

Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
bandwidth validation depending on memory placement. There's a gpuvm_enable flag
for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
APUs specifically, we *should* be able to ignore the mem placement check. I can
spin up a patch to test this out.

Thanks,
Leo

>>>
>>> It's tricky to fix because you don't want to use the worst case
>>> watermarks all the time because that will limit the number available
>>> display options and you don't want to force everything to a particular
>>> memory pool because that will limit the amount of memory that can be
>>> used for display (which is what the patch in question fixed).  Ideally
>>> the caller would do a test commit before the page flip to determine
>>> whether or not it would succeed before issuing it and then we'd have
>>> some feedback mechanism to tell the caller that the commit would fail
>>> due to buffer placement so it would do a full modeset instead.  We
>>> discussed this feedback mechanism last week at the display hackfest.
>>>
>>>
>>>> kms_plane_alpha_blend works, but does this also support that test
>>>> failing as the cause?
>>>
>>> That may be related.  I'm not too familiar with that test either, but
>>> Leo or Harry can provide some guidance.
>>>
>>> Alex
>>
>> Thanks everyone for the input so far. I have a question for the
>> maintainers, given that it seems that this is functionally broken for
>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
>> it make sense to extend this proposed patch to all iGPUs until a more
>> permanent fix can be identified? At the end of the day I'll take
>> functional correctness over performance.
> 
> It's not functional correctness, it's usability.  All that is
> potentially broken is async flips (which depend on memory pressure and
> buffer placement), while if you effectively revert the patch, you end
> up  limiting all display buffers to either VRAM or GTT which may end
> up causing the inability to display anything because there is not
> enough memory in that pool for the next modeset.  We'll start getting
> bug reports about blank screens and failure to set modes because of
> memory pressure.  I think if we want a short term fix, it would be to
> always set the worst case watermarks.  The downside to that is that it
> would possibly cause some working display setups to stop working if
> they were on the margins to begin with.
> 
> Alex
> 
>>
>> Brian
>>
>>>
>>>>
>>>> Thanks again,
>>>> Brian
>>>>
>>>>>
>>>>> Alex
>>>>>
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>>>
>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>> Cc: stable@vger.kernel.org # 6.1+
>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
>>>>>>>> Signed-off-by: Brian Geffon <bgeffon@google.com>
>>>>>>>> ---
>>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
>>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
>>>>>>>>                                             uint32_t domain)
>>>>>>>>  {
>>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
>>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
>>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
>>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
>>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
>>>>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
>>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
>>>>>>>> --
>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
>>>>>>>>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-18 21:02               ` Leo Li
@ 2025-07-18 21:33                 ` Alex Deucher
  2025-07-18 22:01                   ` Leo Li
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Deucher @ 2025-07-18 21:33 UTC (permalink / raw)
  To: Leo Li
  Cc: Brian Geffon, Wentland, Harry, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans,
	Thadeu Lima de Souza Cascardo, stable

On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng.li@amd.com> wrote:
>
>
>
> On 2025-07-18 16:07, Alex Deucher wrote:
> > On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@google.com> wrote:
> >>
> >> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>
> >>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>
> >>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>>>
> >>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>
> >>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>>>>>
> >>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>>>
> >>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> >>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> >>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
> >>>>>>>> It appears that at least one additional ASIC does not support this which
> >>>>>>>> is Raven.
> >>>>>>>>
> >>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> >>>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
> >>>>>>>> and VRAM.
> >>>>>>>
> >>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
> >>>>>>> stoney this is a hardware limitation (all display buffers need to be
> >>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
> >>>>>>> limitation and we tested raven pretty extensively at the time.
> >>>>>>
> >>>>>> Thanks for taking the time to look. We have automated testing and a
> >>>>>> few igt gpu tools tests failed and after debugging we found that
> >>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
> >>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> >>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> >>>>>> VRAM buffers resolves the issue.
> >>>>>
> >>>>> + Harry and Leo
> >>>>>
> >>>>> This sounds like the memory placement issue we discussed last week.
> >>>>> In that case, the issue is related to where the buffer ends up when we
> >>>>> try to do an async flip.  In that case, we can't do an async flip
> >>>>> without a full modeset if the buffers locations are different than the
> >>>>> last modeset because we need to update more than just the buffer base
> >>>>> addresses.  This change works around that limitation by always forcing
> >>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
> >>>>> those tests but will make the overall experience worse because we'll
> >>>>> end up effectively not being able to not fully utilize both gtt and
> >>>>> vram for display which would reintroduce all of the problems fixed by
> >>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
> >>>>
> >>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
> >>>> would Raven only be impacted by this? It would seem that all devices
> >>>> would have this issue, no? Also, I'm not familiar with how
> >>>
> >>> It depends on memory pressure and available memory in each pool.
> >>> E.g., initially the display buffer is in VRAM when the initial mode
> >>> set happens.  The watermarks, etc. are set for that scenario.  One of
> >>> the next frames ends up in a pool different than the original.  Now
> >>> the buffer is in GTT.  The async flip interface does a fast validation
> >>> to try and flip as soon as possible, but that validation fails because
> >>> the watermarks need to be updated which requires a full modeset.
>
> Huh, I'm not sure if this actually is an issue for APUs. The fix that introduced
> a check for same memory placement on async flips was on a system with a DGPU,
> for which VRAM placement does matter:
> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
>
> Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
> bandwidth validation depending on memory placement. There's a gpuvm_enable flag
> for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
> APUs specifically, we *should* be able to ignore the mem placement check. I can
> spin up a patch to test this out.

Is the gpu_vm_support flag ever set for dGPUs?  The allowed domains
for display buffers are determined by
amdgpu_display_supported_domains() and we only allow GTT as a domain
if gpu_vm_support is set, which I think is just for APUs.  In that
case, we could probably only need the checks specifically for
CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
and GTT (only one or the other?).  dGPUs and really old APUs will
always get VRAM, and newer APUs will get VRAM | GTT.

Alex

>
> Thanks,
> Leo
>
> >>>
> >>> It's tricky to fix because you don't want to use the worst case
> >>> watermarks all the time because that will limit the number available
> >>> display options and you don't want to force everything to a particular
> >>> memory pool because that will limit the amount of memory that can be
> >>> used for display (which is what the patch in question fixed).  Ideally
> >>> the caller would do a test commit before the page flip to determine
> >>> whether or not it would succeed before issuing it and then we'd have
> >>> some feedback mechanism to tell the caller that the commit would fail
> >>> due to buffer placement so it would do a full modeset instead.  We
> >>> discussed this feedback mechanism last week at the display hackfest.
> >>>
> >>>
> >>>> kms_plane_alpha_blend works, but does this also support that test
> >>>> failing as the cause?
> >>>
> >>> That may be related.  I'm not too familiar with that test either, but
> >>> Leo or Harry can provide some guidance.
> >>>
> >>> Alex
> >>
> >> Thanks everyone for the input so far. I have a question for the
> >> maintainers, given that it seems that this is functionally broken for
> >> ASICs which are iGPUs, and there does not seem to be an easy fix, does
> >> it make sense to extend this proposed patch to all iGPUs until a more
> >> permanent fix can be identified? At the end of the day I'll take
> >> functional correctness over performance.
> >
> > It's not functional correctness, it's usability.  All that is
> > potentially broken is async flips (which depend on memory pressure and
> > buffer placement), while if you effectively revert the patch, you end
> > up  limiting all display buffers to either VRAM or GTT which may end
> > up causing the inability to display anything because there is not
> > enough memory in that pool for the next modeset.  We'll start getting
> > bug reports about blank screens and failure to set modes because of
> > memory pressure.  I think if we want a short term fix, it would be to
> > always set the worst case watermarks.  The downside to that is that it
> > would possibly cause some working display setups to stop working if
> > they were on the margins to begin with.
> >
> > Alex
> >
> >>
> >> Brian
> >>
> >>>
> >>>>
> >>>> Thanks again,
> >>>> Brian
> >>>>
> >>>>>
> >>>>> Alex
> >>>>>
> >>>>>>
> >>>>>> Brian
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Alex
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> >>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>>>>> Cc: Christian König <christian.koenig@amd.com>
> >>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>>>>> Cc: stable@vger.kernel.org # 6.1+
> >>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> >>>>>>>> Signed-off-by: Brian Geffon <bgeffon@google.com>
> >>>>>>>> ---
> >>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> >>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>> index 73403744331a..5d7f13e25b7c 100644
> >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> >>>>>>>>                                             uint32_t domain)
> >>>>>>>>  {
> >>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> >>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> >>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> >>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
> >>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> >>>>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> >>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
> >>>>>>>> --
> >>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
> >>>>>>>>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-18 21:33                 ` Alex Deucher
@ 2025-07-18 22:01                   ` Leo Li
  2025-07-18 23:00                     ` Alex Deucher
  0 siblings, 1 reply; 18+ messages in thread
From: Leo Li @ 2025-07-18 22:01 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Brian Geffon, Wentland, Harry, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans,
	Thadeu Lima de Souza Cascardo, stable



On 2025-07-18 17:33, Alex Deucher wrote:
> On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng.li@amd.com> wrote:
>>
>>
>>
>> On 2025-07-18 16:07, Alex Deucher wrote:
>>> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>
>>>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>>
>>>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>
>>>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>>>>
>>>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
>>>>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
>>>>>>>>>> It appears that at least one additional ASIC does not support this which
>>>>>>>>>> is Raven.
>>>>>>>>>>
>>>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
>>>>>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
>>>>>>>>>> and VRAM.
>>>>>>>>>
>>>>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
>>>>>>>>> stoney this is a hardware limitation (all display buffers need to be
>>>>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
>>>>>>>>> limitation and we tested raven pretty extensively at the time.
>>>>>>>>
>>>>>>>> Thanks for taking the time to look. We have automated testing and a
>>>>>>>> few igt gpu tools tests failed and after debugging we found that
>>>>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
>>>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
>>>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
>>>>>>>> VRAM buffers resolves the issue.
>>>>>>>
>>>>>>> + Harry and Leo
>>>>>>>
>>>>>>> This sounds like the memory placement issue we discussed last week.
>>>>>>> In that case, the issue is related to where the buffer ends up when we
>>>>>>> try to do an async flip.  In that case, we can't do an async flip
>>>>>>> without a full modeset if the buffers locations are different than the
>>>>>>> last modeset because we need to update more than just the buffer base
>>>>>>> addresses.  This change works around that limitation by always forcing
>>>>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
>>>>>>> those tests but will make the overall experience worse because we'll
>>>>>>> end up effectively not being able to not fully utilize both gtt and
>>>>>>> vram for display which would reintroduce all of the problems fixed by
>>>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
>>>>>>
>>>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
>>>>>> would Raven only be impacted by this? It would seem that all devices
>>>>>> would have this issue, no? Also, I'm not familiar with how
>>>>>
>>>>> It depends on memory pressure and available memory in each pool.
>>>>> E.g., initially the display buffer is in VRAM when the initial mode
>>>>> set happens.  The watermarks, etc. are set for that scenario.  One of
>>>>> the next frames ends up in a pool different than the original.  Now
>>>>> the buffer is in GTT.  The async flip interface does a fast validation
>>>>> to try and flip as soon as possible, but that validation fails because
>>>>> the watermarks need to be updated which requires a full modeset.
>>
>> Huh, I'm not sure if this actually is an issue for APUs. The fix that introduced
>> a check for same memory placement on async flips was on a system with a DGPU,
>> for which VRAM placement does matter:
>> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
>>
>> Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
>> bandwidth validation depending on memory placement. There's a gpuvm_enable flag
>> for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
>> APUs specifically, we *should* be able to ignore the mem placement check. I can
>> spin up a patch to test this out.
> 
> Is the gpu_vm_support flag ever set for dGPUs?  The allowed domains
> for display buffers are determined by
> amdgpu_display_supported_domains() and we only allow GTT as a domain
> if gpu_vm_support is set, which I think is just for APUs.  In that
> case, we could probably only need the checks specifically for
> CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
> and GTT (only one or the other?).  dGPUs and really old APUs will
> always get VRAM, and newer APUs will get VRAM | GTT.

It doesn't look like gpu_vm_support is set for DGPUs
https://elixir.bootlin.com/linux/v6.15.6/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L1866

Though interestingly, further up at #L1858, Raven has gpu_vm_support = 0. Maybe it had stability issues?
https://github.com/torvalds/linux/commit/098c13079c6fdd44f10586b69132c392ebf87450

- Leo

> 
> Alex
> 
>>
>> Thanks,
>> Leo
>>
>>>>>
>>>>> It's tricky to fix because you don't want to use the worst case
>>>>> watermarks all the time because that will limit the number available
>>>>> display options and you don't want to force everything to a particular
>>>>> memory pool because that will limit the amount of memory that can be
>>>>> used for display (which is what the patch in question fixed).  Ideally
>>>>> the caller would do a test commit before the page flip to determine
>>>>> whether or not it would succeed before issuing it and then we'd have
>>>>> some feedback mechanism to tell the caller that the commit would fail
>>>>> due to buffer placement so it would do a full modeset instead.  We
>>>>> discussed this feedback mechanism last week at the display hackfest.
>>>>>
>>>>>
>>>>>> kms_plane_alpha_blend works, but does this also support that test
>>>>>> failing as the cause?
>>>>>
>>>>> That may be related.  I'm not too familiar with that test either, but
>>>>> Leo or Harry can provide some guidance.
>>>>>
>>>>> Alex
>>>>
>>>> Thanks everyone for the input so far. I have a question for the
>>>> maintainers, given that it seems that this is functionally broken for
>>>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
>>>> it make sense to extend this proposed patch to all iGPUs until a more
>>>> permanent fix can be identified? At the end of the day I'll take
>>>> functional correctness over performance.
>>>
>>> It's not functional correctness, it's usability.  All that is
>>> potentially broken is async flips (which depend on memory pressure and
>>> buffer placement), while if you effectively revert the patch, you end
>>> up  limiting all display buffers to either VRAM or GTT which may end
>>> up causing the inability to display anything because there is not
>>> enough memory in that pool for the next modeset.  We'll start getting
>>> bug reports about blank screens and failure to set modes because of
>>> memory pressure.  I think if we want a short term fix, it would be to
>>> always set the worst case watermarks.  The downside to that is that it
>>> would possibly cause some working display setups to stop working if
>>> they were on the margins to begin with.
>>>
>>> Alex
>>>
>>>>
>>>> Brian
>>>>
>>>>>
>>>>>>
>>>>>> Thanks again,
>>>>>> Brian
>>>>>>
>>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>>>
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Alex
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>> Cc: stable@vger.kernel.org # 6.1+
>>>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
>>>>>>>>>> Signed-off-by: Brian Geffon <bgeffon@google.com>
>>>>>>>>>> ---
>>>>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
>>>>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
>>>>>>>>>>                                             uint32_t domain)
>>>>>>>>>>  {
>>>>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
>>>>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
>>>>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
>>>>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
>>>>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
>>>>>>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
>>>>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
>>>>>>>>>> --
>>>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
>>>>>>>>>>
>>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-18 22:01                   ` Leo Li
@ 2025-07-18 23:00                     ` Alex Deucher
  2025-07-22 11:21                       ` Thadeu Lima de Souza Cascardo
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Deucher @ 2025-07-18 23:00 UTC (permalink / raw)
  To: Leo Li
  Cc: Brian Geffon, Wentland, Harry, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans,
	Thadeu Lima de Souza Cascardo, stable

[-- Attachment #1: Type: text/plain, Size: 11105 bytes --]

On Fri, Jul 18, 2025 at 6:01 PM Leo Li <sunpeng.li@amd.com> wrote:
>
>
>
> On 2025-07-18 17:33, Alex Deucher wrote:
> > On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng.li@amd.com> wrote:
> >>
> >>
> >>
> >> On 2025-07-18 16:07, Alex Deucher wrote:
> >>> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>
> >>>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>>>
> >>>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>
> >>>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>>>>>
> >>>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> >>>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> >>>>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
> >>>>>>>>>> It appears that at least one additional ASIC does not support this which
> >>>>>>>>>> is Raven.
> >>>>>>>>>>
> >>>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> >>>>>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
> >>>>>>>>>> and VRAM.
> >>>>>>>>>
> >>>>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
> >>>>>>>>> stoney this is a hardware limitation (all display buffers need to be
> >>>>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
> >>>>>>>>> limitation and we tested raven pretty extensively at the time.s
> >>>>>>>>
> >>>>>>>> Thanks for taking the time to look. We have automated testing and a
> >>>>>>>> few igt gpu tools tests failed and after debugging we found that
> >>>>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
> >>>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> >>>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> >>>>>>>> VRAM buffers resolves the issue.
> >>>>>>>
> >>>>>>> + Harry and Leo
> >>>>>>>
> >>>>>>> This sounds like the memory placement issue we discussed last week.
> >>>>>>> In that case, the issue is related to where the buffer ends up when we
> >>>>>>> try to do an async flip.  In that case, we can't do an async flip
> >>>>>>> without a full modeset if the buffers locations are different than the
> >>>>>>> last modeset because we need to update more than just the buffer base
> >>>>>>> addresses.  This change works around that limitation by always forcing
> >>>>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
> >>>>>>> those tests but will make the overall experience worse because we'll
> >>>>>>> end up effectively not being able to not fully utilize both gtt and
> >>>>>>> vram for display which would reintroduce all of the problems fixed by
> >>>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
> >>>>>>
> >>>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
> >>>>>> would Raven only be impacted by this? It would seem that all devices
> >>>>>> would have this issue, no? Also, I'm not familiar with how
> >>>>>
> >>>>> It depends on memory pressure and available memory in each pool.
> >>>>> E.g., initially the display buffer is in VRAM when the initial mode
> >>>>> set happens.  The watermarks, etc. are set for that scenario.  One of
> >>>>> the next frames ends up in a pool different than the original.  Now
> >>>>> the buffer is in GTT.  The async flip interface does a fast validation
> >>>>> to try and flip as soon as possible, but that validation fails because
> >>>>> the watermarks need to be updated which requires a full modeset.
> >>
> >> Huh, I'm not sure if this actually is an issue for APUs. The fix that introduced
> >> a check for same memory placement on async flips was on a system with a DGPU,
> >> for which VRAM placement does matter:
> >> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
> >>
> >> Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
> >> bandwidth validation depending on memory placement. There's a gpuvm_enable flag
> >> for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
> >> APUs specifically, we *should* be able to ignore the mem placement check. I can
> >> spin up a patch to test this out.
> >
> > Is the gpu_vm_support flag ever set for dGPUs?  The allowed domains
> > for display buffers are determined by
> > amdgpu_display_supported_domains() and we only allow GTT as a domain
> > if gpu_vm_support is set, which I think is just for APUs.  In that
> > case, we could probably only need the checks specifically for
> > CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
> > and GTT (only one or the other?).  dGPUs and really old APUs will
> > always get VRAM, and newer APUs will get VRAM | GTT.
>
> It doesn't look like gpu_vm_support is set for DGPUs
> https://elixir.bootlin.com/linux/v6.15.6/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L1866
>
> Though interestingly, further up at #L1858, Raven has gpu_vm_support = 0. Maybe it had stability issues?
> https://github.com/torvalds/linux/commit/098c13079c6fdd44f10586b69132c392ebf87450

We need to be a little careful here asic_type == CHIP_RAVEN covers
several variants:
apu_flags & AMD_APU_IS_RAVEN - raven1 (gpu_vm_support = false)
apu_flags & AMD_APU_IS_RAVEN2 - raven2 (gpu_vm_support = true)
apu_flags & AMD_APU_IS_PICASSO - picasso (gpu_vm_support = true)

amdgpu_display_supported_domains() only sets AMDGPU_GEM_DOMAIN_GTT if
gpu_vm_support is true.  so we'd never get into the check in
amdgpu_bo_get_preferred_domain() for raven1.

Anyway, back to your suggestion, I think we can probably drop the
checks as you should always get a compatible memory buffer due to
amdgpu_bo_get_preferred_domain(). Pinning should fail if we can't pin
in the required domain.  amdgpu_display_supported_domains() will
ensure you always get VRAM or GTT or VRAM | GTT depending on what the
chip supports.  Then amdgpu_bo_get_preferred_domain() will either
leave that as is, or force VRAM or GTT for the STONEY/CARRIZO case.
On the off chance we do get incompatible memory, something like the
attached patch should do the trick.

Alex


>
> - Leo
>
> >
> > Alex
> >
> >>
> >> Thanks,
> >> Leo
> >>
> >>>>>
> >>>>> It's tricky to fix because you don't want to use the worst case
> >>>>> watermarks all the time because that will limit the number available
> >>>>> display options and you don't want to force everything to a particular
> >>>>> memory pool because that will limit the amount of memory that can be
> >>>>> used for display (which is what the patch in question fixed).  Ideally
> >>>>> the caller would do a test commit before the page flip to determine
> >>>>> whether or not it would succeed before issuing it and then we'd have
> >>>>> some feedback mechanism to tell the caller that the commit would fail
> >>>>> due to buffer placement so it would do a full modeset instead.  We
> >>>>> discussed this feedback mechanism last week at the display hackfest.
> >>>>>
> >>>>>
> >>>>>> kms_plane_alpha_blend works, but does this also support that test
> >>>>>> failing as the cause?
> >>>>>
> >>>>> That may be related.  I'm not too familiar with that test either, but
> >>>>> Leo or Harry can provide some guidance.
> >>>>>
> >>>>> Alex
> >>>>
> >>>> Thanks everyone for the input so far. I have a question for the
> >>>> maintainers, given that it seems that this is functionally broken for
> >>>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
> >>>> it make sense to extend this proposed patch to all iGPUs until a more
> >>>> permanent fix can be identified? At the end of the day I'll take
> >>>> functional correctness over performance.
> >>>
> >>> It's not functional correctness, it's usability.  All that is
> >>> potentially broken is async flips (which depend on memory pressure and
> >>> buffer placement), while if you effectively revert the patch, you end
> >>> up  limiting all display buffers to either VRAM or GTT which may end
> >>> up causing the inability to display anything because there is not
> >>> enough memory in that pool for the next modeset.  We'll start getting
> >>> bug reports about blank screens and failure to set modes because of
> >>> memory pressure.  I think if we want a short term fix, it would be to
> >>> always set the worst case watermarks.  The downside to that is that it
> >>> would possibly cause some working display setups to stop working if
> >>> they were on the margins to begin with.
> >>>
> >>> Alex
> >>>
> >>>>
> >>>> Brian
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks again,
> >>>>>> Brian
> >>>>>>
> >>>>>>>
> >>>>>>> Alex
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Brian
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Alex
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> >>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>>>>>>> Cc: Christian König <christian.koenig@amd.com>
> >>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>>>>>>> Cc: stable@vger.kernel.org # 6.1+
> >>>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> >>>>>>>>>> Signed-off-by: Brian Geffon <bgeffon@google.com>
> >>>>>>>>>> ---
> >>>>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> >>>>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
> >>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> >>>>>>>>>>                                             uint32_t domain)
> >>>>>>>>>>  {
> >>>>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> >>>>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> >>>>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> >>>>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
> >>>>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> >>>>>>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> >>>>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
> >>>>>>>>>> --
> >>>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
> >>>>>>>>>>
> >>
>

[-- Attachment #2: 0001-drm-amd-display-refine-framebuffer-placement-checks.patch --]
[-- Type: text/x-patch, Size: 3966 bytes --]

From cce1652c62c42c858de64c306ea0ddc7af3bd0b1 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Fri, 18 Jul 2025 18:40:26 -0400
Subject: [PATCH] drm/amd/display: refine framebuffer placement checks

When we commit planes, we need to make sure the
framebuffer memory locations are compatible. Various
hardware has the following requirements for display buffers:
dGPUs, old APUs, raven1 - must be in VRAM
cazziro/stoney - must be in VRAM or GTT, but not both
newer APUs (raven2/picasso and newer) - can be in VRAM or GTT

You should always get a compatible memory buffer due to
amdgpu_bo_get_preferred_domain(). amdgpu_display_supported_domains()
will ensure you always get VRAM or GTT or VRAM | GTT depending on
what the chip supports.  Then amdgpu_bo_get_preferred_domain()
will either leave that as is when pinning, or force VRAM or GTT
for the STONEY/CARRIZO case.

As such the checks could probably be removed, but on the off chance
we do end up getting different memory pool for the old
and new framebuffers, refine the check to take into account the
hardware capabilities.

Fixes: a7c0cad0dc06 ("drm/amd/display: ensure async flips are only accepted for fast updates")
Reported-by: Brian Geffon <bgeffon@google.com>
Cc: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 ++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 129476b6d5fa9..de2bd789ec15b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9288,6 +9288,18 @@ static void amdgpu_dm_enable_self_refresh(struct amdgpu_crtc *acrtc_attach,
 	}
 }
 
+static bool amdgpu_dm_mem_type_compatible(struct amdgpu_device *adev,
+					  struct drm_framebuffer *old_fb,
+					  struct drm_framebuffer *new_fb)
+{
+	if (!adev->mode_info.gpu_vm_support ||
+	    (adev->asic_type == CHIP_CARRIZO) ||
+	    (adev->asic_type == CHIP_STONEY))
+		return get_mem_type(old_fb) == get_mem_type(new_fb);
+
+	return true;
+}
+
 static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 				    struct drm_device *dev,
 				    struct amdgpu_display_manager *dm,
@@ -9465,7 +9477,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 		 */
 		if (crtc->state->async_flip &&
 		    (acrtc_state->update_type != UPDATE_TYPE_FAST ||
-		     get_mem_type(old_plane_state->fb) != get_mem_type(fb)))
+		     !amdgpu_dm_mem_type_compatible(dm->adev, old_plane_state->fb, fb)))
 			drm_warn_once(state->dev,
 				      "[PLANE:%d:%s] async flip with non-fast update\n",
 				      plane->base.id, plane->name);
@@ -9473,7 +9485,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 		bundle->flip_addrs[planes_count].flip_immediate =
 			crtc->state->async_flip &&
 			acrtc_state->update_type == UPDATE_TYPE_FAST &&
-			get_mem_type(old_plane_state->fb) == get_mem_type(fb);
+			amdgpu_dm_mem_type_compatible(dm->adev, old_plane_state->fb, fb);
 
 		timestamp_ns = ktime_get_ns();
 		bundle->flip_addrs[planes_count].flip_timestamp_in_us = div_u64(timestamp_ns, 1000);
@@ -11760,6 +11772,7 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
 					    struct drm_atomic_state *state,
 					    struct drm_crtc_state *crtc_state)
 {
+	struct amdgpu_device *adev = drm_to_adev(dev);
 	struct drm_plane *plane;
 	struct drm_plane_state *new_plane_state, *old_plane_state;
 
@@ -11773,7 +11786,8 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
 		}
 
 		if (old_plane_state->fb && new_plane_state->fb &&
-		    get_mem_type(old_plane_state->fb) != get_mem_type(new_plane_state->fb))
+		    !amdgpu_dm_mem_type_compatible(adev, old_plane_state->fb,
+						   new_plane_state->fb))
 			return true;
 	}
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-18 23:00                     ` Alex Deucher
@ 2025-07-22 11:21                       ` Thadeu Lima de Souza Cascardo
  2025-07-22 13:54                         ` Leo Li
  0 siblings, 1 reply; 18+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2025-07-22 11:21 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Leo Li, Brian Geffon, Wentland, Harry, Alex Deucher,
	christian.koenig, David Airlie, Simona Vetter, Tvrtko Ursulin,
	Yunxiang Li, Lijo Lazar, Prike Liang, Pratap Nirujogi,
	Luben Tuikov, amd-gfx, dri-devel, linux-kernel, Garrick Evans,
	stable

On Fri, Jul 18, 2025 at 07:00:39PM -0400, Alex Deucher wrote:
> On Fri, Jul 18, 2025 at 6:01 PM Leo Li <sunpeng.li@amd.com> wrote:
> >
> >
> >
> > On 2025-07-18 17:33, Alex Deucher wrote:
> > > On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng.li@amd.com> wrote:
> > >>
> > >>
> > >>
> > >> On 2025-07-18 16:07, Alex Deucher wrote:
> > >>> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@google.com> wrote:
> > >>>>
> > >>>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
> > >>>>>
> > >>>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
> > >>>>>>
> > >>>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > >>>>>>>
> > >>>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
> > >>>>>>>>
> > >>>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > >>>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> > >>>>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
> > >>>>>>>>>> It appears that at least one additional ASIC does not support this which
> > >>>>>>>>>> is Raven.
> > >>>>>>>>>>
> > >>>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> > >>>>>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
> > >>>>>>>>>> and VRAM.
> > >>>>>>>>>
> > >>>>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
> > >>>>>>>>> stoney this is a hardware limitation (all display buffers need to be
> > >>>>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
> > >>>>>>>>> limitation and we tested raven pretty extensively at the time.s
> > >>>>>>>>
> > >>>>>>>> Thanks for taking the time to look. We have automated testing and a
> > >>>>>>>> few igt gpu tools tests failed and after debugging we found that
> > >>>>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
> > >>>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> > >>>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> > >>>>>>>> VRAM buffers resolves the issue.
> > >>>>>>>
> > >>>>>>> + Harry and Leo
> > >>>>>>>
> > >>>>>>> This sounds like the memory placement issue we discussed last week.
> > >>>>>>> In that case, the issue is related to where the buffer ends up when we
> > >>>>>>> try to do an async flip.  In that case, we can't do an async flip
> > >>>>>>> without a full modeset if the buffers locations are different than the
> > >>>>>>> last modeset because we need to update more than just the buffer base
> > >>>>>>> addresses.  This change works around that limitation by always forcing
> > >>>>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
> > >>>>>>> those tests but will make the overall experience worse because we'll
> > >>>>>>> end up effectively not being able to not fully utilize both gtt and
> > >>>>>>> vram for display which would reintroduce all of the problems fixed by
> > >>>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
> > >>>>>>
> > >>>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
> > >>>>>> would Raven only be impacted by this? It would seem that all devices
> > >>>>>> would have this issue, no? Also, I'm not familiar with how
> > >>>>>
> > >>>>> It depends on memory pressure and available memory in each pool.
> > >>>>> E.g., initially the display buffer is in VRAM when the initial mode
> > >>>>> set happens.  The watermarks, etc. are set for that scenario.  One of
> > >>>>> the next frames ends up in a pool different than the original.  Now
> > >>>>> the buffer is in GTT.  The async flip interface does a fast validation
> > >>>>> to try and flip as soon as possible, but that validation fails because
> > >>>>> the watermarks need to be updated which requires a full modeset.
> > >>
> > >> Huh, I'm not sure if this actually is an issue for APUs. The fix that introduced
> > >> a check for same memory placement on async flips was on a system with a DGPU,
> > >> for which VRAM placement does matter:
> > >> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
> > >>
> > >> Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
> > >> bandwidth validation depending on memory placement. There's a gpuvm_enable flag
> > >> for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
> > >> APUs specifically, we *should* be able to ignore the mem placement check. I can
> > >> spin up a patch to test this out.
> > >
> > > Is the gpu_vm_support flag ever set for dGPUs?  The allowed domains
> > > for display buffers are determined by
> > > amdgpu_display_supported_domains() and we only allow GTT as a domain
> > > if gpu_vm_support is set, which I think is just for APUs.  In that
> > > case, we could probably only need the checks specifically for
> > > CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
> > > and GTT (only one or the other?).  dGPUs and really old APUs will
> > > always get VRAM, and newer APUs will get VRAM | GTT.
> >
> > It doesn't look like gpu_vm_support is set for DGPUs
> > https://elixir.bootlin.com/linux/v6.15.6/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L1866
> >
> > Though interestingly, further up at #L1858, Raven has gpu_vm_support = 0. Maybe it had stability issues?
> > https://github.com/torvalds/linux/commit/098c13079c6fdd44f10586b69132c392ebf87450
> 
> We need to be a little careful here asic_type == CHIP_RAVEN covers
> several variants:
> apu_flags & AMD_APU_IS_RAVEN - raven1 (gpu_vm_support = false)
> apu_flags & AMD_APU_IS_RAVEN2 - raven2 (gpu_vm_support = true)
> apu_flags & AMD_APU_IS_PICASSO - picasso (gpu_vm_support = true)
> 
> amdgpu_display_supported_domains() only sets AMDGPU_GEM_DOMAIN_GTT if
> gpu_vm_support is true.  so we'd never get into the check in
> amdgpu_bo_get_preferred_domain() for raven1.
> 
> Anyway, back to your suggestion, I think we can probably drop the
> checks as you should always get a compatible memory buffer due to
> amdgpu_bo_get_preferred_domain(). Pinning should fail if we can't pin
> in the required domain.  amdgpu_display_supported_domains() will
> ensure you always get VRAM or GTT or VRAM | GTT depending on what the
> chip supports.  Then amdgpu_bo_get_preferred_domain() will either
> leave that as is, or force VRAM or GTT for the STONEY/CARRIZO case.
> On the off chance we do get incompatible memory, something like the
> attached patch should do the trick.
> 
> Alex
> 

Thanks for the patch, Alex.

I have tested it, and though kms_async_flips and kms_plane_alpha_blend
pass, kms_plane_cursor still fail.

I am going to investigate a little more today and send more details from my
findings.

Thanks.
Cascardo.

> 
> >
> > - Leo
> >
> > >
> > > Alex
> > >
> > >>
> > >> Thanks,
> > >> Leo
> > >>
> > >>>>>
> > >>>>> It's tricky to fix because you don't want to use the worst case
> > >>>>> watermarks all the time because that will limit the number available
> > >>>>> display options and you don't want to force everything to a particular
> > >>>>> memory pool because that will limit the amount of memory that can be
> > >>>>> used for display (which is what the patch in question fixed).  Ideally
> > >>>>> the caller would do a test commit before the page flip to determine
> > >>>>> whether or not it would succeed before issuing it and then we'd have
> > >>>>> some feedback mechanism to tell the caller that the commit would fail
> > >>>>> due to buffer placement so it would do a full modeset instead.  We
> > >>>>> discussed this feedback mechanism last week at the display hackfest.
> > >>>>>
> > >>>>>
> > >>>>>> kms_plane_alpha_blend works, but does this also support that test
> > >>>>>> failing as the cause?
> > >>>>>
> > >>>>> That may be related.  I'm not too familiar with that test either, but
> > >>>>> Leo or Harry can provide some guidance.
> > >>>>>
> > >>>>> Alex
> > >>>>
> > >>>> Thanks everyone for the input so far. I have a question for the
> > >>>> maintainers, given that it seems that this is functionally broken for
> > >>>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
> > >>>> it make sense to extend this proposed patch to all iGPUs until a more
> > >>>> permanent fix can be identified? At the end of the day I'll take
> > >>>> functional correctness over performance.
> > >>>
> > >>> It's not functional correctness, it's usability.  All that is
> > >>> potentially broken is async flips (which depend on memory pressure and
> > >>> buffer placement), while if you effectively revert the patch, you end
> > >>> up  limiting all display buffers to either VRAM or GTT which may end
> > >>> up causing the inability to display anything because there is not
> > >>> enough memory in that pool for the next modeset.  We'll start getting
> > >>> bug reports about blank screens and failure to set modes because of
> > >>> memory pressure.  I think if we want a short term fix, it would be to
> > >>> always set the worst case watermarks.  The downside to that is that it
> > >>> would possibly cause some working display setups to stop working if
> > >>> they were on the margins to begin with.
> > >>>
> > >>> Alex
> > >>>
> > >>>>
> > >>>> Brian
> > >>>>
> > >>>>>
> > >>>>>>
> > >>>>>> Thanks again,
> > >>>>>> Brian
> > >>>>>>
> > >>>>>>>
> > >>>>>>> Alex
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> Brian
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Alex
> > >>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> > >>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> > >>>>>>>>>> Cc: Christian König <christian.koenig@amd.com>
> > >>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> > >>>>>>>>>> Cc: stable@vger.kernel.org # 6.1+
> > >>>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> > >>>>>>>>>> Signed-off-by: Brian Geffon <bgeffon@google.com>
> > >>>>>>>>>> ---
> > >>>>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> > >>>>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
> > >>>>>>>>>>
> > >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > >>>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
> > >>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > >>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > >>>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> > >>>>>>>>>>                                             uint32_t domain)
> > >>>>>>>>>>  {
> > >>>>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> > >>>>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> > >>>>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> > >>>>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
> > >>>>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> > >>>>>>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> > >>>>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
> > >>>>>>>>>> --
> > >>>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
> > >>>>>>>>>>
> > >>
> >

> From cce1652c62c42c858de64c306ea0ddc7af3bd0b1 Mon Sep 17 00:00:00 2001
> From: Alex Deucher <alexander.deucher@amd.com>
> Date: Fri, 18 Jul 2025 18:40:26 -0400
> Subject: [PATCH] drm/amd/display: refine framebuffer placement checks
> 
> When we commit planes, we need to make sure the
> framebuffer memory locations are compatible. Various
> hardware has the following requirements for display buffers:
> dGPUs, old APUs, raven1 - must be in VRAM
> cazziro/stoney - must be in VRAM or GTT, but not both
> newer APUs (raven2/picasso and newer) - can be in VRAM or GTT
> 
> You should always get a compatible memory buffer due to
> amdgpu_bo_get_preferred_domain(). amdgpu_display_supported_domains()
> will ensure you always get VRAM or GTT or VRAM | GTT depending on
> what the chip supports.  Then amdgpu_bo_get_preferred_domain()
> will either leave that as is when pinning, or force VRAM or GTT
> for the STONEY/CARRIZO case.
> 
> As such the checks could probably be removed, but on the off chance
> we do end up getting different memory pool for the old
> and new framebuffers, refine the check to take into account the
> hardware capabilities.
> 
> Fixes: a7c0cad0dc06 ("drm/amd/display: ensure async flips are only accepted for fast updates")
> Reported-by: Brian Geffon <bgeffon@google.com>
> Cc: Leo Li <sunpeng.li@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 ++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 129476b6d5fa9..de2bd789ec15b 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -9288,6 +9288,18 @@ static void amdgpu_dm_enable_self_refresh(struct amdgpu_crtc *acrtc_attach,
>  	}
>  }
>  
> +static bool amdgpu_dm_mem_type_compatible(struct amdgpu_device *adev,
> +					  struct drm_framebuffer *old_fb,
> +					  struct drm_framebuffer *new_fb)
> +{
> +	if (!adev->mode_info.gpu_vm_support ||
> +	    (adev->asic_type == CHIP_CARRIZO) ||
> +	    (adev->asic_type == CHIP_STONEY))
> +		return get_mem_type(old_fb) == get_mem_type(new_fb);
> +
> +	return true;
> +}
> +
>  static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>  				    struct drm_device *dev,
>  				    struct amdgpu_display_manager *dm,
> @@ -9465,7 +9477,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>  		 */
>  		if (crtc->state->async_flip &&
>  		    (acrtc_state->update_type != UPDATE_TYPE_FAST ||
> -		     get_mem_type(old_plane_state->fb) != get_mem_type(fb)))
> +		     !amdgpu_dm_mem_type_compatible(dm->adev, old_plane_state->fb, fb)))
>  			drm_warn_once(state->dev,
>  				      "[PLANE:%d:%s] async flip with non-fast update\n",
>  				      plane->base.id, plane->name);
> @@ -9473,7 +9485,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>  		bundle->flip_addrs[planes_count].flip_immediate =
>  			crtc->state->async_flip &&
>  			acrtc_state->update_type == UPDATE_TYPE_FAST &&
> -			get_mem_type(old_plane_state->fb) == get_mem_type(fb);
> +			amdgpu_dm_mem_type_compatible(dm->adev, old_plane_state->fb, fb);
>  
>  		timestamp_ns = ktime_get_ns();
>  		bundle->flip_addrs[planes_count].flip_timestamp_in_us = div_u64(timestamp_ns, 1000);
> @@ -11760,6 +11772,7 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
>  					    struct drm_atomic_state *state,
>  					    struct drm_crtc_state *crtc_state)
>  {
> +	struct amdgpu_device *adev = drm_to_adev(dev);
>  	struct drm_plane *plane;
>  	struct drm_plane_state *new_plane_state, *old_plane_state;
>  
> @@ -11773,7 +11786,8 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
>  		}
>  
>  		if (old_plane_state->fb && new_plane_state->fb &&
> -		    get_mem_type(old_plane_state->fb) != get_mem_type(new_plane_state->fb))
> +		    !amdgpu_dm_mem_type_compatible(adev, old_plane_state->fb,
> +						   new_plane_state->fb))
>  			return true;
>  	}
>  
> -- 
> 2.50.1
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-22 11:21                       ` Thadeu Lima de Souza Cascardo
@ 2025-07-22 13:54                         ` Leo Li
  2025-07-28 16:38                           ` Alex Deucher
  0 siblings, 1 reply; 18+ messages in thread
From: Leo Li @ 2025-07-22 13:54 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo, Alex Deucher
  Cc: Brian Geffon, Wentland, Harry, Alex Deucher, christian.koenig,
	David Airlie, Simona Vetter, Tvrtko Ursulin, Yunxiang Li,
	Lijo Lazar, Prike Liang, Pratap Nirujogi, Luben Tuikov, amd-gfx,
	dri-devel, linux-kernel, Garrick Evans, stable



On 2025-07-22 07:21, Thadeu Lima de Souza Cascardo wrote:
> On Fri, Jul 18, 2025 at 07:00:39PM -0400, Alex Deucher wrote:
>> On Fri, Jul 18, 2025 at 6:01 PM Leo Li <sunpeng.li@amd.com> wrote:
>>>
>>>
>>>
>>> On 2025-07-18 17:33, Alex Deucher wrote:
>>>> On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng.li@amd.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2025-07-18 16:07, Alex Deucher wrote:
>>>>>> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>>
>>>>>>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>>>>
>>>>>>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>>>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
>>>>>>>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
>>>>>>>>>>>>> It appears that at least one additional ASIC does not support this which
>>>>>>>>>>>>> is Raven.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
>>>>>>>>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
>>>>>>>>>>>>> and VRAM.
>>>>>>>>>>>>
>>>>>>>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
>>>>>>>>>>>> stoney this is a hardware limitation (all display buffers need to be
>>>>>>>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
>>>>>>>>>>>> limitation and we tested raven pretty extensively at the time.s
>>>>>>>>>>>
>>>>>>>>>>> Thanks for taking the time to look. We have automated testing and a
>>>>>>>>>>> few igt gpu tools tests failed and after debugging we found that
>>>>>>>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
>>>>>>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
>>>>>>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
>>>>>>>>>>> VRAM buffers resolves the issue.
>>>>>>>>>>
>>>>>>>>>> + Harry and Leo
>>>>>>>>>>
>>>>>>>>>> This sounds like the memory placement issue we discussed last week.
>>>>>>>>>> In that case, the issue is related to where the buffer ends up when we
>>>>>>>>>> try to do an async flip.  In that case, we can't do an async flip
>>>>>>>>>> without a full modeset if the buffers locations are different than the
>>>>>>>>>> last modeset because we need to update more than just the buffer base
>>>>>>>>>> addresses.  This change works around that limitation by always forcing
>>>>>>>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
>>>>>>>>>> those tests but will make the overall experience worse because we'll
>>>>>>>>>> end up effectively not being able to not fully utilize both gtt and
>>>>>>>>>> vram for display which would reintroduce all of the problems fixed by
>>>>>>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
>>>>>>>>>
>>>>>>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
>>>>>>>>> would Raven only be impacted by this? It would seem that all devices
>>>>>>>>> would have this issue, no? Also, I'm not familiar with how
>>>>>>>>
>>>>>>>> It depends on memory pressure and available memory in each pool.
>>>>>>>> E.g., initially the display buffer is in VRAM when the initial mode
>>>>>>>> set happens.  The watermarks, etc. are set for that scenario.  One of
>>>>>>>> the next frames ends up in a pool different than the original.  Now
>>>>>>>> the buffer is in GTT.  The async flip interface does a fast validation
>>>>>>>> to try and flip as soon as possible, but that validation fails because
>>>>>>>> the watermarks need to be updated which requires a full modeset.
>>>>>
>>>>> Huh, I'm not sure if this actually is an issue for APUs. The fix that introduced
>>>>> a check for same memory placement on async flips was on a system with a DGPU,
>>>>> for which VRAM placement does matter:
>>>>> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
>>>>>
>>>>> Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
>>>>> bandwidth validation depending on memory placement. There's a gpuvm_enable flag
>>>>> for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
>>>>> APUs specifically, we *should* be able to ignore the mem placement check. I can
>>>>> spin up a patch to test this out.
>>>>
>>>> Is the gpu_vm_support flag ever set for dGPUs?  The allowed domains
>>>> for display buffers are determined by
>>>> amdgpu_display_supported_domains() and we only allow GTT as a domain
>>>> if gpu_vm_support is set, which I think is just for APUs.  In that
>>>> case, we could probably only need the checks specifically for
>>>> CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
>>>> and GTT (only one or the other?).  dGPUs and really old APUs will
>>>> always get VRAM, and newer APUs will get VRAM | GTT.
>>>
>>> It doesn't look like gpu_vm_support is set for DGPUs
>>> https://elixir.bootlin.com/linux/v6.15.6/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L1866
>>>
>>> Though interestingly, further up at #L1858, Raven has gpu_vm_support = 0. Maybe it had stability issues?
>>> https://github.com/torvalds/linux/commit/098c13079c6fdd44f10586b69132c392ebf87450
>>
>> We need to be a little careful here asic_type == CHIP_RAVEN covers
>> several variants:
>> apu_flags & AMD_APU_IS_RAVEN - raven1 (gpu_vm_support = false)
>> apu_flags & AMD_APU_IS_RAVEN2 - raven2 (gpu_vm_support = true)
>> apu_flags & AMD_APU_IS_PICASSO - picasso (gpu_vm_support = true)
>>
>> amdgpu_display_supported_domains() only sets AMDGPU_GEM_DOMAIN_GTT if
>> gpu_vm_support is true.  so we'd never get into the check in
>> amdgpu_bo_get_preferred_domain() for raven1.
>>
>> Anyway, back to your suggestion, I think we can probably drop the
>> checks as you should always get a compatible memory buffer due to
>> amdgpu_bo_get_preferred_domain(). Pinning should fail if we can't pin
>> in the required domain.  amdgpu_display_supported_domains() will
>> ensure you always get VRAM or GTT or VRAM | GTT depending on what the
>> chip supports.  Then amdgpu_bo_get_preferred_domain() will either
>> leave that as is, or force VRAM or GTT for the STONEY/CARRIZO case.
>> On the off chance we do get incompatible memory, something like the
>> attached patch should do the trick.

Thanks for the patch, this makes sense to me.

Somewhat unrelated: I wonder if setting AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS is necessary before
bo_pin(). FWIU from chatting with our DCN experts, DCN doesn't really care if the fb is
contiguous or not.

Which begs the question -- what exactly does AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS mean? From git
history, it seems setting this flag doesn't necessarily move the bo to be congiguous. But
rather:

    When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
    - This means contiguous is not mandatory.
    - we will try to allocate the contiguous buffer. Say if the
      allocation fails, we fallback to allocate the individual pages.

https://github.com/torvalds/linux/commit/e362b7c8f8c7af00d06f0ab609629101aebae993

Does that mean -- if the buffer is already in the required domain -- that bo_pin() will also
attempt to make it contiguous? Or will it just pin it from being moved and leave it at that?

I guess in any case, it sounds like VRAM_CONTIGUOUS is not necessary for DCN scanout.
I can give dropping it a spin and see if IGT complains.

Thanks,
Leo 

>>
>> Alex
>>
> 
> Thanks for the patch, Alex.
> 
> I have tested it, and though kms_async_flips and kms_plane_alpha_blend
> pass, kms_plane_cursor still fail.
> 
> I am going to investigate a little more today and send more details from my
> findings.
> 
> Thanks.
> Cascardo.
> 
>>
>>>
>>> - Leo
>>>
>>>>
>>>> Alex
>>>>
>>>>>
>>>>> Thanks,
>>>>> Leo
>>>>>
>>>>>>>>
>>>>>>>> It's tricky to fix because you don't want to use the worst case
>>>>>>>> watermarks all the time because that will limit the number available
>>>>>>>> display options and you don't want to force everything to a particular
>>>>>>>> memory pool because that will limit the amount of memory that can be
>>>>>>>> used for display (which is what the patch in question fixed).  Ideally
>>>>>>>> the caller would do a test commit before the page flip to determine
>>>>>>>> whether or not it would succeed before issuing it and then we'd have
>>>>>>>> some feedback mechanism to tell the caller that the commit would fail
>>>>>>>> due to buffer placement so it would do a full modeset instead.  We
>>>>>>>> discussed this feedback mechanism last week at the display hackfest.
>>>>>>>>
>>>>>>>>
>>>>>>>>> kms_plane_alpha_blend works, but does this also support that test
>>>>>>>>> failing as the cause?
>>>>>>>>
>>>>>>>> That may be related.  I'm not too familiar with that test either, but
>>>>>>>> Leo or Harry can provide some guidance.
>>>>>>>>
>>>>>>>> Alex
>>>>>>>
>>>>>>> Thanks everyone for the input so far. I have a question for the
>>>>>>> maintainers, given that it seems that this is functionally broken for
>>>>>>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
>>>>>>> it make sense to extend this proposed patch to all iGPUs until a more
>>>>>>> permanent fix can be identified? At the end of the day I'll take
>>>>>>> functional correctness over performance.
>>>>>>
>>>>>> It's not functional correctness, it's usability.  All that is
>>>>>> potentially broken is async flips (which depend on memory pressure and
>>>>>> buffer placement), while if you effectively revert the patch, you end
>>>>>> up  limiting all display buffers to either VRAM or GTT which may end
>>>>>> up causing the inability to display anything because there is not
>>>>>> enough memory in that pool for the next modeset.  We'll start getting
>>>>>> bug reports about blank screens and failure to set modes because of
>>>>>> memory pressure.  I think if we want a short term fix, it would be to
>>>>>> always set the worst case watermarks.  The downside to that is that it
>>>>>> would possibly cause some working display setups to stop working if
>>>>>> they were on the margins to begin with.
>>>>>>
>>>>>> Alex
>>>>>>
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks again,
>>>>>>>>> Brian
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Alex
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Brian
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Alex
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
>>>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>>>>> Cc: stable@vger.kernel.org # 6.1+
>>>>>>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
>>>>>>>>>>>>> Signed-off-by: Brian Geffon <bgeffon@google.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
>>>>>>>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
>>>>>>>>>>>>>                                             uint32_t domain)
>>>>>>>>>>>>>  {
>>>>>>>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
>>>>>>>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
>>>>>>>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
>>>>>>>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
>>>>>>>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
>>>>>>>>>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
>>>>>>>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
>>>>>>>>>>>>>
>>>>>
>>>
> 
>> From cce1652c62c42c858de64c306ea0ddc7af3bd0b1 Mon Sep 17 00:00:00 2001
>> From: Alex Deucher <alexander.deucher@amd.com>
>> Date: Fri, 18 Jul 2025 18:40:26 -0400
>> Subject: [PATCH] drm/amd/display: refine framebuffer placement checks
>>
>> When we commit planes, we need to make sure the
>> framebuffer memory locations are compatible. Various
>> hardware has the following requirements for display buffers:
>> dGPUs, old APUs, raven1 - must be in VRAM
>> cazziro/stoney - must be in VRAM or GTT, but not both
>> newer APUs (raven2/picasso and newer) - can be in VRAM or GTT
>>
>> You should always get a compatible memory buffer due to
>> amdgpu_bo_get_preferred_domain(). amdgpu_display_supported_domains()
>> will ensure you always get VRAM or GTT or VRAM | GTT depending on
>> what the chip supports.  Then amdgpu_bo_get_preferred_domain()
>> will either leave that as is when pinning, or force VRAM or GTT
>> for the STONEY/CARRIZO case.
>>
>> As such the checks could probably be removed, but on the off chance
>> we do end up getting different memory pool for the old
>> and new framebuffers, refine the check to take into account the
>> hardware capabilities.
>>
>> Fixes: a7c0cad0dc06 ("drm/amd/display: ensure async flips are only accepted for fast updates")
>> Reported-by: Brian Geffon <bgeffon@google.com>
>> Cc: Leo Li <sunpeng.li@amd.com>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> ---
>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 ++++++++++++++++---
>>  1 file changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> index 129476b6d5fa9..de2bd789ec15b 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> @@ -9288,6 +9288,18 @@ static void amdgpu_dm_enable_self_refresh(struct amdgpu_crtc *acrtc_attach,
>>  	}
>>  }
>>  
>> +static bool amdgpu_dm_mem_type_compatible(struct amdgpu_device *adev,
>> +					  struct drm_framebuffer *old_fb,
>> +					  struct drm_framebuffer *new_fb)
>> +{
>> +	if (!adev->mode_info.gpu_vm_support ||
>> +	    (adev->asic_type == CHIP_CARRIZO) ||
>> +	    (adev->asic_type == CHIP_STONEY))
>> +		return get_mem_type(old_fb) == get_mem_type(new_fb);
>> +
>> +	return true;
>> +}
>> +
>>  static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>>  				    struct drm_device *dev,
>>  				    struct amdgpu_display_manager *dm,
>> @@ -9465,7 +9477,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>>  		 */
>>  		if (crtc->state->async_flip &&
>>  		    (acrtc_state->update_type != UPDATE_TYPE_FAST ||
>> -		     get_mem_type(old_plane_state->fb) != get_mem_type(fb)))
>> +		     !amdgpu_dm_mem_type_compatible(dm->adev, old_plane_state->fb, fb)))
>>  			drm_warn_once(state->dev,
>>  				      "[PLANE:%d:%s] async flip with non-fast update\n",
>>  				      plane->base.id, plane->name);
>> @@ -9473,7 +9485,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>>  		bundle->flip_addrs[planes_count].flip_immediate =
>>  			crtc->state->async_flip &&
>>  			acrtc_state->update_type == UPDATE_TYPE_FAST &&
>> -			get_mem_type(old_plane_state->fb) == get_mem_type(fb);
>> +			amdgpu_dm_mem_type_compatible(dm->adev, old_plane_state->fb, fb);
>>  
>>  		timestamp_ns = ktime_get_ns();
>>  		bundle->flip_addrs[planes_count].flip_timestamp_in_us = div_u64(timestamp_ns, 1000);
>> @@ -11760,6 +11772,7 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
>>  					    struct drm_atomic_state *state,
>>  					    struct drm_crtc_state *crtc_state)
>>  {
>> +	struct amdgpu_device *adev = drm_to_adev(dev);
>>  	struct drm_plane *plane;
>>  	struct drm_plane_state *new_plane_state, *old_plane_state;
>>  
>> @@ -11773,7 +11786,8 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
>>  		}
>>  
>>  		if (old_plane_state->fb && new_plane_state->fb &&
>> -		    get_mem_type(old_plane_state->fb) != get_mem_type(new_plane_state->fb))
>> +		    !amdgpu_dm_mem_type_compatible(adev, old_plane_state->fb,
>> +						   new_plane_state->fb))
>>  			return true;
>>  	}
>>  
>> -- 
>> 2.50.1
>>
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-22 13:54                         ` Leo Li
@ 2025-07-28 16:38                           ` Alex Deucher
  2025-08-05 10:16                             ` Christian König
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Deucher @ 2025-07-28 16:38 UTC (permalink / raw)
  To: Leo Li
  Cc: Thadeu Lima de Souza Cascardo, Brian Geffon, Wentland, Harry,
	Alex Deucher, christian.koenig, David Airlie, Simona Vetter,
	Tvrtko Ursulin, Yunxiang Li, Lijo Lazar, Prike Liang,
	Pratap Nirujogi, Luben Tuikov, amd-gfx, dri-devel, linux-kernel,
	Garrick Evans, stable

On Tue, Jul 22, 2025 at 9:54 AM Leo Li <sunpeng.li@amd.com> wrote:
>
>
>
> On 2025-07-22 07:21, Thadeu Lima de Souza Cascardo wrote:
> > On Fri, Jul 18, 2025 at 07:00:39PM -0400, Alex Deucher wrote:
> >> On Fri, Jul 18, 2025 at 6:01 PM Leo Li <sunpeng.li@amd.com> wrote:
> >>>
> >>>
> >>>
> >>> On 2025-07-18 17:33, Alex Deucher wrote:
> >>>> On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng.li@amd.com> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 2025-07-18 16:07, Alex Deucher wrote:
> >>>>>> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>>
> >>>>>>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgeffon@google.com> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> >>>>>>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted that
> >>>>>>>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
> >>>>>>>>>>>>> It appears that at least one additional ASIC does not support this which
> >>>>>>>>>>>>> is Raven.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 kernel
> >>>>>>>>>>>>> and have confirmed that Raven also needs to be excluded from mixing GTT
> >>>>>>>>>>>>> and VRAM.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
> >>>>>>>>>>>> stoney this is a hardware limitation (all display buffers need to be
> >>>>>>>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
> >>>>>>>>>>>> limitation and we tested raven pretty extensively at the time.s
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for taking the time to look. We have automated testing and a
> >>>>>>>>>>> few igt gpu tools tests failed and after debugging we found that
> >>>>>>>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
> >>>>>>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
> >>>>>>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> >>>>>>>>>>> VRAM buffers resolves the issue.
> >>>>>>>>>>
> >>>>>>>>>> + Harry and Leo
> >>>>>>>>>>
> >>>>>>>>>> This sounds like the memory placement issue we discussed last week.
> >>>>>>>>>> In that case, the issue is related to where the buffer ends up when we
> >>>>>>>>>> try to do an async flip.  In that case, we can't do an async flip
> >>>>>>>>>> without a full modeset if the buffers locations are different than the
> >>>>>>>>>> last modeset because we need to update more than just the buffer base
> >>>>>>>>>> addresses.  This change works around that limitation by always forcing
> >>>>>>>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
> >>>>>>>>>> those tests but will make the overall experience worse because we'll
> >>>>>>>>>> end up effectively not being able to not fully utilize both gtt and
> >>>>>>>>>> vram for display which would reintroduce all of the problems fixed by
> >>>>>>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
> >>>>>>>>>
> >>>>>>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
> >>>>>>>>> would Raven only be impacted by this? It would seem that all devices
> >>>>>>>>> would have this issue, no? Also, I'm not familiar with how
> >>>>>>>>
> >>>>>>>> It depends on memory pressure and available memory in each pool.
> >>>>>>>> E.g., initially the display buffer is in VRAM when the initial mode
> >>>>>>>> set happens.  The watermarks, etc. are set for that scenario.  One of
> >>>>>>>> the next frames ends up in a pool different than the original.  Now
> >>>>>>>> the buffer is in GTT.  The async flip interface does a fast validation
> >>>>>>>> to try and flip as soon as possible, but that validation fails because
> >>>>>>>> the watermarks need to be updated which requires a full modeset.
> >>>>>
> >>>>> Huh, I'm not sure if this actually is an issue for APUs. The fix that introduced
> >>>>> a check for same memory placement on async flips was on a system with a DGPU,
> >>>>> for which VRAM placement does matter:
> >>>>> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
> >>>>>
> >>>>> Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
> >>>>> bandwidth validation depending on memory placement. There's a gpuvm_enable flag
> >>>>> for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
> >>>>> APUs specifically, we *should* be able to ignore the mem placement check. I can
> >>>>> spin up a patch to test this out.
> >>>>
> >>>> Is the gpu_vm_support flag ever set for dGPUs?  The allowed domains
> >>>> for display buffers are determined by
> >>>> amdgpu_display_supported_domains() and we only allow GTT as a domain
> >>>> if gpu_vm_support is set, which I think is just for APUs.  In that
> >>>> case, we could probably only need the checks specifically for
> >>>> CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
> >>>> and GTT (only one or the other?).  dGPUs and really old APUs will
> >>>> always get VRAM, and newer APUs will get VRAM | GTT.
> >>>
> >>> It doesn't look like gpu_vm_support is set for DGPUs
> >>> https://elixir.bootlin.com/linux/v6.15.6/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L1866
> >>>
> >>> Though interestingly, further up at #L1858, Raven has gpu_vm_support = 0. Maybe it had stability issues?
> >>> https://github.com/torvalds/linux/commit/098c13079c6fdd44f10586b69132c392ebf87450
> >>
> >> We need to be a little careful here asic_type == CHIP_RAVEN covers
> >> several variants:
> >> apu_flags & AMD_APU_IS_RAVEN - raven1 (gpu_vm_support = false)
> >> apu_flags & AMD_APU_IS_RAVEN2 - raven2 (gpu_vm_support = true)
> >> apu_flags & AMD_APU_IS_PICASSO - picasso (gpu_vm_support = true)
> >>
> >> amdgpu_display_supported_domains() only sets AMDGPU_GEM_DOMAIN_GTT if
> >> gpu_vm_support is true.  so we'd never get into the check in
> >> amdgpu_bo_get_preferred_domain() for raven1.
> >>
> >> Anyway, back to your suggestion, I think we can probably drop the
> >> checks as you should always get a compatible memory buffer due to
> >> amdgpu_bo_get_preferred_domain(). Pinning should fail if we can't pin
> >> in the required domain.  amdgpu_display_supported_domains() will
> >> ensure you always get VRAM or GTT or VRAM | GTT depending on what the
> >> chip supports.  Then amdgpu_bo_get_preferred_domain() will either
> >> leave that as is, or force VRAM or GTT for the STONEY/CARRIZO case.
> >> On the off chance we do get incompatible memory, something like the
> >> attached patch should do the trick.
>
> Thanks for the patch, this makes sense to me.
>
> Somewhat unrelated: I wonder if setting AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS is necessary before
> bo_pin(). FWIU from chatting with our DCN experts, DCN doesn't really care if the fb is
> contiguous or not.

Is this a APU statement or dGPU statement?  At least on older dGPUs,
they required contiguous VRAM.  This may not be an issue on newer
chips with DCHUB. At the moment, we use the FB aperture to access VRAM
directly in the kernel driver, so we do not set up page tables for
VRAM.  We'd need to do that to support linear mappings of
non-contiguous VRAM buffers in the kernel driver.  We do support it on
some MI chips, so it's doable, but it adds overhead.

>
> Which begs the question -- what exactly does AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS mean? From git
> history, it seems setting this flag doesn't necessarily move the bo to be congiguous. But
> rather:
>
>     When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
>     - This means contiguous is not mandatory.
>     - we will try to allocate the contiguous buffer. Say if the
>       allocation fails, we fallback to allocate the individual pages.
>
> https://github.com/torvalds/linux/commit/e362b7c8f8c7af00d06f0ab609629101aebae993
>
> Does that mean -- if the buffer is already in the required domain -- that bo_pin() will also
> attempt to make it contiguous? Or will it just pin it from being moved and leave it at that?
>

It means that the VRAM backing for the buffer will be physically contiguous.

> I guess in any case, it sounds like VRAM_CONTIGUOUS is not necessary for DCN scanout.
> I can give dropping it a spin and see if IGT complains.

That won't work unless we change how we manage VRAM in vmid0.  Right
now we use the FB aperture to directly access it, if we wanted to use
non-contiguous pages, we'd need to use page tables for VRAM as well.

Alex


>
> Thanks,
> Leo
>
> >>
> >> Alex
> >>
> >
> > Thanks for the patch, Alex.
> >
> > I have tested it, and though kms_async_flips and kms_plane_alpha_blend
> > pass, kms_plane_cursor still fail.
> >
> > I am going to investigate a little more today and send more details from my
> > findings.
> >
> > Thanks.
> > Cascardo.
> >
> >>
> >>>
> >>> - Leo
> >>>
> >>>>
> >>>> Alex
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>> Leo
> >>>>>
> >>>>>>>>
> >>>>>>>> It's tricky to fix because you don't want to use the worst case
> >>>>>>>> watermarks all the time because that will limit the number available
> >>>>>>>> display options and you don't want to force everything to a particular
> >>>>>>>> memory pool because that will limit the amount of memory that can be
> >>>>>>>> used for display (which is what the patch in question fixed).  Ideally
> >>>>>>>> the caller would do a test commit before the page flip to determine
> >>>>>>>> whether or not it would succeed before issuing it and then we'd have
> >>>>>>>> some feedback mechanism to tell the caller that the commit would fail
> >>>>>>>> due to buffer placement so it would do a full modeset instead.  We
> >>>>>>>> discussed this feedback mechanism last week at the display hackfest.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> kms_plane_alpha_blend works, but does this also support that test
> >>>>>>>>> failing as the cause?
> >>>>>>>>
> >>>>>>>> That may be related.  I'm not too familiar with that test either, but
> >>>>>>>> Leo or Harry can provide some guidance.
> >>>>>>>>
> >>>>>>>> Alex
> >>>>>>>
> >>>>>>> Thanks everyone for the input so far. I have a question for the
> >>>>>>> maintainers, given that it seems that this is functionally broken for
> >>>>>>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
> >>>>>>> it make sense to extend this proposed patch to all iGPUs until a more
> >>>>>>> permanent fix can be identified? At the end of the day I'll take
> >>>>>>> functional correctness over performance.
> >>>>>>
> >>>>>> It's not functional correctness, it's usability.  All that is
> >>>>>> potentially broken is async flips (which depend on memory pressure and
> >>>>>> buffer placement), while if you effectively revert the patch, you end
> >>>>>> up  limiting all display buffers to either VRAM or GTT which may end
> >>>>>> up causing the inability to display anything because there is not
> >>>>>> enough memory in that pool for the next modeset.  We'll start getting
> >>>>>> bug reports about blank screens and failure to set modes because of
> >>>>>> memory pressure.  I think if we want a short term fix, it would be to
> >>>>>> always set the worst case watermarks.  The downside to that is that it
> >>>>>> would possibly cause some working display setups to stop working if
> >>>>>> they were on the margins to begin with.
> >>>>>>
> >>>>>> Alex
> >>>>>>
> >>>>>>>
> >>>>>>> Brian
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks again,
> >>>>>>>>> Brian
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Alex
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Brian
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Alex
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
> >>>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>>>>>>>>>> Cc: Christian König <christian.koenig@amd.com>
> >>>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>>>>>>>>>> Cc: stable@vger.kernel.org # 6.1+
> >>>>>>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> >>>>>>>>>>>>> Signed-off-by: Brian Geffon <bgeffon@google.com>
> >>>>>>>>>>>>> ---
> >>>>>>>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> >>>>>>>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
> >>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> >>>>>>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> >>>>>>>>>>>>>                                             uint32_t domain)
> >>>>>>>>>>>>>  {
> >>>>>>>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
> >>>>>>>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
> >>>>>>>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
> >>>>>>>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
> >>>>>>>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> >>>>>>>>>>>>>                 if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> >>>>>>>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
> >>>>>>>>>>>>>
> >>>>>
> >>>
> >
> >> From cce1652c62c42c858de64c306ea0ddc7af3bd0b1 Mon Sep 17 00:00:00 2001
> >> From: Alex Deucher <alexander.deucher@amd.com>
> >> Date: Fri, 18 Jul 2025 18:40:26 -0400
> >> Subject: [PATCH] drm/amd/display: refine framebuffer placement checks
> >>
> >> When we commit planes, we need to make sure the
> >> framebuffer memory locations are compatible. Various
> >> hardware has the following requirements for display buffers:
> >> dGPUs, old APUs, raven1 - must be in VRAM
> >> cazziro/stoney - must be in VRAM or GTT, but not both
> >> newer APUs (raven2/picasso and newer) - can be in VRAM or GTT
> >>
> >> You should always get a compatible memory buffer due to
> >> amdgpu_bo_get_preferred_domain(). amdgpu_display_supported_domains()
> >> will ensure you always get VRAM or GTT or VRAM | GTT depending on
> >> what the chip supports.  Then amdgpu_bo_get_preferred_domain()
> >> will either leave that as is when pinning, or force VRAM or GTT
> >> for the STONEY/CARRIZO case.
> >>
> >> As such the checks could probably be removed, but on the off chance
> >> we do end up getting different memory pool for the old
> >> and new framebuffers, refine the check to take into account the
> >> hardware capabilities.
> >>
> >> Fixes: a7c0cad0dc06 ("drm/amd/display: ensure async flips are only accepted for fast updates")
> >> Reported-by: Brian Geffon <bgeffon@google.com>
> >> Cc: Leo Li <sunpeng.li@amd.com>
> >> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> >> ---
> >>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 ++++++++++++++++---
> >>  1 file changed, 17 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> index 129476b6d5fa9..de2bd789ec15b 100644
> >> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> @@ -9288,6 +9288,18 @@ static void amdgpu_dm_enable_self_refresh(struct amdgpu_crtc *acrtc_attach,
> >>      }
> >>  }
> >>
> >> +static bool amdgpu_dm_mem_type_compatible(struct amdgpu_device *adev,
> >> +                                      struct drm_framebuffer *old_fb,
> >> +                                      struct drm_framebuffer *new_fb)
> >> +{
> >> +    if (!adev->mode_info.gpu_vm_support ||
> >> +        (adev->asic_type == CHIP_CARRIZO) ||
> >> +        (adev->asic_type == CHIP_STONEY))
> >> +            return get_mem_type(old_fb) == get_mem_type(new_fb);
> >> +
> >> +    return true;
> >> +}
> >> +
> >>  static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> >>                                  struct drm_device *dev,
> >>                                  struct amdgpu_display_manager *dm,
> >> @@ -9465,7 +9477,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> >>               */
> >>              if (crtc->state->async_flip &&
> >>                  (acrtc_state->update_type != UPDATE_TYPE_FAST ||
> >> -                 get_mem_type(old_plane_state->fb) != get_mem_type(fb)))
> >> +                 !amdgpu_dm_mem_type_compatible(dm->adev, old_plane_state->fb, fb)))
> >>                      drm_warn_once(state->dev,
> >>                                    "[PLANE:%d:%s] async flip with non-fast update\n",
> >>                                    plane->base.id, plane->name);
> >> @@ -9473,7 +9485,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
> >>              bundle->flip_addrs[planes_count].flip_immediate =
> >>                      crtc->state->async_flip &&
> >>                      acrtc_state->update_type == UPDATE_TYPE_FAST &&
> >> -                    get_mem_type(old_plane_state->fb) == get_mem_type(fb);
> >> +                    amdgpu_dm_mem_type_compatible(dm->adev, old_plane_state->fb, fb);
> >>
> >>              timestamp_ns = ktime_get_ns();
> >>              bundle->flip_addrs[planes_count].flip_timestamp_in_us = div_u64(timestamp_ns, 1000);
> >> @@ -11760,6 +11772,7 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
> >>                                          struct drm_atomic_state *state,
> >>                                          struct drm_crtc_state *crtc_state)
> >>  {
> >> +    struct amdgpu_device *adev = drm_to_adev(dev);
> >>      struct drm_plane *plane;
> >>      struct drm_plane_state *new_plane_state, *old_plane_state;
> >>
> >> @@ -11773,7 +11786,8 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
> >>              }
> >>
> >>              if (old_plane_state->fb && new_plane_state->fb &&
> >> -                get_mem_type(old_plane_state->fb) != get_mem_type(new_plane_state->fb))
> >> +                !amdgpu_dm_mem_type_compatible(adev, old_plane_state->fb,
> >> +                                               new_plane_state->fb))
> >>                      return true;
> >>      }
> >>
> >> --
> >> 2.50.1
> >>
> >
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM
  2025-07-28 16:38                           ` Alex Deucher
@ 2025-08-05 10:16                             ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2025-08-05 10:16 UTC (permalink / raw)
  To: Alex Deucher, Leo Li
  Cc: Thadeu Lima de Souza Cascardo, Brian Geffon, Wentland, Harry,
	Alex Deucher, David Airlie, Simona Vetter, Tvrtko Ursulin,
	Yunxiang Li, Lijo Lazar, Prike Liang, Pratap Nirujogi,
	Luben Tuikov, amd-gfx, dri-devel, linux-kernel, Garrick Evans,
	stable

On 28.07.25 18:38, Alex Deucher wrote:
>>>> Anyway, back to your suggestion, I think we can probably drop the
>>>> checks as you should always get a compatible memory buffer due to
>>>> amdgpu_bo_get_preferred_domain(). Pinning should fail if we can't pin
>>>> in the required domain.  amdgpu_display_supported_domains() will
>>>> ensure you always get VRAM or GTT or VRAM | GTT depending on what the
>>>> chip supports.  Then amdgpu_bo_get_preferred_domain() will either
>>>> leave that as is, or force VRAM or GTT for the STONEY/CARRIZO case.
>>>> On the off chance we do get incompatible memory, something like the
>>>> attached patch should do the trick.
>>
>> Thanks for the patch, this makes sense to me.
>>
>> Somewhat unrelated: I wonder if setting AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS is necessary before
>> bo_pin(). FWIU from chatting with our DCN experts, DCN doesn't really care if the fb is
>> contiguous or not.
> 
> Is this a APU statement or dGPU statement?  At least on older dGPUs,
> they required contiguous VRAM.  This may not be an issue on newer
> chips with DCHUB. At the moment, we use the FB aperture to access VRAM
> directly in the kernel driver, so we do not set up page tables for
> VRAM.  We'd need to do that to support linear mappings of
> non-contiguous VRAM buffers in the kernel driver.  We do support it on
> some MI chips, so it's doable, but it adds overhead.
> 
>>
>> Which begs the question -- what exactly does AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS mean? From git
>> history, it seems setting this flag doesn't necessarily move the bo to be congiguous. But
>> rather:
>>
>>     When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS
>>     - This means contiguous is not mandatory.
>>     - we will try to allocate the contiguous buffer. Say if the
>>       allocation fails, we fallback to allocate the individual pages.
>>
>> https://github.com/torvalds/linux/commit/e362b7c8f8c7af00d06f0ab609629101aebae993
>>
>> Does that mean -- if the buffer is already in the required domain -- that bo_pin() will also
>> attempt to make it contiguous? Or will it just pin it from being moved and leave it at that?
>>
> 
> It means that the VRAM backing for the buffer will be physically contiguous.
> 
>> I guess in any case, it sounds like VRAM_CONTIGUOUS is not necessary for DCN scanout.
>> I can give dropping it a spin and see if IGT complains.
> 
> That won't work unless we change how we manage VRAM in vmid0.  Right
> now we use the FB aperture to directly access it, if we wanted to use
> non-contiguous pages, we'd need to use page tables for VRAM as well.

Yeah, that isn't easily doable. We looked into that when on first HW generation with DCHUB.

In the end it was more trouble managing the page tables for VRAM than allocating VRAM contiguously.

Regards,
Christian.

> 
> Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-08-05 10:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 16:17 [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM Brian Geffon
2025-07-16 16:33 ` Alex Deucher
2025-07-16 16:40   ` Brian Geffon
2025-07-16 21:02     ` Alex Deucher
2025-07-17  0:12       ` Brian Geffon
2025-07-17  7:58         ` Christian König
2025-07-17 14:58         ` Alex Deucher
2025-07-17 15:34           ` Michel Dänzer
2025-07-18 17:56           ` Brian Geffon
2025-07-18 20:07             ` Alex Deucher
2025-07-18 21:02               ` Leo Li
2025-07-18 21:33                 ` Alex Deucher
2025-07-18 22:01                   ` Leo Li
2025-07-18 23:00                     ` Alex Deucher
2025-07-22 11:21                       ` Thadeu Lima de Souza Cascardo
2025-07-22 13:54                         ` Leo Li
2025-07-28 16:38                           ` Alex Deucher
2025-08-05 10:16                             ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).