All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] drm/amd/display: Pin native scanout to VRAM on large-carveout APUs
@ 2026-06-16  7:10 Matthew Schwartz
  2026-06-16  7:21 ` sashiko-bot
  2026-06-16  7:31 ` Christian König
  0 siblings, 2 replies; 3+ messages in thread
From: Matthew Schwartz @ 2026-06-16  7:10 UTC (permalink / raw)
  To: Harry Wentland, Melissa Wen, Leo Li, Rodrigo Siqueira,
	christian.koenig, Alex Deucher, natalie.vock
  Cc: amd-gfx, dri-devel, Pierre-Loup A . Griffais, Matthew Schwartz

Native scanout buffers on APUs are pinned with the VRAM|GTT domain, so
under VRAM carveout pressure a swapchain can end up split across VRAM and
GTT. The scanout buffer's memory type then changes from one flip to the
next, and amdgpu_dm_crtc_mem_type_changed() rejects an async page flip
across the change. The result is repeated async page flip failures,
observed as choppy updates under carveout pressure, until the buffers
reconverge to a single domain.

Pin native scanout buffers in VRAM only so the swapchain stays in one
memory domain. Restrict this to APUs whose carveout is larger than
AMDGPU_SG_THRESHOLD, so small-carveout parts keep their existing VRAM|GTT
placement, and fall back to GTT when the buffer does not fit in VRAM, so
the flip still succeeds and the swapchain stays in one domain. Imported
buffers may only be pinnable in GTT, so leave those on the default
domains.

Signed-off-by: Matthew Schwartz <matthew.schwartz@linux.dev>
---
Hi,

This came up while testing my kernel patch to fix mem_type detection for
async flips here: https://lore.kernel.org/amd-gfx/20260611154438.571685-1-matthew.schwartz@linux.dev/

I found a new issue where splitting a swapchain between VRAM and GTT
causes a noticeable stutter in gameplay if gamescope is using direct
scanout and tearing is enabled while a game is already running.

Once a swapchain is split across the VRAM carveout and GTT, the scanout
buffer's mem_type changes from one flip to the next, so
amdgpu_dm_crtc_mem_type_changed() rejects the async flip. Under direct
scanout with tearing that rejection recurs every time the displayed buffer
crosses domains, which is what surfaces as the choppiness. 

With this patch, I can enable tearing on top of an already-disabled frame
limit mid-game and no longer reproduce the choppiness.

amdgpu_gem_info confirms the swapchain converges to a single domain
instead of splitting across VRAM and GTT.

Before:
0x00000f81:      3981312 byte GTT exported as ino:275 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC     write fence:drm_sched gfx_0.0.0 seq 88248 signalled
0x00000f82:      3981312 byte GTT exported as ino:276 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC     write fence:drm_sched gfx_0.0.0 seq 88224 signalled
0x00000f83:      3981312 byte VRAM VISIBLE pin count 1 exported as ino:277 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC        write fence:drm_sched gfx_0.0.0 seq 88236 signalled

After:
0x00000f82:      3981312 byte VRAM VISIBLE pin count 1 exported as ino:548 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC        write fence:drm_sched gfx_0.0.0 seq 822258 signalled
0x00000f83:      3981312 byte VRAM VISIBLE exported as ino:549 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC    write fence:drm_sched gfx_0.0.0 seq 822255 signalled
0x00000f84:      3981312 byte VRAM VISIBLE exported as ino:550 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC    write fence:drm_sched gfx_0.0.0 seq 822261 signalled

Does this seem like the correct approach to take for fixing the observed
issue? I wanted to start with an RFC to make sure I didn't overlook
anything obvious or miss any better methods of fixing this.

Thanks,
Matt
---
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 29 +++++++++++++++++--
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 23a9faa2ea89..b99f938e58ec 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -932,6 +932,7 @@ static int amdgpu_dm_plane_helper_prepare_fb(struct drm_plane *plane,
 	struct amdgpu_bo *rbo;
 	struct dm_plane_state *dm_plane_state_new, *dm_plane_state_old;
 	uint32_t domain;
+	bool pin_vram_only;
 	int r;
 
 	if (!new_state->fb) {
@@ -958,13 +959,35 @@ static int amdgpu_dm_plane_helper_prepare_fb(struct drm_plane *plane,
 	if (r)
 		goto error_unlock;
 
-	if (plane->type != DRM_PLANE_TYPE_CURSOR)
-		domain = amdgpu_display_supported_domains(adev, rbo->flags);
-	else
+	/*
+	 * Pin native scanout in VRAM on APUs so a swapchain stays in one
+	 * memory domain. A VRAM/GTT split changes its mem_type between flips
+	 * and amdgpu_dm_crtc_mem_type_changed() rejects the async flip. Skip
+	 * small carveouts that may not fit, and imported buffers.
+	 */
+	pin_vram_only = plane->type != DRM_PLANE_TYPE_CURSOR &&
+			(adev->flags & AMD_IS_APU) &&
+			!rbo->tbo.base.import_attach &&
+			adev->gmc.real_vram_size > AMDGPU_SG_THRESHOLD;
+
+	if (plane->type == DRM_PLANE_TYPE_CURSOR || pin_vram_only)
 		domain = AMDGPU_GEM_DOMAIN_VRAM;
+	else
+		domain = amdgpu_display_supported_domains(adev, rbo->flags);
 
 	rbo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
 	r = amdgpu_bo_pin(rbo, domain);
+	if (r == -ENOMEM && pin_vram_only) {
+		/*
+		 * VRAM could not fit the buffer. Fall back to GTT where
+		 * allowed so the swapchain stays in one domain.
+		 */
+		domain = amdgpu_display_supported_domains(adev, rbo->flags);
+		if (domain & AMDGPU_GEM_DOMAIN_GTT) {
+			domain = AMDGPU_GEM_DOMAIN_GTT;
+			r = amdgpu_bo_pin(rbo, domain);
+		}
+	}
 	if (unlikely(r != 0)) {
 		if (r != -ERESTARTSYS)
 			DRM_ERROR("Failed to pin framebuffer with error %d\n", r);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-16  7:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16  7:10 [RFC PATCH] drm/amd/display: Pin native scanout to VRAM on large-carveout APUs Matthew Schwartz
2026-06-16  7:21 ` sashiko-bot
2026-06-16  7:31 ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.