From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15BDCCD98D2 for ; Tue, 16 Jun 2026 07:13:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7057E10E866; Tue, 16 Jun 2026 07:13:14 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (1024-bit key; unprotected) header.d=linux.dev header.i=@linux.dev header.b="hzL08NIH"; dkim-atps=neutral Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5A70310E87F; Tue, 16 Jun 2026 07:13:11 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1781593989; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=/n5ClynN9wECcpRwRUUOPxrDcmCBQxKZrrm5x/CS1ZE=; b=hzL08NIHwOLXET4LriCS3jha/l1h+tfnJXHRFF5PimyEs96MvmK5yzxHmm3eCx+NSquRx7 96INU4Tj8riZJ3BaJ6nwzMPnj+rV16l3jsldze2JOlt8qS6BJedNizu0u35dyXi/WldTF2 EoScnkPyfLkGEwT4LUiX08EFbW/tM/Y= From: Matthew Schwartz To: Harry Wentland , Melissa Wen , Leo Li , Rodrigo Siqueira , christian.koenig@amd.com, Alex Deucher , natalie.vock@gmx.de Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, "Pierre-Loup A . Griffais" , Matthew Schwartz Subject: [RFC PATCH] drm/amd/display: Pin native scanout to VRAM on large-carveout APUs Date: Tue, 16 Jun 2026 00:10:37 -0700 Message-ID: <20260616071037.26718-1-matthew.schwartz@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Native scanout buffers on APUs are pinned with the VRAM|GTT domain, so under VRAM carveout pressure a swapchain can end up split across VRAM and GTT. The scanout buffer's memory type then changes from one flip to the next, and amdgpu_dm_crtc_mem_type_changed() rejects an async page flip across the change. The result is repeated async page flip failures, observed as choppy updates under carveout pressure, until the buffers reconverge to a single domain. Pin native scanout buffers in VRAM only so the swapchain stays in one memory domain. Restrict this to APUs whose carveout is larger than AMDGPU_SG_THRESHOLD, so small-carveout parts keep their existing VRAM|GTT placement, and fall back to GTT when the buffer does not fit in VRAM, so the flip still succeeds and the swapchain stays in one domain. Imported buffers may only be pinnable in GTT, so leave those on the default domains. Signed-off-by: Matthew Schwartz --- Hi, This came up while testing my kernel patch to fix mem_type detection for async flips here: https://lore.kernel.org/amd-gfx/20260611154438.571685-1-matthew.schwartz@linux.dev/ I found a new issue where splitting a swapchain between VRAM and GTT causes a noticeable stutter in gameplay if gamescope is using direct scanout and tearing is enabled while a game is already running. Once a swapchain is split across the VRAM carveout and GTT, the scanout buffer's mem_type changes from one flip to the next, so amdgpu_dm_crtc_mem_type_changed() rejects the async flip. Under direct scanout with tearing that rejection recurs every time the displayed buffer crosses domains, which is what surfaces as the choppiness. With this patch, I can enable tearing on top of an already-disabled frame limit mid-game and no longer reproduce the choppiness. amdgpu_gem_info confirms the swapchain converges to a single domain instead of splitting across VRAM and GTT. Before: 0x00000f81: 3981312 byte GTT exported as ino:275 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC write fence:drm_sched gfx_0.0.0 seq 88248 signalled 0x00000f82: 3981312 byte GTT exported as ino:276 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC write fence:drm_sched gfx_0.0.0 seq 88224 signalled 0x00000f83: 3981312 byte VRAM VISIBLE pin count 1 exported as ino:277 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC write fence:drm_sched gfx_0.0.0 seq 88236 signalled After: 0x00000f82: 3981312 byte VRAM VISIBLE pin count 1 exported as ino:548 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC write fence:drm_sched gfx_0.0.0 seq 822258 signalled 0x00000f83: 3981312 byte VRAM VISIBLE exported as ino:549 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC write fence:drm_sched gfx_0.0.0 seq 822255 signalled 0x00000f84: 3981312 byte VRAM VISIBLE exported as ino:550 NO_CPU_ACCESS CPU_GTT_USWC VRAM_CLEARED VRAM_CONTIGUOUS EXPLICIT_SYNC write fence:drm_sched gfx_0.0.0 seq 822261 signalled Does this seem like the correct approach to take for fixing the observed issue? I wanted to start with an RFC to make sure I didn't overlook anything obvious or miss any better methods of fixing this. Thanks, Matt --- .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 29 +++++++++++++++++-- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c index 23a9faa2ea89..b99f938e58ec 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c @@ -932,6 +932,7 @@ static int amdgpu_dm_plane_helper_prepare_fb(struct drm_plane *plane, struct amdgpu_bo *rbo; struct dm_plane_state *dm_plane_state_new, *dm_plane_state_old; uint32_t domain; + bool pin_vram_only; int r; if (!new_state->fb) { @@ -958,13 +959,35 @@ static int amdgpu_dm_plane_helper_prepare_fb(struct drm_plane *plane, if (r) goto error_unlock; - if (plane->type != DRM_PLANE_TYPE_CURSOR) - domain = amdgpu_display_supported_domains(adev, rbo->flags); - else + /* + * Pin native scanout in VRAM on APUs so a swapchain stays in one + * memory domain. A VRAM/GTT split changes its mem_type between flips + * and amdgpu_dm_crtc_mem_type_changed() rejects the async flip. Skip + * small carveouts that may not fit, and imported buffers. + */ + pin_vram_only = plane->type != DRM_PLANE_TYPE_CURSOR && + (adev->flags & AMD_IS_APU) && + !rbo->tbo.base.import_attach && + adev->gmc.real_vram_size > AMDGPU_SG_THRESHOLD; + + if (plane->type == DRM_PLANE_TYPE_CURSOR || pin_vram_only) domain = AMDGPU_GEM_DOMAIN_VRAM; + else + domain = amdgpu_display_supported_domains(adev, rbo->flags); rbo->flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS; r = amdgpu_bo_pin(rbo, domain); + if (r == -ENOMEM && pin_vram_only) { + /* + * VRAM could not fit the buffer. Fall back to GTT where + * allowed so the swapchain stays in one domain. + */ + domain = amdgpu_display_supported_domains(adev, rbo->flags); + if (domain & AMDGPU_GEM_DOMAIN_GTT) { + domain = AMDGPU_GEM_DOMAIN_GTT; + r = amdgpu_bo_pin(rbo, domain); + } + } if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("Failed to pin framebuffer with error %d\n", r); -- 2.54.0