From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from MTA-08-4.privateemail.com (mta-08-4.privateemail.com [198.54.122.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73527175A66 for ; Fri, 1 May 2026 20:37:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.54.122.147 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777667857; cv=none; b=ezFIIhYwCb0NccCULG/LXtYKKd6nB2SsZACFcCcVtf7EmwEsTz7KmZKSAmsTZOSAC09mIqR6Slz7MBbrcL/uYcQMpJPuaM58tCOd4zx/3neLSmzQ8uhm4OiM6w43JES1GZq+CT0Qba9Okdd7XuNZE6z/AuPMrTtW0Oo+RyLKUBc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777667857; c=relaxed/simple; bh=2ywp/9Egd1P6lsxcdS0Z0v0/y0v5ExqCthX9BqhIdfk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VMBgR6/klbGkferLcnQ0G5SDOm7e02U85T43ezkjpBfQxfpg0jbAfkeQS75uA0MHo345Airor9zbNkpBfzzwR7EUQciRlR0Mdfsn/RS6hgGUH6ptFORHlgnDpjtKkAziil31yobVsV0jWz0Jwx6RuVFYT+LeVyyNGjKEjrILogM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=effective-light.com; spf=pass smtp.mailfrom=effective-light.com; arc=none smtp.client-ip=198.54.122.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=effective-light.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=effective-light.com Received: from mta-08.privateemail.com (localhost [127.0.0.1]) by mta-08.privateemail.com (Postfix) with ESMTP id 4g6jXH4kKxz3hhTp; Fri, 1 May 2026 16:37:35 -0400 (EDT) Received: from localhost.localdomain (bras-base-toroon4332w-grc-26-174-91-51-28.dsl.bell.ca [174.91.51.28]) by mta-08.privateemail.com (Postfix) with ESMTPA; Fri, 1 May 2026 16:37:01 -0400 (EDT) From: Hamza Mahfooz To: dri-devel@lists.freedesktop.org Cc: Hamza Mahfooz , Harry Wentland , Leo Li , Rodrigo Siqueira , Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Mario Limonciello , Alex Hung , Ray Wu , Wayne Lin , Aurabindo Pillai , =?UTF-8?q?Timur=20Krist=C3=B3f?= , "Mario Limonciello (AMD)" , Ivan Lipski , Chenyu Chen , Matthew Schwartz , Yussuf Khalil , Tom Chung , Colin Ian King , Charlene Liu , amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 2/2] drm/amd/display: add DMU timeout recovery support Date: Fri, 1 May 2026 16:35:44 -0400 Message-ID: <20260501203552.749080-2-someguy@effective-light.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260501203552.749080-1-someguy@effective-light.com> References: <20260501203552.749080-1-someguy@effective-light.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP DMU already has robust hung state tracking, but timeout recovery was never hooked up, so do so now. Signed-off-by: Hamza Mahfooz --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 ++++++++++++++----- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 1 + .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 12 ++++++++-- 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index e96a12ff2d31..7be4ebee1cb7 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1246,7 +1246,7 @@ static void amdgpu_dm_audio_eld_notify(struct amdgpu_device *adev, int pin) } } -static int dm_dmub_hw_init(struct amdgpu_device *adev) +int amdgpu_dm_dmub_hw_init(struct amdgpu_device *adev) { const struct dmcub_firmware_header_v1_0 *hdr; struct dmub_srv *dmub_srv = adev->dm.dmub_srv; @@ -1315,7 +1315,7 @@ static int dm_dmub_hw_init(struct amdgpu_device *adev) /* if adev->firmware.load_type == AMDGPU_FW_LOAD_PSP, * amdgpu_ucode_init_single_fw will load dmub firmware * fw_inst_const part to cw0; otherwise, the firmware back door load - * will be done by dm_dmub_hw_init + * will be done by amdgpu_dm_dmub_hw_init(). */ if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) { memcpy(fb_info->fb[DMUB_WINDOW_0_INST_CONST].cpu_addr, fw_inst_const, @@ -1457,7 +1457,7 @@ static void dm_dmub_hw_resume(struct amdgpu_device *adev) drm_warn(adev_to_drm(adev), "Wait for DMUB auto-load failed: %d\n", status); } else { /* Perform the full hardware initialization. */ - r = dm_dmub_hw_init(adev); + r = amdgpu_dm_dmub_hw_init(adev); if (r) drm_err(adev_to_drm(adev), "DMUB interface failed to initialize: status=%d\n", r); } @@ -2041,6 +2041,9 @@ static int amdgpu_dm_init(struct amdgpu_device *adev) goto error; } + adev->dm.dc->debug.enable_dmu_recovery = + amdgpu_device_should_recover_gpu(adev); + if (amdgpu_dc_debug_mask & DC_DISABLE_PIPE_SPLIT) { adev->dm.dc->debug.force_single_disp_pipe_split = false; adev->dm.dc->debug.pipe_split_policy = MPC_SPLIT_AVOID; @@ -2090,7 +2093,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev) if (adev->dm.dc->caps.dp_hdmi21_pcon_support) drm_info(adev_to_drm(adev), "DP-HDMI FRL PCON supported\n"); - r = dm_dmub_hw_init(adev); + r = amdgpu_dm_dmub_hw_init(adev); if (r) { drm_err(adev_to_drm(adev), "DMUB interface failed to initialize: status=%d\n", r); goto error; @@ -3604,7 +3607,7 @@ static int dm_resume(struct amdgpu_ip_block *ip_block) */ link_enc_cfg_copy(adev->dm.dc->current_state, dc_state); - r = dm_dmub_hw_init(adev); + r = amdgpu_dm_dmub_hw_init(adev); if (r) { drm_err(adev_to_drm(adev), "DMUB interface failed to initialize: status=%d\n", r); return r; @@ -9623,7 +9626,15 @@ static void prepare_flip_isr(struct amdgpu_crtc *acrtc) { assert_spin_locked(&acrtc->base.dev->event_lock); - WARN_ON(acrtc->event); + + /* + * Compositors will refuse to make forward progress unless we send + * the previous flip's completion event. + */ + if (WARN_ON(acrtc->event)) { + drm_crtc_send_vblank_event(&acrtc->base, acrtc->event); + drm_crtc_vblank_put(&acrtc->base); + } acrtc->event = acrtc->base.state->event; diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h index 74a8fe1a1999..dc808ee83c2a 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h @@ -1086,6 +1086,7 @@ int amdgpu_dm_verify_lut3d_size(struct amdgpu_device *adev, #define MAX_COLOR_LEGACY_LUT_ENTRIES 256 void amdgpu_dm_init_color_mod(void); +int amdgpu_dm_dmub_hw_init(struct amdgpu_device *adev); int amdgpu_dm_create_color_properties(struct amdgpu_device *adev); int amdgpu_dm_verify_lut_sizes(const struct drm_crtc_state *crtc_state); int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state *crtc); diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index 3b8ae7798a93..8f10117483e2 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include @@ -1165,8 +1166,15 @@ void dm_set_dcn_clocks(struct dc_context *ctx, struct dc_clocks *clks) void dm_helpers_dmu_timeout(struct dc_context *ctx) { - // TODO: - //amdgpu_device_gpu_recover(dc_context->driver-context, NULL); + struct amdgpu_device *adev = ctx->driver_context; + + lockdep_assert_held(&adev->dm.dc_lock); + + drm_info(adev_to_drm(adev), "attempting firmware reset\n"); + if (amdgpu_dm_dmub_hw_init(adev)) + drm_dev_wedged_event(adev_to_drm(adev), + DRM_WEDGE_RECOVERY_REBIND | + DRM_WEDGE_RECOVERY_BUS_RESET, NULL); } void dm_helpers_smu_timeout(struct dc_context *ctx, unsigned int msg_id, unsigned int param, unsigned int timeout_us) -- 2.54.0