patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Samuel Zhang <guoqing.zhang@amd.com>,
	Lijo Lazar <lijo.lazar@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.12 101/112] drm/amdgpu: fix gpu page fault after hibernation on PF passthrough
Date: Thu, 27 Nov 2025 15:46:43 +0100	[thread overview]
Message-ID: <20251127144036.538409778@linuxfoundation.org> (raw)
In-Reply-To: <20251127144032.705323598@linuxfoundation.org>

6.12-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Samuel Zhang <guoqing.zhang@amd.com>

[ Upstream commit eb6e7f520d6efa4d4ebf1671455abe4a681f7a05 ]

On PF passthrough environment, after hibernate and then resume, coralgemm
will cause gpu page fault.

Mode1 reset happens during hibernate, but partition mode is not restored
on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right
after resume. When CP access the MQD BO, wrong stride size is used,
this will cause out of bound access on the MQD BO, resulting page fault.

The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called
when resume from a hibernation.
KFD resume is called separately during a reset recovery or resume from
suspend sequence. Hence it's not required to be called as part of
partition switch.

Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5d1b32cfe4a676fe552416cb5ae847b215463a1a)
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 3 ++-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c    | 4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
index ccfd2a4b4acc8..9c89e234c7869 100644
--- a/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
+++ b/drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c
@@ -555,7 +555,8 @@ static int aqua_vanjaram_switch_partition_mode(struct amdgpu_xcp_mgr *xcp_mgr,
 		return -EINVAL;
 	}
 
-	if (adev->kfd.init_complete && !amdgpu_in_reset(adev))
+	if (adev->kfd.init_complete && !amdgpu_in_reset(adev) &&
+		!adev->in_suspend)
 		flags |= AMDGPU_XCP_OPS_KFD;
 
 	if (flags & AMDGPU_XCP_OPS_KFD) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index f27ccb8f3c8c5..26c2d8d9e2463 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
@@ -2297,7 +2297,9 @@ static int gfx_v9_4_3_cp_resume(struct amdgpu_device *adev)
 		r = amdgpu_xcp_init(adev->xcp_mgr, num_xcp, mode);
 
 	} else {
-		if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr,
+		if (adev->in_suspend)
+			amdgpu_xcp_restore_partition_mode(adev->xcp_mgr);
+		else if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr,
 						    AMDGPU_XCP_FL_NONE) ==
 		    AMDGPU_UNKNOWN_COMPUTE_PARTITION_MODE)
 			r = amdgpu_xcp_switch_partition_mode(
-- 
2.51.0




  parent reply	other threads:[~2025-11-27 14:57 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-27 14:45 [PATCH 6.12 000/112] 6.12.60-rc1 review Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 001/112] KVM: arm64: Check the untrusted offset in FF-A memory share Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 002/112] timers: Fix NULL function pointer race in timer_shutdown_sync() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 003/112] HID: amd_sfh: Stop sensor before starting Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 004/112] HID: quirks: work around VID/PID conflict for 0x4c4a/0x4155 Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 005/112] arm64: dts: rockchip: Fix vccio4-supply on rk3566-pinetab2 Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 006/112] arm64: dts: rockchip: fix PCIe 3.3V regulator voltage on orangepi-5 Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 007/112] arm64: dts: rockchip: include rk3399-base instead of rk3399 in rk3399-op1 Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 008/112] arm64: dts: rockchip: disable HS400 on RK3588 Tiger Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 009/112] mtd: rawnand: cadence: fix DMA device NULL pointer dereference Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 010/112] mtdchar: fix integer overflow in read/write ioctls Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 011/112] isofs: check the return value of sb_min_blocksize() in isofs_fill_super Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 012/112] shmem: fix tmpfs reconfiguration (remount) when noswap is set Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 013/112] exfat: check return value of sb_min_blocksize in exfat_read_boot_sector Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 014/112] mptcp: Disallow MPTCP subflows from sockmap Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 015/112] mptcp: Fix proto fallback detection with BPF Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 016/112] ata: libata-scsi: Fix system suspend for a security locked drive Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 017/112] MIPS: mm: Prevent a TLB shutdown on initial uniquification Greg Kroah-Hartman
2025-11-28  6:01   ` Maciej W. Rozycki
2025-11-27 14:45 ` [PATCH 6.12 018/112] smb: client: introduce close_cached_dir_locked() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 019/112] ata: libata-scsi: Add missing scsi_device_put() in ata_scsi_dev_rescan() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 020/112] be2net: pass wrb_params in case of OS2BMC Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 021/112] net: dsa: microchip: lan937x: Fix RGMII delay tuning Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 022/112] Revert "drm/tegra: dsi: Clear enable register if powered by bootloader" Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 023/112] Input: cros_ec_keyb - fix an invalid memory access Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 024/112] Input: goodix - add support for ACPI ID GDIX1003 Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 025/112] Input: imx_sc_key - fix memory corruption on unload Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 026/112] Input: pegasus-notetaker - fix potential out-of-bounds access Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 027/112] mm/mempool: fix poisoning order>0 pages with HIGHMEM Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 028/112] nouveau/firmware: Add missing kfree() of nvkm_falcon_fw::boot Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 029/112] nvme: nvme-fc: move tagset removal to nvme_fc_delete_ctrl() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 030/112] nvme: nvme-fc: Ensure ->ioerr_work is cancelled in nvme_fc_delete_ctrl() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 031/112] scsi: sg: Do not sleep in atomic context Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 032/112] scsi: target: tcm_loop: Fix segfault in tcm_loop_tpg_address_show() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 033/112] MIPS: Malta: Fix !EVA SOC-it PCI MMIO Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 034/112] dt-bindings: pinctrl: toshiba,visconti: Fix number of items in groups Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 035/112] LoongArch: Dont panic if no valid cache info for PCI Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 036/112] mptcp: fix race condition in mptcp_schedule_work() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 037/112] mptcp: fix ack generation for fallback msk Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 038/112] mptcp: fix duplicate reset on fastclose Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 039/112] mptcp: fix premature close in case of fallback Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 040/112] selftests: mptcp: join: endpoints: longer timeout Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 041/112] selftests: mptcp: join: userspace: " Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 042/112] mptcp: avoid unneeded subflow-level drops Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 043/112] mptcp: decouple mptcp fastclose from tcp close Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 044/112] mptcp: do not fallback when OoO is present Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 045/112] drm/tegra: dc: Fix reference leak in tegra_dc_couple() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 046/112] drm/radeon: delete radeon_fence_process in is_signaled, no deadlock Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 047/112] drm/amd: Skip power ungate during suspend for VPE Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 048/112] drm/amdgpu: Skip emit de meta data on gfx11 with rs64 enabled Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 049/112] drm/amd/display: Increase DPCD read retries Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 050/112] drm/amd/display: Move sleep into each retry for retrieve_link_cap() Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 051/112] drm/amd/display: Fix pbn to kbps Conversion Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 052/112] drm/amd/display: Clear the CUR_ENABLE register on DCN20 on DPP5 Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 053/112] xfrm: drop SA reference in xfrm_state_update if dir doesnt match Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 054/112] xfrm: set err and extack on failure to create pcpu SA Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 055/112] pinctrl: realtek: Select REGMAP_MMIO for RTD driver Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 056/112] xfrm: Determine inner GSO type from packet inner protocol Greg Kroah-Hartman
2025-11-27 14:45 ` [PATCH 6.12 057/112] xfrm: Prevent locally generated packets from direct output in tunnel mode Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 058/112] pinctrl: cirrus: Fix fwnode leak in cs42l43_pin_probe() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 059/112] platform/x86: msi-wmi-platform: Only load on MSI devices Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 060/112] platform/x86: msi-wmi-platform: Fix typo in WMI GUID Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 061/112] mlxsw: spectrum: Fix memory leak in mlxsw_sp_flower_stats() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 062/112] drm/tegra: Add call to put_pid() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 063/112] net: dsa: hellcreek: fix missing error handling in LED registration Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 064/112] net: mlxsw: linecards: fix missing error check in mlxsw_linecard_devlink_info_get() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 065/112] net: openvswitch: remove never-working support for setting nsh fields Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 066/112] tools: riscv: Fixed misalignment of CSR related definitions Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 067/112] nvme-multipath: fix lockdep WARN due to partition scan work Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 068/112] s390/ctcm: Fix double-kfree Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 069/112] selftests: net: lib: Do not overwrite error messages Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 070/112] platform/x86/intel/speed_select_if: Convert PCIBIOS_* return codes to errnos Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 071/112] net: qlogic/qede: fix potential out-of-bounds read in qede_tpa_cont() and qede_tpa_end() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 072/112] idpf: fix possible vport_config NULL pointer deref in remove Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 073/112] ice: fix PTP cleanup on driver removal in error path Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 074/112] pinctrl: s32cc: fix uninitialized memory in s32_pinctrl_desc Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 075/112] pinctrl: s32cc: initialize gpio_pin_config::list after kmalloc() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 076/112] devlink: rate: Unset parent pointer in devl_rate_nodes_destroy Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 077/112] net/mlx5: Clean up only new IRQ glue on request_irq() failure Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 078/112] af_unix: Cache state->msg in unix_stream_read_generic() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 079/112] af_unix: Read sk_peek_offset() again after sleeping " Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 080/112] LoongArch: Use UAPI types in ptrace UAPI header Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 081/112] cifs: fix memory leak in smb3_fs_context_parse_param error path Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 082/112] vsock: Ignore signal/timeout on connect() if already established Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 083/112] bcma: dont register devices disabled in OF Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 084/112] cifs: fix typo in enable_gcm_256 module parameter Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 085/112] scsi: core: Fix a regression triggered by scsi_host_busy() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 086/112] x86/microcode/AMD: Limit Entrysign signature checking to known generations Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 087/112] selftests: net: use BASH for bareudp testing Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 088/112] net: tls: Change async resync helpers argument Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 089/112] blk-crypto: use BLK_STS_INVAL for alignment errors Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 090/112] net: tls: Cancel RX async resync request on rcd_delta overflow Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 091/112] kconfig/mconf: Initialize the default locale at startup Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 092/112] kconfig/nconf: " Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 093/112] ALSA: usb-audio: Fix missing unlock at error path of maxpacksize check Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 094/112] KVM: arm64: Make all 32bit ID registers fully writable Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 095/112] Revert "RDMA/irdma: Update Kconfig" Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 096/112] drm/xe: Prevent BIT() overflow when handling invalid prefetch region Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 097/112] s390/mm: Fix __ptep_rdp() inline assembly Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 098/112] ALSA: usb-audio: fix uac2 clock source at terminal parser Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 099/112] net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 100/112] tracing/tools: Fix incorrcet short option in usage text for --threads Greg Kroah-Hartman
2025-11-27 14:46 ` Greg Kroah-Hartman [this message]
2025-11-27 14:46 ` [PATCH 6.12 102/112] smb: client: fix incomplete backport in cfids_invalidation_worker() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 103/112] tty/vt: fix up incorrect backport to stable releases Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 104/112] maple_tree: fix tracepoint string pointers Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 105/112] drm/i915/dp_mst: Disable Panel Replay Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 106/112] mptcp: fix a race in mptcp_pm_del_add_timer() Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 107/112] xfs: Replace strncpy with memcpy Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 108/112] xfs: fix out of bounds memory read error in symlink repair Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 109/112] drm/amd/display: avoid reset DTBCLK at clock init Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 110/112] drm/amd/display: disable DPP RCG before DPP CLK enable Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 111/112] drm/amd/display: Insert dccg log for easy debug Greg Kroah-Hartman
2025-11-27 14:46 ` [PATCH 6.12 112/112] drm/amd/display: Prevent Gating DTBCLK before It Is Properly Latched Greg Kroah-Hartman
2025-11-28 12:20 ` [PATCH 6.12 000/112] 6.12.60-rc1 review Pavel Machek
2025-12-01 10:54 ` Harshit Mogalapalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251127144036.538409778@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alexander.deucher@amd.com \
    --cc=guoqing.zhang@amd.com \
    --cc=lijo.lazar@amd.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).