From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D62D821ADAE; Tue, 8 Apr 2025 11:17:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744111052; cv=none; b=tP0wEhqApCCi2AbYy+t4mL9gpbNeq+vBzesTRJL4JMmXd9XDtfbl9NsrxTQTulyi0JKtHFZu2uP8eUJtV5t0cLf7z4SZf188bAZZlCpt/0JU7Dup4hv37I0hWogue1KuJZdyHvnah3lydUhIwEdPILRCMMCDGBmNYzNS0VHho5o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744111052; c=relaxed/simple; bh=P423ZEHy9u5p0cpR4DgTBWxTRebyd2P40l4tP7lrI2A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=osOkGlikkocwVMdWGlhU3+C+mLTldzLTNDvY0yLHs1wtLz+82gZu+pac6+666UroocH44QoRu0ibXq5uJJUIcMinSSu4HaJU/rfrCg+wAKOHNZf2YcP7yaZ0xCqzzi2I4759kbVo0bxVlNXRWDa2At9uxPbwWRKXWVG5cS9mepo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=0nh8NNBb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="0nh8NNBb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6490DC4CEE5; Tue, 8 Apr 2025 11:17:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1744111052; bh=P423ZEHy9u5p0cpR4DgTBWxTRebyd2P40l4tP7lrI2A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=0nh8NNBbTPabNRFsyNPWzE2HklXdudqvCu2XoWWWU2qzgVndIOE4Vlgsn7MCH3dyK tVbi9rwM3914DPMNRe4FlVmMfpESkfe7HP6ayOdwHI7e4SjcnlMftPtGLwB7MBGezS bgREiwY5LkyAWRcTRltiAisoi6a89xti/02n2HfE= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, lin cao , Jingwen Chen , Victor Skvortsov , Zhigang Luo , =?UTF-8?q?Christian=20K=C3=B6nig?= , Alex Deucher , Srinivasan Shanmugam , Sasha Levin Subject: [PATCH 6.14 291/731] drm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV Date: Tue, 8 Apr 2025 12:43:08 +0200 Message-ID: <20250408104921.047649849@linuxfoundation.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250408104914.247897328@linuxfoundation.org> References: <20250408104914.247897328@linuxfoundation.org> User-Agent: quilt/0.68 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 6.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Srinivasan Shanmugam [ Upstream commit dc0297f3198bd60108ccbd167ee5d9fa4af31ed0 ] RLCG Register Access is a way for virtual functions to safely access GPU registers in a virtualized environment., including TLB flushes and register reads. When multiple threads or VFs try to access the same registers simultaneously, it can lead to race conditions. By using the RLCG interface, the driver can serialize access to the registers. This means that only one thread can access the registers at a time, preventing conflicts and ensuring that operations are performed correctly. Additionally, when a low-priority task holds a mutex that a high-priority task needs, ie., If a thread holding a spinlock tries to acquire a mutex, it can lead to priority inversion. register access in amdgpu_virt_rlcg_reg_rw especially in a fast code path is critical. The call stack shows that the function amdgpu_virt_rlcg_reg_rw is being called, which attempts to acquire the mutex. This function is invoked from amdgpu_sriov_wreg, which in turn is called from gmc_v11_0_flush_gpu_tlb. The [ BUG: Invalid wait context ] indicates that a thread is trying to acquire a mutex while it is in a context that does not allow it to sleep (like holding a spinlock). Fixes the below: [ 253.013423] ============================= [ 253.013434] [ BUG: Invalid wait context ] [ 253.013446] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE [ 253.013464] ----------------------------- [ 253.013475] kworker/0:1/10 is trying to lock: [ 253.013487] ffff9f30542e3cf8 (&adev->virt.rlcg_reg_lock){+.+.}-{3:3}, at: amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.013815] other info that might help us debug this: [ 253.013827] context-{4:4} [ 253.013835] 3 locks held by kworker/0:1/10: [ 253.013847] #0: ffff9f3040050f58 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680 [ 253.013877] #1: ffffb789c008be40 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680 [ 253.013905] #2: ffff9f3054281838 (&adev->gmc.invalidate_lock){+.+.}-{2:2}, at: gmc_v11_0_flush_gpu_tlb+0x198/0x4f0 [amdgpu] [ 253.014154] stack backtrace: [ 253.014164] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14 [ 253.014189] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 253.014203] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/18/2024 [ 253.014224] Workqueue: events work_for_cpu_fn [ 253.014241] Call Trace: [ 253.014250] [ 253.014260] dump_stack_lvl+0x9b/0xf0 [ 253.014275] dump_stack+0x10/0x20 [ 253.014287] __lock_acquire+0xa47/0x2810 [ 253.014303] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.014321] lock_acquire+0xd1/0x300 [ 253.014333] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.014562] ? __lock_acquire+0xa6b/0x2810 [ 253.014578] __mutex_lock+0x85/0xe20 [ 253.014591] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.014782] ? sched_clock_noinstr+0x9/0x10 [ 253.014795] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.014808] ? local_clock_noinstr+0xe/0xc0 [ 253.014822] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.015012] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.015029] mutex_lock_nested+0x1b/0x30 [ 253.015044] ? mutex_lock_nested+0x1b/0x30 [ 253.015057] amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.015249] amdgpu_sriov_wreg+0xc5/0xd0 [amdgpu] [ 253.015435] gmc_v11_0_flush_gpu_tlb+0x44b/0x4f0 [amdgpu] [ 253.015667] gfx_v11_0_hw_init+0x499/0x29c0 [amdgpu] [ 253.015901] ? __pfx_smu_v13_0_update_pcie_parameters+0x10/0x10 [amdgpu] [ 253.016159] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.016173] ? smu_hw_init+0x18d/0x300 [amdgpu] [ 253.016403] amdgpu_device_init+0x29ad/0x36a0 [amdgpu] [ 253.016614] amdgpu_driver_load_kms+0x1a/0xc0 [amdgpu] [ 253.017057] amdgpu_pci_probe+0x1c2/0x660 [amdgpu] [ 253.017493] local_pci_probe+0x4b/0xb0 [ 253.017746] work_for_cpu_fn+0x1a/0x30 [ 253.017995] process_one_work+0x21e/0x680 [ 253.018248] worker_thread+0x190/0x330 [ 253.018500] ? __pfx_worker_thread+0x10/0x10 [ 253.018746] kthread+0xe7/0x120 [ 253.018988] ? __pfx_kthread+0x10/0x10 [ 253.019231] ret_from_fork+0x3c/0x60 [ 253.019468] ? __pfx_kthread+0x10/0x10 [ 253.019701] ret_from_fork_asm+0x1a/0x30 [ 253.019939] v2: s/spin_trylock/spin_lock_irqsave to be safe (Christian). Fixes: e864180ee49b ("drm/amdgpu: Add lock around VF RLCG interface") Cc: lin cao Cc: Jingwen Chen Cc: Victor Skvortsov Cc: Zhigang Luo Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam Suggested-by: Alex Deucher Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 3 ++- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 018dfccd771ba..f5909977eed4b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4223,7 +4223,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, mutex_init(&adev->grbm_idx_mutex); mutex_init(&adev->mn_lock); mutex_init(&adev->virt.vf_errors.lock); - mutex_init(&adev->virt.rlcg_reg_lock); hash_init(adev->mn_hash); mutex_init(&adev->psp.mutex); mutex_init(&adev->notifier_lock); @@ -4249,6 +4248,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, spin_lock_init(&adev->se_cac_idx_lock); spin_lock_init(&adev->audio_endpt_idx_lock); spin_lock_init(&adev->mm_stats.lock); + spin_lock_init(&adev->virt.rlcg_reg_lock); spin_lock_init(&adev->wb.lock); INIT_LIST_HEAD(&adev->reset_list); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index 0af469ec6fccd..13e5709ea1caa 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -1017,6 +1017,7 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f void *scratch_reg2; void *scratch_reg3; void *spare_int; + unsigned long flags; if (!adev->gfx.rlc.rlcg_reg_access_supported) { dev_err(adev->dev, @@ -1038,7 +1039,7 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f scratch_reg2 = (void __iomem *)adev->rmmio + 4 * reg_access_ctrl->scratch_reg2; scratch_reg3 = (void __iomem *)adev->rmmio + 4 * reg_access_ctrl->scratch_reg3; - mutex_lock(&adev->virt.rlcg_reg_lock); + spin_lock_irqsave(&adev->virt.rlcg_reg_lock, flags); if (reg_access_ctrl->spare_int) spare_int = (void __iomem *)adev->rmmio + 4 * reg_access_ctrl->spare_int; @@ -1097,7 +1098,7 @@ u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v, u32 f ret = readl(scratch_reg0); - mutex_unlock(&adev->virt.rlcg_reg_lock); + spin_unlock_irqrestore(&adev->virt.rlcg_reg_lock, flags); return ret; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h index 5381b8d596e62..0ca73343a7689 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h @@ -279,7 +279,8 @@ struct amdgpu_virt { /* the ucode id to signal the autoload */ uint32_t autoload_ucode_id; - struct mutex rlcg_reg_lock; + /* Spinlock to protect access to the RLCG register interface */ + spinlock_t rlcg_reg_lock; union amd_sriov_ras_caps ras_en_caps; union amd_sriov_ras_caps ras_telemetry_en_caps; -- 2.39.5