* FAILED: patch "[PATCH] drm/amdgpu: fix locking scope when flushing tlb" failed to apply to 6.10-stable tree
@ 2024-08-07 14:16 gregkh
2024-08-09 13:37 ` Li, Yunxiang (Teddy)
0 siblings, 1 reply; 5+ messages in thread
From: gregkh @ 2024-08-07 14:16 UTC (permalink / raw)
To: Yunxiang.Li, alexander.deucher, christian.koenig; +Cc: stable
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 9c33e5fd4fb63b793d9a92bf35d190630d9bada4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2024080738-tarmac-unproven-1f45@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
9c33e5fd4fb6 ("drm/amdgpu: fix locking scope when flushing tlb")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9c33e5fd4fb63b793d9a92bf35d190630d9bada4 Mon Sep 17 00:00:00 2001
From: Yunxiang Li <Yunxiang.Li@amd.com>
Date: Thu, 23 May 2024 07:48:19 -0400
Subject: [PATCH] drm/amdgpu: fix locking scope when flushing tlb
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Which method is used to flush tlb does not depend on whether a reset is
in progress or not. We should skip flush altogether if the GPU will get
reset. So put both path under reset_domain read lock.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 660599823050..322b8ff67cde 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -682,12 +682,17 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring;
struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst];
unsigned int ndw;
- signed long r;
+ int r;
uint32_t seq;
- if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready ||
- !down_read_trylock(&adev->reset_domain->sem)) {
+ /*
+ * A GPU reset should flush all TLBs anyway, so no need to do
+ * this while one is ongoing.
+ */
+ if (!down_read_trylock(&adev->reset_domain->sem))
+ return 0;
+ if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) {
if (adev->gmc.flush_tlb_needs_extra_type_2)
adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
2, all_hub,
@@ -701,44 +706,41 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid,
adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid,
flush_type, all_hub,
inst);
- return 0;
- }
+ r = 0;
+ } else {
+ /* 2 dwords flush + 8 dwords fence */
+ ndw = kiq->pmf->invalidate_tlbs_size + 8;
- /* 2 dwords flush + 8 dwords fence */
- ndw = kiq->pmf->invalidate_tlbs_size + 8;
+ if (adev->gmc.flush_tlb_needs_extra_type_2)
+ ndw += kiq->pmf->invalidate_tlbs_size;
- if (adev->gmc.flush_tlb_needs_extra_type_2)
- ndw += kiq->pmf->invalidate_tlbs_size;
+ if (adev->gmc.flush_tlb_needs_extra_type_0)
+ ndw += kiq->pmf->invalidate_tlbs_size;
- if (adev->gmc.flush_tlb_needs_extra_type_0)
- ndw += kiq->pmf->invalidate_tlbs_size;
+ spin_lock(&adev->gfx.kiq[inst].ring_lock);
+ amdgpu_ring_alloc(ring, ndw);
+ if (adev->gmc.flush_tlb_needs_extra_type_2)
+ kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 2, all_hub);
- spin_lock(&adev->gfx.kiq[inst].ring_lock);
- amdgpu_ring_alloc(ring, ndw);
- if (adev->gmc.flush_tlb_needs_extra_type_2)
- kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 2, all_hub);
+ if (flush_type == 2 && adev->gmc.flush_tlb_needs_extra_type_0)
+ kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 0, all_hub);
- if (flush_type == 2 && adev->gmc.flush_tlb_needs_extra_type_0)
- kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 0, all_hub);
+ kiq->pmf->kiq_invalidate_tlbs(ring, pasid, flush_type, all_hub);
+ r = amdgpu_fence_emit_polling(ring, &seq, MAX_KIQ_REG_WAIT);
+ if (r) {
+ amdgpu_ring_undo(ring);
+ spin_unlock(&adev->gfx.kiq[inst].ring_lock);
+ goto error_unlock_reset;
+ }
- kiq->pmf->kiq_invalidate_tlbs(ring, pasid, flush_type, all_hub);
- r = amdgpu_fence_emit_polling(ring, &seq, MAX_KIQ_REG_WAIT);
- if (r) {
- amdgpu_ring_undo(ring);
+ amdgpu_ring_commit(ring);
spin_unlock(&adev->gfx.kiq[inst].ring_lock);
- goto error_unlock_reset;
+ if (amdgpu_fence_wait_polling(ring, seq, usec_timeout) < 1) {
+ dev_err(adev->dev, "timeout waiting for kiq fence\n");
+ r = -ETIME;
+ }
}
- amdgpu_ring_commit(ring);
- spin_unlock(&adev->gfx.kiq[inst].ring_lock);
- r = amdgpu_fence_wait_polling(ring, seq, usec_timeout);
- if (r < 1) {
- dev_err(adev->dev, "wait for kiq fence error: %ld.\n", r);
- r = -ETIME;
- goto error_unlock_reset;
- }
- r = 0;
-
error_unlock_reset:
up_read(&adev->reset_domain->sem);
return r;
^ permalink raw reply related [flat|nested] 5+ messages in thread* RE: FAILED: patch "[PATCH] drm/amdgpu: fix locking scope when flushing tlb" failed to apply to 6.10-stable tree
2024-08-07 14:16 FAILED: patch "[PATCH] drm/amdgpu: fix locking scope when flushing tlb" failed to apply to 6.10-stable tree gregkh
@ 2024-08-09 13:37 ` Li, Yunxiang (Teddy)
2024-08-11 9:46 ` gregkh
0 siblings, 1 reply; 5+ messages in thread
From: Li, Yunxiang (Teddy) @ 2024-08-09 13:37 UTC (permalink / raw)
To: gregkh@linuxfoundation.org, Deucher, Alexander, Koenig, Christian
Cc: stable@vger.kernel.org
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Greg,
I believe this commit has already been picked onto 6.10-y as commit 84801d4f1e4fbd2c44dddecaec9099bdff100a42
Regards,
Yunxiang Li
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: FAILED: patch "[PATCH] drm/amdgpu: fix locking scope when flushing tlb" failed to apply to 6.10-stable tree
2024-08-09 13:37 ` Li, Yunxiang (Teddy)
@ 2024-08-11 9:46 ` gregkh
2024-08-12 14:30 ` Li, Yunxiang (Teddy)
0 siblings, 1 reply; 5+ messages in thread
From: gregkh @ 2024-08-11 9:46 UTC (permalink / raw)
To: Li, Yunxiang (Teddy)
Cc: Deucher, Alexander, Koenig, Christian, stable@vger.kernel.org
On Fri, Aug 09, 2024 at 01:37:50PM +0000, Li, Yunxiang (Teddy) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Greg,
>
> I believe this commit has already been picked onto 6.10-y as commit 84801d4f1e4fbd2c44dddecaec9099bdff100a42
Then why is it showing up here again? What broke in your workflow to
cause this? Please fix that as there are loads of these "double"
commits with no ability for me at all to detect them being a double
commit.
If this continues, I'll just end up dropping all AMD stable-marked
patches as it's too much of a hasle to deal with, and expect you all to
send me either git ids, or patch series, to apply. No other subsystem
has this issue.
greg k-h
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: FAILED: patch "[PATCH] drm/amdgpu: fix locking scope when flushing tlb" failed to apply to 6.10-stable tree
2024-08-11 9:46 ` gregkh
@ 2024-08-12 14:30 ` Li, Yunxiang (Teddy)
2024-08-12 14:43 ` gregkh
0 siblings, 1 reply; 5+ messages in thread
From: Li, Yunxiang (Teddy) @ 2024-08-12 14:30 UTC (permalink / raw)
To: gregkh@linuxfoundation.org, Deucher, Alexander
Cc: Koenig, Christian, stable@vger.kernel.org
From what I can piece together with git history, it seems the patch was both in amd-drm-fixes-6.10-2024-06-19 and amd-drm-next-6.11-2024-06-22 and this caused the double commit. @Deucher, Alexander
Yunxiang
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: FAILED: patch "[PATCH] drm/amdgpu: fix locking scope when flushing tlb" failed to apply to 6.10-stable tree
2024-08-12 14:30 ` Li, Yunxiang (Teddy)
@ 2024-08-12 14:43 ` gregkh
0 siblings, 0 replies; 5+ messages in thread
From: gregkh @ 2024-08-12 14:43 UTC (permalink / raw)
To: Li, Yunxiang (Teddy)
Cc: Deucher, Alexander, Koenig, Christian, stable@vger.kernel.org
On Mon, Aug 12, 2024 at 02:30:00PM +0000, Li, Yunxiang (Teddy) wrote:
> >From what I can piece together with git history, it seems the patch was both in amd-drm-fixes-6.10-2024-06-19 and amd-drm-next-6.11-2024-06-22 and this caused the double commit. @Deucher, Alexander
Again, PLEASE FIX YOUR BROKEN DEVELOPMENT PROCESS!
This isn't ok, and is wasting our time :(
greg k-h
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-08-12 14:43 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-07 14:16 FAILED: patch "[PATCH] drm/amdgpu: fix locking scope when flushing tlb" failed to apply to 6.10-stable tree gregkh
2024-08-09 13:37 ` Li, Yunxiang (Teddy)
2024-08-11 9:46 ` gregkh
2024-08-12 14:30 ` Li, Yunxiang (Teddy)
2024-08-12 14:43 ` gregkh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox