* [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory
@ 2023-07-07 15:07 Alex Deucher
2023-07-07 15:07 ` [PATCH 2/9] drm/amdgpu: make sure that BOs have a backing store Alex Deucher
` (9 more replies)
0 siblings, 10 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable
Cc: mario.limonciello, Christian König, Alex Deucher,
Guchun Chen, Mikhail Gavrilov
From: Christian König <christian.koenig@amd.com>
We need to grab the lock of the BO or otherwise can run into a crash
when we try to inspect the current location.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e2ad8e2df432498b1cee2af04df605723f4d75e6)
Cc: stable@vger.kernel.org # 6.3.x
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 69 +++++++++++++++-----------
1 file changed, 39 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 5b3a70becbdf..a252a206f37b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -920,42 +920,51 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm,
return r;
}
+static void amdgpu_vm_bo_get_memory(struct amdgpu_bo_va *bo_va,
+ struct amdgpu_mem_stats *stats)
+{
+ struct amdgpu_vm *vm = bo_va->base.vm;
+ struct amdgpu_bo *bo = bo_va->base.bo;
+
+ if (!bo)
+ return;
+
+ /*
+ * For now ignore BOs which are currently locked and potentially
+ * changing their location.
+ */
+ if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv &&
+ !dma_resv_trylock(bo->tbo.base.resv))
+ return;
+
+ amdgpu_bo_get_memory(bo, stats);
+ if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
+ dma_resv_unlock(bo->tbo.base.resv);
+}
+
void amdgpu_vm_get_memory(struct amdgpu_vm *vm,
struct amdgpu_mem_stats *stats)
{
struct amdgpu_bo_va *bo_va, *tmp;
spin_lock(&vm->status_lock);
- list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status) {
- if (!bo_va->base.bo)
- continue;
- amdgpu_bo_get_memory(bo_va->base.bo, stats);
- }
- list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status) {
- if (!bo_va->base.bo)
- continue;
- amdgpu_bo_get_memory(bo_va->base.bo, stats);
- }
- list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status) {
- if (!bo_va->base.bo)
- continue;
- amdgpu_bo_get_memory(bo_va->base.bo, stats);
- }
- list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) {
- if (!bo_va->base.bo)
- continue;
- amdgpu_bo_get_memory(bo_va->base.bo, stats);
- }
- list_for_each_entry_safe(bo_va, tmp, &vm->invalidated, base.vm_status) {
- if (!bo_va->base.bo)
- continue;
- amdgpu_bo_get_memory(bo_va->base.bo, stats);
- }
- list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status) {
- if (!bo_va->base.bo)
- continue;
- amdgpu_bo_get_memory(bo_va->base.bo, stats);
- }
+ list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status)
+ amdgpu_vm_bo_get_memory(bo_va, stats);
+
+ list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status)
+ amdgpu_vm_bo_get_memory(bo_va, stats);
+
+ list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status)
+ amdgpu_vm_bo_get_memory(bo_va, stats);
+
+ list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status)
+ amdgpu_vm_bo_get_memory(bo_va, stats);
+
+ list_for_each_entry_safe(bo_va, tmp, &vm->invalidated, base.vm_status)
+ amdgpu_vm_bo_get_memory(bo_va, stats);
+
+ list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status)
+ amdgpu_vm_bo_get_memory(bo_va, stats);
spin_unlock(&vm->status_lock);
}
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/9] drm/amdgpu: make sure that BOs have a backing store
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
@ 2023-07-07 15:07 ` Alex Deucher
2023-07-07 15:07 ` [PATCH 3/9] drm/amdgpu: Skip mark offset for high priority rings Alex Deucher
` (8 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable
Cc: mario.limonciello, Christian König, Alex Deucher,
Guchun Chen, Mikhail Gavrilov
From: Christian König <christian.koenig@amd.com>
It's perfectly possible that the BO is about to be destroyed and doesn't
have a backing store associated with it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ca0b954a4315ca2228001c439ae1062561c81989)
Cc: stable@vger.kernel.org # 6.3.x
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index a70103ac0026..46557bbbc18a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1266,8 +1266,12 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
struct amdgpu_mem_stats *stats)
{
- unsigned int domain;
uint64_t size = amdgpu_bo_size(bo);
+ unsigned int domain;
+
+ /* Abort if the BO doesn't currently have a backing store */
+ if (!bo->tbo.resource)
+ return;
domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
switch (domain) {
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/9] drm/amdgpu: Skip mark offset for high priority rings
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
2023-07-07 15:07 ` [PATCH 2/9] drm/amdgpu: make sure that BOs have a backing store Alex Deucher
@ 2023-07-07 15:07 ` Alex Deucher
2023-07-07 15:07 ` [PATCH 4/9] drm/amd/pm: revise the ASPM settings for thunderbolt attached scenario Alex Deucher
` (7 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable; +Cc: mario.limonciello, Jiadong Zhu, Alex Deucher
From: Jiadong Zhu <Jiadong.Zhu@amd.com>
Only low priority rings are using chunks to save the offset.
Bypass the mark offset callings from high priority rings.
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ef3c36a6e025e9b16ca3321479ba016841fa17a0)
Cc: stable@vger.kernel.org
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
index 73516abef662..b779ee4bbaa7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
@@ -423,6 +423,9 @@ void amdgpu_sw_ring_ib_mark_offset(struct amdgpu_ring *ring, enum amdgpu_ring_mu
struct amdgpu_ring_mux *mux = &adev->gfx.muxer;
unsigned offset;
+ if (ring->hw_prio > AMDGPU_RING_PRIO_DEFAULT)
+ return;
+
offset = ring->wptr & ring->buf_mask;
amdgpu_ring_mux_ib_mark_offset(mux, ring, offset, type);
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/9] drm/amd/pm: revise the ASPM settings for thunderbolt attached scenario
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
2023-07-07 15:07 ` [PATCH 2/9] drm/amdgpu: make sure that BOs have a backing store Alex Deucher
2023-07-07 15:07 ` [PATCH 3/9] drm/amdgpu: Skip mark offset for high priority rings Alex Deucher
@ 2023-07-07 15:07 ` Alex Deucher
2023-07-07 15:07 ` [PATCH 5/9] drm/amdgpu/sdma4: set align mask to 255 Alex Deucher
` (6 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable; +Cc: mario.limonciello, Evan Quan, Alex Deucher
From: Evan Quan <evan.quan@amd.com>
Also, correct the comment for NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT
as 0x0000000E stands for 400ms instead of 4ms.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fd21987274463a439c074b8f3c93d3b132e4c031)
Cc: stable@vger.kernel.org
---
drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
index aa761ff3a5fa..7ba47fc1917b 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
@@ -346,7 +346,7 @@ static void nbio_v2_3_init_registers(struct amdgpu_device *adev)
#define NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT 0x00000000 // off by default, no gains over L1
#define NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT 0x00000009 // 1=1us, 9=1ms
-#define NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT 0x0000000E // 4ms
+#define NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT 0x0000000E // 400ms
static void nbio_v2_3_enable_aspm(struct amdgpu_device *adev,
bool enable)
@@ -479,9 +479,12 @@ static void nbio_v2_3_program_aspm(struct amdgpu_device *adev)
WREG32_SOC15(NBIO, 0, mmRCC_BIF_STRAP5, data);
def = data = RREG32_PCIE(smnPCIE_LC_CNTL);
- data &= ~PCIE_LC_CNTL__LC_L0S_INACTIVITY_MASK;
- data |= 0x9 << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
- data |= 0x1 << PCIE_LC_CNTL__LC_PMI_TO_L1_DIS__SHIFT;
+ data |= NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L0S_INACTIVITY__SHIFT;
+ if (pci_is_thunderbolt_attached(adev->pdev))
+ data |= NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
+ else
+ data |= NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
+ data &= ~PCIE_LC_CNTL__LC_PMI_TO_L1_DIS_MASK;
if (def != data)
WREG32_PCIE(smnPCIE_LC_CNTL, data);
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/9] drm/amdgpu/sdma4: set align mask to 255
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
` (2 preceding siblings ...)
2023-07-07 15:07 ` [PATCH 4/9] drm/amd/pm: revise the ASPM settings for thunderbolt attached scenario Alex Deucher
@ 2023-07-07 15:07 ` Alex Deucher
2023-07-07 15:07 ` [PATCH 6/9] drm/amd/pm: add abnormal fan detection for smu 13.0.0 Alex Deucher
` (5 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable
Cc: mario.limonciello, Alex Deucher, Felix Kuehling, Aaron Liu,
Christian König
The wptr needs to be incremented at at least 64 dword intervals,
use 256 to align with windows. This should fix potential hangs
with unaligned updates.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e5df16d9428f5c6d2d0b1eff244d6c330ba9ef3a)
Cc: stable@vger.kernel.org
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 ++--
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 9295ac7edd56..d35c8a33d06d 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -2306,7 +2306,7 @@ const struct amd_ip_funcs sdma_v4_0_ip_funcs = {
static const struct amdgpu_ring_funcs sdma_v4_0_ring_funcs = {
.type = AMDGPU_RING_TYPE_SDMA,
- .align_mask = 0xf,
+ .align_mask = 0xff,
.nop = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP),
.support_64bit_ptrs = true,
.secure_submission_supported = true,
@@ -2338,7 +2338,7 @@ static const struct amdgpu_ring_funcs sdma_v4_0_ring_funcs = {
static const struct amdgpu_ring_funcs sdma_v4_0_page_ring_funcs = {
.type = AMDGPU_RING_TYPE_SDMA,
- .align_mask = 0xf,
+ .align_mask = 0xff,
.nop = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP),
.support_64bit_ptrs = true,
.secure_submission_supported = true,
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
index 64dcaa2670dd..ac7aa8631f6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
@@ -1740,7 +1740,7 @@ const struct amd_ip_funcs sdma_v4_4_2_ip_funcs = {
static const struct amdgpu_ring_funcs sdma_v4_4_2_ring_funcs = {
.type = AMDGPU_RING_TYPE_SDMA,
- .align_mask = 0xf,
+ .align_mask = 0xff,
.nop = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP),
.support_64bit_ptrs = true,
.get_rptr = sdma_v4_4_2_ring_get_rptr,
@@ -1771,7 +1771,7 @@ static const struct amdgpu_ring_funcs sdma_v4_4_2_ring_funcs = {
static const struct amdgpu_ring_funcs sdma_v4_4_2_page_ring_funcs = {
.type = AMDGPU_RING_TYPE_SDMA,
- .align_mask = 0xf,
+ .align_mask = 0xff,
.nop = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP),
.support_64bit_ptrs = true,
.get_rptr = sdma_v4_4_2_ring_get_rptr,
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/9] drm/amd/pm: add abnormal fan detection for smu 13.0.0
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
` (3 preceding siblings ...)
2023-07-07 15:07 ` [PATCH 5/9] drm/amdgpu/sdma4: set align mask to 255 Alex Deucher
@ 2023-07-07 15:07 ` Alex Deucher
2023-07-07 15:07 ` [PATCH 7/9] drm/amdgpu: check RAS irq existence for VCN/JPEG Alex Deucher
` (4 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable; +Cc: mario.limonciello, Kenneth Feng, Evan Quan, Alex Deucher
From: Kenneth Feng <kenneth.feng@amd.com>
add abnormal fan detection for smu 13.0.0
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2da0036ea99bccb27f7fe3cf2aa2900860e9be46)
Cc: stable@vger.kernel.org # 6.1.x
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
index 08577d1b84ec..c42c0c1446f4 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
@@ -1300,6 +1300,7 @@ static int smu_v13_0_0_get_thermal_temperature_range(struct smu_context *smu,
range->mem_emergency_max = (pptable->SkuTable.TemperatureLimit[TEMP_MEM] + CTF_OFFSET_MEM)*
SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
range->software_shutdown_temp = powerplay_table->software_shutdown_temp;
+ range->software_shutdown_temp_offset = pptable->SkuTable.FanAbnormalTempLimitOffset;
return 0;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 7/9] drm/amdgpu: check RAS irq existence for VCN/JPEG
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
` (4 preceding siblings ...)
2023-07-07 15:07 ` [PATCH 6/9] drm/amd/pm: add abnormal fan detection for smu 13.0.0 Alex Deucher
@ 2023-07-07 15:07 ` Alex Deucher
2023-07-07 15:07 ` [PATCH 8/9] drm/amdgpu: fix number of fence calculations Alex Deucher
` (3 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable; +Cc: mario.limonciello, Tao Zhou, Hawking Zhang, Alex Deucher
From: Tao Zhou <tao.zhou1@amd.com>
No RAS irq is allowed.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4ff96bcc0d40b66bf3ddd6010830e9a4f9b85d53)
Cc: stable@vger.kernel.org # 6.1.x
---
drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 3 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
index 4fa019c8aefc..fb9251d9c899 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
@@ -251,7 +251,8 @@ int amdgpu_jpeg_ras_late_init(struct amdgpu_device *adev, struct ras_common_if *
if (amdgpu_ras_is_supported(adev, ras_block->block)) {
for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) {
- if (adev->jpeg.harvest_config & (1 << i))
+ if (adev->jpeg.harvest_config & (1 << i) ||
+ !adev->jpeg.inst[i].ras_poison_irq.funcs)
continue;
r = amdgpu_irq_get(adev, &adev->jpeg.inst[i].ras_poison_irq, 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 2d94f1b63bd6..b46a5771c3ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -1191,7 +1191,8 @@ int amdgpu_vcn_ras_late_init(struct amdgpu_device *adev, struct ras_common_if *r
if (amdgpu_ras_is_supported(adev, ras_block->block)) {
for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
- if (adev->vcn.harvest_config & (1 << i))
+ if (adev->vcn.harvest_config & (1 << i) ||
+ !adev->vcn.inst[i].ras_poison_irq.funcs)
continue;
r = amdgpu_irq_get(adev, &adev->vcn.inst[i].ras_poison_irq, 0);
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 8/9] drm/amdgpu: fix number of fence calculations
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
` (5 preceding siblings ...)
2023-07-07 15:07 ` [PATCH 7/9] drm/amdgpu: check RAS irq existence for VCN/JPEG Alex Deucher
@ 2023-07-07 15:07 ` Alex Deucher
2023-07-07 15:07 ` [PATCH 9/9] drm/amd: Don't try to enable secure display TA multiple times Alex Deucher
` (2 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable; +Cc: mario.limonciello, Christian König, Alex Deucher
From: Christian König <christian.koenig@amd.com>
Since adding gang submit we need to take the gang size into account
while reserving fences.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 4624459c84d7 ("drm/amdgpu: add gang submit frontend v6")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 570b295248b00c3cf4cf59e397de5cb2361e10c2)
Cc: stable@vger.kernel.org
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 2eb2c66843a8..5612caf77dd6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -133,9 +133,6 @@ static int amdgpu_cs_p1_user_fence(struct amdgpu_cs_parser *p,
bo = amdgpu_bo_ref(gem_to_amdgpu_bo(gobj));
p->uf_entry.priority = 0;
p->uf_entry.tv.bo = &bo->tbo;
- /* One for TTM and two for the CS job */
- p->uf_entry.tv.num_shared = 3;
-
drm_gem_object_put(gobj);
size = amdgpu_bo_size(bo);
@@ -882,15 +879,19 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
mutex_lock(&p->bo_list->bo_list_mutex);
- /* One for TTM and one for the CS job */
+ /* One for TTM and one for each CS job */
amdgpu_bo_list_for_each_entry(e, p->bo_list)
- e->tv.num_shared = 2;
+ e->tv.num_shared = 1 + p->gang_size;
+ p->uf_entry.tv.num_shared = 1 + p->gang_size;
amdgpu_bo_list_get_list(p->bo_list, &p->validated);
INIT_LIST_HEAD(&duplicates);
amdgpu_vm_get_pd_bo(&fpriv->vm, &p->validated, &p->vm_pd);
+ /* Two for VM updates, one for TTM and one for each CS job */
+ p->vm_pd.tv.num_shared = 3 + p->gang_size;
+
if (p->uf_entry.tv.bo && !ttm_to_amdgpu_bo(p->uf_entry.tv.bo)->parent)
list_add(&p->uf_entry.tv.head, &p->validated);
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 9/9] drm/amd: Don't try to enable secure display TA multiple times
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
` (6 preceding siblings ...)
2023-07-07 15:07 ` [PATCH 8/9] drm/amdgpu: fix number of fence calculations Alex Deucher
@ 2023-07-07 15:07 ` Alex Deucher
2023-07-11 21:40 ` [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Mario Limonciello
2023-07-16 19:16 ` Greg KH
9 siblings, 0 replies; 14+ messages in thread
From: Alex Deucher @ 2023-07-07 15:07 UTC (permalink / raw)
To: stable; +Cc: mario.limonciello, Filip Hejsek, Alex Deucher
From: Mario Limonciello <mario.limonciello@amd.com>
If the securedisplay TA failed to load the first time, it's unlikely
to work again after a suspend/resume cycle or reset cycle and it appears
to be causing problems in futher attempts.
Fixes: e42dfa66d592 ("drm/amdgpu: Add secure display TA load for Renoir")
Reported-by: Filip Hejsek <filip.hejsek@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2633
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5c6d52ff4b61e5267b25be714eb5a9ba2a338199)
Cc: stable@vger.kernel.org
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index a150b7a4b4aa..e4757a2807d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1947,6 +1947,8 @@ static int psp_securedisplay_initialize(struct psp_context *psp)
psp_securedisplay_parse_resp_status(psp, securedisplay_cmd->status);
dev_err(psp->adev->dev, "SECUREDISPLAY: query securedisplay TA failed. ret 0x%x\n",
securedisplay_cmd->securedisplay_out_message.query_ta.query_cmd_ret);
+ /* don't try again */
+ psp->securedisplay_context.context.bin_desc.size_bytes = 0;
}
return 0;
--
2.41.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
` (7 preceding siblings ...)
2023-07-07 15:07 ` [PATCH 9/9] drm/amd: Don't try to enable secure display TA multiple times Alex Deucher
@ 2023-07-11 21:40 ` Mario Limonciello
2023-07-12 5:12 ` Greg Kroah-Hartman
2023-07-16 19:16 ` Greg KH
9 siblings, 1 reply; 14+ messages in thread
From: Mario Limonciello @ 2023-07-11 21:40 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Christian König, Guchun Chen, Mikhail Gavrilov, Alex Deucher,
stable@vger.kernel.org
On 7/7/23 10:07, Alex Deucher wrote:
> From: Christian König <christian.koenig@amd.com>
>
> We need to grab the lock of the BO or otherwise can run into a crash
> when we try to inspect the current location.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> Acked-by: Guchun Chen <guchun.chen@amd.com>
> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> (cherry picked from commit e2ad8e2df432498b1cee2af04df605723f4d75e6)
> Cc: stable@vger.kernel.org # 6.3.x
> ---
Greg,
Just want to make sure you saw these 9 commits as you're processing
queues since they don't stand out as being sent directly to stable.
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 69 +++++++++++++++-----------
> 1 file changed, 39 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 5b3a70becbdf..a252a206f37b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -920,42 +920,51 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> return r;
> }
>
> +static void amdgpu_vm_bo_get_memory(struct amdgpu_bo_va *bo_va,
> + struct amdgpu_mem_stats *stats)
> +{
> + struct amdgpu_vm *vm = bo_va->base.vm;
> + struct amdgpu_bo *bo = bo_va->base.bo;
> +
> + if (!bo)
> + return;
> +
> + /*
> + * For now ignore BOs which are currently locked and potentially
> + * changing their location.
> + */
> + if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv &&
> + !dma_resv_trylock(bo->tbo.base.resv))
> + return;
> +
> + amdgpu_bo_get_memory(bo, stats);
> + if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
> + dma_resv_unlock(bo->tbo.base.resv);
> +}
> +
> void amdgpu_vm_get_memory(struct amdgpu_vm *vm,
> struct amdgpu_mem_stats *stats)
> {
> struct amdgpu_bo_va *bo_va, *tmp;
>
> spin_lock(&vm->status_lock);
> - list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->invalidated, base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> + list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->invalidated, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> spin_unlock(&vm->status_lock);
> }
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory
2023-07-11 21:40 ` [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Mario Limonciello
@ 2023-07-12 5:12 ` Greg Kroah-Hartman
0 siblings, 0 replies; 14+ messages in thread
From: Greg Kroah-Hartman @ 2023-07-12 5:12 UTC (permalink / raw)
To: Mario Limonciello
Cc: Christian König, Guchun Chen, Mikhail Gavrilov, Alex Deucher,
stable@vger.kernel.org
On Tue, Jul 11, 2023 at 04:40:44PM -0500, Mario Limonciello wrote:
> On 7/7/23 10:07, Alex Deucher wrote:
> > From: Christian König <christian.koenig@amd.com>
> >
> > We need to grab the lock of the BO or otherwise can run into a crash
> > when we try to inspect the current location.
> >
> > Signed-off-by: Christian König <christian.koenig@amd.com>
> > Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> > Acked-by: Guchun Chen <guchun.chen@amd.com>
> > Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> > (cherry picked from commit e2ad8e2df432498b1cee2af04df605723f4d75e6)
> > Cc: stable@vger.kernel.org # 6.3.x
> > ---
>
> Greg,
>
> Just want to make sure you saw these 9 commits as you're processing queues
> since they don't stand out as being sent directly to stable.
Thanks for the pointer, no, I had missed them in the flood of stable
patches recently. I have many hundreds of other patches to still get
to, and these are now in that review queue as well.
greg k-h
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
` (8 preceding siblings ...)
2023-07-11 21:40 ` [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Mario Limonciello
@ 2023-07-16 19:16 ` Greg KH
2023-07-16 19:22 ` Mario Limonciello
9 siblings, 1 reply; 14+ messages in thread
From: Greg KH @ 2023-07-16 19:16 UTC (permalink / raw)
To: Alex Deucher
Cc: stable, mario.limonciello, Christian König, Guchun Chen,
Mikhail Gavrilov
On Fri, Jul 07, 2023 at 11:07:26AM -0400, Alex Deucher wrote:
> From: Christian König <christian.koenig@amd.com>
>
> We need to grab the lock of the BO or otherwise can run into a crash
> when we try to inspect the current location.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> Acked-by: Guchun Chen <guchun.chen@amd.com>
> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> (cherry picked from commit e2ad8e2df432498b1cee2af04df605723f4d75e6)
> Cc: stable@vger.kernel.org # 6.3.x
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 69 +++++++++++++++-----------
> 1 file changed, 39 insertions(+), 30 deletions(-)
I've applied the first 7 patches here to 6.4.y, which I am guessing is
where you want them applied to, yet you didn't really say?
The last 2 did not apply :(
And some of these should go into 6.1.y also? Please send a patch series
and give me a hint as to where they should be applied to next time so I
don't have to guess...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory
2023-07-16 19:16 ` Greg KH
@ 2023-07-16 19:22 ` Mario Limonciello
2023-07-16 19:28 ` Greg KH
0 siblings, 1 reply; 14+ messages in thread
From: Mario Limonciello @ 2023-07-16 19:22 UTC (permalink / raw)
To: Greg KH, Alex Deucher
Cc: stable, Christian König, Guchun Chen, Mikhail Gavrilov
On 7/16/23 14:16, Greg KH wrote:
> On Fri, Jul 07, 2023 at 11:07:26AM -0400, Alex Deucher wrote:
>> From: Christian König <christian.koenig@amd.com>
>>
>> We need to grab the lock of the BO or otherwise can run into a crash
>> when we try to inspect the current location.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>> Acked-by: Guchun Chen <guchun.chen@amd.com>
>> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> (cherry picked from commit e2ad8e2df432498b1cee2af04df605723f4d75e6)
>> Cc: stable@vger.kernel.org # 6.3.x
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 69 +++++++++++++++-----------
>> 1 file changed, 39 insertions(+), 30 deletions(-)
>
> I've applied the first 7 patches here to 6.4.y, which I am guessing is
> where you want them applied to, yet you didn't really say?
>
> The last 2 did not apply :(
>
> And some of these should go into 6.1.y also? Please send a patch series
> and give me a hint as to where they should be applied to next time so I
> don't have to guess...
>
> thanks,
>
> greg k-h
In this case the individual patches with specific requirements have:
Cc: stable@vger.kernel.org # version
They were sent before 6.3 went EOL, so here is the updated summary from
them:
6.4.y:
1, 2, 3, 4, 5, 6, 7, 8, 9
6.1.y:
3, 4, 5, 6, 7, 8, 9
3 is particularly important for 6.1.y as we have active regressions
reported related to it on 6.1.y.
So can you please take 3-7 to 6.1.y and I'll look more closely at what
is wrong with 8 and 9 on 6.1.y and 6.4.y and resend them?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory
2023-07-16 19:22 ` Mario Limonciello
@ 2023-07-16 19:28 ` Greg KH
0 siblings, 0 replies; 14+ messages in thread
From: Greg KH @ 2023-07-16 19:28 UTC (permalink / raw)
To: Mario Limonciello
Cc: Alex Deucher, stable, Christian König, Guchun Chen,
Mikhail Gavrilov
On Sun, Jul 16, 2023 at 02:22:36PM -0500, Mario Limonciello wrote:
> On 7/16/23 14:16, Greg KH wrote:
> > On Fri, Jul 07, 2023 at 11:07:26AM -0400, Alex Deucher wrote:
> > > From: Christian König <christian.koenig@amd.com>
> > >
> > > We need to grab the lock of the BO or otherwise can run into a crash
> > > when we try to inspect the current location.
> > >
> > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> > > Acked-by: Guchun Chen <guchun.chen@amd.com>
> > > Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> > > (cherry picked from commit e2ad8e2df432498b1cee2af04df605723f4d75e6)
> > > Cc: stable@vger.kernel.org # 6.3.x
> > > ---
> > > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 69 +++++++++++++++-----------
> > > 1 file changed, 39 insertions(+), 30 deletions(-)
> >
> > I've applied the first 7 patches here to 6.4.y, which I am guessing is
> > where you want them applied to, yet you didn't really say?
> >
> > The last 2 did not apply :(
> >
> > And some of these should go into 6.1.y also? Please send a patch series
> > and give me a hint as to where they should be applied to next time so I
> > don't have to guess...
> >
> > thanks,
> >
> > greg k-h
>
> In this case the individual patches with specific requirements have:
>
> Cc: stable@vger.kernel.org # version
>
> They were sent before 6.3 went EOL, so here is the updated summary from
> them:
> 6.4.y:
> 1, 2, 3, 4, 5, 6, 7, 8, 9
>
> 6.1.y:
> 3, 4, 5, 6, 7, 8, 9
>
> 3 is particularly important for 6.1.y as we have active regressions reported
> related to it on 6.1.y.
>
> So can you please take 3-7 to 6.1.y and I'll look more closely at what is
> wrong with 8 and 9 on 6.1.y and 6.4.y and resend them?
I can't really pick out these for 6.1 from the larger series as I'm
drowning in patches at the moment. Please send a backported series and
I'll be glad to queue that up.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2023-07-16 19:28 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-07 15:07 [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Alex Deucher
2023-07-07 15:07 ` [PATCH 2/9] drm/amdgpu: make sure that BOs have a backing store Alex Deucher
2023-07-07 15:07 ` [PATCH 3/9] drm/amdgpu: Skip mark offset for high priority rings Alex Deucher
2023-07-07 15:07 ` [PATCH 4/9] drm/amd/pm: revise the ASPM settings for thunderbolt attached scenario Alex Deucher
2023-07-07 15:07 ` [PATCH 5/9] drm/amdgpu/sdma4: set align mask to 255 Alex Deucher
2023-07-07 15:07 ` [PATCH 6/9] drm/amd/pm: add abnormal fan detection for smu 13.0.0 Alex Deucher
2023-07-07 15:07 ` [PATCH 7/9] drm/amdgpu: check RAS irq existence for VCN/JPEG Alex Deucher
2023-07-07 15:07 ` [PATCH 8/9] drm/amdgpu: fix number of fence calculations Alex Deucher
2023-07-07 15:07 ` [PATCH 9/9] drm/amd: Don't try to enable secure display TA multiple times Alex Deucher
2023-07-11 21:40 ` [PATCH 1/9] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory Mario Limonciello
2023-07-12 5:12 ` Greg Kroah-Hartman
2023-07-16 19:16 ` Greg KH
2023-07-16 19:22 ` Mario Limonciello
2023-07-16 19:28 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox