* [PATCH 00/73] MES support
@ 2022-04-29 17:45 Alex Deucher
2022-04-29 17:45 ` [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip Alex Deucher
` (72 more replies)
0 siblings, 73 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher
This patch set enables support for MES (Micro Engine Scheduler). This
is similar the the HWS (HardWare Scheduler) on the MEC currently used
for KFD. It's a scheduling microcontroller used for scheduling
engine queues to hardware slots. This adds the basic infrastructure
to enable MES in amdgpu.
Jack Xiao (70):
drm/amdgpu: define MQD abstract layer for hw ip
drm/amdgpu: add helper function to initialize mqd from ring v4
drm/amdgpu: add the per-context meta data v3
drm/amdgpu: add mes ctx data in amdgpu_ring
drm/amdgpu: define ring structure to access rptr/wptr/fence
drm/amdgpu: use ring structure to access rptr/wptr v2
drm/amdgpu: initialize/finalize the ring for mes queue
drm/amdgpu: assign the cpu/gpu address of fence from ring
drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2
drm/amdgpu/gfx10: use per ctx CSA for ce metadata
drm/amdgpu/gfx10: use per ctx CSA for de metadata
drm/amdgpu/gfx10: associate mes queue id with fence v2
drm/amdgpu/gfx10: inherit vmid from mqd
drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2
drm/amdgpu/gmc10: skip emitting pasid mapping packet
drm/amdgpu: use the whole doorbell space for mes
drm/amdgpu: update mes process/gang/queue definitions
drm/amdgpu: add mes_kiq module parameter v2
drm/amdgpu: allocate doorbell index for mes kiq
drm/amdgpu/mes: extend mes framework to support multiple mes pipes
drm/amdgpu/gfx10: add mes queue fence handling
drm/amdgpu/gfx10: add mes support for gfx ib test
drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled
drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue
drm/amdgpu/sdma5.2: initialize sdma mqd
drm/amdgpu/sdma5.2: associate mes queue id with fence
drm/amdgpu/sdma5.2: add mes queue fence handling
drm/amdgpu/sdma5.2: add mes support for sdma ring test
drm/amdgpu/sdma5.2: add mes support for sdma ib test
drm/amdgpu/sdma5: initialize sdma mqd
drm/amdgpu/sdma5: associate mes queue id with fence
drm/amdgpu/sdma5: add mes queue fence handling
drm/amdgpu/sdma5: add mes support for sdma ring test
drm/amdgpu/sdma5: add mes support for sdma ib test
drm/amdgpu/mes: add mes kiq callback
drm/amdgpu: add mes kiq frontdoor loading support
drm/amdgpu: enable mes kiq N-1 test on sienna cichlid
drm/amdgpu/mes: manage mes doorbell allocation
drm/amdgpu: add mes queue id mask v2
drm/amdgpu/mes: initialize/finalize common mes structure v2
drm/amdgpu/mes: relocate status_fence slot allocation
drm/amdgpu/mes10.1: call general mes initialization
drm/amdgpu/mes10.1: add delay after mes engine enable
drm/amdgpu/mes10.1: implement the suspend/resume routine
drm/amdgpu/mes: implement creating mes process v2
drm/amdgpu/mes: implement destroying mes process
drm/amdgpu/mes: implement adding mes gang
drm/amdgpu/mes: implement removing mes gang
drm/amdgpu/mes: implement suspending all gangs
drm/amdgpu/mes: implement resuming all gangs
drm/amdgpu/mes: initialize mqd from queue properties
drm/amdgpu/mes: implement adding mes queue
drm/amdgpu/mes: implement removing mes queue
drm/amdgpu/mes: add helper function to convert ring to queue property
drm/amdgpu/mes: add helper function to get the ctx meta data offset
drm/amdgpu/mes: use ring for kernel queue submission
drm/amdgpu/mes: implement removing mes ring
drm/amdgpu/mes: add helper functions to alloc/free ctx metadata
drm/amdgpu: skip kfd routines when mes enabled
drm/amdgpu: skip some checking for mes queue ib submission
drm/amdgpu: skip kiq ib tests if mes enabled
drm/amdgpu: skip gds switch for mes queue
drm/amdgpu: kiq takes charge of all queues
drm/amdgpu/mes: map ctx metadata for mes self test
drm/amdgpu/mes: create gang and queues for mes self test
drm/amdgpu/mes: add ring/ib test for mes self test
drm/amdgpu/mes: implement mes self test
drm/amdgpu/mes10.1: add mes self test in late init
drm/amdgpu/mes: fix vm csa update issue
drm/amdgpu/mes: disable mes sdma queue test
Likun Gao (1):
drm/amdgpu: add mes kiq PSP GFX FW type
Mukul Joshi (2):
drm/amdgpu: Enable KFD with MES enabled
drm/amdgpu/mes: Update the doorbell function signatures
drivers/gpu/drm/amd/amdgpu/Makefile | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 24 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 42 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 1138 ++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 168 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h | 121 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 193 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 22 +
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 24 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 10 +
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 +-
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 8 +-
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 383 +++---
drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 8 +-
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 16 +-
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 20 +-
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 25 +-
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 6 +-
drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 4 +-
drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 4 +-
drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c | 4 +-
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 461 ++++---
drivers/gpu/drm/amd/amdgpu/nv.c | 3 +-
drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 8 +-
drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 16 +-
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 28 +-
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 169 ++-
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 171 ++-
drivers/gpu/drm/amd/amdgpu/si_dma.c | 4 +-
drivers/gpu/drm/amd/amdgpu/soc21.c | 3 +-
drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 6 +-
drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 6 +-
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 12 +-
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 12 +-
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 12 +-
41 files changed, 2619 insertions(+), 553 deletions(-)
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
--
2.35.1
^ permalink raw reply [flat|nested] 74+ messages in thread
* [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 02/73] drm/amdgpu: add helper function to initialize mqd from ring v4 Alex Deucher
` (71 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Define MQD abstract layer for hw ip, for the passing
mqd configuration not only from ring but more sources,
like user queue.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 1a598e3247ca..2eed9479e854 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -720,6 +720,26 @@ struct ip_discovery_top;
(rid == 0x01) || \
(rid == 0x10))))
+struct amdgpu_mqd_prop {
+ uint64_t mqd_gpu_addr;
+ uint64_t hqd_base_gpu_addr;
+ uint64_t rptr_gpu_addr;
+ uint64_t wptr_gpu_addr;
+ uint32_t queue_size;
+ bool use_doorbell;
+ uint32_t doorbell_index;
+ uint64_t eop_gpu_addr;
+ uint32_t hqd_pipe_priority;
+ uint32_t hqd_queue_priority;
+ bool hqd_active;
+};
+
+struct amdgpu_mqd {
+ unsigned mqd_size;
+ int (*init_mqd)(struct amdgpu_device *adev, void *mqd,
+ struct amdgpu_mqd_prop *p);
+};
+
#define AMDGPU_RESET_MAGIC_NUM 64
#define AMDGPU_MAX_DF_PERFMONS 4
#define AMDGPU_PRODUCT_NAME_LEN 64
@@ -920,6 +940,7 @@ struct amdgpu_device {
/* mes */
bool enable_mes;
struct amdgpu_mes mes;
+ struct amdgpu_mqd mqds[AMDGPU_HW_IP_NUM];
/* df */
struct amdgpu_df df;
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 02/73] drm/amdgpu: add helper function to initialize mqd from ring v4
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
2022-04-29 17:45 ` [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 03/73] drm/amdgpu: add the per-context meta data v3 Alex Deucher
` (70 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add the helper function to initialize mqd from ring configuration.
v2: use if/else pair instead of ?/: pair
v3: use simpler way to judge hqd_active
v4: fix parameters to amdgpu_gfx_is_high_priority_compute_queue
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 48 ++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +
2 files changed, 50 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 7f33ae87cb41..773954318216 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -458,3 +458,51 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring)
ring->sched.ready = !r;
return r;
}
+
+static void amdgpu_ring_to_mqd_prop(struct amdgpu_ring *ring,
+ struct amdgpu_mqd_prop *prop)
+{
+ struct amdgpu_device *adev = ring->adev;
+
+ memset(prop, 0, sizeof(*prop));
+
+ prop->mqd_gpu_addr = ring->mqd_gpu_addr;
+ prop->hqd_base_gpu_addr = ring->gpu_addr;
+ prop->rptr_gpu_addr = ring->rptr_gpu_addr;
+ prop->wptr_gpu_addr = ring->wptr_gpu_addr;
+ prop->queue_size = ring->ring_size;
+ prop->eop_gpu_addr = ring->eop_gpu_addr;
+ prop->use_doorbell = ring->use_doorbell;
+ prop->doorbell_index = ring->doorbell_index;
+
+ /* map_queues packet doesn't need activate the queue,
+ * so only kiq need set this field.
+ */
+ prop->hqd_active = ring->funcs->type == AMDGPU_RING_TYPE_KIQ;
+
+ if (ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE) {
+ if (amdgpu_gfx_is_high_priority_compute_queue(adev, ring)) {
+ prop->hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_HIGH;
+ prop->hqd_queue_priority =
+ AMDGPU_GFX_QUEUE_PRIORITY_MAXIMUM;
+ }
+ }
+}
+
+int amdgpu_ring_init_mqd(struct amdgpu_ring *ring)
+{
+ struct amdgpu_device *adev = ring->adev;
+ struct amdgpu_mqd *mqd_mgr;
+ struct amdgpu_mqd_prop prop;
+
+ amdgpu_ring_to_mqd_prop(ring, &prop);
+
+ ring->wptr = 0;
+
+ if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+ mqd_mgr = &adev->mqds[AMDGPU_HW_IP_COMPUTE];
+ else
+ mqd_mgr = &adev->mqds[ring->funcs->type];
+
+ return mqd_mgr->init_mqd(adev, ring->mqd_ptr, &prop);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 317d80209e95..20dfe5a19a81 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -369,6 +369,8 @@ int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
void amdgpu_debugfs_ring_init(struct amdgpu_device *adev,
struct amdgpu_ring *ring);
+int amdgpu_ring_init_mqd(struct amdgpu_ring *ring);
+
static inline u32 amdgpu_ib_get_value(struct amdgpu_ib *ib, int idx)
{
return ib->ptr[idx];
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 03/73] drm/amdgpu: add the per-context meta data v3
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
2022-04-29 17:45 ` [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip Alex Deucher
2022-04-29 17:45 ` [PATCH 02/73] drm/amdgpu: add helper function to initialize mqd from ring v4 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 04/73] drm/amdgpu: add mes ctx data in amdgpu_ring Alex Deucher
` (69 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
The per-context meta data is a per-context data structure associated
with a mes-managed hardware ring, which includes MCBP CSA, ring buffer
and etc.
v2: fix typo
v3: a. use structure instead of typedef
b. move amdgpu_mes_ctx_get_offs_* to amdgpu_ring.h
c. use __aligned to make alignement
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h | 118 ++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 ++
3 files changed, 128 insertions(+)
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 2eed9479e854..24bce7e691a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -91,6 +91,7 @@
#include "amdgpu_dm.h"
#include "amdgpu_virt.h"
#include "amdgpu_csa.h"
+#include "amdgpu_mes_ctx.h"
#include "amdgpu_gart.h"
#include "amdgpu_debugfs.h"
#include "amdgpu_job.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
new file mode 100644
index 000000000000..f3e1ba1a889f
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
@@ -0,0 +1,118 @@
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_MES_CTX_H__
+#define __AMDGPU_MES_CTX_H__
+
+#include "v10_structs.h"
+
+enum {
+ AMDGPU_MES_CTX_RPTR_OFFS = 0,
+ AMDGPU_MES_CTX_WPTR_OFFS,
+ AMDGPU_MES_CTX_FENCE_OFFS,
+ AMDGPU_MES_CTX_COND_EXE_OFFS,
+ AMDGPU_MES_CTX_TRAIL_FENCE_OFFS,
+ AMDGPU_MES_CTX_MAX_OFFS,
+};
+
+enum {
+ AMDGPU_MES_CTX_RING_OFFS = AMDGPU_MES_CTX_MAX_OFFS,
+ AMDGPU_MES_CTX_IB_OFFS,
+ AMDGPU_MES_CTX_PADDING_OFFS,
+};
+
+#define AMDGPU_MES_CTX_MAX_GFX_RINGS 1
+#define AMDGPU_MES_CTX_MAX_COMPUTE_RINGS 4
+#define AMDGPU_MES_CTX_MAX_SDMA_RINGS 2
+#define AMDGPU_MES_CTX_MAX_RINGS \
+ (AMDGPU_MES_CTX_MAX_GFX_RINGS + \
+ AMDGPU_MES_CTX_MAX_COMPUTE_RINGS + \
+ AMDGPU_MES_CTX_MAX_SDMA_RINGS)
+
+#define AMDGPU_CSA_SDMA_SIZE 64
+#define GFX10_MEC_HPD_SIZE 2048
+
+struct amdgpu_wb_slot {
+ uint32_t data[8];
+};
+
+struct amdgpu_mes_ctx_meta_data {
+ struct {
+ uint8_t ring[PAGE_SIZE * 4];
+
+ /* gfx csa */
+ struct v10_gfx_meta_data gfx_meta_data;
+
+ uint8_t gds_backup[64 * 1024];
+
+ struct amdgpu_wb_slot slots[AMDGPU_MES_CTX_MAX_OFFS];
+
+ /* only for ib test */
+ uint32_t ib[256] __aligned(256);
+
+ uint32_t padding[64];
+
+ } __aligned(PAGE_SIZE) gfx[AMDGPU_MES_CTX_MAX_GFX_RINGS];
+
+ struct {
+ uint8_t ring[PAGE_SIZE * 4];
+
+ uint8_t mec_hpd[GFX10_MEC_HPD_SIZE];
+
+ struct amdgpu_wb_slot slots[AMDGPU_MES_CTX_MAX_OFFS];
+
+ /* only for ib test */
+ uint32_t ib[256] __aligned(256);
+
+ uint32_t padding[64];
+
+ } __aligned(PAGE_SIZE) compute[AMDGPU_MES_CTX_MAX_COMPUTE_RINGS];
+
+ struct {
+ uint8_t ring[PAGE_SIZE * 4];
+
+ /* sdma csa for mcbp */
+ uint8_t sdma_meta_data[AMDGPU_CSA_SDMA_SIZE];
+
+ struct amdgpu_wb_slot slots[AMDGPU_MES_CTX_MAX_OFFS];
+
+ /* only for ib test */
+ uint32_t ib[256] __aligned(256);
+
+ uint32_t padding[64];
+
+ } __aligned(PAGE_SIZE) sdma[AMDGPU_MES_CTX_MAX_SDMA_RINGS];
+};
+
+struct amdgpu_mes_ctx_data {
+ struct amdgpu_bo *meta_data_obj;
+ uint64_t meta_data_gpu_addr;
+ struct amdgpu_bo_va *meta_data_va;
+ void *meta_data_ptr;
+ uint32_t gang_ids[AMDGPU_HW_IP_DMA+1];
+};
+
+#define AMDGPU_FENCE_MES_QUEUE_FLAG 0x1000000u
+#define AMDGPU_FENCE_MES_QUEUE_ID_MASK (AMDGPU_FENCE_MES_QUEUE_FLAG - 1)
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 20dfe5a19a81..112c2b0ef0b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -364,6 +364,15 @@ static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
ring->count_dw -= count_dw;
}
+#define amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset) \
+ (ring->is_mes_queue && ring->mes_ctx ? \
+ (ring->mes_ctx->meta_data_gpu_addr + offset) : 0)
+
+#define amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset) \
+ (ring->is_mes_queue && ring->mes_ctx ? \
+ (void *)((uint8_t *)(ring->mes_ctx->meta_data_ptr) + offset) : \
+ NULL)
+
int amdgpu_ring_test_helper(struct amdgpu_ring *ring);
void amdgpu_debugfs_ring_init(struct amdgpu_device *adev,
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 04/73] drm/amdgpu: add mes ctx data in amdgpu_ring
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (2 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 03/73] drm/amdgpu: add the per-context meta data v3 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 05/73] drm/amdgpu: define ring structure to access rptr/wptr/fence Alex Deucher
` (68 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add mes context data structure in amdgpu_ring.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 112c2b0ef0b1..317a66bcd258 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -267,6 +267,11 @@ struct amdgpu_ring {
int hw_prio;
unsigned num_hw_submission;
atomic_t *sched_score;
+
+ /* used for mes */
+ bool is_mes_queue;
+ uint32_t hw_queue_id;
+ struct amdgpu_mes_ctx_data *mes_ctx;
};
#define amdgpu_ring_parse_cs(r, p, job, ib) ((r)->funcs->parse_cs((p), (job), (ib)))
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 05/73] drm/amdgpu: define ring structure to access rptr/wptr/fence
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (3 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 04/73] drm/amdgpu: add mes ctx data in amdgpu_ring Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 06/73] drm/amdgpu: use ring structure to access rptr/wptr v2 Alex Deucher
` (67 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Define ring structure to access the cpu/gpu address of rptr/wptr/fence
instead of dynamic calculation.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 317a66bcd258..7d89a52091c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -230,6 +230,8 @@ struct amdgpu_ring {
struct amdgpu_bo *ring_obj;
volatile uint32_t *ring;
unsigned rptr_offs;
+ u64 rptr_gpu_addr;
+ volatile u32 *rptr_cpu_addr;
u64 wptr;
u64 wptr_old;
unsigned ring_size;
@@ -250,7 +252,11 @@ struct amdgpu_ring {
bool use_doorbell;
bool use_pollmem;
unsigned wptr_offs;
+ u64 wptr_gpu_addr;
+ volatile u32 *wptr_cpu_addr;
unsigned fence_offs;
+ u64 fence_gpu_addr;
+ volatile u32 *fence_cpu_addr;
uint64_t current_ctx;
char name[16];
u32 trail_seq;
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 06/73] drm/amdgpu: use ring structure to access rptr/wptr v2
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (4 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 05/73] drm/amdgpu: define ring structure to access rptr/wptr/fence Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 07/73] drm/amdgpu: initialize/finalize the ring for mes queue Alex Deucher
` (66 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Use ring structure to access the cpu/gpu address of rptr/wptr.
v2: merge gfx10/sdma5/sdma5.2 patches
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 8 +++---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 37 +++++++++++++-------------
drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c | 8 +++---
drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 16 +++++------
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 20 +++++++-------
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 25 +++++++++--------
drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 4 +--
drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 4 +--
drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c | 4 +--
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 11 ++++----
drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 8 +++---
drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 16 +++++------
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 28 ++++++++-----------
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 16 +++++------
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 16 +++++------
drivers/gpu/drm/amd/amdgpu/si_dma.c | 4 +--
drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 6 ++---
drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 6 ++---
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 12 ++++-----
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 12 ++++-----
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 12 ++++-----
21 files changed, 128 insertions(+), 145 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index 6c01199e9112..5647f13b98d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -164,7 +164,7 @@ static uint64_t cik_sdma_ring_get_rptr(struct amdgpu_ring *ring)
{
u32 rptr;
- rptr = ring->adev->wb.wb[ring->rptr_offs];
+ rptr = *ring->rptr_cpu_addr;
return (rptr & 0x3fffc) >> 2;
}
@@ -436,12 +436,10 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
struct amdgpu_ring *ring;
u32 rb_cntl, ib_cntl;
u32 rb_bufsz;
- u32 wb_offset;
int i, j, r;
for (i = 0; i < adev->sdma.num_instances; i++) {
ring = &adev->sdma.instance[i].ring;
- wb_offset = (ring->rptr_offs * 4);
mutex_lock(&adev->srbm_mutex);
for (j = 0; j < 16; j++) {
@@ -477,9 +475,9 @@ static int cik_sdma_gfx_resume(struct amdgpu_device *adev)
/* set the wb address whether it's enabled or not */
WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_HI + sdma_offsets[i],
- upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+ upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_LO + sdma_offsets[i],
- ((adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC));
+ ((ring->rptr_gpu_addr) & 0xFFFFFFFC));
rb_cntl |= SDMA0_GFX_RB_CNTL__RPTR_WRITEBACK_ENABLE_MASK;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 407074f958f4..f2dd53f2af61 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3519,9 +3519,8 @@ static void gfx10_kiq_set_resources(struct amdgpu_ring *kiq_ring, uint64_t queue
static void gfx10_kiq_map_queues(struct amdgpu_ring *kiq_ring,
struct amdgpu_ring *ring)
{
- struct amdgpu_device *adev = kiq_ring->adev;
uint64_t mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj);
- uint64_t wptr_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ uint64_t wptr_addr = ring->wptr_gpu_addr;
uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
@@ -6344,12 +6343,12 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device *adev)
WREG32_SOC15(GC, 0, mmCP_RB0_WPTR_HI, upper_32_bits(ring->wptr));
/* set the wb address wether it's enabled or not */
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ rptr_addr = ring->rptr_gpu_addr;
WREG32_SOC15(GC, 0, mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
WREG32_SOC15(GC, 0, mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) &
CP_RB_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK);
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO,
lower_32_bits(wptr_gpu_addr));
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI,
@@ -6382,11 +6381,11 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device *adev)
WREG32_SOC15(GC, 0, mmCP_RB1_WPTR, lower_32_bits(ring->wptr));
WREG32_SOC15(GC, 0, mmCP_RB1_WPTR_HI, upper_32_bits(ring->wptr));
/* Set the wb address wether it's enabled or not */
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ rptr_addr = ring->rptr_gpu_addr;
WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR, lower_32_bits(rptr_addr));
WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR_HI, upper_32_bits(rptr_addr) &
CP_RB1_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK);
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO,
lower_32_bits(wptr_gpu_addr));
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI,
@@ -6610,13 +6609,13 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
mqd->cp_gfx_hqd_base_hi = upper_32_bits(hqd_gpu_addr);
/* set up hqd_rptr_addr/_hi, similar as CP_RB_RPTR */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ wb_gpu_addr = ring->rptr_gpu_addr;
mqd->cp_gfx_hqd_rptr_addr = wb_gpu_addr & 0xfffffffc;
mqd->cp_gfx_hqd_rptr_addr_hi =
upper_32_bits(wb_gpu_addr) & 0xffff;
/* set up rb_wptr_poll addr */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wb_gpu_addr = ring->wptr_gpu_addr;
mqd->cp_rb_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_rb_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
@@ -6730,7 +6729,7 @@ static int gfx_v10_0_gfx_init_queue(struct amdgpu_ring *ring)
memcpy(mqd, adev->gfx.me.mqd_backup[mqd_idx], sizeof(*mqd));
/* reset the ring */
ring->wptr = 0;
- adev->wb.wb[ring->wptr_offs] = 0;
+ *ring->wptr_cpu_addr = 0;
amdgpu_ring_clear_ring(ring);
#ifdef BRING_UP_DEBUG
mutex_lock(&adev->srbm_mutex);
@@ -6904,13 +6903,13 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
mqd->cp_hqd_pq_control = tmp;
/* set the wb address whether it's enabled or not */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ wb_gpu_addr = ring->rptr_gpu_addr;
mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_rptr_report_addr_hi =
upper_32_bits(wb_gpu_addr) & 0xffff;
/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wb_gpu_addr = ring->wptr_gpu_addr;
mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
@@ -7130,7 +7129,7 @@ static int gfx_v10_0_kcq_init_queue(struct amdgpu_ring *ring)
/* reset ring buffer */
ring->wptr = 0;
- atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], 0);
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr, 0);
amdgpu_ring_clear_ring(ring);
} else {
amdgpu_ring_clear_ring(ring);
@@ -8496,7 +8495,8 @@ static void gfx_v10_0_get_clockgating_state(void *handle, u64 *flags)
static u64 gfx_v10_0_ring_get_rptr_gfx(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs]; /* gfx10 is 32bit rptr*/
+ /* gfx10 is 32bit rptr*/
+ return *(uint32_t *)ring->rptr_cpu_addr;
}
static u64 gfx_v10_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
@@ -8506,7 +8506,7 @@ static u64 gfx_v10_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
/* XXX check if swapping is necessary on BE */
if (ring->use_doorbell) {
- wptr = atomic64_read((atomic64_t *)&adev->wb.wb[ring->wptr_offs]);
+ wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
} else {
wptr = RREG32_SOC15(GC, 0, mmCP_RB0_WPTR);
wptr += (u64)RREG32_SOC15(GC, 0, mmCP_RB0_WPTR_HI) << 32;
@@ -8521,7 +8521,7 @@ static void gfx_v10_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], ring->wptr);
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr, ring->wptr);
WDOORBELL64(ring->doorbell_index, ring->wptr);
} else {
WREG32_SOC15(GC, 0, mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
@@ -8531,7 +8531,8 @@ static void gfx_v10_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
static u64 gfx_v10_0_ring_get_rptr_compute(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs]; /* gfx10 hardware is 32bit rptr */
+ /* gfx10 hardware is 32bit rptr */
+ return *(uint32_t *)ring->rptr_cpu_addr;
}
static u64 gfx_v10_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
@@ -8540,7 +8541,7 @@ static u64 gfx_v10_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
/* XXX check if swapping is necessary on BE */
if (ring->use_doorbell)
- wptr = atomic64_read((atomic64_t *)&ring->adev->wb.wb[ring->wptr_offs]);
+ wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
else
BUG();
return wptr;
@@ -8552,7 +8553,7 @@ static void gfx_v10_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
/* XXX check if swapping is necessary on BE */
if (ring->use_doorbell) {
- atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], ring->wptr);
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr, ring->wptr);
WDOORBELL64(ring->doorbell_index, ring->wptr);
} else {
BUG(); /* only DOORBELL method supported on gfx10 now */
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
index 6a8dadea40f9..29a91b320d4f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -2117,7 +2117,7 @@ static int gfx_v6_0_cp_gfx_resume(struct amdgpu_device *adev)
WREG32(mmCP_RB0_WPTR, ring->wptr);
/* set the wb address whether it's enabled or not */
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ rptr_addr = ring->rptr_gpu_addr;
WREG32(mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
WREG32(mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
@@ -2139,7 +2139,7 @@ static int gfx_v6_0_cp_gfx_resume(struct amdgpu_device *adev)
static u64 gfx_v6_0_ring_get_rptr(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs];
+ return *ring->rptr_cpu_addr;
}
static u64 gfx_v6_0_ring_get_wptr(struct amdgpu_ring *ring)
@@ -2203,7 +2203,7 @@ static int gfx_v6_0_cp_compute_resume(struct amdgpu_device *adev)
ring->wptr = 0;
WREG32(mmCP_RB1_WPTR, ring->wptr);
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ rptr_addr = ring->rptr_gpu_addr;
WREG32(mmCP_RB1_RPTR_ADDR, lower_32_bits(rptr_addr));
WREG32(mmCP_RB1_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
@@ -2222,7 +2222,7 @@ static int gfx_v6_0_cp_compute_resume(struct amdgpu_device *adev)
WREG32(mmCP_RB2_CNTL, tmp | CP_RB2_CNTL__RB_RPTR_WR_ENA_MASK);
ring->wptr = 0;
WREG32(mmCP_RB2_WPTR, ring->wptr);
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ rptr_addr = ring->rptr_gpu_addr;
WREG32(mmCP_RB2_RPTR_ADDR, lower_32_bits(rptr_addr));
WREG32(mmCP_RB2_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index d17a6f399347..ac3f2dbba726 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2630,8 +2630,8 @@ static int gfx_v7_0_cp_gfx_resume(struct amdgpu_device *adev)
ring->wptr = 0;
WREG32(mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
- /* set the wb address whether it's enabled or not */
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ /* set the wb address wether it's enabled or not */
+ rptr_addr = ring->rptr_gpu_addr;
WREG32(mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
WREG32(mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
@@ -2656,7 +2656,7 @@ static int gfx_v7_0_cp_gfx_resume(struct amdgpu_device *adev)
static u64 gfx_v7_0_ring_get_rptr(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs];
+ return *ring->rptr_cpu_addr;
}
static u64 gfx_v7_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
@@ -2677,7 +2677,7 @@ static void gfx_v7_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
static u64 gfx_v7_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
{
/* XXX check if swapping is necessary on BE */
- return ring->adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
}
static void gfx_v7_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
@@ -2685,7 +2685,7 @@ static void gfx_v7_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
/* XXX check if swapping is necessary on BE */
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
}
@@ -2981,12 +2981,12 @@ static void gfx_v7_0_mqd_init(struct amdgpu_device *adev,
CP_HQD_PQ_CONTROL__KMD_QUEUE_MASK; /* assuming kernel queue control */
/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wb_gpu_addr = ring->wptr_gpu_addr;
mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
- /* set the wb address whether it's enabled or not */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ /* set the wb address wether it's enabled or not */
+ wb_gpu_addr = ring->rptr_gpu_addr;
mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_rptr_report_addr_hi =
upper_32_bits(wb_gpu_addr) & 0xffff;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 25dc729d0ec2..e4e779a19c20 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -4306,11 +4306,11 @@ static int gfx_v8_0_cp_gfx_resume(struct amdgpu_device *adev)
WREG32(mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
/* set the wb address wether it's enabled or not */
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ rptr_addr = ring->rptr_gpu_addr;
WREG32(mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
WREG32(mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & 0xFF);
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32(mmCP_RB_WPTR_POLL_ADDR_LO, lower_32_bits(wptr_gpu_addr));
WREG32(mmCP_RB_WPTR_POLL_ADDR_HI, upper_32_bits(wptr_gpu_addr));
mdelay(1);
@@ -4393,7 +4393,7 @@ static int gfx_v8_0_kiq_kcq_enable(struct amdgpu_device *adev)
for (i = 0; i < adev->gfx.num_compute_rings; i++) {
struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
uint64_t mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj);
- uint64_t wptr_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ uint64_t wptr_addr = ring->wptr_gpu_addr;
/* map queues */
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
@@ -4517,13 +4517,13 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring *ring)
mqd->cp_hqd_pq_control = tmp;
/* set the wb address whether it's enabled or not */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ wb_gpu_addr = ring->rptr_gpu_addr;
mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_rptr_report_addr_hi =
upper_32_bits(wb_gpu_addr) & 0xffff;
/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wb_gpu_addr = ring->wptr_gpu_addr;
mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
@@ -6051,7 +6051,7 @@ static int gfx_v8_0_set_clockgating_state(void *handle,
static u64 gfx_v8_0_ring_get_rptr(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs];
+ return *ring->rptr_cpu_addr;
}
static u64 gfx_v8_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
@@ -6060,7 +6060,7 @@ static u64 gfx_v8_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
if (ring->use_doorbell)
/* XXX check if swapping is necessary on BE */
- return ring->adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32(mmCP_RB0_WPTR);
}
@@ -6071,7 +6071,7 @@ static void gfx_v8_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32(mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
@@ -6271,7 +6271,7 @@ static void gfx_v8_0_ring_emit_vm_flush(struct amdgpu_ring *ring,
static u64 gfx_v8_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
}
static void gfx_v8_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
@@ -6279,7 +6279,7 @@ static void gfx_v8_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
/* XXX check if swapping is necessary on BE */
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
}
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index d58fd83524ac..06182b7e4351 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -780,9 +780,8 @@ static void gfx_v9_0_kiq_set_resources(struct amdgpu_ring *kiq_ring,
static void gfx_v9_0_kiq_map_queues(struct amdgpu_ring *kiq_ring,
struct amdgpu_ring *ring)
{
- struct amdgpu_device *adev = kiq_ring->adev;
uint64_t mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj);
- uint64_t wptr_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ uint64_t wptr_addr = ring->wptr_gpu_addr;
uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
@@ -3326,11 +3325,11 @@ static int gfx_v9_0_cp_gfx_resume(struct amdgpu_device *adev)
WREG32_SOC15(GC, 0, mmCP_RB0_WPTR_HI, upper_32_bits(ring->wptr));
/* set the wb address wether it's enabled or not */
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ rptr_addr = ring->rptr_gpu_addr;
WREG32_SOC15(GC, 0, mmCP_RB0_RPTR_ADDR, lower_32_bits(rptr_addr));
WREG32_SOC15(GC, 0, mmCP_RB0_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & CP_RB_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK);
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO, lower_32_bits(wptr_gpu_addr));
WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI, upper_32_bits(wptr_gpu_addr));
@@ -3542,13 +3541,13 @@ static int gfx_v9_0_mqd_init(struct amdgpu_ring *ring)
mqd->cp_hqd_pq_control = tmp;
/* set the wb address whether it's enabled or not */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ wb_gpu_addr = ring->rptr_gpu_addr;
mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_rptr_report_addr_hi =
upper_32_bits(wb_gpu_addr) & 0xffff;
/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wb_gpu_addr = ring->wptr_gpu_addr;
mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
@@ -3830,7 +3829,7 @@ static int gfx_v9_0_kcq_init_queue(struct amdgpu_ring *ring)
/* reset ring buffer */
ring->wptr = 0;
- atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], 0);
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr, 0);
amdgpu_ring_clear_ring(ring);
} else {
amdgpu_ring_clear_ring(ring);
@@ -5279,7 +5278,7 @@ static void gfx_v9_0_get_clockgating_state(void *handle, u64 *flags)
static u64 gfx_v9_0_ring_get_rptr_gfx(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs]; /* gfx9 is 32bit rptr*/
+ return *ring->rptr_cpu_addr; /* gfx9 is 32bit rptr*/
}
static u64 gfx_v9_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
@@ -5289,7 +5288,7 @@ static u64 gfx_v9_0_ring_get_wptr_gfx(struct amdgpu_ring *ring)
/* XXX check if swapping is necessary on BE */
if (ring->use_doorbell) {
- wptr = atomic64_read((atomic64_t *)&adev->wb.wb[ring->wptr_offs]);
+ wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
} else {
wptr = RREG32_SOC15(GC, 0, mmCP_RB0_WPTR);
wptr += (u64)RREG32_SOC15(GC, 0, mmCP_RB0_WPTR_HI) << 32;
@@ -5304,7 +5303,7 @@ static void gfx_v9_0_ring_set_wptr_gfx(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], ring->wptr);
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr, ring->wptr);
WDOORBELL64(ring->doorbell_index, ring->wptr);
} else {
WREG32_SOC15(GC, 0, mmCP_RB0_WPTR, lower_32_bits(ring->wptr));
@@ -5469,7 +5468,7 @@ static void gfx_v9_0_ring_emit_vm_flush(struct amdgpu_ring *ring,
static u64 gfx_v9_0_ring_get_rptr_compute(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs]; /* gfx9 hardware is 32bit rptr */
+ return *ring->rptr_cpu_addr; /* gfx9 hardware is 32bit rptr */
}
static u64 gfx_v9_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
@@ -5478,7 +5477,7 @@ static u64 gfx_v9_0_ring_get_wptr_compute(struct amdgpu_ring *ring)
/* XXX check if swapping is necessary on BE */
if (ring->use_doorbell)
- wptr = atomic64_read((atomic64_t *)&ring->adev->wb.wb[ring->wptr_offs]);
+ wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
else
BUG();
return wptr;
@@ -5490,7 +5489,7 @@ static void gfx_v9_0_ring_set_wptr_compute(struct amdgpu_ring *ring)
/* XXX check if swapping is necessary on BE */
if (ring->use_doorbell) {
- atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], ring->wptr);
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr, ring->wptr);
WDOORBELL64(ring->doorbell_index, ring->wptr);
} else{
BUG(); /* only DOORBELL method supported on gfx9 now */
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
index 299de1d131d8..d2722adabd1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
@@ -407,7 +407,7 @@ static uint64_t jpeg_v2_0_dec_ring_get_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(JPEG, 0, mmUVD_JRBC_RB_WPTR);
}
@@ -424,7 +424,7 @@ static void jpeg_v2_0_dec_ring_set_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(JPEG, 0, mmUVD_JRBC_RB_WPTR, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c
index 8c3227d0b8b4..c2bf036a7330 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c
@@ -402,7 +402,7 @@ static uint64_t jpeg_v2_5_dec_ring_get_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(JPEG, ring->me, mmUVD_JRBC_RB_WPTR);
}
@@ -419,7 +419,7 @@ static void jpeg_v2_5_dec_ring_set_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(JPEG, ring->me, mmUVD_JRBC_RB_WPTR, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c
index 41a00851b6c5..a1b751d9ac06 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c
@@ -427,7 +427,7 @@ static uint64_t jpeg_v3_0_dec_ring_get_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(JPEG, 0, mmUVD_JRBC_RB_WPTR);
}
@@ -444,7 +444,7 @@ static void jpeg_v3_0_dec_ring_set_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(JPEG, 0, mmUVD_JRBC_RB_WPTR, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index a7ec4ac89da5..0819ffe8e759 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -46,7 +46,7 @@ static void mes_v10_1_ring_set_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell) {
- atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs],
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr,
ring->wptr);
WDOORBELL64(ring->doorbell_index, ring->wptr);
} else {
@@ -56,7 +56,7 @@ static void mes_v10_1_ring_set_wptr(struct amdgpu_ring *ring)
static u64 mes_v10_1_ring_get_rptr(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs];
+ return *ring->rptr_cpu_addr;
}
static u64 mes_v10_1_ring_get_wptr(struct amdgpu_ring *ring)
@@ -64,8 +64,7 @@ static u64 mes_v10_1_ring_get_wptr(struct amdgpu_ring *ring)
u64 wptr;
if (ring->use_doorbell)
- wptr = atomic64_read((atomic64_t *)
- &ring->adev->wb.wb[ring->wptr_offs]);
+ wptr = atomic64_read((atomic64_t *)ring->wptr_cpu_addr);
else
BUG();
return wptr;
@@ -673,13 +672,13 @@ static int mes_v10_1_mqd_init(struct amdgpu_ring *ring)
mqd->cp_hqd_pq_control = tmp;
/* set the wb address whether it's enabled or not */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ wb_gpu_addr = ring->rptr_gpu_addr;
mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_rptr_report_addr_hi =
upper_32_bits(wb_gpu_addr) & 0xffff;
/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
- wb_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wb_gpu_addr = ring->wptr_gpu_addr;
mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffff8;
mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 84b57b06b20c..6bdffdc1c0b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -194,7 +194,7 @@ static int sdma_v2_4_init_microcode(struct amdgpu_device *adev)
static uint64_t sdma_v2_4_ring_get_rptr(struct amdgpu_ring *ring)
{
/* XXX check if swapping is necessary on BE */
- return ring->adev->wb.wb[ring->rptr_offs] >> 2;
+ return *ring->rptr_cpu_addr >> 2;
}
/**
@@ -414,12 +414,10 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
struct amdgpu_ring *ring;
u32 rb_cntl, ib_cntl;
u32 rb_bufsz;
- u32 wb_offset;
int i, j, r;
for (i = 0; i < adev->sdma.num_instances; i++) {
ring = &adev->sdma.instance[i].ring;
- wb_offset = (ring->rptr_offs * 4);
mutex_lock(&adev->srbm_mutex);
for (j = 0; j < 16; j++) {
@@ -455,9 +453,9 @@ static int sdma_v2_4_gfx_resume(struct amdgpu_device *adev)
/* set the wb address whether it's enabled or not */
WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_HI + sdma_offsets[i],
- upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+ upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_LO + sdma_offsets[i],
- lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+ lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 8af5c94d526a..2584fa3cb13e 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -350,7 +350,7 @@ static int sdma_v3_0_init_microcode(struct amdgpu_device *adev)
static uint64_t sdma_v3_0_ring_get_rptr(struct amdgpu_ring *ring)
{
/* XXX check if swapping is necessary on BE */
- return ring->adev->wb.wb[ring->rptr_offs] >> 2;
+ return *ring->rptr_cpu_addr >> 2;
}
/**
@@ -367,7 +367,7 @@ static uint64_t sdma_v3_0_ring_get_wptr(struct amdgpu_ring *ring)
if (ring->use_doorbell || ring->use_pollmem) {
/* XXX check if swapping is necessary on BE */
- wptr = ring->adev->wb.wb[ring->wptr_offs] >> 2;
+ wptr = *ring->wptr_cpu_addr >> 2;
} else {
wptr = RREG32(mmSDMA0_GFX_RB_WPTR + sdma_offsets[ring->me]) >> 2;
}
@@ -387,12 +387,12 @@ static void sdma_v3_0_ring_set_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell) {
- u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
+ u32 *wb = (u32 *)ring->wptr_cpu_addr;
/* XXX check if swapping is necessary on BE */
WRITE_ONCE(*wb, ring->wptr << 2);
WDOORBELL32(ring->doorbell_index, ring->wptr << 2);
} else if (ring->use_pollmem) {
- u32 *wb = (u32 *)&adev->wb.wb[ring->wptr_offs];
+ u32 *wb = (u32 *)ring->wptr_cpu_addr;
WRITE_ONCE(*wb, ring->wptr << 2);
} else {
@@ -649,7 +649,6 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
struct amdgpu_ring *ring;
u32 rb_cntl, ib_cntl, wptr_poll_cntl;
u32 rb_bufsz;
- u32 wb_offset;
u32 doorbell;
u64 wptr_gpu_addr;
int i, j, r;
@@ -657,7 +656,6 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
for (i = 0; i < adev->sdma.num_instances; i++) {
ring = &adev->sdma.instance[i].ring;
amdgpu_ring_clear_ring(ring);
- wb_offset = (ring->rptr_offs * 4);
mutex_lock(&adev->srbm_mutex);
for (j = 0; j < 16; j++) {
@@ -694,9 +692,9 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
/* set the wb address whether it's enabled or not */
WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_HI + sdma_offsets[i],
- upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+ upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
WREG32(mmSDMA0_GFX_RB_RPTR_ADDR_LO + sdma_offsets[i],
- lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+ lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1);
@@ -715,7 +713,7 @@ static int sdma_v3_0_gfx_resume(struct amdgpu_device *adev)
WREG32(mmSDMA0_GFX_DOORBELL + sdma_offsets[i], doorbell);
/* setup the wptr shadow polling */
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32(mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO + sdma_offsets[i],
lower_32_bits(wptr_gpu_addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 80de85847712..65181efba50e 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -722,7 +722,7 @@ static uint64_t sdma_v4_0_ring_get_rptr(struct amdgpu_ring *ring)
u64 *rptr;
/* XXX check if swapping is necessary on BE */
- rptr = ((u64 *)&ring->adev->wb.wb[ring->rptr_offs]);
+ rptr = ((u64 *)ring->rptr_cpu_addr);
DRM_DEBUG("rptr before shift == 0x%016llx\n", *rptr);
return ((*rptr) >> 2);
@@ -742,7 +742,7 @@ static uint64_t sdma_v4_0_ring_get_wptr(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- wptr = READ_ONCE(*((u64 *)&adev->wb.wb[ring->wptr_offs]));
+ wptr = READ_ONCE(*((u64 *)ring->wptr_cpu_addr));
DRM_DEBUG("wptr/doorbell before shift == 0x%016llx\n", wptr);
} else {
wptr = RREG32_SDMA(ring->me, mmSDMA0_GFX_RB_WPTR_HI);
@@ -768,7 +768,7 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring)
DRM_DEBUG("Setting write pointer\n");
if (ring->use_doorbell) {
- u64 *wb = (u64 *)&adev->wb.wb[ring->wptr_offs];
+ u64 *wb = (u64 *)ring->wptr_cpu_addr;
DRM_DEBUG("Using doorbell -- "
"wptr_offs == 0x%08x "
@@ -811,7 +811,7 @@ static uint64_t sdma_v4_0_page_ring_get_wptr(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- wptr = READ_ONCE(*((u64 *)&adev->wb.wb[ring->wptr_offs]));
+ wptr = READ_ONCE(*((u64 *)ring->wptr_cpu_addr));
} else {
wptr = RREG32_SDMA(ring->me, mmSDMA0_PAGE_RB_WPTR_HI);
wptr = wptr << 32;
@@ -833,7 +833,7 @@ static void sdma_v4_0_page_ring_set_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell) {
- u64 *wb = (u64 *)&adev->wb.wb[ring->wptr_offs];
+ u64 *wb = (u64 *)ring->wptr_cpu_addr;
/* XXX check if swapping is necessary on BE */
WRITE_ONCE(*wb, (ring->wptr << 2));
@@ -1174,13 +1174,10 @@ static void sdma_v4_0_gfx_resume(struct amdgpu_device *adev, unsigned int i)
{
struct amdgpu_ring *ring = &adev->sdma.instance[i].ring;
u32 rb_cntl, ib_cntl, wptr_poll_cntl;
- u32 wb_offset;
u32 doorbell;
u32 doorbell_offset;
u64 wptr_gpu_addr;
- wb_offset = (ring->rptr_offs * 4);
-
rb_cntl = RREG32_SDMA(i, mmSDMA0_GFX_RB_CNTL);
rb_cntl = sdma_v4_0_rb_cntl(ring, rb_cntl);
WREG32_SDMA(i, mmSDMA0_GFX_RB_CNTL, rb_cntl);
@@ -1193,9 +1190,9 @@ static void sdma_v4_0_gfx_resume(struct amdgpu_device *adev, unsigned int i)
/* set the wb address whether it's enabled or not */
WREG32_SDMA(i, mmSDMA0_GFX_RB_RPTR_ADDR_HI,
- upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+ upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
WREG32_SDMA(i, mmSDMA0_GFX_RB_RPTR_ADDR_LO,
- lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+ lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL,
RPTR_WRITEBACK_ENABLE, 1);
@@ -1225,7 +1222,7 @@ static void sdma_v4_0_gfx_resume(struct amdgpu_device *adev, unsigned int i)
WREG32_SDMA(i, mmSDMA0_GFX_MINOR_PTR_UPDATE, 0);
/* setup the wptr shadow polling */
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32_SDMA(i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO,
lower_32_bits(wptr_gpu_addr));
WREG32_SDMA(i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI,
@@ -1264,13 +1261,10 @@ static void sdma_v4_0_page_resume(struct amdgpu_device *adev, unsigned int i)
{
struct amdgpu_ring *ring = &adev->sdma.instance[i].page;
u32 rb_cntl, ib_cntl, wptr_poll_cntl;
- u32 wb_offset;
u32 doorbell;
u32 doorbell_offset;
u64 wptr_gpu_addr;
- wb_offset = (ring->rptr_offs * 4);
-
rb_cntl = RREG32_SDMA(i, mmSDMA0_PAGE_RB_CNTL);
rb_cntl = sdma_v4_0_rb_cntl(ring, rb_cntl);
WREG32_SDMA(i, mmSDMA0_PAGE_RB_CNTL, rb_cntl);
@@ -1283,9 +1277,9 @@ static void sdma_v4_0_page_resume(struct amdgpu_device *adev, unsigned int i)
/* set the wb address whether it's enabled or not */
WREG32_SDMA(i, mmSDMA0_PAGE_RB_RPTR_ADDR_HI,
- upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+ upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
WREG32_SDMA(i, mmSDMA0_PAGE_RB_RPTR_ADDR_LO,
- lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+ lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_PAGE_RB_CNTL,
RPTR_WRITEBACK_ENABLE, 1);
@@ -1316,7 +1310,7 @@ static void sdma_v4_0_page_resume(struct amdgpu_device *adev, unsigned int i)
WREG32_SDMA(i, mmSDMA0_PAGE_MINOR_PTR_UPDATE, 0);
/* setup the wptr shadow polling */
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32_SDMA(i, mmSDMA0_PAGE_RB_WPTR_POLL_ADDR_LO,
lower_32_bits(wptr_gpu_addr));
WREG32_SDMA(i, mmSDMA0_PAGE_RB_WPTR_POLL_ADDR_HI,
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index d3939c5f531d..ff359e7f1eb8 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -347,7 +347,7 @@ static uint64_t sdma_v5_0_ring_get_rptr(struct amdgpu_ring *ring)
u64 *rptr;
/* XXX check if swapping is necessary on BE */
- rptr = ((u64 *)&ring->adev->wb.wb[ring->rptr_offs]);
+ rptr = (u64 *)ring->rptr_cpu_addr;
DRM_DEBUG("rptr before shift == 0x%016llx\n", *rptr);
return ((*rptr) >> 2);
@@ -367,7 +367,7 @@ static uint64_t sdma_v5_0_ring_get_wptr(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- wptr = READ_ONCE(*((u64 *)&adev->wb.wb[ring->wptr_offs]));
+ wptr = READ_ONCE(*((u64 *)ring->wptr_cpu_addr));
DRM_DEBUG("wptr/doorbell before shift == 0x%016llx\n", wptr);
} else {
wptr = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI));
@@ -400,8 +400,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring)
lower_32_bits(ring->wptr << 2),
upper_32_bits(ring->wptr << 2));
/* XXX check if swapping is necessary on BE */
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr << 2);
- adev->wb.wb[ring->wptr_offs + 1] = upper_32_bits(ring->wptr << 2);
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr,
+ ring->wptr << 2);
DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
ring->doorbell_index, ring->wptr << 2);
WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
@@ -708,7 +708,6 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
struct amdgpu_ring *ring;
u32 rb_cntl, ib_cntl;
u32 rb_bufsz;
- u32 wb_offset;
u32 doorbell;
u32 doorbell_offset;
u32 temp;
@@ -718,7 +717,6 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
for (i = 0; i < adev->sdma.num_instances; i++) {
ring = &adev->sdma.instance[i].ring;
- wb_offset = (ring->rptr_offs * 4);
if (!amdgpu_sriov_vf(adev))
WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0);
@@ -741,7 +739,7 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), 0);
/* setup the wptr shadow polling */
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO),
lower_32_bits(wptr_gpu_addr));
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI),
@@ -756,9 +754,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
/* set the wb address whether it's enabled or not */
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_ADDR_HI),
- upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+ upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_ADDR_LO),
- lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+ lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 8298926f8502..bf2cf95cbf8f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -248,7 +248,7 @@ static uint64_t sdma_v5_2_ring_get_rptr(struct amdgpu_ring *ring)
u64 *rptr;
/* XXX check if swapping is necessary on BE */
- rptr = ((u64 *)&ring->adev->wb.wb[ring->rptr_offs]);
+ rptr = (u64 *)ring->rptr_cpu_addr;
DRM_DEBUG("rptr before shift == 0x%016llx\n", *rptr);
return ((*rptr) >> 2);
@@ -268,7 +268,7 @@ static uint64_t sdma_v5_2_ring_get_wptr(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- wptr = READ_ONCE(*((u64 *)&adev->wb.wb[ring->wptr_offs]));
+ wptr = READ_ONCE(*((u64 *)ring->wptr_cpu_addr));
DRM_DEBUG("wptr/doorbell before shift == 0x%016llx\n", wptr);
} else {
wptr = RREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI));
@@ -301,8 +301,8 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring)
lower_32_bits(ring->wptr << 2),
upper_32_bits(ring->wptr << 2));
/* XXX check if swapping is necessary on BE */
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr << 2);
- adev->wb.wb[ring->wptr_offs + 1] = upper_32_bits(ring->wptr << 2);
+ atomic64_set((atomic64_t *)ring->wptr_cpu_addr,
+ ring->wptr << 2);
DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
ring->doorbell_index, ring->wptr << 2);
WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
@@ -609,7 +609,6 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
struct amdgpu_ring *ring;
u32 rb_cntl, ib_cntl;
u32 rb_bufsz;
- u32 wb_offset;
u32 doorbell;
u32 doorbell_offset;
u32 temp;
@@ -619,7 +618,6 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
for (i = 0; i < adev->sdma.num_instances; i++) {
ring = &adev->sdma.instance[i].ring;
- wb_offset = (ring->rptr_offs * 4);
if (!amdgpu_sriov_vf(adev))
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0);
@@ -642,7 +640,7 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), 0);
/* setup the wptr shadow polling */
- wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+ wptr_gpu_addr = ring->wptr_gpu_addr;
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO),
lower_32_bits(wptr_gpu_addr));
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI),
@@ -657,9 +655,9 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
/* set the wb address whether it's enabled or not */
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_ADDR_HI),
- upper_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFF);
+ upper_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFF);
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_ADDR_LO),
- lower_32_bits(adev->wb.gpu_addr + wb_offset) & 0xFFFFFFFC);
+ lower_32_bits(ring->rptr_gpu_addr) & 0xFFFFFFFC);
rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RPTR_WRITEBACK_ENABLE, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c b/drivers/gpu/drm/amd/amdgpu/si_dma.c
index 2f95235bbfb3..f675111ace20 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
@@ -40,7 +40,7 @@ static void si_dma_set_irq_funcs(struct amdgpu_device *adev);
static uint64_t si_dma_ring_get_rptr(struct amdgpu_ring *ring)
{
- return ring->adev->wb.wb[ring->rptr_offs>>2];
+ return *ring->rptr_cpu_addr;
}
static uint64_t si_dma_ring_get_wptr(struct amdgpu_ring *ring)
@@ -153,7 +153,7 @@ static int si_dma_start(struct amdgpu_device *adev)
WREG32(DMA_RB_RPTR + sdma_offsets[i], 0);
WREG32(DMA_RB_WPTR + sdma_offsets[i], 0);
- rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4);
+ rptr_addr = ring->rptr_gpu_addr;
WREG32(DMA_RB_RPTR_ADDR_LO + sdma_offsets[i], lower_32_bits(rptr_addr));
WREG32(DMA_RB_RPTR_ADDR_HI + sdma_offsets[i], upper_32_bits(rptr_addr) & 0xFF);
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index 2f15b8e0f7d7..e668b3baa8c6 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -118,7 +118,7 @@ static uint64_t uvd_v7_0_enc_ring_get_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
if (ring == &adev->uvd.inst[ring->me].ring_enc[0])
return RREG32_SOC15(UVD, ring->me, mmUVD_RB_WPTR);
@@ -153,7 +153,7 @@ static void uvd_v7_0_enc_ring_set_wptr(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
return;
}
@@ -754,7 +754,7 @@ static int uvd_v7_0_mmsch_start(struct amdgpu_device *adev,
if (adev->uvd.harvest_config & (1 << i))
continue;
WDOORBELL32(adev->uvd.inst[i].ring_enc[0].doorbell_index, 0);
- adev->wb.wb[adev->uvd.inst[i].ring_enc[0].wptr_offs] = 0;
+ *adev->uvd.inst[i].ring_enc[0].wptr_cpu_addr = 0;
adev->uvd.inst[i].ring_enc[0].wptr = 0;
adev->uvd.inst[i].ring_enc[0].wptr_old = 0;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index d1fc4e0b8265..66cd3d11aa4b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -83,7 +83,7 @@ static uint64_t vce_v4_0_ring_get_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
if (ring->me == 0)
return RREG32(SOC15_REG_OFFSET(VCE, 0, mmVCE_RB_WPTR));
@@ -106,7 +106,7 @@ static void vce_v4_0_ring_set_wptr(struct amdgpu_ring *ring)
if (ring->use_doorbell) {
/* XXX check if swapping is necessary on BE */
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
return;
}
@@ -177,7 +177,7 @@ static int vce_v4_0_mmsch_start(struct amdgpu_device *adev,
WREG32(SOC15_REG_OFFSET(VCE, 0, mmVCE_MMSCH_VF_MAILBOX_RESP), 0);
WDOORBELL32(adev->vce.ring[0].doorbell_index, 0);
- adev->wb.wb[adev->vce.ring[0].wptr_offs] = 0;
+ *adev->vce.ring[0].wptr_cpu_addr = 0;
adev->vce.ring[0].wptr = 0;
adev->vce.ring[0].wptr_old = 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 7a7f35e83dd5..8421044d5629 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -1336,7 +1336,7 @@ static uint64_t vcn_v2_0_dec_ring_get_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(UVD, 0, mmUVD_RBC_RB_WPTR);
}
@@ -1357,7 +1357,7 @@ static void vcn_v2_0_dec_ring_set_wptr(struct amdgpu_ring *ring)
lower_32_bits(ring->wptr) | 0x80000000);
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(UVD, 0, mmUVD_RBC_RB_WPTR, lower_32_bits(ring->wptr));
@@ -1565,12 +1565,12 @@ static uint64_t vcn_v2_0_enc_ring_get_wptr(struct amdgpu_ring *ring)
if (ring == &adev->vcn.inst->ring_enc[0]) {
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(UVD, 0, mmUVD_RB_WPTR);
} else {
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(UVD, 0, mmUVD_RB_WPTR2);
}
@@ -1589,14 +1589,14 @@ static void vcn_v2_0_enc_ring_set_wptr(struct amdgpu_ring *ring)
if (ring == &adev->vcn.inst->ring_enc[0]) {
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(UVD, 0, mmUVD_RB_WPTR, lower_32_bits(ring->wptr));
}
} else {
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(UVD, 0, mmUVD_RB_WPTR2, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 17d44be58877..9352d07539b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -1491,7 +1491,7 @@ static uint64_t vcn_v2_5_dec_ring_get_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(VCN, ring->me, mmUVD_RBC_RB_WPTR);
}
@@ -1508,7 +1508,7 @@ static void vcn_v2_5_dec_ring_set_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(VCN, ring->me, mmUVD_RBC_RB_WPTR, lower_32_bits(ring->wptr));
@@ -1607,12 +1607,12 @@ static uint64_t vcn_v2_5_enc_ring_get_wptr(struct amdgpu_ring *ring)
if (ring == &adev->vcn.inst[ring->me].ring_enc[0]) {
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR);
} else {
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR2);
}
@@ -1631,14 +1631,14 @@ static void vcn_v2_5_enc_ring_set_wptr(struct amdgpu_ring *ring)
if (ring == &adev->vcn.inst[ring->me].ring_enc[0]) {
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR, lower_32_bits(ring->wptr));
}
} else {
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR2, lower_32_bits(ring->wptr));
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index cb5f0a12333f..19cdad38d134 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -1695,7 +1695,7 @@ static uint64_t vcn_v3_0_dec_ring_get_wptr(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(VCN, ring->me, mmUVD_RBC_RB_WPTR);
}
@@ -1721,7 +1721,7 @@ static void vcn_v3_0_dec_ring_set_wptr(struct amdgpu_ring *ring)
}
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(VCN, ring->me, mmUVD_RBC_RB_WPTR, lower_32_bits(ring->wptr));
@@ -2012,12 +2012,12 @@ static uint64_t vcn_v3_0_enc_ring_get_wptr(struct amdgpu_ring *ring)
if (ring == &adev->vcn.inst[ring->me].ring_enc[0]) {
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR);
} else {
if (ring->use_doorbell)
- return adev->wb.wb[ring->wptr_offs];
+ return *ring->wptr_cpu_addr;
else
return RREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR2);
}
@@ -2036,14 +2036,14 @@ static void vcn_v3_0_enc_ring_set_wptr(struct amdgpu_ring *ring)
if (ring == &adev->vcn.inst[ring->me].ring_enc[0]) {
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR, lower_32_bits(ring->wptr));
}
} else {
if (ring->use_doorbell) {
- adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr);
+ *ring->wptr_cpu_addr = lower_32_bits(ring->wptr);
WDOORBELL32(ring->doorbell_index, lower_32_bits(ring->wptr));
} else {
WREG32_SOC15(VCN, ring->me, mmUVD_RB_WPTR2, lower_32_bits(ring->wptr));
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 07/73] drm/amdgpu: initialize/finalize the ring for mes queue
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (5 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 06/73] drm/amdgpu: use ring structure to access rptr/wptr v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 08/73] drm/amdgpu: assign the cpu/gpu address of fence from ring Alex Deucher
` (65 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Iniailize/finalize the ring for mes queue which submits the command
stream to the mes-managed hardware queue.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 145 ++++++++++++++++-------
1 file changed, 104 insertions(+), 41 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 773954318216..13db99d653bd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -149,6 +149,16 @@ void amdgpu_ring_undo(struct amdgpu_ring *ring)
ring->funcs->end_use(ring);
}
+#define amdgpu_ring_get_gpu_addr(ring, offset) \
+ (ring->is_mes_queue ? \
+ (ring->mes_ctx->meta_data_gpu_addr + offset) : \
+ (ring->adev->wb.gpu_addr + offset * 4))
+
+#define amdgpu_ring_get_cpu_addr(ring, offset) \
+ (ring->is_mes_queue ? \
+ (void *)((uint8_t *)(ring->mes_ctx->meta_data_ptr) + offset) : \
+ (&ring->adev->wb.wb[offset]))
+
/**
* amdgpu_ring_init - init driver ring struct.
*
@@ -189,51 +199,88 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
return -EINVAL;
ring->adev = adev;
- ring->idx = adev->num_rings++;
- adev->rings[ring->idx] = ring;
ring->num_hw_submission = sched_hw_submission;
ring->sched_score = sched_score;
ring->vmid_wait = dma_fence_get_stub();
+
+ if (!ring->is_mes_queue) {
+ ring->idx = adev->num_rings++;
+ adev->rings[ring->idx] = ring;
+ }
+
r = amdgpu_fence_driver_init_ring(ring);
if (r)
return r;
}
- r = amdgpu_device_wb_get(adev, &ring->rptr_offs);
- if (r) {
- dev_err(adev->dev, "(%d) ring rptr_offs wb alloc failed\n", r);
- return r;
- }
+ if (ring->is_mes_queue) {
+ ring->rptr_offs = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_RPTR_OFFS);
+ ring->wptr_offs = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_WPTR_OFFS);
+ ring->fence_offs = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_FENCE_OFFS);
+ ring->trail_fence_offs = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_TRAIL_FENCE_OFFS);
+ ring->cond_exe_offs = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_COND_EXE_OFFS);
+ } else {
+ r = amdgpu_device_wb_get(adev, &ring->rptr_offs);
+ if (r) {
+ dev_err(adev->dev, "(%d) ring rptr_offs wb alloc failed\n", r);
+ return r;
+ }
- r = amdgpu_device_wb_get(adev, &ring->wptr_offs);
- if (r) {
- dev_err(adev->dev, "(%d) ring wptr_offs wb alloc failed\n", r);
- return r;
- }
+ r = amdgpu_device_wb_get(adev, &ring->wptr_offs);
+ if (r) {
+ dev_err(adev->dev, "(%d) ring wptr_offs wb alloc failed\n", r);
+ return r;
+ }
- r = amdgpu_device_wb_get(adev, &ring->fence_offs);
- if (r) {
- dev_err(adev->dev, "(%d) ring fence_offs wb alloc failed\n", r);
- return r;
- }
+ r = amdgpu_device_wb_get(adev, &ring->fence_offs);
+ if (r) {
+ dev_err(adev->dev, "(%d) ring fence_offs wb alloc failed\n", r);
+ return r;
+ }
- r = amdgpu_device_wb_get(adev, &ring->trail_fence_offs);
- if (r) {
- dev_err(adev->dev,
- "(%d) ring trail_fence_offs wb alloc failed\n", r);
- return r;
+ r = amdgpu_device_wb_get(adev, &ring->trail_fence_offs);
+ if (r) {
+ dev_err(adev->dev, "(%d) ring trail_fence_offs wb alloc failed\n", r);
+ return r;
+ }
+
+ r = amdgpu_device_wb_get(adev, &ring->cond_exe_offs);
+ if (r) {
+ dev_err(adev->dev, "(%d) ring cond_exec_polling wb alloc failed\n", r);
+ return r;
+ }
}
+
+ ring->fence_gpu_addr =
+ amdgpu_ring_get_gpu_addr(ring, ring->fence_offs);
+ ring->fence_cpu_addr =
+ amdgpu_ring_get_cpu_addr(ring, ring->fence_offs);
+
+ ring->rptr_gpu_addr =
+ amdgpu_ring_get_gpu_addr(ring, ring->rptr_offs);
+ ring->rptr_cpu_addr =
+ amdgpu_ring_get_cpu_addr(ring, ring->rptr_offs);
+
+ ring->wptr_gpu_addr =
+ amdgpu_ring_get_gpu_addr(ring, ring->wptr_offs);
+ ring->wptr_cpu_addr =
+ amdgpu_ring_get_cpu_addr(ring, ring->wptr_offs);
+
ring->trail_fence_gpu_addr =
- adev->wb.gpu_addr + (ring->trail_fence_offs * 4);
- ring->trail_fence_cpu_addr = &adev->wb.wb[ring->trail_fence_offs];
+ amdgpu_ring_get_gpu_addr(ring, ring->trail_fence_offs);
+ ring->trail_fence_cpu_addr =
+ amdgpu_ring_get_cpu_addr(ring, ring->trail_fence_offs);
+
+ ring->cond_exe_gpu_addr =
+ amdgpu_ring_get_gpu_addr(ring, ring->cond_exe_offs);
+ ring->cond_exe_cpu_addr =
+ amdgpu_ring_get_cpu_addr(ring, ring->cond_exe_offs);
- r = amdgpu_device_wb_get(adev, &ring->cond_exe_offs);
- if (r) {
- dev_err(adev->dev, "(%d) ring cond_exec_polling wb alloc failed\n", r);
- return r;
- }
- ring->cond_exe_gpu_addr = adev->wb.gpu_addr + (ring->cond_exe_offs * 4);
- ring->cond_exe_cpu_addr = &adev->wb.wb[ring->cond_exe_offs];
/* always set cond_exec_polling to CONTINUE */
*ring->cond_exe_cpu_addr = 1;
@@ -248,8 +295,20 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
ring->buf_mask = (ring->ring_size / 4) - 1;
ring->ptr_mask = ring->funcs->support_64bit_ptrs ?
0xffffffffffffffff : ring->buf_mask;
+
/* Allocate ring buffer */
- if (ring->ring_obj == NULL) {
+ if (ring->is_mes_queue) {
+ int offset = 0;
+
+ BUG_ON(ring->ring_size > PAGE_SIZE*4);
+
+ offset = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_RING_OFFS);
+ ring->gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ ring->ring = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+ amdgpu_ring_clear_ring(ring);
+
+ } else if (ring->ring_obj == NULL) {
r = amdgpu_bo_create_kernel(adev, ring->ring_size + ring->funcs->extra_dw, PAGE_SIZE,
AMDGPU_GEM_DOMAIN_GTT,
&ring->ring_obj,
@@ -286,26 +345,30 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring)
{
/* Not to finish a ring which is not initialized */
- if (!(ring->adev) || !(ring->adev->rings[ring->idx]))
+ if (!(ring->adev) ||
+ (!ring->is_mes_queue && !(ring->adev->rings[ring->idx])))
return;
ring->sched.ready = false;
- amdgpu_device_wb_free(ring->adev, ring->rptr_offs);
- amdgpu_device_wb_free(ring->adev, ring->wptr_offs);
+ if (!ring->is_mes_queue) {
+ amdgpu_device_wb_free(ring->adev, ring->rptr_offs);
+ amdgpu_device_wb_free(ring->adev, ring->wptr_offs);
- amdgpu_device_wb_free(ring->adev, ring->cond_exe_offs);
- amdgpu_device_wb_free(ring->adev, ring->fence_offs);
+ amdgpu_device_wb_free(ring->adev, ring->cond_exe_offs);
+ amdgpu_device_wb_free(ring->adev, ring->fence_offs);
- amdgpu_bo_free_kernel(&ring->ring_obj,
- &ring->gpu_addr,
- (void **)&ring->ring);
+ amdgpu_bo_free_kernel(&ring->ring_obj,
+ &ring->gpu_addr,
+ (void **)&ring->ring);
+ }
dma_fence_put(ring->vmid_wait);
ring->vmid_wait = NULL;
ring->me = 0;
- ring->adev->rings[ring->idx] = NULL;
+ if (!ring->is_mes_queue)
+ ring->adev->rings[ring->idx] = NULL;
}
/**
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 08/73] drm/amdgpu: assign the cpu/gpu address of fence from ring
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (6 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 07/73] drm/amdgpu: initialize/finalize the ring for mes queue Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 09/73] drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2 Alex Deucher
` (64 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
assign the cpu/gpu address of fence for the normal or mes ring
from ring structure.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 5d13ed376ab4..d16c8c1f72db 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -422,8 +422,8 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
uint64_t index;
if (ring->funcs->type != AMDGPU_RING_TYPE_UVD) {
- ring->fence_drv.cpu_addr = &adev->wb.wb[ring->fence_offs];
- ring->fence_drv.gpu_addr = adev->wb.gpu_addr + (ring->fence_offs * 4);
+ ring->fence_drv.cpu_addr = ring->fence_cpu_addr;
+ ring->fence_drv.gpu_addr = ring->fence_gpu_addr;
} else {
/* put fence directly behind firmware */
index = ALIGN(adev->uvd.fw->size, 8);
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 09/73] drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (7 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 08/73] drm/amdgpu: assign the cpu/gpu address of fence from ring Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 10/73] drm/amdgpu/gfx10: use per ctx CSA for ce metadata Alex Deucher
` (63 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Refine the existing gfx/compute mqd functions, and add them
to engine mqd layer.
v2: rebase fix.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 111 +++++++++++++------------
1 file changed, 56 insertions(+), 55 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index f2dd53f2af61..cc70594d7e4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3485,6 +3485,7 @@ static void gfx_v10_0_set_ring_funcs(struct amdgpu_device *adev);
static void gfx_v10_0_set_irq_funcs(struct amdgpu_device *adev);
static void gfx_v10_0_set_gds_init(struct amdgpu_device *adev);
static void gfx_v10_0_set_rlc_funcs(struct amdgpu_device *adev);
+static void gfx_v10_0_set_mqd_funcs(struct amdgpu_device *adev);
static int gfx_v10_0_get_cu_info(struct amdgpu_device *adev,
struct amdgpu_cu_info *cu_info);
static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev);
@@ -6564,10 +6565,10 @@ static void gfx_v10_0_kiq_setting(struct amdgpu_ring *ring)
}
}
-static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
+static int gfx_v10_0_gfx_mqd_init(struct amdgpu_device *adev, void *m,
+ struct amdgpu_mqd_prop *prop)
{
- struct amdgpu_device *adev = ring->adev;
- struct v10_gfx_mqd *mqd = ring->mqd_ptr;
+ struct v10_gfx_mqd *mqd = m;
uint64_t hqd_gpu_addr, wb_gpu_addr;
uint32_t tmp;
uint32_t rb_bufsz;
@@ -6577,8 +6578,8 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
mqd->cp_gfx_hqd_wptr_hi = 0;
/* set the pointer to the MQD */
- mqd->cp_mqd_base_addr = ring->mqd_gpu_addr & 0xfffffffc;
- mqd->cp_mqd_base_addr_hi = upper_32_bits(ring->mqd_gpu_addr);
+ mqd->cp_mqd_base_addr = prop->mqd_gpu_addr & 0xfffffffc;
+ mqd->cp_mqd_base_addr_hi = upper_32_bits(prop->mqd_gpu_addr);
/* set up mqd control */
tmp = RREG32_SOC15(GC, 0, mmCP_GFX_MQD_CONTROL);
@@ -6604,23 +6605,23 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
mqd->cp_gfx_hqd_quantum = tmp;
/* set up gfx hqd base. this is similar as CP_RB_BASE */
- hqd_gpu_addr = ring->gpu_addr >> 8;
+ hqd_gpu_addr = prop->hqd_base_gpu_addr >> 8;
mqd->cp_gfx_hqd_base = hqd_gpu_addr;
mqd->cp_gfx_hqd_base_hi = upper_32_bits(hqd_gpu_addr);
/* set up hqd_rptr_addr/_hi, similar as CP_RB_RPTR */
- wb_gpu_addr = ring->rptr_gpu_addr;
+ wb_gpu_addr = prop->rptr_gpu_addr;
mqd->cp_gfx_hqd_rptr_addr = wb_gpu_addr & 0xfffffffc;
mqd->cp_gfx_hqd_rptr_addr_hi =
upper_32_bits(wb_gpu_addr) & 0xffff;
/* set up rb_wptr_poll addr */
- wb_gpu_addr = ring->wptr_gpu_addr;
+ wb_gpu_addr = prop->wptr_gpu_addr;
mqd->cp_rb_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_rb_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
/* set up the gfx_hqd_control, similar as CP_RB0_CNTL */
- rb_bufsz = order_base_2(ring->ring_size / 4) - 1;
+ rb_bufsz = order_base_2(prop->queue_size / 4) - 1;
tmp = RREG32_SOC15(GC, 0, mmCP_GFX_HQD_CNTL);
tmp = REG_SET_FIELD(tmp, CP_GFX_HQD_CNTL, RB_BUFSZ, rb_bufsz);
tmp = REG_SET_FIELD(tmp, CP_GFX_HQD_CNTL, RB_BLKSZ, rb_bufsz - 2);
@@ -6631,9 +6632,9 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
/* set up cp_doorbell_control */
tmp = RREG32_SOC15(GC, 0, mmCP_RB_DOORBELL_CONTROL);
- if (ring->use_doorbell) {
+ if (prop->use_doorbell) {
tmp = REG_SET_FIELD(tmp, CP_RB_DOORBELL_CONTROL,
- DOORBELL_OFFSET, ring->doorbell_index);
+ DOORBELL_OFFSET, prop->doorbell_index);
tmp = REG_SET_FIELD(tmp, CP_RB_DOORBELL_CONTROL,
DOORBELL_EN, 1);
} else
@@ -6641,13 +6642,7 @@ static int gfx_v10_0_gfx_mqd_init(struct amdgpu_ring *ring)
DOORBELL_EN, 0);
mqd->cp_rb_doorbell_control = tmp;
- /*if there are 2 gfx rings, set the lower doorbell range of the first ring,
- *otherwise the range of the second ring will override the first ring */
- if (ring->doorbell_index == adev->doorbell_index.gfx_ring0 << 1)
- gfx_v10_0_cp_gfx_set_doorbell(adev, ring);
-
/* reset read and write pointers, similar to CP_RB0_WPTR/_RPTR */
- ring->wptr = 0;
mqd->cp_gfx_hqd_rptr = RREG32_SOC15(GC, 0, mmCP_GFX_HQD_RPTR);
/* active the queue */
@@ -6715,7 +6710,16 @@ static int gfx_v10_0_gfx_init_queue(struct amdgpu_ring *ring)
memset((void *)mqd, 0, sizeof(*mqd));
mutex_lock(&adev->srbm_mutex);
nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
- gfx_v10_0_gfx_mqd_init(ring);
+ amdgpu_ring_init_mqd(ring);
+
+ /*
+ * if there are 2 gfx rings, set the lower doorbell
+ * range of the first ring, otherwise the range of
+ * the second ring will override the first ring
+ */
+ if (ring->doorbell_index == adev->doorbell_index.gfx_ring0 << 1)
+ gfx_v10_0_cp_gfx_set_doorbell(adev, ring);
+
#ifdef BRING_UP_DEBUG
gfx_v10_0_gfx_queue_init_register(ring);
#endif
@@ -6808,23 +6812,10 @@ static int gfx_v10_0_cp_async_gfx_ring_resume(struct amdgpu_device *adev)
return r;
}
-static void gfx_v10_0_compute_mqd_set_priority(struct amdgpu_ring *ring, struct v10_compute_mqd *mqd)
+static int gfx_v10_0_compute_mqd_init(struct amdgpu_device *adev, void *m,
+ struct amdgpu_mqd_prop *prop)
{
- struct amdgpu_device *adev = ring->adev;
-
- if (ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE) {
- if (amdgpu_gfx_is_high_priority_compute_queue(adev, ring)) {
- mqd->cp_hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_HIGH;
- mqd->cp_hqd_queue_priority =
- AMDGPU_GFX_QUEUE_PRIORITY_MAXIMUM;
- }
- }
-}
-
-static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
-{
- struct amdgpu_device *adev = ring->adev;
- struct v10_compute_mqd *mqd = ring->mqd_ptr;
+ struct v10_compute_mqd *mqd = m;
uint64_t hqd_gpu_addr, wb_gpu_addr, eop_base_addr;
uint32_t tmp;
@@ -6836,7 +6827,7 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
mqd->compute_static_thread_mgmt_se3 = 0xffffffff;
mqd->compute_misc_reserved = 0x00000003;
- eop_base_addr = ring->eop_gpu_addr >> 8;
+ eop_base_addr = prop->eop_gpu_addr >> 8;
mqd->cp_hqd_eop_base_addr_lo = eop_base_addr;
mqd->cp_hqd_eop_base_addr_hi = upper_32_bits(eop_base_addr);
@@ -6850,9 +6841,9 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
/* enable doorbell? */
tmp = RREG32_SOC15(GC, 0, mmCP_HQD_PQ_DOORBELL_CONTROL);
- if (ring->use_doorbell) {
+ if (prop->use_doorbell) {
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
- DOORBELL_OFFSET, ring->doorbell_index);
+ DOORBELL_OFFSET, prop->doorbell_index);
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
DOORBELL_EN, 1);
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
@@ -6867,15 +6858,14 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
mqd->cp_hqd_pq_doorbell_control = tmp;
/* disable the queue if it's active */
- ring->wptr = 0;
mqd->cp_hqd_dequeue_request = 0;
mqd->cp_hqd_pq_rptr = 0;
mqd->cp_hqd_pq_wptr_lo = 0;
mqd->cp_hqd_pq_wptr_hi = 0;
/* set the pointer to the MQD */
- mqd->cp_mqd_base_addr_lo = ring->mqd_gpu_addr & 0xfffffffc;
- mqd->cp_mqd_base_addr_hi = upper_32_bits(ring->mqd_gpu_addr);
+ mqd->cp_mqd_base_addr_lo = prop->mqd_gpu_addr & 0xfffffffc;
+ mqd->cp_mqd_base_addr_hi = upper_32_bits(prop->mqd_gpu_addr);
/* set MQD vmid to 0 */
tmp = RREG32_SOC15(GC, 0, mmCP_MQD_CONTROL);
@@ -6883,14 +6873,14 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
mqd->cp_mqd_control = tmp;
/* set the pointer to the HQD, this is similar CP_RB0_BASE/_HI */
- hqd_gpu_addr = ring->gpu_addr >> 8;
+ hqd_gpu_addr = prop->hqd_base_gpu_addr >> 8;
mqd->cp_hqd_pq_base_lo = hqd_gpu_addr;
mqd->cp_hqd_pq_base_hi = upper_32_bits(hqd_gpu_addr);
/* set up the HQD, this is similar to CP_RB0_CNTL */
tmp = RREG32_SOC15(GC, 0, mmCP_HQD_PQ_CONTROL);
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, QUEUE_SIZE,
- (order_base_2(ring->ring_size / 4) - 1));
+ (order_base_2(prop->queue_size / 4) - 1));
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, RPTR_BLOCK_SIZE,
((order_base_2(AMDGPU_GPU_PAGE_SIZE / 4) - 1) << 8));
#ifdef __BIG_ENDIAN
@@ -6903,22 +6893,22 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
mqd->cp_hqd_pq_control = tmp;
/* set the wb address whether it's enabled or not */
- wb_gpu_addr = ring->rptr_gpu_addr;
+ wb_gpu_addr = prop->rptr_gpu_addr;
mqd->cp_hqd_pq_rptr_report_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_rptr_report_addr_hi =
upper_32_bits(wb_gpu_addr) & 0xffff;
/* only used if CP_PQ_WPTR_POLL_CNTL.CP_PQ_WPTR_POLL_CNTL__EN_MASK=1 */
- wb_gpu_addr = ring->wptr_gpu_addr;
+ wb_gpu_addr = prop->wptr_gpu_addr;
mqd->cp_hqd_pq_wptr_poll_addr_lo = wb_gpu_addr & 0xfffffffc;
mqd->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr) & 0xffff;
tmp = 0;
/* enable the doorbell if requested */
- if (ring->use_doorbell) {
+ if (prop->use_doorbell) {
tmp = RREG32_SOC15(GC, 0, mmCP_HQD_PQ_DOORBELL_CONTROL);
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
- DOORBELL_OFFSET, ring->doorbell_index);
+ DOORBELL_OFFSET, prop->doorbell_index);
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_DOORBELL_CONTROL,
DOORBELL_EN, 1);
@@ -6931,7 +6921,6 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
mqd->cp_hqd_pq_doorbell_control = tmp;
/* reset read and write pointers, similar to CP_RB0_WPTR/_RPTR */
- ring->wptr = 0;
mqd->cp_hqd_pq_rptr = RREG32_SOC15(GC, 0, mmCP_HQD_PQ_RPTR);
/* set the vmid for the queue */
@@ -6947,13 +6936,10 @@ static int gfx_v10_0_compute_mqd_init(struct amdgpu_ring *ring)
mqd->cp_hqd_ib_control = tmp;
/* set static priority for a compute queue/ring */
- gfx_v10_0_compute_mqd_set_priority(ring, mqd);
+ mqd->cp_hqd_pipe_priority = prop->hqd_pipe_priority;
+ mqd->cp_hqd_queue_priority = prop->hqd_queue_priority;
- /* map_queues packet doesn't need activate the queue,
- * so only kiq need set this field.
- */
- if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
- mqd->cp_hqd_active = 1;
+ mqd->cp_hqd_active = prop->hqd_active;
return 0;
}
@@ -7094,7 +7080,7 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring *ring)
memset((void *)mqd, 0, sizeof(*mqd));
mutex_lock(&adev->srbm_mutex);
nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
- gfx_v10_0_compute_mqd_init(ring);
+ amdgpu_ring_init_mqd(ring);
gfx_v10_0_kiq_init_register(ring);
nv_grbm_select(adev, 0, 0, 0, 0);
mutex_unlock(&adev->srbm_mutex);
@@ -7116,7 +7102,7 @@ static int gfx_v10_0_kcq_init_queue(struct amdgpu_ring *ring)
memset((void *)mqd, 0, sizeof(*mqd));
mutex_lock(&adev->srbm_mutex);
nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
- gfx_v10_0_compute_mqd_init(ring);
+ amdgpu_ring_init_mqd(ring);
nv_grbm_select(adev, 0, 0, 0, 0);
mutex_unlock(&adev->srbm_mutex);
@@ -7799,6 +7785,7 @@ static int gfx_v10_0_early_init(void *handle)
gfx_v10_0_set_irq_funcs(adev);
gfx_v10_0_set_gds_init(adev);
gfx_v10_0_set_rlc_funcs(adev);
+ gfx_v10_0_set_mqd_funcs(adev);
/* init rlcg reg access ctrl */
gfx_v10_0_init_rlcg_reg_access_ctrl(adev);
@@ -9581,6 +9568,20 @@ static void gfx_v10_0_set_gds_init(struct amdgpu_device *adev)
adev->gds.oa_size = 16;
}
+static void gfx_v10_0_set_mqd_funcs(struct amdgpu_device *adev)
+{
+ /* set gfx eng mqd */
+ adev->mqds[AMDGPU_HW_IP_GFX].mqd_size =
+ sizeof(struct v10_gfx_mqd);
+ adev->mqds[AMDGPU_HW_IP_GFX].init_mqd =
+ gfx_v10_0_gfx_mqd_init;
+ /* set compute eng mqd */
+ adev->mqds[AMDGPU_HW_IP_COMPUTE].mqd_size =
+ sizeof(struct v10_compute_mqd);
+ adev->mqds[AMDGPU_HW_IP_COMPUTE].init_mqd =
+ gfx_v10_0_compute_mqd_init;
+}
+
static void gfx_v10_0_set_user_wgp_inactive_bitmap_per_sh(struct amdgpu_device *adev,
u32 bitmap)
{
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 10/73] drm/amdgpu/gfx10: use per ctx CSA for ce metadata
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (8 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 09/73] drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 11/73] drm/amdgpu/gfx10: use per ctx CSA for de metadata Alex Deucher
` (62 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
As MES requires per context preemption, use per context CSA address
for CE metadata to correctly enable context MCBP preemption.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 28 +++++++++++++++++---------
1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index cc70594d7e4c..56a7153474c6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -8849,26 +8849,36 @@ static void gfx_v10_0_ring_emit_ce_meta(struct amdgpu_ring *ring, bool resume)
{
struct amdgpu_device *adev = ring->adev;
struct v10_ce_ib_state ce_payload = {0};
- uint64_t csa_addr;
+ uint64_t offset, ce_payload_gpu_addr;
+ void *ce_payload_cpu_addr;
int cnt;
cnt = (sizeof(ce_payload) >> 2) + 4 - 2;
- csa_addr = amdgpu_csa_vaddr(ring->adev);
+
+ if (ring->is_mes_queue) {
+ offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+ gfx[0].gfx_meta_data) +
+ offsetof(struct v10_gfx_meta_data, ce_payload);
+ ce_payload_gpu_addr =
+ amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ ce_payload_cpu_addr =
+ amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+ } else {
+ offset = offsetof(struct v10_gfx_meta_data, ce_payload);
+ ce_payload_gpu_addr = amdgpu_csa_vaddr(ring->adev) + offset;
+ ce_payload_cpu_addr = adev->virt.csa_cpu_addr + offset;
+ }
amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, cnt));
amdgpu_ring_write(ring, (WRITE_DATA_ENGINE_SEL(2) |
WRITE_DATA_DST_SEL(8) |
WR_CONFIRM) |
WRITE_DATA_CACHE_POLICY(0));
- amdgpu_ring_write(ring, lower_32_bits(csa_addr +
- offsetof(struct v10_gfx_meta_data, ce_payload)));
- amdgpu_ring_write(ring, upper_32_bits(csa_addr +
- offsetof(struct v10_gfx_meta_data, ce_payload)));
+ amdgpu_ring_write(ring, lower_32_bits(ce_payload_gpu_addr));
+ amdgpu_ring_write(ring, upper_32_bits(ce_payload_gpu_addr));
if (resume)
- amdgpu_ring_write_multiple(ring, adev->virt.csa_cpu_addr +
- offsetof(struct v10_gfx_meta_data,
- ce_payload),
+ amdgpu_ring_write_multiple(ring, ce_payload_cpu_addr,
sizeof(ce_payload) >> 2);
else
amdgpu_ring_write_multiple(ring, (void *)&ce_payload,
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 11/73] drm/amdgpu/gfx10: use per ctx CSA for de metadata
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (9 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 10/73] drm/amdgpu/gfx10: use per ctx CSA for ce metadata Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 12/73] drm/amdgpu/gfx10: associate mes queue id with fence v2 Alex Deucher
` (61 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
As MES requires per context preemption, use per context CSA address
for DE metadata to correctly enable context MCBP preemption.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 39 ++++++++++++++++++--------
1 file changed, 28 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 56a7153474c6..d06807355f5f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -8889,12 +8889,33 @@ static void gfx_v10_0_ring_emit_de_meta(struct amdgpu_ring *ring, bool resume)
{
struct amdgpu_device *adev = ring->adev;
struct v10_de_ib_state de_payload = {0};
- uint64_t csa_addr, gds_addr;
+ uint64_t offset, gds_addr, de_payload_gpu_addr;
+ void *de_payload_cpu_addr;
int cnt;
- csa_addr = amdgpu_csa_vaddr(ring->adev);
- gds_addr = ALIGN(csa_addr + AMDGPU_CSA_SIZE - adev->gds.gds_size,
- PAGE_SIZE);
+ if (ring->is_mes_queue) {
+ offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+ gfx[0].gfx_meta_data) +
+ offsetof(struct v10_gfx_meta_data, de_payload);
+ de_payload_gpu_addr =
+ amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ de_payload_cpu_addr =
+ amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+
+ offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+ gfx[0].gds_backup) +
+ offsetof(struct v10_gfx_meta_data, de_payload);
+ gds_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ } else {
+ offset = offsetof(struct v10_gfx_meta_data, de_payload);
+ de_payload_gpu_addr = amdgpu_csa_vaddr(ring->adev) + offset;
+ de_payload_cpu_addr = adev->virt.csa_cpu_addr + offset;
+
+ gds_addr = ALIGN(amdgpu_csa_vaddr(ring->adev) +
+ AMDGPU_CSA_SIZE - adev->gds.gds_size,
+ PAGE_SIZE);
+ }
+
de_payload.gds_backup_addrlo = lower_32_bits(gds_addr);
de_payload.gds_backup_addrhi = upper_32_bits(gds_addr);
@@ -8904,15 +8925,11 @@ static void gfx_v10_0_ring_emit_de_meta(struct amdgpu_ring *ring, bool resume)
WRITE_DATA_DST_SEL(8) |
WR_CONFIRM) |
WRITE_DATA_CACHE_POLICY(0));
- amdgpu_ring_write(ring, lower_32_bits(csa_addr +
- offsetof(struct v10_gfx_meta_data, de_payload)));
- amdgpu_ring_write(ring, upper_32_bits(csa_addr +
- offsetof(struct v10_gfx_meta_data, de_payload)));
+ amdgpu_ring_write(ring, lower_32_bits(de_payload_gpu_addr));
+ amdgpu_ring_write(ring, upper_32_bits(de_payload_gpu_addr));
if (resume)
- amdgpu_ring_write_multiple(ring, adev->virt.csa_cpu_addr +
- offsetof(struct v10_gfx_meta_data,
- de_payload),
+ amdgpu_ring_write_multiple(ring, de_payload_cpu_addr,
sizeof(de_payload) >> 2);
else
amdgpu_ring_write_multiple(ring, (void *)&de_payload,
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 12/73] drm/amdgpu/gfx10: associate mes queue id with fence v2
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (10 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 11/73] drm/amdgpu/gfx10: use per ctx CSA for de metadata Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 13/73] drm/amdgpu/gfx10: inherit vmid from mqd Alex Deucher
` (60 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Associate mes queue id with fence, so that EOP trap handler can look up
which queue has issued the fence.
v2: move mes queue flag to amdgpu_mes_ctx.h
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h | 2 ++
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 ++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
index f3e1ba1a889f..544f1aa86edf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
@@ -115,4 +115,6 @@ struct amdgpu_mes_ctx_data {
#define AMDGPU_FENCE_MES_QUEUE_FLAG 0x1000000u
#define AMDGPU_FENCE_MES_QUEUE_ID_MASK (AMDGPU_FENCE_MES_QUEUE_FLAG - 1)
+#define AMDGPU_FENCE_MES_QUEUE_FLAG 0x1000000u
+
#endif
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index d06807355f5f..e6e601296097 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -8678,7 +8678,8 @@ static void gfx_v10_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
amdgpu_ring_write(ring, upper_32_bits(addr));
amdgpu_ring_write(ring, lower_32_bits(seq));
amdgpu_ring_write(ring, upper_32_bits(seq));
- amdgpu_ring_write(ring, 0);
+ amdgpu_ring_write(ring, ring->is_mes_queue ?
+ (ring->hw_queue_id | AMDGPU_FENCE_MES_QUEUE_FLAG) : 0);
}
static void gfx_v10_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 13/73] drm/amdgpu/gfx10: inherit vmid from mqd
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (11 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 12/73] drm/amdgpu/gfx10: associate mes queue id with fence v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 14/73] drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2 Alex Deucher
` (59 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
For MES manages vmid assignment, let vmid inherit from mqd instead of
ib packet setting.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index e6e601296097..0d91632f563d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -8602,6 +8602,10 @@ static void gfx_v10_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
(!amdgpu_sriov_vf(ring->adev) && flags & AMDGPU_IB_PREEMPTED) ? true : false);
}
+ if (ring->is_mes_queue)
+ /* inherit vmid from mqd */
+ control |= 0x400000;
+
amdgpu_ring_write(ring, header);
BUG_ON(ib->gpu_addr & 0x3); /* Dword align */
amdgpu_ring_write(ring,
@@ -8621,6 +8625,10 @@ static void gfx_v10_0_ring_emit_ib_compute(struct amdgpu_ring *ring,
unsigned vmid = AMDGPU_JOB_GET_VMID(job);
u32 control = INDIRECT_BUFFER_VALID | ib->length_dw | (vmid << 24);
+ if (ring->is_mes_queue)
+ /* inherit vmid from mqd */
+ control |= 0x40000000;
+
/* Currently, there is a high possibility to get wave ID mismatch
* between ME and GDS, leading to a hw deadlock, because ME generates
* different wave IDs than the GDS expects. This situation happens
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 14/73] drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (12 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 13/73] drm/amdgpu/gfx10: inherit vmid from mqd Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 15/73] drm/amdgpu/gmc10: skip emitting pasid mapping packet Alex Deucher
` (58 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Le Ma, Jack Xiao, Christian König,
Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
For MES queue VM flush, use INVALIDATE_TLBS to invalidate TLBs.
This packet can let CP firmware to determine the current vmid
and inv eng to invalidate.
v2: unify invalidate_tlbs functions
Cc: Le Ma <le.ma@amd.com>
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 27 +++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0d91632f563d..2ab5259c7305 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3503,6 +3503,9 @@ static void gfx_v10_0_ring_emit_frame_cntl(struct amdgpu_ring *ring, bool start,
static u32 gfx_v10_3_get_disabled_sa(struct amdgpu_device *adev);
static void gfx_v10_3_program_pbb_mode(struct amdgpu_device *adev);
static void gfx_v10_3_set_power_brake_sequence(struct amdgpu_device *adev);
+static void gfx_v10_0_ring_invalidate_tlbs(struct amdgpu_ring *ring,
+ uint16_t pasid, uint32_t flush_type,
+ bool all_hub, uint8_t dst_sel);
static void gfx10_kiq_set_resources(struct amdgpu_ring *kiq_ring, uint64_t queue_mask)
{
@@ -3595,12 +3598,7 @@ static void gfx10_kiq_invalidate_tlbs(struct amdgpu_ring *kiq_ring,
uint16_t pasid, uint32_t flush_type,
bool all_hub)
{
- amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_INVALIDATE_TLBS, 0));
- amdgpu_ring_write(kiq_ring,
- PACKET3_INVALIDATE_TLBS_DST_SEL(1) |
- PACKET3_INVALIDATE_TLBS_ALL_HUB(all_hub) |
- PACKET3_INVALIDATE_TLBS_PASID(pasid) |
- PACKET3_INVALIDATE_TLBS_FLUSH_TYPE(flush_type));
+ gfx_v10_0_ring_invalidate_tlbs(kiq_ring, pasid, flush_type, all_hub, 1);
}
static const struct kiq_pm4_funcs gfx_v10_0_kiq_pm4_funcs = {
@@ -8700,10 +8698,25 @@ static void gfx_v10_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring)
upper_32_bits(addr), seq, 0xffffffff, 4);
}
+static void gfx_v10_0_ring_invalidate_tlbs(struct amdgpu_ring *ring,
+ uint16_t pasid, uint32_t flush_type,
+ bool all_hub, uint8_t dst_sel)
+{
+ amdgpu_ring_write(ring, PACKET3(PACKET3_INVALIDATE_TLBS, 0));
+ amdgpu_ring_write(ring,
+ PACKET3_INVALIDATE_TLBS_DST_SEL(dst_sel) |
+ PACKET3_INVALIDATE_TLBS_ALL_HUB(all_hub) |
+ PACKET3_INVALIDATE_TLBS_PASID(pasid) |
+ PACKET3_INVALIDATE_TLBS_FLUSH_TYPE(flush_type));
+}
+
static void gfx_v10_0_ring_emit_vm_flush(struct amdgpu_ring *ring,
unsigned vmid, uint64_t pd_addr)
{
- amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
+ if (ring->is_mes_queue)
+ gfx_v10_0_ring_invalidate_tlbs(ring, 0, 0, false, 0);
+ else
+ amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
/* compute doesn't have PFP */
if (ring->funcs->type == AMDGPU_RING_TYPE_GFX) {
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 15/73] drm/amdgpu/gmc10: skip emitting pasid mapping packet
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (13 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 14/73] drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 16/73] drm/amdgpu: use the whole doorbell space for mes Alex Deucher
` (57 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
For MES FW manages IH_VMID_x_LUT updating, skip emitting pasid
mapping packet.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 487c33937a87..9b4a035a5bf1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -517,6 +517,10 @@ static void gmc_v10_0_emit_pasid_mapping(struct amdgpu_ring *ring, unsigned vmid
struct amdgpu_device *adev = ring->adev;
uint32_t reg;
+ /* MES fw manages IH_VMID_x_LUT updating */
+ if (ring->is_mes_queue)
+ return;
+
if (ring->funcs->vmhub == AMDGPU_GFXHUB_0)
reg = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_VMID_0_LUT) + vmid;
else
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 16/73] drm/amdgpu: use the whole doorbell space for mes
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (14 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 15/73] drm/amdgpu/gmc10: skip emitting pasid mapping packet Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 17/73] drm/amdgpu: update mes process/gang/queue definitions Alex Deucher
` (56 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Use the whole doorbell space for mes. Each queue in one process occupies
one doorbell slot to ring the queue submitting.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 32 +++++++++++++---------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f33e3c341f8f..b9844249d464 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1044,19 +1044,25 @@ static int amdgpu_device_doorbell_init(struct amdgpu_device *adev)
adev->doorbell.base = pci_resource_start(adev->pdev, 2);
adev->doorbell.size = pci_resource_len(adev->pdev, 2);
- adev->doorbell.num_doorbells = min_t(u32, adev->doorbell.size / sizeof(u32),
- adev->doorbell_index.max_assignment+1);
- if (adev->doorbell.num_doorbells == 0)
- return -EINVAL;
-
- /* For Vega, reserve and map two pages on doorbell BAR since SDMA
- * paging queue doorbell use the second page. The
- * AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the
- * doorbells are in the first page. So with paging queue enabled,
- * the max num_doorbells should + 1 page (0x400 in dword)
- */
- if (adev->asic_type >= CHIP_VEGA10)
- adev->doorbell.num_doorbells += 0x400;
+ if (adev->enable_mes) {
+ adev->doorbell.num_doorbells =
+ adev->doorbell.size / sizeof(u32);
+ } else {
+ adev->doorbell.num_doorbells =
+ min_t(u32, adev->doorbell.size / sizeof(u32),
+ adev->doorbell_index.max_assignment+1);
+ if (adev->doorbell.num_doorbells == 0)
+ return -EINVAL;
+
+ /* For Vega, reserve and map two pages on doorbell BAR since SDMA
+ * paging queue doorbell use the second page. The
+ * AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the
+ * doorbells are in the first page. So with paging queue enabled,
+ * the max num_doorbells should + 1 page (0x400 in dword)
+ */
+ if (adev->asic_type >= CHIP_VEGA10)
+ adev->doorbell.num_doorbells += 0x400;
+ }
adev->doorbell.ptr = ioremap(adev->doorbell.base,
adev->doorbell.num_doorbells *
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 17/73] drm/amdgpu: update mes process/gang/queue definitions
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (15 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 16/73] drm/amdgpu: use the whole doorbell space for mes Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 18/73] drm/amdgpu: add mes_kiq module parameter v2 Alex Deucher
` (55 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Update the definitions of MES process/gang/queue.
v2: add missing includes
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 58 +++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 7334982ea702..52483d7ce843 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -24,6 +24,10 @@
#ifndef __AMDGPU_MES_H__
#define __AMDGPU_MES_H__
+#include "amdgpu_irq.h"
+#include "kgd_kfd_interface.h"
+#include "amdgpu_gfx.h"
+
#define AMDGPU_MES_MAX_COMPUTE_PIPES 8
#define AMDGPU_MES_MAX_GFX_PIPES 2
#define AMDGPU_MES_MAX_SDMA_PIPES 2
@@ -37,11 +41,23 @@ enum amdgpu_mes_priority_level {
AMDGPU_MES_PRIORITY_NUM_LEVELS
};
+#define AMDGPU_MES_PROC_CTX_SIZE 0x1000 /* one page area */
+#define AMDGPU_MES_GANG_CTX_SIZE 0x1000 /* one page area */
+
struct amdgpu_mes_funcs;
struct amdgpu_mes {
struct amdgpu_device *adev;
+ struct mutex mutex;
+
+ struct idr pasid_idr;
+ struct idr gang_id_idr;
+ struct idr queue_id_idr;
+ struct ida doorbell_ida;
+
+ spinlock_t queue_id_lock;
+
uint32_t total_max_queue;
uint32_t doorbell_id_offset;
uint32_t max_doorbell_slices;
@@ -90,6 +106,48 @@ struct amdgpu_mes {
const struct amdgpu_mes_funcs *funcs;
};
+struct amdgpu_mes_process {
+ int pasid;
+ struct amdgpu_vm *vm;
+ uint64_t pd_gpu_addr;
+ struct amdgpu_bo *proc_ctx_bo;
+ uint64_t proc_ctx_gpu_addr;
+ void *proc_ctx_cpu_ptr;
+ uint64_t process_quantum;
+ struct list_head gang_list;
+ uint32_t doorbell_index;
+ unsigned long *doorbell_bitmap;
+ struct mutex doorbell_lock;
+};
+
+struct amdgpu_mes_gang {
+ int gang_id;
+ int priority;
+ int inprocess_gang_priority;
+ int global_priority_level;
+ struct list_head list;
+ struct amdgpu_mes_process *process;
+ struct amdgpu_bo *gang_ctx_bo;
+ uint64_t gang_ctx_gpu_addr;
+ void *gang_ctx_cpu_ptr;
+ uint64_t gang_quantum;
+ struct list_head queue_list;
+};
+
+struct amdgpu_mes_queue {
+ struct list_head list;
+ struct amdgpu_mes_gang *gang;
+ int queue_id;
+ uint64_t doorbell_off;
+ struct amdgpu_bo *mqd_obj;
+ void *mqd_cpu_ptr;
+ uint64_t mqd_gpu_addr;
+ uint64_t wptr_gpu_addr;
+ int queue_type;
+ int paging;
+ struct amdgpu_ring *ring;
+};
+
struct mes_add_queue_input {
uint32_t process_id;
uint64_t page_table_base_addr;
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 18/73] drm/amdgpu: add mes_kiq module parameter v2
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (16 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 17/73] drm/amdgpu: update mes process/gang/queue definitions Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 19/73] drm/amdgpu: allocate doorbell index for mes kiq Alex Deucher
` (54 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
mes_kiq parameter is used to enable mes kiq pipe.
This module parameter is unneccessary or enabled by default
in final version.
v2: reword commit message.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++++--
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 ++++++++++
3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 24bce7e691a8..4264abc5604d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -208,6 +208,7 @@ extern int amdgpu_async_gfx_ring;
extern int amdgpu_mcbp;
extern int amdgpu_discovery;
extern int amdgpu_mes;
+extern int amdgpu_mes_kiq;
extern int amdgpu_noretry;
extern int amdgpu_force_asic_type;
extern int amdgpu_smartshift_bias;
@@ -940,6 +941,7 @@ struct amdgpu_device {
/* mes */
bool enable_mes;
+ bool enable_mes_kiq;
struct amdgpu_mes mes;
struct amdgpu_mqd mqds[AMDGPU_HW_IP_NUM];
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b9844249d464..b2366d0d3047 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3669,8 +3669,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (amdgpu_mcbp)
DRM_INFO("MCBP is enabled\n");
- if (amdgpu_mes && adev->asic_type >= CHIP_NAVI10)
- adev->enable_mes = true;
+ if (adev->asic_type >= CHIP_NAVI10) {
+ if (amdgpu_mes || amdgpu_mes_kiq)
+ adev->enable_mes = true;
+
+ if (amdgpu_mes_kiq)
+ adev->enable_mes_kiq = true;
+ }
/*
* Reset domain needs to be present early, before XGMI hive discovered
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 80690cbac89f..9e72b3ec5d4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -171,6 +171,7 @@ int amdgpu_async_gfx_ring = 1;
int amdgpu_mcbp;
int amdgpu_discovery = -1;
int amdgpu_mes;
+int amdgpu_mes_kiq;
int amdgpu_noretry = -1;
int amdgpu_force_asic_type = -1;
int amdgpu_tmz = -1; /* auto */
@@ -636,6 +637,15 @@ MODULE_PARM_DESC(mes,
"Enable Micro Engine Scheduler (0 = disabled (default), 1 = enabled)");
module_param_named(mes, amdgpu_mes, int, 0444);
+/**
+ * DOC: mes_kiq (int)
+ * Enable Micro Engine Scheduler KIQ. This is a new engine pipe for kiq.
+ * (0 = disabled (default), 1 = enabled)
+ */
+MODULE_PARM_DESC(mes_kiq,
+ "Enable Micro Engine Scheduler KIQ (0 = disabled (default), 1 = enabled)");
+module_param_named(mes_kiq, amdgpu_mes_kiq, int, 0444);
+
/**
* DOC: noretry (int)
* Disable XNACK retry in the SQ by default on GFXv9 hardware. On ASICs that
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 19/73] drm/amdgpu: allocate doorbell index for mes kiq
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (17 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 18/73] drm/amdgpu: add mes_kiq module parameter v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 20/73] drm/amdgpu/mes: extend mes framework to support multiple mes pipes Alex Deucher
` (53 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Allocate a doorbell index for mes kiq queue.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h | 6 ++++--
drivers/gpu/drm/amd/amdgpu/nv.c | 3 ++-
drivers/gpu/drm/amd/amdgpu/soc21.c | 3 ++-
3 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
index 89e6ad30396f..2d9485e67125 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
@@ -53,7 +53,8 @@ struct amdgpu_doorbell_index {
uint32_t gfx_ring0;
uint32_t gfx_ring1;
uint32_t sdma_engine[8];
- uint32_t mes_ring;
+ uint32_t mes_ring0;
+ uint32_t mes_ring1;
uint32_t ih;
union {
struct {
@@ -178,7 +179,8 @@ typedef enum _AMDGPU_NAVI10_DOORBELL_ASSIGNMENT
AMDGPU_NAVI10_DOORBELL_USERQUEUE_END = 0x08A,
AMDGPU_NAVI10_DOORBELL_GFX_RING0 = 0x08B,
AMDGPU_NAVI10_DOORBELL_GFX_RING1 = 0x08C,
- AMDGPU_NAVI10_DOORBELL_MES_RING = 0x090,
+ AMDGPU_NAVI10_DOORBELL_MES_RING0 = 0x090,
+ AMDGPU_NAVI10_DOORBELL_MES_RING1 = 0x091,
/* SDMA:256~335*/
AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE0 = 0x100,
AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE1 = 0x10A,
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 0a7946c59a42..8cf1a7f8a632 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -607,7 +607,8 @@ static void nv_init_doorbell_index(struct amdgpu_device *adev)
adev->doorbell_index.userqueue_end = AMDGPU_NAVI10_DOORBELL_USERQUEUE_END;
adev->doorbell_index.gfx_ring0 = AMDGPU_NAVI10_DOORBELL_GFX_RING0;
adev->doorbell_index.gfx_ring1 = AMDGPU_NAVI10_DOORBELL_GFX_RING1;
- adev->doorbell_index.mes_ring = AMDGPU_NAVI10_DOORBELL_MES_RING;
+ adev->doorbell_index.mes_ring0 = AMDGPU_NAVI10_DOORBELL_MES_RING0;
+ adev->doorbell_index.mes_ring1 = AMDGPU_NAVI10_DOORBELL_MES_RING1;
adev->doorbell_index.sdma_engine[0] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE0;
adev->doorbell_index.sdma_engine[1] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE1;
adev->doorbell_index.sdma_engine[2] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE2;
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c
index c3069f5e299a..68985a59a6a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -409,7 +409,8 @@ static void soc21_init_doorbell_index(struct amdgpu_device *adev)
adev->doorbell_index.userqueue_end = AMDGPU_NAVI10_DOORBELL_USERQUEUE_END;
adev->doorbell_index.gfx_ring0 = AMDGPU_NAVI10_DOORBELL_GFX_RING0;
adev->doorbell_index.gfx_ring1 = AMDGPU_NAVI10_DOORBELL_GFX_RING1;
- adev->doorbell_index.mes_ring = AMDGPU_NAVI10_DOORBELL_MES_RING;
+ adev->doorbell_index.mes_ring0 = AMDGPU_NAVI10_DOORBELL_MES_RING0;
+ adev->doorbell_index.mes_ring1 = AMDGPU_NAVI10_DOORBELL_MES_RING1;
adev->doorbell_index.sdma_engine[0] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE0;
adev->doorbell_index.sdma_engine[1] = AMDGPU_NAVI10_DOORBELL_sDMA_ENGINE1;
adev->doorbell_index.ih = AMDGPU_NAVI10_DOORBELL_IH;
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 20/73] drm/amdgpu/mes: extend mes framework to support multiple mes pipes
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (18 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 19/73] drm/amdgpu: allocate doorbell index for mes kiq Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 21/73] drm/amdgpu/gfx10: add mes queue fence handling Alex Deucher
` (52 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add support for multiple mes pipes, so that reuse the existing
code to initialize more mes pipe and queue.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 35 ++--
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 215 ++++++++++++++----------
2 files changed, 149 insertions(+), 101 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 52483d7ce843..91b020842eb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -46,6 +46,12 @@ enum amdgpu_mes_priority_level {
struct amdgpu_mes_funcs;
+enum admgpu_mes_pipe {
+ AMDGPU_MES_SCHED_PIPE = 0,
+ AMDGPU_MES_KIQ_PIPE,
+ AMDGPU_MAX_MES_PIPES = 2,
+};
+
struct amdgpu_mes {
struct amdgpu_device *adev;
@@ -67,27 +73,28 @@ struct amdgpu_mes {
struct amdgpu_ring ring;
- const struct firmware *fw;
+ const struct firmware *fw[AMDGPU_MAX_MES_PIPES];
/* mes ucode */
- struct amdgpu_bo *ucode_fw_obj;
- uint64_t ucode_fw_gpu_addr;
- uint32_t *ucode_fw_ptr;
- uint32_t ucode_fw_version;
- uint64_t uc_start_addr;
+ struct amdgpu_bo *ucode_fw_obj[AMDGPU_MAX_MES_PIPES];
+ uint64_t ucode_fw_gpu_addr[AMDGPU_MAX_MES_PIPES];
+ uint32_t *ucode_fw_ptr[AMDGPU_MAX_MES_PIPES];
+ uint32_t ucode_fw_version[AMDGPU_MAX_MES_PIPES];
+ uint64_t uc_start_addr[AMDGPU_MAX_MES_PIPES];
/* mes ucode data */
- struct amdgpu_bo *data_fw_obj;
- uint64_t data_fw_gpu_addr;
- uint32_t *data_fw_ptr;
- uint32_t data_fw_version;
- uint64_t data_start_addr;
+ struct amdgpu_bo *data_fw_obj[AMDGPU_MAX_MES_PIPES];
+ uint64_t data_fw_gpu_addr[AMDGPU_MAX_MES_PIPES];
+ uint32_t *data_fw_ptr[AMDGPU_MAX_MES_PIPES];
+ uint32_t data_fw_version[AMDGPU_MAX_MES_PIPES];
+ uint64_t data_start_addr[AMDGPU_MAX_MES_PIPES];
/* eop gpu obj */
- struct amdgpu_bo *eop_gpu_obj;
- uint64_t eop_gpu_addr;
+ struct amdgpu_bo *eop_gpu_obj[AMDGPU_MAX_MES_PIPES];
+ uint64_t eop_gpu_addr[AMDGPU_MAX_MES_PIPES];
- void *mqd_backup;
+ void *mqd_backup[AMDGPU_MAX_MES_PIPES];
+ struct amdgpu_irq_src irq[AMDGPU_MAX_MES_PIPES];
uint32_t vmid_mask_gfxhub;
uint32_t vmid_mask_mmhub;
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index 0819ffe8e759..f82a6f981629 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -269,7 +269,8 @@ static const struct amdgpu_mes_funcs mes_v10_1_funcs = {
.resume_gang = mes_v10_1_resume_gang,
};
-static int mes_v10_1_init_microcode(struct amdgpu_device *adev)
+static int mes_v10_1_init_microcode(struct amdgpu_device *adev,
+ enum admgpu_mes_pipe pipe)
{
const char *chip_name;
char fw_name[30];
@@ -288,40 +289,56 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev)
BUG();
}
- snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes.bin", chip_name);
- err = request_firmware(&adev->mes.fw, fw_name, adev->dev);
+ if (pipe == AMDGPU_MES_SCHED_PIPE)
+ snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes.bin",
+ chip_name);
+ else
+ BUG();
+
+ err = request_firmware(&adev->mes.fw[pipe], fw_name, adev->dev);
if (err)
return err;
- err = amdgpu_ucode_validate(adev->mes.fw);
+ err = amdgpu_ucode_validate(adev->mes.fw[pipe]);
if (err) {
- release_firmware(adev->mes.fw);
- adev->mes.fw = NULL;
+ release_firmware(adev->mes.fw[pipe]);
+ adev->mes.fw[pipe] = NULL;
return err;
}
- mes_hdr = (const struct mes_firmware_header_v1_0 *)adev->mes.fw->data;
- adev->mes.ucode_fw_version = le32_to_cpu(mes_hdr->mes_ucode_version);
- adev->mes.ucode_fw_version =
+ mes_hdr = (const struct mes_firmware_header_v1_0 *)
+ adev->mes.fw[pipe]->data;
+ adev->mes.ucode_fw_version[pipe] =
+ le32_to_cpu(mes_hdr->mes_ucode_version);
+ adev->mes.ucode_fw_version[pipe] =
le32_to_cpu(mes_hdr->mes_ucode_data_version);
- adev->mes.uc_start_addr =
+ adev->mes.uc_start_addr[pipe] =
le32_to_cpu(mes_hdr->mes_uc_start_addr_lo) |
((uint64_t)(le32_to_cpu(mes_hdr->mes_uc_start_addr_hi)) << 32);
- adev->mes.data_start_addr =
+ adev->mes.data_start_addr[pipe] =
le32_to_cpu(mes_hdr->mes_data_start_addr_lo) |
((uint64_t)(le32_to_cpu(mes_hdr->mes_data_start_addr_hi)) << 32);
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
- info = &adev->firmware.ucode[AMDGPU_UCODE_ID_CP_MES];
- info->ucode_id = AMDGPU_UCODE_ID_CP_MES;
- info->fw = adev->mes.fw;
+ int ucode, ucode_data;
+
+ if (pipe == AMDGPU_MES_SCHED_PIPE) {
+ ucode = AMDGPU_UCODE_ID_CP_MES;
+ ucode_data = AMDGPU_UCODE_ID_CP_MES_DATA;
+ } else {
+ BUG();
+ }
+
+ info = &adev->firmware.ucode[ucode];
+ info->ucode_id = ucode;
+ info->fw = adev->mes.fw[pipe];
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(mes_hdr->mes_ucode_size_bytes),
PAGE_SIZE);
- info = &adev->firmware.ucode[AMDGPU_UCODE_ID_CP_MES_DATA];
- info->ucode_id = AMDGPU_UCODE_ID_CP_MES_DATA;
- info->fw = adev->mes.fw;
+ info = &adev->firmware.ucode[ucode_data];
+ info->ucode_id = ucode_data;
+ info->fw = adev->mes.fw[pipe];
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(mes_hdr->mes_ucode_data_size_bytes),
PAGE_SIZE);
@@ -330,13 +347,15 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev)
return 0;
}
-static void mes_v10_1_free_microcode(struct amdgpu_device *adev)
+static void mes_v10_1_free_microcode(struct amdgpu_device *adev,
+ enum admgpu_mes_pipe pipe)
{
- release_firmware(adev->mes.fw);
- adev->mes.fw = NULL;
+ release_firmware(adev->mes.fw[pipe]);
+ adev->mes.fw[pipe] = NULL;
}
-static int mes_v10_1_allocate_ucode_buffer(struct amdgpu_device *adev)
+static int mes_v10_1_allocate_ucode_buffer(struct amdgpu_device *adev,
+ enum admgpu_mes_pipe pipe)
{
int r;
const struct mes_firmware_header_v1_0 *mes_hdr;
@@ -344,31 +363,32 @@ static int mes_v10_1_allocate_ucode_buffer(struct amdgpu_device *adev)
unsigned fw_size;
mes_hdr = (const struct mes_firmware_header_v1_0 *)
- adev->mes.fw->data;
+ adev->mes.fw[pipe]->data;
- fw_data = (const __le32 *)(adev->mes.fw->data +
+ fw_data = (const __le32 *)(adev->mes.fw[pipe]->data +
le32_to_cpu(mes_hdr->mes_ucode_offset_bytes));
fw_size = le32_to_cpu(mes_hdr->mes_ucode_size_bytes);
r = amdgpu_bo_create_reserved(adev, fw_size,
PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
- &adev->mes.ucode_fw_obj,
- &adev->mes.ucode_fw_gpu_addr,
- (void **)&adev->mes.ucode_fw_ptr);
+ &adev->mes.ucode_fw_obj[pipe],
+ &adev->mes.ucode_fw_gpu_addr[pipe],
+ (void **)&adev->mes.ucode_fw_ptr[pipe]);
if (r) {
dev_err(adev->dev, "(%d) failed to create mes fw bo\n", r);
return r;
}
- memcpy(adev->mes.ucode_fw_ptr, fw_data, fw_size);
+ memcpy(adev->mes.ucode_fw_ptr[pipe], fw_data, fw_size);
- amdgpu_bo_kunmap(adev->mes.ucode_fw_obj);
- amdgpu_bo_unreserve(adev->mes.ucode_fw_obj);
+ amdgpu_bo_kunmap(adev->mes.ucode_fw_obj[pipe]);
+ amdgpu_bo_unreserve(adev->mes.ucode_fw_obj[pipe]);
return 0;
}
-static int mes_v10_1_allocate_ucode_data_buffer(struct amdgpu_device *adev)
+static int mes_v10_1_allocate_ucode_data_buffer(struct amdgpu_device *adev,
+ enum admgpu_mes_pipe pipe)
{
int r;
const struct mes_firmware_header_v1_0 *mes_hdr;
@@ -376,53 +396,63 @@ static int mes_v10_1_allocate_ucode_data_buffer(struct amdgpu_device *adev)
unsigned fw_size;
mes_hdr = (const struct mes_firmware_header_v1_0 *)
- adev->mes.fw->data;
+ adev->mes.fw[pipe]->data;
- fw_data = (const __le32 *)(adev->mes.fw->data +
+ fw_data = (const __le32 *)(adev->mes.fw[pipe]->data +
le32_to_cpu(mes_hdr->mes_ucode_data_offset_bytes));
fw_size = le32_to_cpu(mes_hdr->mes_ucode_data_size_bytes);
r = amdgpu_bo_create_reserved(adev, fw_size,
64 * 1024, AMDGPU_GEM_DOMAIN_GTT,
- &adev->mes.data_fw_obj,
- &adev->mes.data_fw_gpu_addr,
- (void **)&adev->mes.data_fw_ptr);
+ &adev->mes.data_fw_obj[pipe],
+ &adev->mes.data_fw_gpu_addr[pipe],
+ (void **)&adev->mes.data_fw_ptr[pipe]);
if (r) {
dev_err(adev->dev, "(%d) failed to create mes data fw bo\n", r);
return r;
}
- memcpy(adev->mes.data_fw_ptr, fw_data, fw_size);
+ memcpy(adev->mes.data_fw_ptr[pipe], fw_data, fw_size);
- amdgpu_bo_kunmap(adev->mes.data_fw_obj);
- amdgpu_bo_unreserve(adev->mes.data_fw_obj);
+ amdgpu_bo_kunmap(adev->mes.data_fw_obj[pipe]);
+ amdgpu_bo_unreserve(adev->mes.data_fw_obj[pipe]);
return 0;
}
-static void mes_v10_1_free_ucode_buffers(struct amdgpu_device *adev)
+static void mes_v10_1_free_ucode_buffers(struct amdgpu_device *adev,
+ enum admgpu_mes_pipe pipe)
{
- amdgpu_bo_free_kernel(&adev->mes.data_fw_obj,
- &adev->mes.data_fw_gpu_addr,
- (void **)&adev->mes.data_fw_ptr);
+ amdgpu_bo_free_kernel(&adev->mes.data_fw_obj[pipe],
+ &adev->mes.data_fw_gpu_addr[pipe],
+ (void **)&adev->mes.data_fw_ptr[pipe]);
- amdgpu_bo_free_kernel(&adev->mes.ucode_fw_obj,
- &adev->mes.ucode_fw_gpu_addr,
- (void **)&adev->mes.ucode_fw_ptr);
+ amdgpu_bo_free_kernel(&adev->mes.ucode_fw_obj[pipe],
+ &adev->mes.ucode_fw_gpu_addr[pipe],
+ (void **)&adev->mes.ucode_fw_ptr[pipe]);
}
static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
{
- uint32_t data = 0;
+ uint32_t pipe, data = 0;
if (enable) {
data = RREG32_SOC15(GC, 0, mmCP_MES_CNTL);
data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_RESET, 1);
WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
- /* set ucode start address */
- WREG32_SOC15(GC, 0, mmCP_MES_PRGRM_CNTR_START,
- (uint32_t)(adev->mes.uc_start_addr) >> 2);
+ mutex_lock(&adev->srbm_mutex);
+ for (pipe = 0; pipe < AMDGPU_MAX_MES_PIPES; pipe++) {
+ if (!adev->enable_mes_kiq &&
+ pipe == AMDGPU_MES_KIQ_PIPE)
+ continue;
+
+ nv_grbm_select(adev, 3, pipe, 0, 0);
+ WREG32_SOC15(GC, 0, mmCP_MES_PRGRM_CNTR_START,
+ (uint32_t)(adev->mes.uc_start_addr[pipe]) >> 2);
+ }
+ nv_grbm_select(adev, 0, 0, 0, 0);
+ mutex_unlock(&adev->srbm_mutex);
/* clear BYPASS_UNCACHED to avoid hangs after interrupt. */
data = RREG32_SOC15(GC, 0, mmCP_MES_DC_OP_CNTL);
@@ -445,50 +475,51 @@ static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
}
/* This function is for backdoor MES firmware */
-static int mes_v10_1_load_microcode(struct amdgpu_device *adev)
+static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
+ enum admgpu_mes_pipe pipe)
{
int r;
uint32_t data;
- if (!adev->mes.fw)
+ mes_v10_1_enable(adev, false);
+
+ if (!adev->mes.fw[pipe])
return -EINVAL;
- r = mes_v10_1_allocate_ucode_buffer(adev);
+ r = mes_v10_1_allocate_ucode_buffer(adev, pipe);
if (r)
return r;
- r = mes_v10_1_allocate_ucode_data_buffer(adev);
+ r = mes_v10_1_allocate_ucode_data_buffer(adev, pipe);
if (r) {
- mes_v10_1_free_ucode_buffers(adev);
+ mes_v10_1_free_ucode_buffers(adev, pipe);
return r;
}
- mes_v10_1_enable(adev, false);
-
WREG32_SOC15(GC, 0, mmCP_MES_IC_BASE_CNTL, 0);
mutex_lock(&adev->srbm_mutex);
/* me=3, pipe=0, queue=0 */
- nv_grbm_select(adev, 3, 0, 0, 0);
+ nv_grbm_select(adev, 3, pipe, 0, 0);
/* set ucode start address */
WREG32_SOC15(GC, 0, mmCP_MES_PRGRM_CNTR_START,
- (uint32_t)(adev->mes.uc_start_addr) >> 2);
+ (uint32_t)(adev->mes.uc_start_addr[pipe]) >> 2);
/* set ucode fimrware address */
WREG32_SOC15(GC, 0, mmCP_MES_IC_BASE_LO,
- lower_32_bits(adev->mes.ucode_fw_gpu_addr));
+ lower_32_bits(adev->mes.ucode_fw_gpu_addr[pipe]));
WREG32_SOC15(GC, 0, mmCP_MES_IC_BASE_HI,
- upper_32_bits(adev->mes.ucode_fw_gpu_addr));
+ upper_32_bits(adev->mes.ucode_fw_gpu_addr[pipe]));
/* set ucode instruction cache boundary to 2M-1 */
WREG32_SOC15(GC, 0, mmCP_MES_MIBOUND_LO, 0x1FFFFF);
/* set ucode data firmware address */
WREG32_SOC15(GC, 0, mmCP_MES_MDBASE_LO,
- lower_32_bits(adev->mes.data_fw_gpu_addr));
+ lower_32_bits(adev->mes.data_fw_gpu_addr[pipe]));
WREG32_SOC15(GC, 0, mmCP_MES_MDBASE_HI,
- upper_32_bits(adev->mes.data_fw_gpu_addr));
+ upper_32_bits(adev->mes.data_fw_gpu_addr[pipe]));
/* Set 0x3FFFF (256K-1) to CP_MES_MDBOUND_LO */
WREG32_SOC15(GC, 0, mmCP_MES_MDBOUND_LO, 0x3FFFF);
@@ -538,25 +569,26 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev)
return 0;
}
-static int mes_v10_1_allocate_eop_buf(struct amdgpu_device *adev)
+static int mes_v10_1_allocate_eop_buf(struct amdgpu_device *adev,
+ enum admgpu_mes_pipe pipe)
{
int r;
u32 *eop;
r = amdgpu_bo_create_reserved(adev, MES_EOP_SIZE, PAGE_SIZE,
- AMDGPU_GEM_DOMAIN_GTT,
- &adev->mes.eop_gpu_obj,
- &adev->mes.eop_gpu_addr,
- (void **)&eop);
+ AMDGPU_GEM_DOMAIN_GTT,
+ &adev->mes.eop_gpu_obj[pipe],
+ &adev->mes.eop_gpu_addr[pipe],
+ (void **)&eop);
if (r) {
dev_warn(adev->dev, "(%d) create EOP bo failed\n", r);
return r;
}
- memset(eop, 0, adev->mes.eop_gpu_obj->tbo.base.size);
+ memset(eop, 0, adev->mes.eop_gpu_obj[pipe]->tbo.base.size);
- amdgpu_bo_kunmap(adev->mes.eop_gpu_obj);
- amdgpu_bo_unreserve(adev->mes.eop_gpu_obj);
+ amdgpu_bo_kunmap(adev->mes.eop_gpu_obj[pipe]);
+ amdgpu_bo_unreserve(adev->mes.eop_gpu_obj[pipe]);
return 0;
}
@@ -727,7 +759,7 @@ static void mes_v10_1_queue_init_register(struct amdgpu_ring *ring)
uint32_t data = 0;
mutex_lock(&adev->srbm_mutex);
- nv_grbm_select(adev, 3, 0, 0, 0);
+ nv_grbm_select(adev, 3, ring->pipe, 0, 0);
/* set CP_HQD_VMID.VMID = 0. */
data = RREG32_SOC15(GC, 0, mmCP_HQD_VMID);
@@ -842,8 +874,8 @@ static int mes_v10_1_ring_init(struct amdgpu_device *adev)
ring->ring_obj = NULL;
ring->use_doorbell = true;
- ring->doorbell_index = adev->doorbell_index.mes_ring << 1;
- ring->eop_gpu_addr = adev->mes.eop_gpu_addr;
+ ring->doorbell_index = adev->doorbell_index.mes_ring0 << 1;
+ ring->eop_gpu_addr = adev->mes.eop_gpu_addr[AMDGPU_MES_SCHED_PIPE];
ring->no_scheduler = true;
sprintf(ring->name, "mes_%d.%d.%d", ring->me, ring->pipe, ring->queue);
@@ -851,10 +883,16 @@ static int mes_v10_1_ring_init(struct amdgpu_device *adev)
AMDGPU_RING_PRIO_DEFAULT, NULL);
}
-static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev)
+static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
+ enum admgpu_mes_pipe pipe)
{
int r, mqd_size = sizeof(struct v10_compute_mqd);
- struct amdgpu_ring *ring = &adev->mes.ring;
+ struct amdgpu_ring *ring;
+
+ if (pipe == AMDGPU_MES_SCHED_PIPE)
+ ring = &adev->mes.ring;
+ else
+ BUG();
if (ring->mqd_obj)
return 0;
@@ -868,8 +906,8 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev)
}
/* prepare MQD backup */
- adev->mes.mqd_backup = kmalloc(mqd_size, GFP_KERNEL);
- if (!adev->mes.mqd_backup)
+ adev->mes.mqd_backup[pipe] = kmalloc(mqd_size, GFP_KERNEL);
+ if (!adev->mes.mqd_backup[pipe])
dev_warn(adev->dev,
"no memory to create MQD backup for ring %s\n",
ring->name);
@@ -879,21 +917,21 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev)
static int mes_v10_1_sw_init(void *handle)
{
- int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+ int r, pipe = AMDGPU_MES_SCHED_PIPE;
adev->mes.adev = adev;
adev->mes.funcs = &mes_v10_1_funcs;
- r = mes_v10_1_init_microcode(adev);
+ r = mes_v10_1_init_microcode(adev, pipe);
if (r)
return r;
- r = mes_v10_1_allocate_eop_buf(adev);
+ r = mes_v10_1_allocate_eop_buf(adev, pipe);
if (r)
return r;
- r = mes_v10_1_mqd_sw_init(adev);
+ r = mes_v10_1_mqd_sw_init(adev, pipe);
if (r)
return r;
@@ -911,21 +949,23 @@ static int mes_v10_1_sw_init(void *handle)
static int mes_v10_1_sw_fini(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+ int pipe = AMDGPU_MES_SCHED_PIPE;
amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
amdgpu_device_wb_free(adev, adev->mes.query_status_fence_offs);
- kfree(adev->mes.mqd_backup);
+ kfree(adev->mes.mqd_backup[pipe]);
amdgpu_bo_free_kernel(&adev->mes.ring.mqd_obj,
&adev->mes.ring.mqd_gpu_addr,
&adev->mes.ring.mqd_ptr);
- amdgpu_bo_free_kernel(&adev->mes.eop_gpu_obj,
- &adev->mes.eop_gpu_addr,
+ amdgpu_bo_free_kernel(&adev->mes.eop_gpu_obj[pipe],
+ &adev->mes.eop_gpu_addr[pipe],
NULL);
- mes_v10_1_free_microcode(adev);
+ mes_v10_1_free_microcode(adev, pipe);
+ amdgpu_ring_fini(&adev->mes.ring);
return 0;
}
@@ -936,7 +976,8 @@ static int mes_v10_1_hw_init(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
- r = mes_v10_1_load_microcode(adev);
+ r = mes_v10_1_load_microcode(adev,
+ AMDGPU_MES_SCHED_PIPE);
if (r) {
DRM_ERROR("failed to MES fw, r=%d\n", r);
return r;
@@ -973,7 +1014,7 @@ static int mes_v10_1_hw_fini(void *handle)
mes_v10_1_enable(adev, false);
if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT)
- mes_v10_1_free_ucode_buffers(adev);
+ mes_v10_1_free_ucode_buffers(adev, AMDGPU_MES_SCHED_PIPE);
return 0;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 21/73] drm/amdgpu/gfx10: add mes queue fence handling
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (19 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 20/73] drm/amdgpu/mes: extend mes framework to support multiple mes pipes Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 22/73] drm/amdgpu/gfx10: add mes support for gfx ib test Alex Deucher
` (51 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
From IH ring buffer, look up the coresponding kernel queue and process.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 60 +++++++++++++++++---------
1 file changed, 40 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 2ab5259c7305..0e009bd69a9b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -9188,31 +9188,51 @@ static int gfx_v10_0_eop_irq(struct amdgpu_device *adev,
int i;
u8 me_id, pipe_id, queue_id;
struct amdgpu_ring *ring;
+ uint32_t mes_queue_id = entry->src_data[0];
DRM_DEBUG("IH: CP EOP\n");
- me_id = (entry->ring_id & 0x0c) >> 2;
- pipe_id = (entry->ring_id & 0x03) >> 0;
- queue_id = (entry->ring_id & 0x70) >> 4;
- switch (me_id) {
- case 0:
- if (pipe_id == 0)
- amdgpu_fence_process(&adev->gfx.gfx_ring[0]);
- else
- amdgpu_fence_process(&adev->gfx.gfx_ring[1]);
- break;
- case 1:
- case 2:
- for (i = 0; i < adev->gfx.num_compute_rings; i++) {
- ring = &adev->gfx.compute_ring[i];
- /* Per-queue interrupt is supported for MEC starting from VI.
- * The interrupt can only be enabled/disabled per pipe instead of per queue.
- */
- if ((ring->me == me_id) && (ring->pipe == pipe_id) && (ring->queue == queue_id))
- amdgpu_fence_process(ring);
+ if (adev->enable_mes && (mes_queue_id & AMDGPU_FENCE_MES_QUEUE_FLAG)) {
+ struct amdgpu_mes_queue *queue;
+
+ mes_queue_id &= AMDGPU_FENCE_MES_QUEUE_ID_MASK;
+
+ spin_lock(&adev->mes.queue_id_lock);
+ queue = idr_find(&adev->mes.queue_id_idr, mes_queue_id);
+ if (queue) {
+ DRM_DEBUG("process mes queue id = %d\n", mes_queue_id);
+ amdgpu_fence_process(queue->ring);
+ }
+ spin_unlock(&adev->mes.queue_id_lock);
+ } else {
+ me_id = (entry->ring_id & 0x0c) >> 2;
+ pipe_id = (entry->ring_id & 0x03) >> 0;
+ queue_id = (entry->ring_id & 0x70) >> 4;
+
+ switch (me_id) {
+ case 0:
+ if (pipe_id == 0)
+ amdgpu_fence_process(&adev->gfx.gfx_ring[0]);
+ else
+ amdgpu_fence_process(&adev->gfx.gfx_ring[1]);
+ break;
+ case 1:
+ case 2:
+ for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+ ring = &adev->gfx.compute_ring[i];
+ /* Per-queue interrupt is supported for MEC starting from VI.
+ * The interrupt can only be enabled/disabled per pipe instead
+ * of per queue.
+ */
+ if ((ring->me == me_id) &&
+ (ring->pipe == pipe_id) &&
+ (ring->queue == queue_id))
+ amdgpu_fence_process(ring);
+ }
+ break;
}
- break;
}
+
return 0;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 22/73] drm/amdgpu/gfx10: add mes support for gfx ib test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (20 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 21/73] drm/amdgpu/gfx10: add mes queue fence handling Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 23/73] drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled Alex Deucher
` (50 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add mes support for gfx ib test.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 46 ++++++++++++++++++--------
1 file changed, 33 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 0e009bd69a9b..1208d01cc936 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3818,19 +3818,39 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
struct dma_fence *f = NULL;
unsigned index;
uint64_t gpu_addr;
- uint32_t tmp;
+ volatile uint32_t *cpu_ptr;
long r;
- r = amdgpu_device_wb_get(adev, &index);
- if (r)
- return r;
-
- gpu_addr = adev->wb.gpu_addr + (index * 4);
- adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
memset(&ib, 0, sizeof(ib));
- r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
- if (r)
- goto err1;
+
+ if (ring->is_mes_queue) {
+ uint32_t padding, offset;
+
+ offset = amdgpu_mes_ctx_get_offs(ring, AMDGPU_MES_CTX_IB_OFFS);
+ padding = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_PADDING_OFFS);
+
+ ib.gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ ib.ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+
+ gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, padding);
+ cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, padding);
+ *cpu_ptr = cpu_to_le32(0xCAFEDEAD);
+ } else {
+ r = amdgpu_device_wb_get(adev, &index);
+ if (r)
+ return r;
+
+ gpu_addr = adev->wb.gpu_addr + (index * 4);
+ adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
+ cpu_ptr = &adev->wb.wb[index];
+
+ r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
+ if (r) {
+ DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
+ goto err1;
+ }
+ }
ib.ptr[0] = PACKET3(PACKET3_WRITE_DATA, 3);
ib.ptr[1] = WRITE_DATA_DST_SEL(5) | WR_CONFIRM;
@@ -3851,13 +3871,13 @@ static int gfx_v10_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
goto err2;
}
- tmp = adev->wb.wb[index];
- if (tmp == 0xDEADBEEF)
+ if (le32_to_cpu(*cpu_ptr) == 0xDEADBEEF)
r = 0;
else
r = -EINVAL;
err2:
- amdgpu_ib_free(adev, &ib, NULL);
+ if (!ring->is_mes_queue)
+ amdgpu_ib_free(adev, &ib, NULL);
dma_fence_put(f);
err1:
amdgpu_device_wb_free(adev, index);
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 23/73] drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (21 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 22/73] drm/amdgpu/gfx10: add mes support for gfx ib test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 24/73] drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue Alex Deucher
` (49 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
If MES is enabled, don't use kiq to flush gpu tlb,
for it would result in conflicting with mes fw.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 9b4a035a5bf1..b8c79789e1e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -328,7 +328,7 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
/* For SRIOV run time, driver shouldn't access the register through MMIO
* Directly use kiq to do the vm invalidation instead
*/
- if (adev->gfx.kiq.ring.sched.ready &&
+ if (adev->gfx.kiq.ring.sched.ready && !adev->enable_mes &&
(amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev)) &&
down_read_trylock(&adev->reset_domain->sem)) {
struct amdgpu_vmhub *hub = &adev->vmhub[vmhub];
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 24/73] drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (22 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 23/73] drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 25/73] drm/amdgpu/sdma5.2: initialize sdma mqd Alex Deucher
` (48 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Use per context sdma csa address for mes sdma queue.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
index e1835fd4b237..8e221a1ba937 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
@@ -74,14 +74,22 @@ uint64_t amdgpu_sdma_get_csa_mc_addr(struct amdgpu_ring *ring,
if (amdgpu_sriov_vf(adev) || vmid == 0 || !amdgpu_mcbp)
return 0;
- r = amdgpu_sdma_get_index_from_ring(ring, &index);
-
- if (r || index > 31)
- csa_mc_addr = 0;
- else
- csa_mc_addr = amdgpu_csa_vaddr(adev) +
- AMDGPU_CSA_SDMA_OFFSET +
- index * AMDGPU_CSA_SDMA_SIZE;
+ if (ring->is_mes_queue) {
+ uint32_t offset = 0;
+
+ offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+ sdma[ring->idx].sdma_meta_data);
+ csa_mc_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ } else {
+ r = amdgpu_sdma_get_index_from_ring(ring, &index);
+
+ if (r || index > 31)
+ csa_mc_addr = 0;
+ else
+ csa_mc_addr = amdgpu_csa_vaddr(adev) +
+ AMDGPU_CSA_SDMA_OFFSET +
+ index * AMDGPU_CSA_SDMA_SIZE;
+ }
return csa_mc_addr;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 25/73] drm/amdgpu/sdma5.2: initialize sdma mqd
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (23 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 24/73] drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 26/73] drm/amdgpu/sdma5.2: associate mes queue id with fence Alex Deucher
` (47 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Initialize sdma mqd according to ring settings.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 44 ++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index bf2cf95cbf8f..f67801c5a6c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -903,6 +903,49 @@ static int sdma_v5_2_start(struct amdgpu_device *adev)
return r;
}
+static int sdma_v5_2_mqd_init(struct amdgpu_device *adev, void *mqd,
+ struct amdgpu_mqd_prop *prop)
+{
+ struct v10_sdma_mqd *m = mqd;
+ uint64_t wb_gpu_addr;
+
+ m->sdmax_rlcx_rb_cntl =
+ order_base_2(prop->queue_size / 4) << SDMA0_RLC0_RB_CNTL__RB_SIZE__SHIFT |
+ 1 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_ENABLE__SHIFT |
+ 6 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_TIMER__SHIFT |
+ 1 << SDMA0_RLC0_RB_CNTL__RB_PRIV__SHIFT;
+
+ m->sdmax_rlcx_rb_base = lower_32_bits(prop->hqd_base_gpu_addr >> 8);
+ m->sdmax_rlcx_rb_base_hi = upper_32_bits(prop->hqd_base_gpu_addr >> 8);
+
+ m->sdmax_rlcx_rb_wptr_poll_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, 0,
+ mmSDMA0_GFX_RB_WPTR_POLL_CNTL));
+
+ wb_gpu_addr = prop->wptr_gpu_addr;
+ m->sdmax_rlcx_rb_wptr_poll_addr_lo = lower_32_bits(wb_gpu_addr);
+ m->sdmax_rlcx_rb_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr);
+
+ wb_gpu_addr = prop->rptr_gpu_addr;
+ m->sdmax_rlcx_rb_rptr_addr_lo = lower_32_bits(wb_gpu_addr);
+ m->sdmax_rlcx_rb_rptr_addr_hi = upper_32_bits(wb_gpu_addr);
+
+ m->sdmax_rlcx_ib_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, 0,
+ mmSDMA0_GFX_IB_CNTL));
+
+ m->sdmax_rlcx_doorbell_offset =
+ prop->doorbell_index << SDMA0_RLC0_DOORBELL_OFFSET__OFFSET__SHIFT;
+
+ m->sdmax_rlcx_doorbell = REG_SET_FIELD(0, SDMA0_RLC0_DOORBELL, ENABLE, 1);
+
+ return 0;
+}
+
+static void sdma_v5_2_set_mqd_funcs(struct amdgpu_device *adev)
+{
+ adev->mqds[AMDGPU_HW_IP_DMA].mqd_size = sizeof(struct v10_sdma_mqd);
+ adev->mqds[AMDGPU_HW_IP_DMA].init_mqd = sdma_v5_2_mqd_init;
+}
+
/**
* sdma_v5_2_ring_test_ring - simple async dma engine test
*
@@ -1233,6 +1276,7 @@ static int sdma_v5_2_early_init(void *handle)
sdma_v5_2_set_buffer_funcs(adev);
sdma_v5_2_set_vm_pte_funcs(adev);
sdma_v5_2_set_irq_funcs(adev);
+ sdma_v5_2_set_mqd_funcs(adev);
return 0;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 26/73] drm/amdgpu/sdma5.2: associate mes queue id with fence
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (24 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 25/73] drm/amdgpu/sdma5.2: initialize sdma mqd Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 27/73] drm/amdgpu/sdma5.2: add mes queue fence handling Alex Deucher
` (46 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Associate mes queue id with fence, so that EOP trap handler can look up
which queue issues the fence.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index f67801c5a6c1..0b7de18df5f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -460,10 +460,12 @@ static void sdma_v5_2_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 se
amdgpu_ring_write(ring, upper_32_bits(seq));
}
- if (flags & AMDGPU_FENCE_FLAG_INT) {
+ if ((flags & AMDGPU_FENCE_FLAG_INT)) {
+ uint32_t ctx = ring->is_mes_queue ?
+ (ring->hw_queue_id | AMDGPU_FENCE_MES_QUEUE_FLAG) : 0;
/* generate an interrupt */
amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_TRAP));
- amdgpu_ring_write(ring, SDMA_PKT_TRAP_INT_CONTEXT_INT_CONTEXT(0));
+ amdgpu_ring_write(ring, SDMA_PKT_TRAP_INT_CONTEXT_INT_CONTEXT(ctx));
}
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 27/73] drm/amdgpu/sdma5.2: add mes queue fence handling
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (25 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 26/73] drm/amdgpu/sdma5.2: associate mes queue id with fence Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 28/73] drm/amdgpu/sdma5.2: add mes support for sdma ring test Alex Deucher
` (45 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
From IH ring buffer, look up the coresponding kernel queue and process.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 0b7de18df5f4..9f246ab942f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -1512,7 +1512,25 @@ static int sdma_v5_2_process_trap_irq(struct amdgpu_device *adev,
struct amdgpu_irq_src *source,
struct amdgpu_iv_entry *entry)
{
+ uint32_t mes_queue_id = entry->src_data[0];
+
DRM_DEBUG("IH: SDMA trap\n");
+
+ if (adev->enable_mes && (mes_queue_id & AMDGPU_FENCE_MES_QUEUE_FLAG)) {
+ struct amdgpu_mes_queue *queue;
+
+ mes_queue_id &= AMDGPU_FENCE_MES_QUEUE_ID_MASK;
+
+ spin_lock(&adev->mes.queue_id_lock);
+ queue = idr_find(&adev->mes.queue_id_idr, mes_queue_id);
+ if (queue) {
+ DRM_DEBUG("process smda queue id = %d\n", mes_queue_id);
+ amdgpu_fence_process(queue->ring);
+ }
+ spin_unlock(&adev->mes.queue_id_lock);
+ return 0;
+ }
+
switch (entry->client_id) {
case SOC15_IH_CLIENTID_SDMA0:
switch (entry->ring_id) {
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 28/73] drm/amdgpu/sdma5.2: add mes support for sdma ring test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (26 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 27/73] drm/amdgpu/sdma5.2: add mes queue fence handling Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 29/73] drm/amdgpu/sdma5.2: add mes support for sdma ib test Alex Deucher
` (44 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add MES support for sdma ring test.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 37 ++++++++++++++++++--------
1 file changed, 26 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 9f246ab942f9..7c9c70382591 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -965,18 +965,29 @@ static int sdma_v5_2_ring_test_ring(struct amdgpu_ring *ring)
int r;
u32 tmp;
u64 gpu_addr;
+ volatile uint32_t *cpu_ptr = NULL;
- r = amdgpu_device_wb_get(adev, &index);
- if (r) {
- dev_err(adev->dev, "(%d) failed to allocate wb slot\n", r);
- return r;
- }
-
- gpu_addr = adev->wb.gpu_addr + (index * 4);
tmp = 0xCAFEDEAD;
- adev->wb.wb[index] = cpu_to_le32(tmp);
- r = amdgpu_ring_alloc(ring, 5);
+ if (ring->is_mes_queue) {
+ uint32_t offset = 0;
+ offset = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_PADDING_OFFS);
+ gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+ *cpu_ptr = tmp;
+ } else {
+ r = amdgpu_device_wb_get(adev, &index);
+ if (r) {
+ dev_err(adev->dev, "(%d) failed to allocate wb slot\n", r);
+ return r;
+ }
+
+ gpu_addr = adev->wb.gpu_addr + (index * 4);
+ adev->wb.wb[index] = cpu_to_le32(tmp);
+ }
+
+ r = amdgpu_ring_alloc(ring, 20);
if (r) {
DRM_ERROR("amdgpu: dma failed to lock ring %d (%d).\n", ring->idx, r);
amdgpu_device_wb_free(adev, index);
@@ -992,7 +1003,10 @@ static int sdma_v5_2_ring_test_ring(struct amdgpu_ring *ring)
amdgpu_ring_commit(ring);
for (i = 0; i < adev->usec_timeout; i++) {
- tmp = le32_to_cpu(adev->wb.wb[index]);
+ if (ring->is_mes_queue)
+ tmp = le32_to_cpu(*cpu_ptr);
+ else
+ tmp = le32_to_cpu(adev->wb.wb[index]);
if (tmp == 0xDEADBEEF)
break;
if (amdgpu_emu_mode == 1)
@@ -1004,7 +1018,8 @@ static int sdma_v5_2_ring_test_ring(struct amdgpu_ring *ring)
if (i >= adev->usec_timeout)
r = -ETIMEDOUT;
- amdgpu_device_wb_free(adev, index);
+ if (!ring->is_mes_queue)
+ amdgpu_device_wb_free(adev, index);
return r;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 29/73] drm/amdgpu/sdma5.2: add mes support for sdma ib test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (27 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 28/73] drm/amdgpu/sdma5.2: add mes support for sdma ring test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 30/73] drm/amdgpu/sdma5: initialize sdma mqd Alex Deucher
` (43 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add MES support for sdma ib test.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 50 ++++++++++++++++++--------
1 file changed, 36 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 7c9c70382591..83c6ccaaa9e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -1042,21 +1042,37 @@ static int sdma_v5_2_ring_test_ib(struct amdgpu_ring *ring, long timeout)
long r;
u32 tmp = 0;
u64 gpu_addr;
+ volatile uint32_t *cpu_ptr = NULL;
- r = amdgpu_device_wb_get(adev, &index);
- if (r) {
- dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
- return r;
- }
-
- gpu_addr = adev->wb.gpu_addr + (index * 4);
tmp = 0xCAFEDEAD;
- adev->wb.wb[index] = cpu_to_le32(tmp);
memset(&ib, 0, sizeof(ib));
- r = amdgpu_ib_get(adev, NULL, 256, AMDGPU_IB_POOL_DIRECT, &ib);
- if (r) {
- DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
- goto err0;
+
+ if (ring->is_mes_queue) {
+ uint32_t offset = 0;
+ offset = amdgpu_mes_ctx_get_offs(ring, AMDGPU_MES_CTX_IB_OFFS);
+ ib.gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ ib.ptr = (void *)amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+
+ offset = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_PADDING_OFFS);
+ gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+ *cpu_ptr = tmp;
+ } else {
+ r = amdgpu_device_wb_get(adev, &index);
+ if (r) {
+ dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
+ return r;
+ }
+
+ gpu_addr = adev->wb.gpu_addr + (index * 4);
+ adev->wb.wb[index] = cpu_to_le32(tmp);
+
+ r = amdgpu_ib_get(adev, NULL, 256, AMDGPU_IB_POOL_DIRECT, &ib);
+ if (r) {
+ DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
+ goto err0;
+ }
}
ib.ptr[0] = SDMA_PKT_HEADER_OP(SDMA_OP_WRITE) |
@@ -1083,7 +1099,12 @@ static int sdma_v5_2_ring_test_ib(struct amdgpu_ring *ring, long timeout)
DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
goto err1;
}
- tmp = le32_to_cpu(adev->wb.wb[index]);
+
+ if (ring->is_mes_queue)
+ tmp = le32_to_cpu(*cpu_ptr);
+ else
+ tmp = le32_to_cpu(adev->wb.wb[index]);
+
if (tmp == 0xDEADBEEF)
r = 0;
else
@@ -1093,7 +1114,8 @@ static int sdma_v5_2_ring_test_ib(struct amdgpu_ring *ring, long timeout)
amdgpu_ib_free(adev, &ib, NULL);
dma_fence_put(f);
err0:
- amdgpu_device_wb_free(adev, index);
+ if (!ring->is_mes_queue)
+ amdgpu_device_wb_free(adev, index);
return r;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 30/73] drm/amdgpu/sdma5: initialize sdma mqd
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (28 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 29/73] drm/amdgpu/sdma5.2: add mes support for sdma ib test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 31/73] drm/amdgpu/sdma5: associate mes queue id with fence Alex Deucher
` (42 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Initialize sdma mqd according to ring settings.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 44 ++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index ff359e7f1eb8..30d12c9df911 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -959,6 +959,49 @@ static int sdma_v5_0_start(struct amdgpu_device *adev)
return r;
}
+static int sdma_v5_0_mqd_init(struct amdgpu_device *adev, void *mqd,
+ struct amdgpu_mqd_prop *prop)
+{
+ struct v10_sdma_mqd *m = mqd;
+ uint64_t wb_gpu_addr;
+
+ m->sdmax_rlcx_rb_cntl =
+ order_base_2(prop->queue_size / 4) << SDMA0_RLC0_RB_CNTL__RB_SIZE__SHIFT |
+ 1 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_ENABLE__SHIFT |
+ 6 << SDMA0_RLC0_RB_CNTL__RPTR_WRITEBACK_TIMER__SHIFT |
+ 1 << SDMA0_RLC0_RB_CNTL__RB_PRIV__SHIFT;
+
+ m->sdmax_rlcx_rb_base = lower_32_bits(prop->hqd_base_gpu_addr >> 8);
+ m->sdmax_rlcx_rb_base_hi = upper_32_bits(prop->hqd_base_gpu_addr >> 8);
+
+ m->sdmax_rlcx_rb_wptr_poll_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, 0,
+ mmSDMA0_GFX_RB_WPTR_POLL_CNTL));
+
+ wb_gpu_addr = prop->wptr_gpu_addr;
+ m->sdmax_rlcx_rb_wptr_poll_addr_lo = lower_32_bits(wb_gpu_addr);
+ m->sdmax_rlcx_rb_wptr_poll_addr_hi = upper_32_bits(wb_gpu_addr);
+
+ wb_gpu_addr = prop->rptr_gpu_addr;
+ m->sdmax_rlcx_rb_rptr_addr_lo = lower_32_bits(wb_gpu_addr);
+ m->sdmax_rlcx_rb_rptr_addr_hi = upper_32_bits(wb_gpu_addr);
+
+ m->sdmax_rlcx_ib_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, 0,
+ mmSDMA0_GFX_IB_CNTL));
+
+ m->sdmax_rlcx_doorbell_offset =
+ prop->doorbell_index << SDMA0_RLC0_DOORBELL_OFFSET__OFFSET__SHIFT;
+
+ m->sdmax_rlcx_doorbell = REG_SET_FIELD(0, SDMA0_RLC0_DOORBELL, ENABLE, 1);
+
+ return 0;
+}
+
+static void sdma_v5_0_set_mqd_funcs(struct amdgpu_device *adev)
+{
+ adev->mqds[AMDGPU_HW_IP_DMA].mqd_size = sizeof(struct v10_sdma_mqd);
+ adev->mqds[AMDGPU_HW_IP_DMA].init_mqd = sdma_v5_0_mqd_init;
+}
+
/**
* sdma_v5_0_ring_test_ring - simple async dma engine test
*
@@ -1289,6 +1332,7 @@ static int sdma_v5_0_early_init(void *handle)
sdma_v5_0_set_buffer_funcs(adev);
sdma_v5_0_set_vm_pte_funcs(adev);
sdma_v5_0_set_irq_funcs(adev);
+ sdma_v5_0_set_mqd_funcs(adev);
return 0;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 31/73] drm/amdgpu/sdma5: associate mes queue id with fence
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (29 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 30/73] drm/amdgpu/sdma5: initialize sdma mqd Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 32/73] drm/amdgpu/sdma5: add mes queue fence handling Alex Deucher
` (41 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Associate mes queue id with fence, so that EOP trap handler can look up
which queue issues the fence.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 30d12c9df911..b73e45597031 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -562,9 +562,11 @@ static void sdma_v5_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 se
}
if (flags & AMDGPU_FENCE_FLAG_INT) {
+ uint32_t ctx = ring->is_mes_queue ?
+ (ring->hw_queue_id | AMDGPU_FENCE_MES_QUEUE_FLAG) : 0;
/* generate an interrupt */
amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_TRAP));
- amdgpu_ring_write(ring, SDMA_PKT_TRAP_INT_CONTEXT_INT_CONTEXT(0));
+ amdgpu_ring_write(ring, SDMA_PKT_TRAP_INT_CONTEXT_INT_CONTEXT(ctx));
}
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 32/73] drm/amdgpu/sdma5: add mes queue fence handling
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (30 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 31/73] drm/amdgpu/sdma5: associate mes queue id with fence Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 33/73] drm/amdgpu/sdma5: add mes support for sdma ring test Alex Deucher
` (40 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
From IH ring buffer look up the coresponding kernel queue and process.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index b73e45597031..564adc7b010c 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -1555,7 +1555,25 @@ static int sdma_v5_0_process_trap_irq(struct amdgpu_device *adev,
struct amdgpu_irq_src *source,
struct amdgpu_iv_entry *entry)
{
+ uint32_t mes_queue_id = entry->src_data[0];
+
DRM_DEBUG("IH: SDMA trap\n");
+
+ if (adev->enable_mes && (mes_queue_id & AMDGPU_FENCE_MES_QUEUE_FLAG)) {
+ struct amdgpu_mes_queue *queue;
+
+ mes_queue_id &= AMDGPU_FENCE_MES_QUEUE_ID_MASK;
+
+ spin_lock(&adev->mes.queue_id_lock);
+ queue = idr_find(&adev->mes.queue_id_idr, mes_queue_id);
+ if (queue) {
+ DRM_DEBUG("process smda queue id = %d\n", mes_queue_id);
+ amdgpu_fence_process(queue->ring);
+ }
+ spin_unlock(&adev->mes.queue_id_lock);
+ return 0;
+ }
+
switch (entry->client_id) {
case SOC15_IH_CLIENTID_SDMA0:
switch (entry->ring_id) {
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 33/73] drm/amdgpu/sdma5: add mes support for sdma ring test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (31 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 32/73] drm/amdgpu/sdma5: add mes queue fence handling Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 34/73] drm/amdgpu/sdma5: add mes support for sdma ib test Alex Deucher
` (39 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add MES support for sdma ring test.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 37 ++++++++++++++++++--------
1 file changed, 26 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 564adc7b010c..1f0b19e18494 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -1021,18 +1021,29 @@ static int sdma_v5_0_ring_test_ring(struct amdgpu_ring *ring)
int r;
u32 tmp;
u64 gpu_addr;
+ volatile uint32_t *cpu_ptr = NULL;
- r = amdgpu_device_wb_get(adev, &index);
- if (r) {
- dev_err(adev->dev, "(%d) failed to allocate wb slot\n", r);
- return r;
- }
-
- gpu_addr = adev->wb.gpu_addr + (index * 4);
tmp = 0xCAFEDEAD;
- adev->wb.wb[index] = cpu_to_le32(tmp);
- r = amdgpu_ring_alloc(ring, 5);
+ if (ring->is_mes_queue) {
+ uint32_t offset = 0;
+ offset = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_PADDING_OFFS);
+ gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+ *cpu_ptr = tmp;
+ } else {
+ r = amdgpu_device_wb_get(adev, &index);
+ if (r) {
+ dev_err(adev->dev, "(%d) failed to allocate wb slot\n", r);
+ return r;
+ }
+
+ gpu_addr = adev->wb.gpu_addr + (index * 4);
+ adev->wb.wb[index] = cpu_to_le32(tmp);
+ }
+
+ r = amdgpu_ring_alloc(ring, 20);
if (r) {
DRM_ERROR("amdgpu: dma failed to lock ring %d (%d).\n", ring->idx, r);
amdgpu_device_wb_free(adev, index);
@@ -1048,7 +1059,10 @@ static int sdma_v5_0_ring_test_ring(struct amdgpu_ring *ring)
amdgpu_ring_commit(ring);
for (i = 0; i < adev->usec_timeout; i++) {
- tmp = le32_to_cpu(adev->wb.wb[index]);
+ if (ring->is_mes_queue)
+ tmp = le32_to_cpu(*cpu_ptr);
+ else
+ tmp = le32_to_cpu(adev->wb.wb[index]);
if (tmp == 0xDEADBEEF)
break;
if (amdgpu_emu_mode == 1)
@@ -1060,7 +1074,8 @@ static int sdma_v5_0_ring_test_ring(struct amdgpu_ring *ring)
if (i >= adev->usec_timeout)
r = -ETIMEDOUT;
- amdgpu_device_wb_free(adev, index);
+ if (!ring->is_mes_queue)
+ amdgpu_device_wb_free(adev, index);
return r;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 34/73] drm/amdgpu/sdma5: add mes support for sdma ib test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (32 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 33/73] drm/amdgpu/sdma5: add mes support for sdma ring test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 35/73] drm/amdgpu: add mes kiq PSP GFX FW type Alex Deucher
` (38 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add MES support for sdma ib test.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 50 ++++++++++++++++++--------
1 file changed, 36 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 1f0b19e18494..1f9021f896a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -1098,22 +1098,38 @@ static int sdma_v5_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
long r;
u32 tmp = 0;
u64 gpu_addr;
+ volatile uint32_t *cpu_ptr = NULL;
- r = amdgpu_device_wb_get(adev, &index);
- if (r) {
- dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
- return r;
- }
-
- gpu_addr = adev->wb.gpu_addr + (index * 4);
tmp = 0xCAFEDEAD;
- adev->wb.wb[index] = cpu_to_le32(tmp);
memset(&ib, 0, sizeof(ib));
- r = amdgpu_ib_get(adev, NULL, 256,
+
+ if (ring->is_mes_queue) {
+ uint32_t offset = 0;
+ offset = amdgpu_mes_ctx_get_offs(ring, AMDGPU_MES_CTX_IB_OFFS);
+ ib.gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ ib.ptr = (void *)amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+
+ offset = amdgpu_mes_ctx_get_offs(ring,
+ AMDGPU_MES_CTX_PADDING_OFFS);
+ gpu_addr = amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ cpu_ptr = amdgpu_mes_ctx_get_offs_cpu_addr(ring, offset);
+ *cpu_ptr = tmp;
+ } else {
+ r = amdgpu_device_wb_get(adev, &index);
+ if (r) {
+ dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
+ return r;
+ }
+
+ gpu_addr = adev->wb.gpu_addr + (index * 4);
+ adev->wb.wb[index] = cpu_to_le32(tmp);
+
+ r = amdgpu_ib_get(adev, NULL, 256,
AMDGPU_IB_POOL_DIRECT, &ib);
- if (r) {
- DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
- goto err0;
+ if (r) {
+ DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
+ goto err0;
+ }
}
ib.ptr[0] = SDMA_PKT_HEADER_OP(SDMA_OP_WRITE) |
@@ -1140,7 +1156,12 @@ static int sdma_v5_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
goto err1;
}
- tmp = le32_to_cpu(adev->wb.wb[index]);
+
+ if (ring->is_mes_queue)
+ tmp = le32_to_cpu(*cpu_ptr);
+ else
+ tmp = le32_to_cpu(adev->wb.wb[index]);
+
if (tmp == 0xDEADBEEF)
r = 0;
else
@@ -1150,7 +1171,8 @@ static int sdma_v5_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
amdgpu_ib_free(adev, &ib, NULL);
dma_fence_put(f);
err0:
- amdgpu_device_wb_free(adev, index);
+ if (!ring->is_mes_queue)
+ amdgpu_device_wb_free(adev, index);
return r;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 35/73] drm/amdgpu: add mes kiq PSP GFX FW type
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (33 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 34/73] drm/amdgpu/sdma5: add mes support for sdma ib test Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 36/73] drm/amdgpu/mes: add mes kiq callback Alex Deucher
` (37 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Likun Gao, Hawking Zhang
From: Likun Gao <Likun.Gao@amd.com>
Add MES KIQ PSP GFX FW type and the convert type.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index aabb208bebde..d0fb14ef645c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -2190,6 +2190,12 @@ static int psp_get_fw_type(struct amdgpu_firmware_info *ucode,
case AMDGPU_UCODE_ID_CP_MES_DATA:
*type = GFX_FW_TYPE_MES_STACK;
break;
+ case AMDGPU_UCODE_ID_CP_MES1:
+ *type = GFX_FW_TYPE_CP_MES_KIQ;
+ break;
+ case AMDGPU_UCODE_ID_CP_MES1_DATA:
+ *type = GFX_FW_TYPE_MES_KIQ_STACK;
+ break;
case AMDGPU_UCODE_ID_CP_CE:
*type = GFX_FW_TYPE_CP_CE;
break;
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 36/73] drm/amdgpu/mes: add mes kiq callback
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (34 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 35/73] drm/amdgpu: add mes kiq PSP GFX FW type Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 37/73] drm/amdgpu: add mes kiq frontdoor loading support Alex Deucher
` (36 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Needed to properly initialize mes kiq.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 91b020842eb0..117c95acfd48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -109,6 +109,9 @@ struct amdgpu_mes {
uint64_t query_status_fence_gpu_addr;
uint64_t *query_status_fence_ptr;
+ /* initialize kiq pipe */
+ int (*kiq_hw_init)(struct amdgpu_device *adev);
+
/* ip specific functions */
const struct amdgpu_mes_funcs *funcs;
};
@@ -204,4 +207,6 @@ struct amdgpu_mes_funcs {
struct mes_resume_gang_input *input);
};
+#define amdgpu_mes_kiq_hw_init(adev) (adev)->mes.kiq_hw_init((adev))
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 37/73] drm/amdgpu: add mes kiq frontdoor loading support
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (35 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 36/73] drm/amdgpu/mes: add mes kiq callback Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 38/73] drm/amdgpu: enable mes kiq N-1 test on sienna cichlid Alex Deucher
` (35 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add mes kiq frontdoor loading support.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index b7d575c7bcdc..a67f41465337 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -711,6 +711,16 @@ static int amdgpu_ucode_init_single_fw(struct amdgpu_device *adev,
ucode_addr = (u8 *)ucode->fw->data +
le32_to_cpu(mes_hdr->mes_ucode_data_offset_bytes);
break;
+ case AMDGPU_UCODE_ID_CP_MES1:
+ ucode->ucode_size = le32_to_cpu(mes_hdr->mes_ucode_size_bytes);
+ ucode_addr = (u8 *)ucode->fw->data +
+ le32_to_cpu(mes_hdr->mes_ucode_offset_bytes);
+ break;
+ case AMDGPU_UCODE_ID_CP_MES1_DATA:
+ ucode->ucode_size = le32_to_cpu(mes_hdr->mes_ucode_data_size_bytes);
+ ucode_addr = (u8 *)ucode->fw->data +
+ le32_to_cpu(mes_hdr->mes_ucode_data_offset_bytes);
+ break;
case AMDGPU_UCODE_ID_DMCU_ERAM:
ucode->ucode_size = le32_to_cpu(header->ucode_size_bytes) -
le32_to_cpu(dmcu_hdr->intv_size_bytes);
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 38/73] drm/amdgpu: enable mes kiq N-1 test on sienna cichlid
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (36 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 37/73] drm/amdgpu: add mes kiq frontdoor loading support Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 39/73] drm/amdgpu/mes: manage mes doorbell allocation Alex Deucher
` (34 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Enable kiq support on gfx10.3, enable mes kiq (n-1)
test on sienna cichlid, so that mes kiq can be tested on
sienna cichlid. The patch can be dropped once mes kiq
is functional.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 32 ++--
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 202 ++++++++++++++++++++-----
2 files changed, 184 insertions(+), 50 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 1208d01cc936..9042e0b480ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4897,16 +4897,18 @@ static int gfx_v10_0_sw_init(void *handle)
}
}
- r = amdgpu_gfx_kiq_init(adev, GFX10_MEC_HPD_SIZE);
- if (r) {
- DRM_ERROR("Failed to init KIQ BOs!\n");
- return r;
- }
+ if (!adev->enable_mes_kiq) {
+ r = amdgpu_gfx_kiq_init(adev, GFX10_MEC_HPD_SIZE);
+ if (r) {
+ DRM_ERROR("Failed to init KIQ BOs!\n");
+ return r;
+ }
- kiq = &adev->gfx.kiq;
- r = amdgpu_gfx_kiq_init_ring(adev, &kiq->ring, &kiq->irq);
- if (r)
- return r;
+ kiq = &adev->gfx.kiq;
+ r = amdgpu_gfx_kiq_init_ring(adev, &kiq->ring, &kiq->irq);
+ if (r)
+ return r;
+ }
r = amdgpu_gfx_mqd_sw_init(adev, sizeof(struct v10_compute_mqd));
if (r)
@@ -4958,8 +4960,11 @@ static int gfx_v10_0_sw_fini(void *handle)
amdgpu_ring_fini(&adev->gfx.compute_ring[i]);
amdgpu_gfx_mqd_sw_fini(adev);
- amdgpu_gfx_kiq_free_ring(&adev->gfx.kiq.ring);
- amdgpu_gfx_kiq_fini(adev);
+
+ if (!adev->enable_mes_kiq) {
+ amdgpu_gfx_kiq_free_ring(&adev->gfx.kiq.ring);
+ amdgpu_gfx_kiq_fini(adev);
+ }
gfx_v10_0_pfp_fini(adev);
gfx_v10_0_ce_fini(adev);
@@ -7213,7 +7218,10 @@ static int gfx_v10_0_cp_resume(struct amdgpu_device *adev)
return r;
}
- r = gfx_v10_0_kiq_resume(adev);
+ if (adev->enable_mes_kiq && adev->mes.kiq_hw_init)
+ r = amdgpu_mes_kiq_hw_init(adev);
+ else
+ r = gfx_v10_0_kiq_resume(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index f82a6f981629..fecf3f26bf7c 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -33,11 +33,15 @@
#define mmCP_MES_IC_OP_CNTL_Sienna_Cichlid 0x2820
#define mmCP_MES_IC_OP_CNTL_Sienna_Cichlid_BASE_IDX 1
+#define mmRLC_CP_SCHEDULERS_Sienna_Cichlid 0x4ca1
+#define mmRLC_CP_SCHEDULERS_Sienna_Cichlid_BASE_IDX 1
MODULE_FIRMWARE("amdgpu/navi10_mes.bin");
MODULE_FIRMWARE("amdgpu/sienna_cichlid_mes.bin");
+MODULE_FIRMWARE("amdgpu/sienna_cichlid_mes1.bin");
static int mes_v10_1_hw_fini(void *handle);
+static int mes_v10_1_kiq_hw_init(struct amdgpu_device *adev);
#define MES_EOP_SIZE 2048
@@ -278,11 +282,11 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev,
const struct mes_firmware_header_v1_0 *mes_hdr;
struct amdgpu_firmware_info *info;
- switch (adev->asic_type) {
- case CHIP_NAVI10:
+ switch (adev->ip_versions[GC_HWIP][0]) {
+ case IP_VERSION(10, 1, 10):
chip_name = "navi10";
break;
- case CHIP_SIENNA_CICHLID:
+ case IP_VERSION(10, 3, 0):
chip_name = "sienna_cichlid";
break;
default:
@@ -293,7 +297,8 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev,
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes.bin",
chip_name);
else
- BUG();
+ snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mes1.bin",
+ chip_name);
err = request_firmware(&adev->mes.fw[pipe], fw_name, adev->dev);
if (err)
@@ -326,7 +331,8 @@ static int mes_v10_1_init_microcode(struct amdgpu_device *adev,
ucode = AMDGPU_UCODE_ID_CP_MES;
ucode_data = AMDGPU_UCODE_ID_CP_MES_DATA;
} else {
- BUG();
+ ucode = AMDGPU_UCODE_ID_CP_MES1;
+ ucode_data = AMDGPU_UCODE_ID_CP_MES1_DATA;
}
info = &adev->firmware.ucode[ucode];
@@ -439,6 +445,8 @@ static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
if (enable) {
data = RREG32_SOC15(GC, 0, mmCP_MES_CNTL);
data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_RESET, 1);
+ data = REG_SET_FIELD(data, CP_MES_CNTL,
+ MES_PIPE1_RESET, adev->enable_mes_kiq ? 1 : 0);
WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
mutex_lock(&adev->srbm_mutex);
@@ -462,13 +470,18 @@ static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
/* unhalt MES and activate pipe0 */
data = REG_SET_FIELD(0, CP_MES_CNTL, MES_PIPE0_ACTIVE, 1);
+ data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE1_ACTIVE,
+ adev->enable_mes_kiq ? 1 : 0);
WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
} else {
data = RREG32_SOC15(GC, 0, mmCP_MES_CNTL);
data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_ACTIVE, 0);
+ data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE1_ACTIVE, 0);
data = REG_SET_FIELD(data, CP_MES_CNTL,
MES_INVALIDATE_ICACHE, 1);
data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_RESET, 1);
+ data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE1_RESET,
+ adev->enable_mes_kiq ? 1 : 0);
data = REG_SET_FIELD(data, CP_MES_CNTL, MES_HALT, 1);
WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
}
@@ -525,8 +538,8 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
WREG32_SOC15(GC, 0, mmCP_MES_MDBOUND_LO, 0x3FFFF);
/* invalidate ICACHE */
- switch (adev->asic_type) {
- case CHIP_SIENNA_CICHLID:
+ switch (adev->ip_versions[GC_HWIP][0]) {
+ case IP_VERSION(10, 3, 0):
data = RREG32_SOC15(GC, 0, mmCP_MES_IC_OP_CNTL_Sienna_Cichlid);
break;
default:
@@ -535,8 +548,8 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
}
data = REG_SET_FIELD(data, CP_MES_IC_OP_CNTL, PRIME_ICACHE, 0);
data = REG_SET_FIELD(data, CP_MES_IC_OP_CNTL, INVALIDATE_CACHE, 1);
- switch (adev->asic_type) {
- case CHIP_SIENNA_CICHLID:
+ switch (adev->ip_versions[GC_HWIP][0]) {
+ case IP_VERSION(10, 3, 0):
WREG32_SOC15(GC, 0, mmCP_MES_IC_OP_CNTL_Sienna_Cichlid, data);
break;
default:
@@ -545,8 +558,8 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
}
/* prime the ICACHE. */
- switch (adev->asic_type) {
- case CHIP_SIENNA_CICHLID:
+ switch (adev->ip_versions[GC_HWIP][0]) {
+ case IP_VERSION(10, 3, 0):
data = RREG32_SOC15(GC, 0, mmCP_MES_IC_OP_CNTL_Sienna_Cichlid);
break;
default:
@@ -554,8 +567,8 @@ static int mes_v10_1_load_microcode(struct amdgpu_device *adev,
break;
}
data = REG_SET_FIELD(data, CP_MES_IC_OP_CNTL, PRIME_ICACHE, 1);
- switch (adev->asic_type) {
- case CHIP_SIENNA_CICHLID:
+ switch (adev->ip_versions[GC_HWIP][0]) {
+ case IP_VERSION(10, 3, 0):
WREG32_SOC15(GC, 0, mmCP_MES_IC_OP_CNTL_Sienna_Cichlid, data);
break;
default:
@@ -883,13 +896,40 @@ static int mes_v10_1_ring_init(struct amdgpu_device *adev)
AMDGPU_RING_PRIO_DEFAULT, NULL);
}
+static int mes_v10_1_kiq_ring_init(struct amdgpu_device *adev)
+{
+ struct amdgpu_ring *ring;
+
+ spin_lock_init(&adev->gfx.kiq.ring_lock);
+
+ ring = &adev->gfx.kiq.ring;
+
+ ring->me = 3;
+ ring->pipe = 1;
+ ring->queue = 0;
+
+ ring->adev = NULL;
+ ring->ring_obj = NULL;
+ ring->use_doorbell = true;
+ ring->doorbell_index = adev->doorbell_index.mes_ring1 << 1;
+ ring->eop_gpu_addr = adev->mes.eop_gpu_addr[AMDGPU_MES_KIQ_PIPE];
+ ring->no_scheduler = true;
+ sprintf(ring->name, "mes_kiq_%d.%d.%d",
+ ring->me, ring->pipe, ring->queue);
+
+ return amdgpu_ring_init(adev, ring, 1024, NULL, 0,
+ AMDGPU_RING_PRIO_DEFAULT, NULL);
+}
+
static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
enum admgpu_mes_pipe pipe)
{
int r, mqd_size = sizeof(struct v10_compute_mqd);
struct amdgpu_ring *ring;
- if (pipe == AMDGPU_MES_SCHED_PIPE)
+ if (pipe == AMDGPU_MES_KIQ_PIPE)
+ ring = &adev->gfx.kiq.ring;
+ else if (pipe == AMDGPU_MES_SCHED_PIPE)
ring = &adev->mes.ring;
else
BUG();
@@ -918,22 +958,34 @@ static int mes_v10_1_mqd_sw_init(struct amdgpu_device *adev,
static int mes_v10_1_sw_init(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- int r, pipe = AMDGPU_MES_SCHED_PIPE;
+ int pipe, r;
adev->mes.adev = adev;
adev->mes.funcs = &mes_v10_1_funcs;
+ adev->mes.kiq_hw_init = &mes_v10_1_kiq_hw_init;
- r = mes_v10_1_init_microcode(adev, pipe);
- if (r)
- return r;
+ for (pipe = 0; pipe < AMDGPU_MAX_MES_PIPES; pipe++) {
+ if (!adev->enable_mes_kiq && pipe == AMDGPU_MES_KIQ_PIPE)
+ continue;
- r = mes_v10_1_allocate_eop_buf(adev, pipe);
- if (r)
- return r;
+ r = mes_v10_1_init_microcode(adev, pipe);
+ if (r)
+ return r;
- r = mes_v10_1_mqd_sw_init(adev, pipe);
- if (r)
- return r;
+ r = mes_v10_1_allocate_eop_buf(adev, pipe);
+ if (r)
+ return r;
+
+ r = mes_v10_1_mqd_sw_init(adev, pipe);
+ if (r)
+ return r;
+ }
+
+ if (adev->enable_mes_kiq) {
+ r = mes_v10_1_kiq_ring_init(adev);
+ if (r)
+ return r;
+ }
r = mes_v10_1_ring_init(adev);
if (r)
@@ -949,43 +1001,115 @@ static int mes_v10_1_sw_init(void *handle)
static int mes_v10_1_sw_fini(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- int pipe = AMDGPU_MES_SCHED_PIPE;
+ int pipe;
amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
amdgpu_device_wb_free(adev, adev->mes.query_status_fence_offs);
- kfree(adev->mes.mqd_backup[pipe]);
+ for (pipe = 0; pipe < AMDGPU_MAX_MES_PIPES; pipe++) {
+ kfree(adev->mes.mqd_backup[pipe]);
+
+ amdgpu_bo_free_kernel(&adev->mes.eop_gpu_obj[pipe],
+ &adev->mes.eop_gpu_addr[pipe],
+ NULL);
+
+ mes_v10_1_free_microcode(adev, pipe);
+ }
+
+ amdgpu_bo_free_kernel(&adev->gfx.kiq.ring.mqd_obj,
+ &adev->gfx.kiq.ring.mqd_gpu_addr,
+ &adev->gfx.kiq.ring.mqd_ptr);
amdgpu_bo_free_kernel(&adev->mes.ring.mqd_obj,
&adev->mes.ring.mqd_gpu_addr,
&adev->mes.ring.mqd_ptr);
- amdgpu_bo_free_kernel(&adev->mes.eop_gpu_obj[pipe],
- &adev->mes.eop_gpu_addr[pipe],
- NULL);
-
- mes_v10_1_free_microcode(adev, pipe);
+ amdgpu_ring_fini(&adev->gfx.kiq.ring);
amdgpu_ring_fini(&adev->mes.ring);
return 0;
}
-static int mes_v10_1_hw_init(void *handle)
+static void mes_v10_1_kiq_setting(struct amdgpu_ring *ring)
{
- int r;
- struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+ uint32_t tmp;
+ struct amdgpu_device *adev = ring->adev;
+
+ /* tell RLC which is KIQ queue */
+ switch (adev->ip_versions[GC_HWIP][0]) {
+ case IP_VERSION(10, 3, 0):
+ case IP_VERSION(10, 3, 2):
+ case IP_VERSION(10, 3, 1):
+ case IP_VERSION(10, 3, 4):
+ tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid);
+ tmp &= 0xffffff00;
+ tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue);
+ WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid, tmp);
+ tmp |= 0x80;
+ WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid, tmp);
+ break;
+ default:
+ tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS);
+ tmp &= 0xffffff00;
+ tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue);
+ WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp);
+ tmp |= 0x80;
+ WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp);
+ break;
+ }
+}
+
+static int mes_v10_1_kiq_hw_init(struct amdgpu_device *adev)
+{
+ int r = 0;
if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
- r = mes_v10_1_load_microcode(adev,
- AMDGPU_MES_SCHED_PIPE);
+ r = mes_v10_1_load_microcode(adev, AMDGPU_MES_KIQ_PIPE);
+ if (r) {
+ DRM_ERROR("failed to load MES kiq fw, r=%d\n", r);
+ return r;
+ }
+
+ r = mes_v10_1_load_microcode(adev, AMDGPU_MES_SCHED_PIPE);
if (r) {
- DRM_ERROR("failed to MES fw, r=%d\n", r);
+ DRM_ERROR("failed to load MES fw, r=%d\n", r);
return r;
}
}
mes_v10_1_enable(adev, true);
+ mes_v10_1_kiq_setting(&adev->gfx.kiq.ring);
+
+ r = mes_v10_1_queue_init(adev);
+ if (r)
+ goto failure;
+
+ return r;
+
+failure:
+ mes_v10_1_hw_fini(adev);
+ return r;
+}
+
+static int mes_v10_1_hw_init(void *handle)
+{
+ int r;
+ struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+ if (!adev->enable_mes_kiq) {
+ if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
+ r = mes_v10_1_load_microcode(adev,
+ AMDGPU_MES_SCHED_PIPE);
+ if (r) {
+ DRM_ERROR("failed to MES fw, r=%d\n", r);
+ return r;
+ }
+ }
+
+ mes_v10_1_enable(adev, true);
+ }
+
r = mes_v10_1_queue_init(adev);
if (r)
goto failure;
@@ -1013,8 +1137,10 @@ static int mes_v10_1_hw_fini(void *handle)
mes_v10_1_enable(adev, false);
- if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT)
+ if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) {
+ mes_v10_1_free_ucode_buffers(adev, AMDGPU_MES_KIQ_PIPE);
mes_v10_1_free_ucode_buffers(adev, AMDGPU_MES_SCHED_PIPE);
+ }
return 0;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 39/73] drm/amdgpu/mes: manage mes doorbell allocation
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (37 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 38/73] drm/amdgpu: enable mes kiq N-1 test on sienna cichlid Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 40/73] drm/amdgpu: add mes queue id mask v2 Alex Deucher
` (33 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
It is used to manage the doorbell allocation of mes processes and queues.
Driver calls into process doorbell allocation to get the slice doorbell
for the process, then the doorbell for a queue is allocated from the
process doorbell slice.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/Makefile | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 133 ++++++++++++++++++++++++
2 files changed, 134 insertions(+)
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
index 7a1b13fabebb..803e7f5dc458 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -144,6 +144,7 @@ amdgpu-y += \
# add MES block
amdgpu-y += \
+ amdgpu_mes.o \
mes_v10_1.o
# add UVD block
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
new file mode 100644
index 000000000000..1c591cb45fd9
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -0,0 +1,133 @@
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "amdgpu_mes.h"
+#include "amdgpu.h"
+#include "soc15_common.h"
+#include "amdgpu_mes_ctx.h"
+
+#define AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS 1024
+#define AMDGPU_ONE_DOORBELL_SIZE 8
+
+static int amdgpu_mes_doorbell_process_slice(struct amdgpu_device *adev)
+{
+ return roundup(AMDGPU_ONE_DOORBELL_SIZE *
+ AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS,
+ PAGE_SIZE);
+}
+
+static int amdgpu_mes_alloc_process_doorbells(struct amdgpu_device *adev,
+ struct amdgpu_mes_process *process)
+{
+ int r = ida_simple_get(&adev->mes.doorbell_ida, 2,
+ adev->mes.max_doorbell_slices,
+ GFP_KERNEL);
+ if (r > 0)
+ process->doorbell_index = r;
+
+ return r;
+}
+
+static void amdgpu_mes_free_process_doorbells(struct amdgpu_device *adev,
+ struct amdgpu_mes_process *process)
+{
+ if (process->doorbell_index)
+ ida_simple_remove(&adev->mes.doorbell_ida,
+ process->doorbell_index);
+}
+
+static int amdgpu_mes_queue_doorbell_get(struct amdgpu_device *adev,
+ struct amdgpu_mes_process *process,
+ int ip_type, uint64_t *doorbell_index)
+{
+ unsigned int offset, found;
+
+ if (ip_type == AMDGPU_RING_TYPE_SDMA) {
+ offset = adev->doorbell_index.sdma_engine[0];
+ found = find_next_zero_bit(process->doorbell_bitmap,
+ AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS,
+ offset);
+ } else {
+ found = find_first_zero_bit(process->doorbell_bitmap,
+ AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS);
+ }
+
+ if (found >= AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS) {
+ DRM_WARN("No doorbell available\n");
+ return -ENOSPC;
+ }
+
+ set_bit(found, process->doorbell_bitmap);
+
+ *doorbell_index =
+ (process->doorbell_index *
+ amdgpu_mes_doorbell_process_slice(adev)) / sizeof(u32) +
+ found * 2;
+
+ return 0;
+}
+
+static void amdgpu_mes_queue_doorbell_free(struct amdgpu_device *adev,
+ struct amdgpu_mes_process *process,
+ uint32_t doorbell_index)
+{
+ unsigned int old, doorbell_id;
+
+ doorbell_id = doorbell_index -
+ (process->doorbell_index *
+ amdgpu_mes_doorbell_process_slice(adev)) / sizeof(u32);
+ doorbell_id /= 2;
+
+ old = test_and_clear_bit(doorbell_id, process->doorbell_bitmap);
+ WARN_ON(!old);
+}
+
+static int amdgpu_mes_doorbell_init(struct amdgpu_device *adev)
+{
+ size_t doorbell_start_offset;
+ size_t doorbell_aperture_size;
+ size_t doorbell_process_limit;
+
+ doorbell_start_offset = (adev->doorbell_index.max_assignment+1) * sizeof(u32);
+ doorbell_start_offset =
+ roundup(doorbell_start_offset,
+ amdgpu_mes_doorbell_process_slice(adev));
+
+ doorbell_aperture_size = adev->doorbell.size;
+ doorbell_aperture_size =
+ rounddown(doorbell_aperture_size,
+ amdgpu_mes_doorbell_process_slice(adev));
+
+ if (doorbell_aperture_size > doorbell_start_offset)
+ doorbell_process_limit =
+ (doorbell_aperture_size - doorbell_start_offset) /
+ amdgpu_mes_doorbell_process_slice(adev);
+ else
+ return -ENOSPC;
+
+ adev->mes.doorbell_id_offset = doorbell_start_offset / sizeof(u32);
+ adev->mes.max_doorbell_slices = doorbell_process_limit;
+
+ DRM_INFO("max_doorbell_slices=%ld\n", doorbell_process_limit);
+ return 0;
+}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 40/73] drm/amdgpu: add mes queue id mask v2
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (38 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 39/73] drm/amdgpu/mes: manage mes doorbell allocation Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 41/73] drm/amdgpu/mes: initialize/finalize common mes structure v2 Alex Deucher
` (32 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add MES queue id mask.
v2: move queue id mask to amdgpu_mes_ctx.h
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
index 544f1aa86edf..c000f656aae5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes_ctx.h
@@ -116,5 +116,6 @@ struct amdgpu_mes_ctx_data {
#define AMDGPU_FENCE_MES_QUEUE_ID_MASK (AMDGPU_FENCE_MES_QUEUE_FLAG - 1)
#define AMDGPU_FENCE_MES_QUEUE_FLAG 0x1000000u
+#define AMDGPU_FENCE_MES_QUEUE_ID_MASK (AMDGPU_FENCE_MES_QUEUE_FLAG - 1)
#endif
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 41/73] drm/amdgpu/mes: initialize/finalize common mes structure v2
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (39 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 40/73] drm/amdgpu: add mes queue id mask v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 42/73] drm/amdgpu/mes: relocate status_fence slot allocation Alex Deucher
` (31 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Le Ma, Jack Xiao, Christian König,
Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Initialize/finalize common mes structure.
v2: add mutex_init for adev->mes.mutex
Cc: Le Ma <le.ma@amd.com>
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 72 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 4 ++
2 files changed, 76 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 1c591cb45fd9..90c400564540 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -131,3 +131,75 @@ static int amdgpu_mes_doorbell_init(struct amdgpu_device *adev)
DRM_INFO("max_doorbell_slices=%ld\n", doorbell_process_limit);
return 0;
}
+
+int amdgpu_mes_init(struct amdgpu_device *adev)
+{
+ int i, r;
+
+ adev->mes.adev = adev;
+
+ idr_init(&adev->mes.pasid_idr);
+ idr_init(&adev->mes.gang_id_idr);
+ idr_init(&adev->mes.queue_id_idr);
+ ida_init(&adev->mes.doorbell_ida);
+ spin_lock_init(&adev->mes.queue_id_lock);
+ mutex_init(&adev->mes.mutex);
+
+ adev->mes.total_max_queue = AMDGPU_FENCE_MES_QUEUE_ID_MASK;
+ adev->mes.vmid_mask_mmhub = 0xffffff00;
+ adev->mes.vmid_mask_gfxhub = 0xffffff00;
+
+ for (i = 0; i < AMDGPU_MES_MAX_COMPUTE_PIPES; i++) {
+ /* use only 1st MEC pipes */
+ if (i >= 4)
+ continue;
+ adev->mes.compute_hqd_mask[i] = 0xc;
+ }
+
+ for (i = 0; i < AMDGPU_MES_MAX_GFX_PIPES; i++)
+ adev->mes.gfx_hqd_mask[i] = i ? 0 : 0xfffffffe;
+
+ for (i = 0; i < AMDGPU_MES_MAX_SDMA_PIPES; i++)
+ adev->mes.sdma_hqd_mask[i] = i ? 0 : 0x3fc;
+
+ for (i = 0; i < AMDGPU_MES_PRIORITY_NUM_LEVELS; i++)
+ adev->mes.agreegated_doorbells[i] = 0xffffffff;
+
+ r = amdgpu_device_wb_get(adev, &adev->mes.sch_ctx_offs);
+ if (r) {
+ dev_err(adev->dev,
+ "(%d) ring trail_fence_offs wb alloc failed\n", r);
+ goto error_ids;
+ }
+ adev->mes.sch_ctx_gpu_addr =
+ adev->wb.gpu_addr + (adev->mes.sch_ctx_offs * 4);
+ adev->mes.sch_ctx_ptr =
+ (uint64_t *)&adev->wb.wb[adev->mes.sch_ctx_offs];
+
+ r = amdgpu_mes_doorbell_init(adev);
+ if (r)
+ goto error;
+
+ return 0;
+
+error:
+ amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
+error_ids:
+ idr_destroy(&adev->mes.pasid_idr);
+ idr_destroy(&adev->mes.gang_id_idr);
+ idr_destroy(&adev->mes.queue_id_idr);
+ ida_destroy(&adev->mes.doorbell_ida);
+ mutex_destroy(&adev->mes.mutex);
+ return r;
+}
+
+void amdgpu_mes_fini(struct amdgpu_device *adev)
+{
+ amdgpu_device_wb_free(adev, adev->mes.sch_ctx_offs);
+
+ idr_destroy(&adev->mes.pasid_idr);
+ idr_destroy(&adev->mes.gang_id_idr);
+ idr_destroy(&adev->mes.queue_id_idr);
+ ida_destroy(&adev->mes.doorbell_ida);
+ mutex_destroy(&adev->mes.mutex);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 117c95acfd48..e64b2114c7ba 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -207,6 +207,10 @@ struct amdgpu_mes_funcs {
struct mes_resume_gang_input *input);
};
+
#define amdgpu_mes_kiq_hw_init(adev) (adev)->mes.kiq_hw_init((adev))
+int amdgpu_mes_init(struct amdgpu_device *adev);
+void amdgpu_mes_fini(struct amdgpu_device *adev);
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 42/73] drm/amdgpu/mes: relocate status_fence slot allocation
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (40 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 41/73] drm/amdgpu/mes: initialize/finalize common mes structure v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 43/73] drm/amdgpu/mes10.1: call general mes initialization Alex Deucher
` (30 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Move the status_fence slot allocation from ip specific function
to general mes function.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 11 +++++++++
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 33 -------------------------
2 files changed, 11 insertions(+), 33 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 90c400564540..a988c232b4a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -176,6 +176,17 @@ int amdgpu_mes_init(struct amdgpu_device *adev)
adev->mes.sch_ctx_ptr =
(uint64_t *)&adev->wb.wb[adev->mes.sch_ctx_offs];
+ r = amdgpu_device_wb_get(adev, &adev->mes.query_status_fence_offs);
+ if (r) {
+ dev_err(adev->dev,
+ "(%d) query_status_fence_offs wb alloc failed\n", r);
+ return r;
+ }
+ adev->mes.query_status_fence_gpu_addr =
+ adev->wb.gpu_addr + (adev->mes.query_status_fence_offs * 4);
+ adev->mes.query_status_fence_ptr =
+ (uint64_t *)&adev->wb.wb[adev->mes.query_status_fence_offs];
+
r = amdgpu_mes_doorbell_init(adev);
if (r)
goto error;
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index fecf3f26bf7c..d77242e0360e 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -606,35 +606,6 @@ static int mes_v10_1_allocate_eop_buf(struct amdgpu_device *adev,
return 0;
}
-static int mes_v10_1_allocate_mem_slots(struct amdgpu_device *adev)
-{
- int r;
-
- r = amdgpu_device_wb_get(adev, &adev->mes.sch_ctx_offs);
- if (r) {
- dev_err(adev->dev,
- "(%d) mes sch_ctx_offs wb alloc failed\n", r);
- return r;
- }
- adev->mes.sch_ctx_gpu_addr =
- adev->wb.gpu_addr + (adev->mes.sch_ctx_offs * 4);
- adev->mes.sch_ctx_ptr =
- (uint64_t *)&adev->wb.wb[adev->mes.sch_ctx_offs];
-
- r = amdgpu_device_wb_get(adev, &adev->mes.query_status_fence_offs);
- if (r) {
- dev_err(adev->dev,
- "(%d) query_status_fence_offs wb alloc failed\n", r);
- return r;
- }
- adev->mes.query_status_fence_gpu_addr =
- adev->wb.gpu_addr + (adev->mes.query_status_fence_offs * 4);
- adev->mes.query_status_fence_ptr =
- (uint64_t *)&adev->wb.wb[adev->mes.query_status_fence_offs];
-
- return 0;
-}
-
static int mes_v10_1_mqd_init(struct amdgpu_ring *ring)
{
struct amdgpu_device *adev = ring->adev;
@@ -991,10 +962,6 @@ static int mes_v10_1_sw_init(void *handle)
if (r)
return r;
- r = mes_v10_1_allocate_mem_slots(adev);
- if (r)
- return r;
-
return 0;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 43/73] drm/amdgpu/mes10.1: call general mes initialization
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (41 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 42/73] drm/amdgpu/mes: relocate status_fence slot allocation Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 44/73] drm/amdgpu/mes10.1: add delay after mes engine enable Alex Deucher
` (29 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Call general mes initialization/finalization.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index d77242e0360e..94812164998a 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -935,6 +935,10 @@ static int mes_v10_1_sw_init(void *handle)
adev->mes.funcs = &mes_v10_1_funcs;
adev->mes.kiq_hw_init = &mes_v10_1_kiq_hw_init;
+ r = amdgpu_mes_init(adev);
+ if (r)
+ return r;
+
for (pipe = 0; pipe < AMDGPU_MAX_MES_PIPES; pipe++) {
if (!adev->enable_mes_kiq && pipe == AMDGPU_MES_KIQ_PIPE)
continue;
@@ -994,6 +998,7 @@ static int mes_v10_1_sw_fini(void *handle)
amdgpu_ring_fini(&adev->gfx.kiq.ring);
amdgpu_ring_fini(&adev->mes.ring);
+ amdgpu_mes_fini(adev);
return 0;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 44/73] drm/amdgpu/mes10.1: add delay after mes engine enable
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (42 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 43/73] drm/amdgpu/mes10.1: call general mes initialization Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 45/73] drm/amdgpu/mes10.1: implement the suspend/resume routine Alex Deucher
` (28 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add delay after mes engine enable, for it needs more time
to complete engine initialising.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index 94812164998a..d4e64c5a3215 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -473,6 +473,7 @@ static void mes_v10_1_enable(struct amdgpu_device *adev, bool enable)
data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE1_ACTIVE,
adev->enable_mes_kiq ? 1 : 0);
WREG32_SOC15(GC, 0, mmCP_MES_CNTL, data);
+ udelay(50);
} else {
data = RREG32_SOC15(GC, 0, mmCP_MES_CNTL);
data = REG_SET_FIELD(data, CP_MES_CNTL, MES_PIPE0_ACTIVE, 0);
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 45/73] drm/amdgpu/mes10.1: implement the suspend/resume routine
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (43 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 44/73] drm/amdgpu/mes10.1: add delay after mes engine enable Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 46/73] drm/amdgpu/mes: implement creating mes process v2 Alex Deucher
` (27 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Implement the suspend/resume routine of mes.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index d4e64c5a3215..d468cb5a8854 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -1120,12 +1120,26 @@ static int mes_v10_1_hw_fini(void *handle)
static int mes_v10_1_suspend(void *handle)
{
- return 0;
+ int r;
+ struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+ r = amdgpu_mes_suspend(adev);
+ if (r)
+ return r;
+
+ return mes_v10_1_hw_fini(adev);
}
static int mes_v10_1_resume(void *handle)
{
- return 0;
+ int r;
+ struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+ r = mes_v10_1_hw_init(adev);
+ if (r)
+ return r;
+
+ return amdgpu_mes_resume(adev);
}
static const struct amd_ip_funcs mes_v10_1_ip_funcs = {
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 46/73] drm/amdgpu/mes: implement creating mes process v2
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (44 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 45/73] drm/amdgpu/mes10.1: implement the suspend/resume routine Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 47/73] drm/amdgpu/mes: implement destroying mes process Alex Deucher
` (26 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Create a mes process which contains process-related resources,
like vm, doorbell bitmap, process ctx bo and etc.
v2: move the simple variable to the end
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 77 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 3 +
2 files changed, 80 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index a988c232b4a3..55005a594be1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -214,3 +214,80 @@ void amdgpu_mes_fini(struct amdgpu_device *adev)
ida_destroy(&adev->mes.doorbell_ida);
mutex_destroy(&adev->mes.mutex);
}
+
+int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
+ struct amdgpu_vm *vm)
+{
+ struct amdgpu_mes_process *process;
+ int r;
+
+ mutex_lock(&adev->mes.mutex);
+
+ /* allocate the mes process buffer */
+ process = kzalloc(sizeof(struct amdgpu_mes_process), GFP_KERNEL);
+ if (!process) {
+ DRM_ERROR("no more memory to create mes process\n");
+ mutex_unlock(&adev->mes.mutex);
+ return -ENOMEM;
+ }
+
+ process->doorbell_bitmap =
+ kzalloc(DIV_ROUND_UP(AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS,
+ BITS_PER_BYTE), GFP_KERNEL);
+ if (!process->doorbell_bitmap) {
+ DRM_ERROR("failed to allocate doorbell bitmap\n");
+ kfree(process);
+ mutex_unlock(&adev->mes.mutex);
+ return -ENOMEM;
+ }
+
+ /* add the mes process to idr list */
+ r = idr_alloc(&adev->mes.pasid_idr, process, pasid, pasid + 1,
+ GFP_KERNEL);
+ if (r < 0) {
+ DRM_ERROR("failed to lock pasid=%d\n", pasid);
+ goto clean_up_memory;
+ }
+
+ /* allocate the process context bo and map it */
+ r = amdgpu_bo_create_kernel(adev, AMDGPU_MES_PROC_CTX_SIZE, PAGE_SIZE,
+ AMDGPU_GEM_DOMAIN_GTT,
+ &process->proc_ctx_bo,
+ &process->proc_ctx_gpu_addr,
+ &process->proc_ctx_cpu_ptr);
+ if (r) {
+ DRM_ERROR("failed to allocate process context bo\n");
+ goto clean_up_pasid;
+ }
+ memset(process->proc_ctx_cpu_ptr, 0, AMDGPU_MES_PROC_CTX_SIZE);
+
+ /* allocate the starting doorbell index of the process */
+ r = amdgpu_mes_alloc_process_doorbells(adev, process);
+ if (r < 0) {
+ DRM_ERROR("failed to allocate doorbell for process\n");
+ goto clean_up_ctx;
+ }
+
+ DRM_DEBUG("process doorbell index = %d\n", process->doorbell_index);
+
+ INIT_LIST_HEAD(&process->gang_list);
+ process->vm = vm;
+ process->pasid = pasid;
+ process->process_quantum = adev->mes.default_process_quantum;
+ process->pd_gpu_addr = amdgpu_bo_gpu_offset(vm->root.bo);
+
+ mutex_unlock(&adev->mes.mutex);
+ return 0;
+
+clean_up_ctx:
+ amdgpu_bo_free_kernel(&process->proc_ctx_bo,
+ &process->proc_ctx_gpu_addr,
+ &process->proc_ctx_cpu_ptr);
+clean_up_pasid:
+ idr_remove(&adev->mes.pasid_idr, pasid);
+clean_up_memory:
+ kfree(process->doorbell_bitmap);
+ kfree(process);
+ mutex_unlock(&adev->mes.mutex);
+ return r;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index e64b2114c7ba..010a9727cbec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -213,4 +213,7 @@ struct amdgpu_mes_funcs {
int amdgpu_mes_init(struct amdgpu_device *adev);
void amdgpu_mes_fini(struct amdgpu_device *adev);
+int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
+ struct amdgpu_vm *vm);
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 47/73] drm/amdgpu/mes: implement destroying mes process
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (45 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 46/73] drm/amdgpu/mes: implement creating mes process v2 Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:45 ` [PATCH 48/73] drm/amdgpu/mes: implement adding mes gang Alex Deucher
` (25 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Destroy the mes process, which free resources of the process.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 58 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 +
2 files changed, 59 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 55005a594be1..05e27636ce20 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -291,3 +291,61 @@ int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
mutex_unlock(&adev->mes.mutex);
return r;
}
+
+void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid)
+{
+ struct amdgpu_mes_process *process;
+ struct amdgpu_mes_gang *gang, *tmp1;
+ struct amdgpu_mes_queue *queue, *tmp2;
+ struct mes_remove_queue_input queue_input;
+ unsigned long flags;
+ int r;
+
+ mutex_lock(&adev->mes.mutex);
+
+ process = idr_find(&adev->mes.pasid_idr, pasid);
+ if (!process) {
+ DRM_WARN("pasid %d doesn't exist\n", pasid);
+ mutex_unlock(&adev->mes.mutex);
+ return;
+ }
+
+ /* free all gangs in the process */
+ list_for_each_entry_safe(gang, tmp1, &process->gang_list, list) {
+ /* free all queues in the gang */
+ list_for_each_entry_safe(queue, tmp2, &gang->queue_list, list) {
+ spin_lock_irqsave(&adev->mes.queue_id_lock, flags);
+ idr_remove(&adev->mes.queue_id_idr, queue->queue_id);
+ spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+
+ queue_input.doorbell_offset = queue->doorbell_off;
+ queue_input.gang_context_addr = gang->gang_ctx_gpu_addr;
+
+ r = adev->mes.funcs->remove_hw_queue(&adev->mes,
+ &queue_input);
+ if (r)
+ DRM_WARN("failed to remove hardware queue\n");
+
+ list_del(&queue->list);
+ kfree(queue);
+ }
+
+ idr_remove(&adev->mes.gang_id_idr, gang->gang_id);
+ amdgpu_bo_free_kernel(&gang->gang_ctx_bo,
+ &gang->gang_ctx_gpu_addr,
+ &gang->gang_ctx_cpu_ptr);
+ list_del(&gang->list);
+ kfree(gang);
+ }
+
+ amdgpu_mes_free_process_doorbells(adev, process);
+
+ idr_remove(&adev->mes.pasid_idr, pasid);
+ amdgpu_bo_free_kernel(&process->proc_ctx_bo,
+ &process->proc_ctx_gpu_addr,
+ &process->proc_ctx_cpu_ptr);
+ kfree(process->doorbell_bitmap);
+ kfree(process);
+
+ mutex_unlock(&adev->mes.mutex);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 010a9727cbec..fa2f47e4cd5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -215,5 +215,6 @@ void amdgpu_mes_fini(struct amdgpu_device *adev);
int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
struct amdgpu_vm *vm);
+void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid);
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 48/73] drm/amdgpu/mes: implement adding mes gang
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (46 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 47/73] drm/amdgpu/mes: implement destroying mes process Alex Deucher
@ 2022-04-29 17:45 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 49/73] drm/amdgpu/mes: implement removing " Alex Deucher
` (24 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:45 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Gang is a group of the same type queue, which is the scheduling
unit of mes hardware scheduler.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 67 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 12 +++++
2 files changed, 79 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 05e27636ce20..74385e4b45c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -349,3 +349,70 @@ void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid)
mutex_unlock(&adev->mes.mutex);
}
+
+int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
+ struct amdgpu_mes_gang_properties *gprops,
+ int *gang_id)
+{
+ struct amdgpu_mes_process *process;
+ struct amdgpu_mes_gang *gang;
+ int r;
+
+ mutex_lock(&adev->mes.mutex);
+
+ process = idr_find(&adev->mes.pasid_idr, pasid);
+ if (!process) {
+ DRM_ERROR("pasid %d doesn't exist\n", pasid);
+ mutex_unlock(&adev->mes.mutex);
+ return -EINVAL;
+ }
+
+ /* allocate the mes gang buffer */
+ gang = kzalloc(sizeof(struct amdgpu_mes_gang), GFP_KERNEL);
+ if (!gang) {
+ mutex_unlock(&adev->mes.mutex);
+ return -ENOMEM;
+ }
+
+ /* add the mes gang to idr list */
+ r = idr_alloc(&adev->mes.gang_id_idr, gang, 1, 0,
+ GFP_KERNEL);
+ if (r < 0) {
+ kfree(gang);
+ mutex_unlock(&adev->mes.mutex);
+ return r;
+ }
+
+ gang->gang_id = r;
+ *gang_id = r;
+
+ /* allocate the gang context bo and map it to cpu space */
+ r = amdgpu_bo_create_kernel(adev, AMDGPU_MES_GANG_CTX_SIZE, PAGE_SIZE,
+ AMDGPU_GEM_DOMAIN_GTT,
+ &gang->gang_ctx_bo,
+ &gang->gang_ctx_gpu_addr,
+ &gang->gang_ctx_cpu_ptr);
+ if (r) {
+ DRM_ERROR("failed to allocate process context bo\n");
+ goto clean_up;
+ }
+ memset(gang->gang_ctx_cpu_ptr, 0, AMDGPU_MES_GANG_CTX_SIZE);
+
+ INIT_LIST_HEAD(&gang->queue_list);
+ gang->process = process;
+ gang->priority = gprops->priority;
+ gang->gang_quantum = gprops->gang_quantum ?
+ gprops->gang_quantum : adev->mes.default_gang_quantum;
+ gang->global_priority_level = gprops->global_priority_level;
+ gang->inprocess_gang_priority = gprops->inprocess_gang_priority;
+ list_add_tail(&gang->list, &process->gang_list);
+
+ mutex_unlock(&adev->mes.mutex);
+ return 0;
+
+clean_up:
+ idr_remove(&adev->mes.gang_id_idr, gang->gang_id);
+ kfree(gang);
+ mutex_unlock(&adev->mes.mutex);
+ return r;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index fa2f47e4cd5a..3109bd1db6bc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -158,6 +158,14 @@ struct amdgpu_mes_queue {
struct amdgpu_ring *ring;
};
+struct amdgpu_mes_gang_properties {
+ uint32_t priority;
+ uint32_t gang_quantum;
+ uint32_t inprocess_gang_priority;
+ uint32_t priority_level;
+ int global_priority_level;
+};
+
struct mes_add_queue_input {
uint32_t process_id;
uint64_t page_table_base_addr;
@@ -217,4 +225,8 @@ int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
struct amdgpu_vm *vm);
void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid);
+int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
+ struct amdgpu_mes_gang_properties *gprops,
+ int *gang_id);
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 49/73] drm/amdgpu/mes: implement removing mes gang
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (47 preceding siblings ...)
2022-04-29 17:45 ` [PATCH 48/73] drm/amdgpu/mes: implement adding mes gang Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 50/73] drm/amdgpu/mes: implement suspending all gangs Alex Deucher
` (23 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Free the mes gang and its resources.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 30 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 +
2 files changed, 31 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 74385e4b45c4..07ddf7bf6a3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -416,3 +416,33 @@ int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
mutex_unlock(&adev->mes.mutex);
return r;
}
+
+int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id)
+{
+ struct amdgpu_mes_gang *gang;
+
+ mutex_lock(&adev->mes.mutex);
+
+ gang = idr_find(&adev->mes.gang_id_idr, gang_id);
+ if (!gang) {
+ DRM_ERROR("gang id %d doesn't exist\n", gang_id);
+ mutex_unlock(&adev->mes.mutex);
+ return -EINVAL;
+ }
+
+ if (!list_empty(&gang->queue_list)) {
+ DRM_ERROR("queue list is not empty\n");
+ mutex_unlock(&adev->mes.mutex);
+ return -EBUSY;
+ }
+
+ idr_remove(&adev->mes.gang_id_idr, gang->gang_id);
+ amdgpu_bo_free_kernel(&gang->gang_ctx_bo,
+ &gang->gang_ctx_gpu_addr,
+ &gang->gang_ctx_cpu_ptr);
+ list_del(&gang->list);
+ kfree(gang);
+
+ mutex_unlock(&adev->mes.mutex);
+ return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 3109bd1db6bc..f401a0a3eebd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -228,5 +228,6 @@ void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid);
int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
struct amdgpu_mes_gang_properties *gprops,
int *gang_id);
+int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id);
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 50/73] drm/amdgpu/mes: implement suspending all gangs
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (48 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 49/73] drm/amdgpu/mes: implement removing " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 51/73] drm/amdgpu/mes: implement resuming " Alex Deucher
` (22 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Implement suspending all gangs.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 25 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 2 ++
2 files changed, 27 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 07ddf7bf6a3b..e64f2a4b5a3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -446,3 +446,28 @@ int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id)
mutex_unlock(&adev->mes.mutex);
return 0;
}
+
+int amdgpu_mes_suspend(struct amdgpu_device *adev)
+{
+ struct idr *idp;
+ struct amdgpu_mes_process *process;
+ struct amdgpu_mes_gang *gang;
+ struct mes_suspend_gang_input input;
+ int r, pasid;
+
+ mutex_lock(&adev->mes.mutex);
+
+ idp = &adev->mes.pasid_idr;
+
+ idr_for_each_entry(idp, process, pasid) {
+ list_for_each_entry(gang, &process->gang_list, list) {
+ r = adev->mes.funcs->suspend_gang(&adev->mes, &input);
+ if (r)
+ DRM_ERROR("failed to suspend pasid %d gangid %d",
+ pasid, gang->gang_id);
+ }
+ }
+
+ mutex_unlock(&adev->mes.mutex);
+ return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index f401a0a3eebd..667fc9f9b21b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -230,4 +230,6 @@ int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
int *gang_id);
int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id);
+int amdgpu_mes_suspend(struct amdgpu_device *adev);
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 51/73] drm/amdgpu/mes: implement resuming all gangs
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (49 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 50/73] drm/amdgpu/mes: implement suspending all gangs Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 52/73] drm/amdgpu/mes: initialize mqd from queue properties Alex Deucher
` (21 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Implement resuming all gangs.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 25 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 +
2 files changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index e64f2a4b5a3b..b58af81f04a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -471,3 +471,28 @@ int amdgpu_mes_suspend(struct amdgpu_device *adev)
mutex_unlock(&adev->mes.mutex);
return 0;
}
+
+int amdgpu_mes_resume(struct amdgpu_device *adev)
+{
+ struct idr *idp;
+ struct amdgpu_mes_process *process;
+ struct amdgpu_mes_gang *gang;
+ struct mes_resume_gang_input input;
+ int r, pasid;
+
+ mutex_lock(&adev->mes.mutex);
+
+ idp = &adev->mes.pasid_idr;
+
+ idr_for_each_entry(idp, process, pasid) {
+ list_for_each_entry(gang, &process->gang_list, list) {
+ r = adev->mes.funcs->resume_gang(&adev->mes, &input);
+ if (r)
+ DRM_ERROR("failed to resume pasid %d gangid %d",
+ pasid, gang->gang_id);
+ }
+ }
+
+ mutex_unlock(&adev->mes.mutex);
+ return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 667fc9f9b21b..43d3a689732a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -231,5 +231,6 @@ int amdgpu_mes_add_gang(struct amdgpu_device *adev, int pasid,
int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id);
int amdgpu_mes_suspend(struct amdgpu_device *adev);
+int amdgpu_mes_resume(struct amdgpu_device *adev);
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 52/73] drm/amdgpu/mes: initialize mqd from queue properties
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (50 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 51/73] drm/amdgpu/mes: implement resuming " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 53/73] drm/amdgpu/mes: implement adding mes queue Alex Deucher
` (20 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add helper function to initialize mqd from queue properties.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 54 +++++++++++++++++++++++++
1 file changed, 54 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b58af81f04a3..2cd2fa76b5c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -496,3 +496,57 @@ int amdgpu_mes_resume(struct amdgpu_device *adev)
mutex_unlock(&adev->mes.mutex);
return 0;
}
+
+static int amdgpu_mes_queue_init_mqd(struct amdgpu_device *adev,
+ struct amdgpu_mes_queue *q,
+ struct amdgpu_mes_queue_properties *p)
+{
+ struct amdgpu_mqd *mqd_mgr = &adev->mqds[p->queue_type];
+ u32 mqd_size = mqd_mgr->mqd_size;
+ struct amdgpu_mqd_prop mqd_prop = {0};
+ int r;
+
+ r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
+ AMDGPU_GEM_DOMAIN_GTT,
+ &q->mqd_obj,
+ &q->mqd_gpu_addr, &q->mqd_cpu_ptr);
+ if (r) {
+ dev_warn(adev->dev, "failed to create queue mqd bo (%d)", r);
+ return r;
+ }
+ memset(q->mqd_cpu_ptr, 0, mqd_size);
+
+ mqd_prop.mqd_gpu_addr = q->mqd_gpu_addr;
+ mqd_prop.hqd_base_gpu_addr = p->hqd_base_gpu_addr;
+ mqd_prop.rptr_gpu_addr = p->rptr_gpu_addr;
+ mqd_prop.wptr_gpu_addr = p->wptr_gpu_addr;
+ mqd_prop.queue_size = p->queue_size;
+ mqd_prop.use_doorbell = true;
+ mqd_prop.doorbell_index = p->doorbell_off;
+ mqd_prop.eop_gpu_addr = p->eop_gpu_addr;
+ mqd_prop.hqd_pipe_priority = p->hqd_pipe_priority;
+ mqd_prop.hqd_queue_priority = p->hqd_queue_priority;
+ mqd_prop.hqd_active = false;
+
+ r = amdgpu_bo_reserve(q->mqd_obj, false);
+ if (unlikely(r != 0))
+ goto clean_up;
+
+ mqd_mgr->init_mqd(adev, q->mqd_cpu_ptr, &mqd_prop);
+
+ amdgpu_bo_unreserve(q->mqd_obj);
+ return 0;
+
+clean_up:
+ amdgpu_bo_free_kernel(&q->mqd_obj,
+ &q->mqd_gpu_addr,
+ &q->mqd_cpu_ptr);
+ return r;
+}
+
+static void amdgpu_mes_queue_free_mqd(struct amdgpu_mes_queue *q)
+{
+ amdgpu_bo_free_kernel(&q->mqd_obj,
+ &q->mqd_gpu_addr,
+ &q->mqd_cpu_ptr);
+}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 53/73] drm/amdgpu/mes: implement adding mes queue
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (51 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 52/73] drm/amdgpu/mes: initialize mqd from queue properties Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 54/73] drm/amdgpu/mes: implement removing " Alex Deucher
` (19 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Allocate related resources for the queue and add it to mes
for scheduling.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 105 ++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 19 +++++
2 files changed, 124 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 2cd2fa76b5c8..9f059c32c6c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -550,3 +550,108 @@ static void amdgpu_mes_queue_free_mqd(struct amdgpu_mes_queue *q)
&q->mqd_gpu_addr,
&q->mqd_cpu_ptr);
}
+
+int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
+ struct amdgpu_mes_queue_properties *qprops,
+ int *queue_id)
+{
+ struct amdgpu_mes_queue *queue;
+ struct amdgpu_mes_gang *gang;
+ struct mes_add_queue_input queue_input;
+ unsigned long flags;
+ int r;
+
+ mutex_lock(&adev->mes.mutex);
+
+ gang = idr_find(&adev->mes.gang_id_idr, gang_id);
+ if (!gang) {
+ DRM_ERROR("gang id %d doesn't exist\n", gang_id);
+ mutex_unlock(&adev->mes.mutex);
+ return -EINVAL;
+ }
+
+ /* allocate the mes queue buffer */
+ queue = kzalloc(sizeof(struct amdgpu_mes_queue), GFP_KERNEL);
+ if (!queue) {
+ mutex_unlock(&adev->mes.mutex);
+ return -ENOMEM;
+ }
+
+ /* add the mes gang to idr list */
+ spin_lock_irqsave(&adev->mes.queue_id_lock, flags);
+ r = idr_alloc(&adev->mes.queue_id_idr, queue, 1, 0,
+ GFP_ATOMIC);
+ if (r < 0) {
+ spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+ goto clean_up_memory;
+ }
+ spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+ *queue_id = queue->queue_id = r;
+
+ /* allocate a doorbell index for the queue */
+ r = amdgpu_mes_queue_doorbell_get(adev, gang->process,
+ qprops->queue_type,
+ &qprops->doorbell_off);
+ if (r)
+ goto clean_up_queue_id;
+
+ /* initialize the queue mqd */
+ r = amdgpu_mes_queue_init_mqd(adev, queue, qprops);
+ if (r)
+ goto clean_up_doorbell;
+
+ /* add hw queue to mes */
+ queue_input.process_id = gang->process->pasid;
+ queue_input.page_table_base_addr = gang->process->pd_gpu_addr;
+ queue_input.process_va_start = 0;
+ queue_input.process_va_end =
+ (adev->vm_manager.max_pfn - 1) << AMDGPU_GPU_PAGE_SHIFT;
+ queue_input.process_quantum = gang->process->process_quantum;
+ queue_input.process_context_addr = gang->process->proc_ctx_gpu_addr;
+ queue_input.gang_quantum = gang->gang_quantum;
+ queue_input.gang_context_addr = gang->gang_ctx_gpu_addr;
+ queue_input.inprocess_gang_priority = gang->inprocess_gang_priority;
+ queue_input.gang_global_priority_level = gang->global_priority_level;
+ queue_input.doorbell_offset = qprops->doorbell_off;
+ queue_input.mqd_addr = queue->mqd_gpu_addr;
+ queue_input.wptr_addr = qprops->wptr_gpu_addr;
+ queue_input.queue_type = qprops->queue_type;
+ queue_input.paging = qprops->paging;
+
+ r = adev->mes.funcs->add_hw_queue(&adev->mes, &queue_input);
+ if (r) {
+ DRM_ERROR("failed to add hardware queue to MES, doorbell=0x%llx\n",
+ qprops->doorbell_off);
+ goto clean_up_mqd;
+ }
+
+ DRM_DEBUG("MES hw queue was added, pasid=%d, gang id=%d, "
+ "queue type=%d, doorbell=0x%llx\n",
+ gang->process->pasid, gang_id, qprops->queue_type,
+ qprops->doorbell_off);
+
+ queue->ring = qprops->ring;
+ queue->doorbell_off = qprops->doorbell_off;
+ queue->wptr_gpu_addr = qprops->wptr_gpu_addr;
+ queue->queue_type = qprops->queue_type;
+ queue->paging = qprops->paging;
+ queue->gang = gang;
+ list_add_tail(&queue->list, &gang->queue_list);
+
+ mutex_unlock(&adev->mes.mutex);
+ return 0;
+
+clean_up_mqd:
+ amdgpu_mes_queue_free_mqd(queue);
+clean_up_doorbell:
+ amdgpu_mes_queue_doorbell_free(adev, gang->process,
+ qprops->doorbell_off);
+clean_up_queue_id:
+ spin_lock_irqsave(&adev->mes.queue_id_lock, flags);
+ idr_remove(&adev->mes.queue_id_idr, queue->queue_id);
+ spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+clean_up_memory:
+ kfree(queue);
+ mutex_unlock(&adev->mes.mutex);
+ return r;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 43d3a689732a..ec727c2109bc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -158,6 +158,21 @@ struct amdgpu_mes_queue {
struct amdgpu_ring *ring;
};
+struct amdgpu_mes_queue_properties {
+ int queue_type;
+ uint64_t hqd_base_gpu_addr;
+ uint64_t rptr_gpu_addr;
+ uint64_t wptr_gpu_addr;
+ uint32_t queue_size;
+ uint64_t eop_gpu_addr;
+ uint32_t hqd_pipe_priority;
+ uint32_t hqd_queue_priority;
+ bool paging;
+ struct amdgpu_ring *ring;
+ /* out */
+ uint64_t doorbell_off;
+};
+
struct amdgpu_mes_gang_properties {
uint32_t priority;
uint32_t gang_quantum;
@@ -233,4 +248,8 @@ int amdgpu_mes_remove_gang(struct amdgpu_device *adev, int gang_id);
int amdgpu_mes_suspend(struct amdgpu_device *adev);
int amdgpu_mes_resume(struct amdgpu_device *adev);
+int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
+ struct amdgpu_mes_queue_properties *qprops,
+ int *queue_id);
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 54/73] drm/amdgpu/mes: implement removing mes queue
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (52 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 53/73] drm/amdgpu/mes: implement adding mes queue Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 55/73] drm/amdgpu/mes: add helper function to convert ring to queue property Alex Deucher
` (18 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Remove the MES queue from MES scheduling and free its resources.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 45 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 1 +
2 files changed, 46 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 9f059c32c6c2..df0e542bd687 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -655,3 +655,48 @@ int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
mutex_unlock(&adev->mes.mutex);
return r;
}
+
+int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id)
+{
+ unsigned long flags;
+ struct amdgpu_mes_queue *queue;
+ struct amdgpu_mes_gang *gang;
+ struct mes_remove_queue_input queue_input;
+ int r;
+
+ mutex_lock(&adev->mes.mutex);
+
+ /* remove the mes gang from idr list */
+ spin_lock_irqsave(&adev->mes.queue_id_lock, flags);
+
+ queue = idr_find(&adev->mes.queue_id_idr, queue_id);
+ if (!queue) {
+ spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+ mutex_unlock(&adev->mes.mutex);
+ DRM_ERROR("queue id %d doesn't exist\n", queue_id);
+ return -EINVAL;
+ }
+
+ idr_remove(&adev->mes.queue_id_idr, queue_id);
+ spin_unlock_irqrestore(&adev->mes.queue_id_lock, flags);
+
+ DRM_DEBUG("try to remove queue, doorbell off = 0x%llx\n",
+ queue->doorbell_off);
+
+ gang = queue->gang;
+ queue_input.doorbell_offset = queue->doorbell_off;
+ queue_input.gang_context_addr = gang->gang_ctx_gpu_addr;
+
+ r = adev->mes.funcs->remove_hw_queue(&adev->mes, &queue_input);
+ if (r)
+ DRM_ERROR("failed to remove hardware queue, queue id = %d\n",
+ queue_id);
+
+ amdgpu_mes_queue_free_mqd(queue);
+ list_del(&queue->list);
+ amdgpu_mes_queue_doorbell_free(adev, gang->process,
+ queue->doorbell_off);
+ kfree(queue);
+ mutex_unlock(&adev->mes.mutex);
+ return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index ec727c2109bc..bf90863852a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -251,5 +251,6 @@ int amdgpu_mes_resume(struct amdgpu_device *adev);
int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
struct amdgpu_mes_queue_properties *qprops,
int *queue_id);
+int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id);
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 55/73] drm/amdgpu/mes: add helper function to convert ring to queue property
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (53 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 54/73] drm/amdgpu/mes: implement removing " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 56/73] drm/amdgpu/mes: add helper function to get the ctx meta data offset Alex Deucher
` (17 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add the helper function to convert ring to queue property.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index df0e542bd687..8cb74d0d0a1f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -700,3 +700,20 @@ int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id)
mutex_unlock(&adev->mes.mutex);
return 0;
}
+
+static void
+amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev,
+ struct amdgpu_ring *ring,
+ struct amdgpu_mes_queue_properties *props)
+{
+ props->queue_type = ring->funcs->type;
+ props->hqd_base_gpu_addr = ring->gpu_addr;
+ props->rptr_gpu_addr = ring->rptr_gpu_addr;
+ props->wptr_gpu_addr = ring->wptr_gpu_addr;
+ props->queue_size = ring->ring_size;
+ props->eop_gpu_addr = ring->eop_gpu_addr;
+ props->hqd_pipe_priority = AMDGPU_GFX_PIPE_PRIO_NORMAL;
+ props->hqd_queue_priority = AMDGPU_GFX_QUEUE_PRIORITY_MINIMUM;
+ props->paging = false;
+ props->ring = ring;
+}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 56/73] drm/amdgpu/mes: add helper function to get the ctx meta data offset
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (54 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 55/73] drm/amdgpu/mes: add helper function to convert ring to queue property Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 57/73] drm/amdgpu/mes: use ring for kernel queue submission Alex Deucher
` (16 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add the helper function to get the corresponding ctx meta data offset.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 36 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 3 ++-
2 files changed, 38 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 8cb74d0d0a1f..4e99adcfbb0e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -717,3 +717,39 @@ amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev,
props->paging = false;
props->ring = ring;
}
+
+#define DEFINE_AMDGPU_MES_CTX_GET_OFFS_ENG(_eng) \
+do { \
+ if (id_offs < AMDGPU_MES_CTX_MAX_OFFS) \
+ return offsetof(struct amdgpu_mes_ctx_meta_data, \
+ _eng[ring->idx].slots[id_offs]); \
+ else if (id_offs == AMDGPU_MES_CTX_RING_OFFS) \
+ return offsetof(struct amdgpu_mes_ctx_meta_data, \
+ _eng[ring->idx].ring); \
+ else if (id_offs == AMDGPU_MES_CTX_IB_OFFS) \
+ return offsetof(struct amdgpu_mes_ctx_meta_data, \
+ _eng[ring->idx].ib); \
+ else if (id_offs == AMDGPU_MES_CTX_PADDING_OFFS) \
+ return offsetof(struct amdgpu_mes_ctx_meta_data, \
+ _eng[ring->idx].padding); \
+} while(0)
+
+int amdgpu_mes_ctx_get_offs(struct amdgpu_ring *ring, unsigned int id_offs)
+{
+ switch (ring->funcs->type) {
+ case AMDGPU_RING_TYPE_GFX:
+ DEFINE_AMDGPU_MES_CTX_GET_OFFS_ENG(gfx);
+ break;
+ case AMDGPU_RING_TYPE_COMPUTE:
+ DEFINE_AMDGPU_MES_CTX_GET_OFFS_ENG(compute);
+ break;
+ case AMDGPU_RING_TYPE_SDMA:
+ DEFINE_AMDGPU_MES_CTX_GET_OFFS_ENG(sdma);
+ break;
+ default:
+ break;
+ }
+
+ WARN_ON(1);
+ return -EINVAL;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index bf90863852a7..36684416f277 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -230,9 +230,10 @@ struct amdgpu_mes_funcs {
struct mes_resume_gang_input *input);
};
-
#define amdgpu_mes_kiq_hw_init(adev) (adev)->mes.kiq_hw_init((adev))
+int amdgpu_mes_ctx_get_offs(struct amdgpu_ring *ring, unsigned int id_offs);
+
int amdgpu_mes_init(struct amdgpu_device *adev);
void amdgpu_mes_fini(struct amdgpu_device *adev);
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 57/73] drm/amdgpu/mes: use ring for kernel queue submission
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (55 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 56/73] drm/amdgpu/mes: add helper function to get the ctx meta data offset Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 58/73] drm/amdgpu/mes: implement removing mes ring Alex Deucher
` (15 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Use ring as the front end for kernel queue submission.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 93 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 5 ++
2 files changed, 98 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 4e99adcfbb0e..827391fcb2a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -753,3 +753,96 @@ int amdgpu_mes_ctx_get_offs(struct amdgpu_ring *ring, unsigned int id_offs)
WARN_ON(1);
return -EINVAL;
}
+
+int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
+ int queue_type, int idx,
+ struct amdgpu_mes_ctx_data *ctx_data,
+ struct amdgpu_ring **out)
+{
+ struct amdgpu_ring *ring;
+ struct amdgpu_mes_gang *gang;
+ struct amdgpu_mes_queue_properties qprops = {0};
+ int r, queue_id, pasid;
+
+ mutex_lock(&adev->mes.mutex);
+ gang = idr_find(&adev->mes.gang_id_idr, gang_id);
+ if (!gang) {
+ DRM_ERROR("gang id %d doesn't exist\n", gang_id);
+ mutex_unlock(&adev->mes.mutex);
+ return -EINVAL;
+ }
+ pasid = gang->process->pasid;
+
+ ring = kzalloc(sizeof(struct amdgpu_ring), GFP_KERNEL);
+ if (!ring) {
+ mutex_unlock(&adev->mes.mutex);
+ return -ENOMEM;
+ }
+
+ ring->ring_obj = NULL;
+ ring->use_doorbell = true;
+ ring->is_mes_queue = true;
+ ring->mes_ctx = ctx_data;
+ ring->idx = idx;
+ ring->no_scheduler = true;
+
+ if (queue_type == AMDGPU_RING_TYPE_COMPUTE) {
+ int offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+ compute[ring->idx].mec_hpd);
+ ring->eop_gpu_addr =
+ amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset);
+ }
+
+ switch (queue_type) {
+ case AMDGPU_RING_TYPE_GFX:
+ ring->funcs = adev->gfx.gfx_ring[0].funcs;
+ break;
+ case AMDGPU_RING_TYPE_COMPUTE:
+ ring->funcs = adev->gfx.compute_ring[0].funcs;
+ break;
+ case AMDGPU_RING_TYPE_SDMA:
+ ring->funcs = adev->sdma.instance[0].ring.funcs;
+ break;
+ default:
+ BUG();
+ }
+
+ r = amdgpu_ring_init(adev, ring, 1024, NULL, 0,
+ AMDGPU_RING_PRIO_DEFAULT, NULL);
+ if (r)
+ goto clean_up_memory;
+
+ amdgpu_mes_ring_to_queue_props(adev, ring, &qprops);
+
+ dma_fence_wait(gang->process->vm->last_update, false);
+ dma_fence_wait(ctx_data->meta_data_va->last_pt_update, false);
+ mutex_unlock(&adev->mes.mutex);
+
+ r = amdgpu_mes_add_hw_queue(adev, gang_id, &qprops, &queue_id);
+ if (r)
+ goto clean_up_ring;
+
+ ring->hw_queue_id = queue_id;
+ ring->doorbell_index = qprops.doorbell_off;
+
+ if (queue_type == AMDGPU_RING_TYPE_GFX)
+ sprintf(ring->name, "gfx_%d.%d.%d", pasid, gang_id, queue_id);
+ else if (queue_type == AMDGPU_RING_TYPE_COMPUTE)
+ sprintf(ring->name, "compute_%d.%d.%d", pasid, gang_id,
+ queue_id);
+ else if (queue_type == AMDGPU_RING_TYPE_SDMA)
+ sprintf(ring->name, "sdma_%d.%d.%d", pasid, gang_id,
+ queue_id);
+ else
+ BUG();
+
+ *out = ring;
+ return 0;
+
+clean_up_ring:
+ amdgpu_ring_fini(ring);
+clean_up_memory:
+ kfree(ring);
+ mutex_unlock(&adev->mes.mutex);
+ return r;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 36684416f277..1fe5c869f37e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -254,4 +254,9 @@ int amdgpu_mes_add_hw_queue(struct amdgpu_device *adev, int gang_id,
int *queue_id);
int amdgpu_mes_remove_hw_queue(struct amdgpu_device *adev, int queue_id);
+int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
+ int queue_type, int idx,
+ struct amdgpu_mes_ctx_data *ctx_data,
+ struct amdgpu_ring **out);
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 58/73] drm/amdgpu/mes: implement removing mes ring
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (56 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 57/73] drm/amdgpu/mes: use ring for kernel queue submission Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 59/73] drm/amdgpu/mes: add helper functions to alloc/free ctx metadata Alex Deucher
` (14 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Remove the mes ring and its resources.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 11 +++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 2 ++
2 files changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 827391fcb2a3..fa43a7e3c9ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -846,3 +846,14 @@ int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
mutex_unlock(&adev->mes.mutex);
return r;
}
+
+void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
+ struct amdgpu_ring *ring)
+{
+ if (!ring)
+ return;
+
+ amdgpu_mes_remove_hw_queue(adev, ring->hw_queue_id);
+ amdgpu_ring_fini(ring);
+ kfree(ring);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 1fe5c869f37e..37232b396b06 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -258,5 +258,7 @@ int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
int queue_type, int idx,
struct amdgpu_mes_ctx_data *ctx_data,
struct amdgpu_ring **out);
+void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
+ struct amdgpu_ring *ring);
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 59/73] drm/amdgpu/mes: add helper functions to alloc/free ctx metadata
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (57 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 58/73] drm/amdgpu/mes: implement removing mes ring Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 60/73] drm/amdgpu: skip kfd routines when mes enabled Alex Deucher
` (13 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add the helper functions to allocate/free context metadata.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 25 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 4 ++++
2 files changed, 29 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index fa43a7e3c9ab..6c01581e3a7b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -857,3 +857,28 @@ void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
amdgpu_ring_fini(ring);
kfree(ring);
}
+
+int amdgpu_mes_ctx_alloc_meta_data(struct amdgpu_device *adev,
+ struct amdgpu_mes_ctx_data *ctx_data)
+{
+ int r;
+
+ r = amdgpu_bo_create_kernel(adev,
+ sizeof(struct amdgpu_mes_ctx_meta_data),
+ PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
+ &ctx_data->meta_data_obj, NULL,
+ &ctx_data->meta_data_ptr);
+ if (!ctx_data->meta_data_obj)
+ return -ENOMEM;
+
+ memset(ctx_data->meta_data_ptr, 0,
+ sizeof(struct amdgpu_mes_ctx_meta_data));
+
+ return 0;
+}
+
+void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data)
+{
+ if (ctx_data->meta_data_obj)
+ amdgpu_bo_free_kernel(&ctx_data->meta_data_obj, NULL, NULL);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 37232b396b06..50d490e69cb7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -261,4 +261,8 @@ int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
struct amdgpu_ring *ring);
+int amdgpu_mes_ctx_alloc_meta_data(struct amdgpu_device *adev,
+ struct amdgpu_mes_ctx_data *ctx_data);
+void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data);
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 60/73] drm/amdgpu: skip kfd routines when mes enabled
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (58 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 59/73] drm/amdgpu/mes: add helper functions to alloc/free ctx metadata Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 61/73] drm/amdgpu: Enable KFD with MES enabled Alex Deucher
` (12 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
For kfd hasn't supported mes, skip kfd routines.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 25 ++++++++++++++--------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b2366d0d3047..728e7be54a59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2171,7 +2171,8 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
}
- amdgpu_amdkfd_device_probe(adev);
+ if (!adev->enable_mes)
+ amdgpu_amdkfd_device_probe(adev);
adev->pm.pp_feature = amdgpu_pp_feature_mask;
if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
@@ -2498,7 +2499,8 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
goto init_failed;
/* Don't init kfd if whole hive need to be reset during init */
- if (!adev->gmc.xgmi.pending_reset)
+ if (!adev->gmc.xgmi.pending_reset &&
+ !adev->enable_mes)
amdgpu_amdkfd_device_init(adev);
amdgpu_fru_get_product_info(adev);
@@ -2861,7 +2863,8 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
if (adev->gmc.xgmi.num_physical_nodes > 1)
amdgpu_xgmi_remove_device(adev);
- amdgpu_amdkfd_device_fini_sw(adev);
+ if (!adev->enable_mes)
+ amdgpu_amdkfd_device_fini_sw(adev);
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.sw)
@@ -4122,7 +4125,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
amdgpu_device_ip_suspend_phase1(adev);
- if (!adev->in_s0ix)
+ if (!adev->in_s0ix && !adev->enable_mes)
amdgpu_amdkfd_suspend(adev, adev->in_runpm);
amdgpu_device_evict_resources(adev);
@@ -4176,7 +4179,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
queue_delayed_work(system_wq, &adev->delayed_init_work,
msecs_to_jiffies(AMDGPU_RESUME_MS));
- if (!adev->in_s0ix) {
+ if (!adev->in_s0ix && !adev->enable_mes) {
r = amdgpu_amdkfd_resume(adev, adev->in_runpm);
if (r)
return r;
@@ -4459,7 +4462,8 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
int retry_limit = 0;
retry:
- amdgpu_amdkfd_pre_reset(adev);
+ if (!adev->enable_mes)
+ amdgpu_amdkfd_pre_reset(adev);
amdgpu_amdkfd_pre_reset(adev);
@@ -4497,7 +4501,9 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
if (!r) {
amdgpu_irq_gpu_reset_resume_helper(adev);
r = amdgpu_ib_ring_tests(adev);
- amdgpu_amdkfd_post_reset(adev);
+
+ if (!adev->enable_mes)
+ amdgpu_amdkfd_post_reset(adev);
}
error:
@@ -5142,7 +5148,7 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
cancel_delayed_work_sync(&tmp_adev->delayed_init_work);
- if (!amdgpu_sriov_vf(tmp_adev))
+ if (!amdgpu_sriov_vf(tmp_adev) && !adev->enable_mes)
amdgpu_amdkfd_pre_reset(tmp_adev);
/*
@@ -5265,7 +5271,8 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
skip_sched_resume:
list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
/* unlock kfd: SRIOV would do it separately */
- if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev))
+ if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev) &&
+ !adev->enable_mes)
amdgpu_amdkfd_post_reset(tmp_adev);
/* kfd_post_reset will do nothing if kfd device is not initialized,
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 61/73] drm/amdgpu: Enable KFD with MES enabled
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (59 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 60/73] drm/amdgpu: skip kfd routines when mes enabled Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 62/73] drm/amdgpu: skip some checking for mes queue ib submission Alex Deucher
` (11 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx
Cc: Alex Deucher, Mukul Joshi, Oak Zeng, Harish Kasiviswanathan,
Felix Kuehling
From: Mukul Joshi <mukul.joshi@amd.com>
Enable KFD initialization with MES enabled.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++++++++--------------
1 file changed, 9 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 728e7be54a59..e582f1044c0f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2171,8 +2171,7 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
}
- if (!adev->enable_mes)
- amdgpu_amdkfd_device_probe(adev);
+ amdgpu_amdkfd_device_probe(adev);
adev->pm.pp_feature = amdgpu_pp_feature_mask;
if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
@@ -2499,8 +2498,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
goto init_failed;
/* Don't init kfd if whole hive need to be reset during init */
- if (!adev->gmc.xgmi.pending_reset &&
- !adev->enable_mes)
+ if (!adev->gmc.xgmi.pending_reset)
amdgpu_amdkfd_device_init(adev);
amdgpu_fru_get_product_info(adev);
@@ -2863,8 +2861,7 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
if (adev->gmc.xgmi.num_physical_nodes > 1)
amdgpu_xgmi_remove_device(adev);
- if (!adev->enable_mes)
- amdgpu_amdkfd_device_fini_sw(adev);
+ amdgpu_amdkfd_device_fini_sw(adev);
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
if (!adev->ip_blocks[i].status.sw)
@@ -4125,7 +4122,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
amdgpu_device_ip_suspend_phase1(adev);
- if (!adev->in_s0ix && !adev->enable_mes)
+ if (!adev->in_s0ix)
amdgpu_amdkfd_suspend(adev, adev->in_runpm);
amdgpu_device_evict_resources(adev);
@@ -4179,7 +4176,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
queue_delayed_work(system_wq, &adev->delayed_init_work,
msecs_to_jiffies(AMDGPU_RESUME_MS));
- if (!adev->in_s0ix && !adev->enable_mes) {
+ if (!adev->in_s0ix) {
r = amdgpu_amdkfd_resume(adev, adev->in_runpm);
if (r)
return r;
@@ -4462,8 +4459,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
int retry_limit = 0;
retry:
- if (!adev->enable_mes)
- amdgpu_amdkfd_pre_reset(adev);
+ amdgpu_amdkfd_pre_reset(adev);
amdgpu_amdkfd_pre_reset(adev);
@@ -4502,8 +4498,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
amdgpu_irq_gpu_reset_resume_helper(adev);
r = amdgpu_ib_ring_tests(adev);
- if (!adev->enable_mes)
- amdgpu_amdkfd_post_reset(adev);
+ amdgpu_amdkfd_post_reset(adev);
}
error:
@@ -5148,7 +5143,7 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
cancel_delayed_work_sync(&tmp_adev->delayed_init_work);
- if (!amdgpu_sriov_vf(tmp_adev) && !adev->enable_mes)
+ if (!amdgpu_sriov_vf(tmp_adev))
amdgpu_amdkfd_pre_reset(tmp_adev);
/*
@@ -5271,8 +5266,7 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
skip_sched_resume:
list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
/* unlock kfd: SRIOV would do it separately */
- if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev) &&
- !adev->enable_mes)
+ if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev))
amdgpu_amdkfd_post_reset(tmp_adev);
/* kfd_post_reset will do nothing if kfd device is not initialized,
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 62/73] drm/amdgpu: skip some checking for mes queue ib submission
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (60 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 61/73] drm/amdgpu: Enable KFD with MES enabled Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 63/73] drm/amdgpu: skip kiq ib tests if mes enabled Alex Deucher
` (10 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Skip some checking for mes queue ib submission.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index d583766ea392..d8354453cc29 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -155,12 +155,12 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
fence_ctx = 0;
}
- if (!ring->sched.ready) {
+ if (!ring->sched.ready && !ring->is_mes_queue) {
dev_err(adev->dev, "couldn't schedule ib on ring <%s>\n", ring->name);
return -EINVAL;
}
- if (vm && !job->vmid) {
+ if (vm && !job->vmid && !ring->is_mes_queue) {
dev_err(adev->dev, "VM IB without ID\n");
return -EINVAL;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 63/73] drm/amdgpu: skip kiq ib tests if mes enabled
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (61 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 62/73] drm/amdgpu: skip some checking for mes queue ib submission Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 64/73] drm/amdgpu: skip gds switch for mes queue Alex Deucher
` (9 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
For kiq conflicts with mes, skip kiq ib tests.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index d8354453cc29..258cffe3c06a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -390,6 +390,10 @@ int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
if (!ring->sched.ready || !ring->funcs->test_ib)
continue;
+ if (adev->enable_mes &&
+ ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+ continue;
+
/* MM engine need more time */
if (ring->funcs->type == AMDGPU_RING_TYPE_UVD ||
ring->funcs->type == AMDGPU_RING_TYPE_VCE ||
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 64/73] drm/amdgpu: skip gds switch for mes queue
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (62 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 63/73] drm/amdgpu: skip kiq ib tests if mes enabled Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 65/73] drm/amdgpu: kiq takes charge of all queues Alex Deucher
` (8 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
For mes manages gds allocation, skip gds switch.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 4736174f5e4d..7276b03ef970 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -639,7 +639,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job,
}
dma_fence_put(fence);
- if (ring->funcs->emit_gds_switch && gds_switch_needed) {
+ if (!ring->is_mes_queue && ring->funcs->emit_gds_switch &&
+ gds_switch_needed) {
id->gds_base = job->gds_base;
id->gds_size = job->gds_size;
id->gws_base = job->gws_base;
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 65/73] drm/amdgpu: kiq takes charge of all queues
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (63 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 64/73] drm/amdgpu: skip gds switch for mes queue Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 66/73] drm/amdgpu/mes: map ctx metadata for mes self test Alex Deucher
` (7 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
To make kgq/kcq and mes queue co-exist, kiq needs take charge
of all queues.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 28a736c507bb..40df1e04d682 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -535,6 +535,9 @@ int amdgpu_gfx_enable_kcq(struct amdgpu_device *adev)
return r;
}
+ if (adev->enable_mes)
+ queue_mask = ~0ULL;
+
kiq->pmf->kiq_set_resources(kiq_ring, queue_mask);
for (i = 0; i < adev->gfx.num_compute_rings; i++)
kiq->pmf->kiq_map_queues(kiq_ring, &adev->gfx.compute_ring[i]);
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 66/73] drm/amdgpu/mes: map ctx metadata for mes self test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (64 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 65/73] drm/amdgpu: kiq takes charge of all queues Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 67/73] drm/amdgpu/mes: create gang and queues " Alex Deucher
` (6 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Map ctx metadata for mes self test.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 37 +++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 6c01581e3a7b..b440b36dd98a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -882,3 +882,40 @@ void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data)
if (ctx_data->meta_data_obj)
amdgpu_bo_free_kernel(&ctx_data->meta_data_obj, NULL, NULL);
}
+
+static int amdgpu_mes_test_map_ctx_meta_data(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm,
+ struct amdgpu_mes_ctx_data *ctx_data)
+{
+ struct amdgpu_bo_va *meta_data_va = NULL;
+ uint64_t meta_data_addr = AMDGPU_VA_RESERVED_SIZE;
+ int r;
+
+ r = amdgpu_map_static_csa(adev, vm, ctx_data->meta_data_obj,
+ &meta_data_va, meta_data_addr,
+ sizeof(struct amdgpu_mes_ctx_meta_data));
+ if (r)
+ return r;
+
+ r = amdgpu_vm_bo_update(adev, meta_data_va, false);
+ if (r)
+ goto error;
+
+ r = amdgpu_vm_update_pdes(adev, vm, false);
+ if (r)
+ goto error;
+
+ dma_fence_wait(vm->last_update, false);
+ dma_fence_wait(meta_data_va->last_pt_update, false);
+
+ ctx_data->meta_data_gpu_addr = meta_data_addr;
+ ctx_data->meta_data_va = meta_data_va;
+
+ return 0;
+
+error:
+ BUG_ON(amdgpu_bo_reserve(ctx_data->meta_data_obj, true));
+ amdgpu_vm_bo_rmv(adev, meta_data_va);
+ amdgpu_bo_unreserve(ctx_data->meta_data_obj);
+ return r;
+}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 67/73] drm/amdgpu/mes: create gang and queues for mes self test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (65 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 66/73] drm/amdgpu/mes: map ctx metadata for mes self test Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 68/73] drm/amdgpu/mes: add ring/ib test " Alex Deucher
` (5 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Create gang and queues for mes self test.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 39 +++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b440b36dd98a..027f3aae6025 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -919,3 +919,42 @@ static int amdgpu_mes_test_map_ctx_meta_data(struct amdgpu_device *adev,
amdgpu_bo_unreserve(ctx_data->meta_data_obj);
return r;
}
+
+static int amdgpu_mes_test_create_gang_and_queues(struct amdgpu_device *adev,
+ int pasid, int *gang_id,
+ int queue_type, int num_queue,
+ struct amdgpu_ring **added_rings,
+ struct amdgpu_mes_ctx_data *ctx_data)
+{
+ struct amdgpu_ring *ring;
+ struct amdgpu_mes_gang_properties gprops = {0};
+ int r, j;
+
+ /* create a gang for the process */
+ gprops.priority = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+ gprops.gang_quantum = adev->mes.default_gang_quantum;
+ gprops.inprocess_gang_priority = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+ gprops.priority_level = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+ gprops.global_priority_level = AMDGPU_MES_PRIORITY_LEVEL_NORMAL;
+
+ r = amdgpu_mes_add_gang(adev, pasid, &gprops, gang_id);
+ if (r) {
+ DRM_ERROR("failed to add gang\n");
+ return r;
+ }
+
+ /* create queues for the gang */
+ for (j = 0; j < num_queue; j++) {
+ r = amdgpu_mes_add_ring(adev, *gang_id, queue_type, j,
+ ctx_data, &ring);
+ if (r) {
+ DRM_ERROR("failed to add ring\n");
+ break;
+ }
+
+ DRM_INFO("ring %s was added\n", ring->name);
+ added_rings[j] = ring;
+ }
+
+ return 0;
+}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 68/73] drm/amdgpu/mes: add ring/ib test for mes self test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (66 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 67/73] drm/amdgpu/mes: create gang and queues " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 69/73] drm/amdgpu/mes: implement " Alex Deucher
` (4 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Run the ring test and ib test for mes self test.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 32 +++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 027f3aae6025..e2b1da08ab64 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -958,3 +958,35 @@ static int amdgpu_mes_test_create_gang_and_queues(struct amdgpu_device *adev,
return 0;
}
+
+static int amdgpu_mes_test_queues(struct amdgpu_ring **added_rings)
+{
+ struct amdgpu_ring *ring;
+ int i, r;
+
+ for (i = 0; i < AMDGPU_MES_CTX_MAX_RINGS; i++) {
+ ring = added_rings[i];
+ if (!ring)
+ continue;
+
+ r = amdgpu_ring_test_ring(ring);
+ if (r) {
+ DRM_DEV_ERROR(ring->adev->dev,
+ "ring %s test failed (%d)\n",
+ ring->name, r);
+ return r;
+ } else
+ DRM_INFO("ring %s test pass\n", ring->name);
+
+ r = amdgpu_ring_test_ib(ring, 1000 * 10);
+ if (r) {
+ DRM_DEV_ERROR(ring->adev->dev,
+ "ring %s ib test failed (%d)\n",
+ ring->name, r);
+ return r;
+ } else
+ DRM_INFO("ring %s ib test pass\n", ring->name);
+ }
+
+ return 0;
+}
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 69/73] drm/amdgpu/mes: implement mes self test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (67 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 68/73] drm/amdgpu/mes: add ring/ib test " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 70/73] drm/amdgpu/mes10.1: add mes self test in late init Alex Deucher
` (3 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add mes self test to verify its fundamental functionality by
running ring test and ib test of mes kernel queue.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 97 +++++++++++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 2 +
2 files changed, 99 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index e2b1da08ab64..c9516b3aa6d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -990,3 +990,100 @@ static int amdgpu_mes_test_queues(struct amdgpu_ring **added_rings)
return 0;
}
+
+int amdgpu_mes_self_test(struct amdgpu_device *adev)
+{
+ struct amdgpu_vm *vm = NULL;
+ struct amdgpu_mes_ctx_data ctx_data = {0};
+ struct amdgpu_ring *added_rings[AMDGPU_MES_CTX_MAX_RINGS] = { NULL };
+ int gang_ids[3] = {0};
+ int queue_types[][2] = { { AMDGPU_RING_TYPE_GFX,
+ AMDGPU_MES_CTX_MAX_GFX_RINGS},
+ { AMDGPU_RING_TYPE_COMPUTE,
+ AMDGPU_MES_CTX_MAX_COMPUTE_RINGS},
+ { AMDGPU_RING_TYPE_SDMA,
+ AMDGPU_MES_CTX_MAX_SDMA_RINGS } };
+ int i, r, pasid, k = 0;
+
+ pasid = amdgpu_pasid_alloc(16);
+ if (pasid < 0) {
+ dev_warn(adev->dev, "No more PASIDs available!");
+ pasid = 0;
+ }
+
+ vm = kzalloc(sizeof(*vm), GFP_KERNEL);
+ if (!vm) {
+ r = -ENOMEM;
+ goto error_pasid;
+ }
+
+ r = amdgpu_vm_init(adev, vm);
+ if (r) {
+ DRM_ERROR("failed to initialize vm\n");
+ goto error_pasid;
+ }
+
+ r = amdgpu_mes_ctx_alloc_meta_data(adev, &ctx_data);
+ if (r) {
+ DRM_ERROR("failed to alloc ctx meta data\n");
+ goto error_pasid;
+ }
+
+ r = amdgpu_mes_test_map_ctx_meta_data(adev, vm, &ctx_data);
+ if (r) {
+ DRM_ERROR("failed to map ctx meta data\n");
+ goto error_vm;
+ }
+
+ r = amdgpu_mes_create_process(adev, pasid, vm);
+ if (r) {
+ DRM_ERROR("failed to create MES process\n");
+ goto error_vm;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(queue_types); i++) {
+ r = amdgpu_mes_test_create_gang_and_queues(adev, pasid,
+ &gang_ids[i],
+ queue_types[i][0],
+ queue_types[i][1],
+ &added_rings[k],
+ &ctx_data);
+ if (r)
+ goto error_queues;
+
+ k += queue_types[i][1];
+ }
+
+ /* start ring test and ib test for MES queues */
+ amdgpu_mes_test_queues(added_rings);
+
+error_queues:
+ /* remove all queues */
+ for (i = 0; i < ARRAY_SIZE(added_rings); i++) {
+ if (!added_rings[i])
+ continue;
+ amdgpu_mes_remove_ring(adev, added_rings[i]);
+ }
+
+ for (i = 0; i < ARRAY_SIZE(gang_ids); i++) {
+ if (!gang_ids[i])
+ continue;
+ amdgpu_mes_remove_gang(adev, gang_ids[i]);
+ }
+
+ amdgpu_mes_destroy_process(adev, pasid);
+
+error_vm:
+ BUG_ON(amdgpu_bo_reserve(ctx_data.meta_data_obj, true));
+ amdgpu_vm_bo_rmv(adev, ctx_data.meta_data_va);
+ amdgpu_bo_unreserve(ctx_data.meta_data_obj);
+ amdgpu_vm_fini(adev, vm);
+
+error_pasid:
+ if (pasid)
+ amdgpu_pasid_free(pasid);
+
+ amdgpu_mes_ctx_free_meta_data(&ctx_data);
+ kfree(vm);
+ return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 50d490e69cb7..5c9e7932c7a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -265,4 +265,6 @@ int amdgpu_mes_ctx_alloc_meta_data(struct amdgpu_device *adev,
struct amdgpu_mes_ctx_data *ctx_data);
void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data);
+int amdgpu_mes_self_test(struct amdgpu_device *adev);
+
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 70/73] drm/amdgpu/mes10.1: add mes self test in late init
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (68 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 69/73] drm/amdgpu/mes: implement " Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 71/73] drm/amdgpu/mes: fix vm csa update issue Alex Deucher
` (2 subsequent siblings)
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Christian König, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Add MES self test in late init.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
index d468cb5a8854..622aa17b18e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.c
@@ -1142,8 +1142,18 @@ static int mes_v10_1_resume(void *handle)
return amdgpu_mes_resume(adev);
}
+static int mes_v10_0_late_init(void *handle)
+{
+ struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+ amdgpu_mes_self_test(adev);
+
+ return 0;
+}
+
static const struct amd_ip_funcs mes_v10_1_ip_funcs = {
.name = "mes_v10_1",
+ .late_init = mes_v10_0_late_init,
.sw_init = mes_v10_1_sw_init,
.sw_fini = mes_v10_1_sw_fini,
.hw_init = mes_v10_1_hw_init,
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 71/73] drm/amdgpu/mes: fix vm csa update issue
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (69 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 70/73] drm/amdgpu/mes10.1: add mes self test in late init Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 72/73] drm/amdgpu/mes: disable mes sdma queue test Alex Deucher
2022-04-29 17:46 ` [PATCH 73/73] drm/amdgpu/mes: Update the doorbell function signatures Alex Deucher
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Need reserve VM buffers before update VM csa.
v2: rebase fixes
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 81 ++++++++++++++++++-------
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 3 +
2 files changed, 62 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index c9516b3aa6d9..51a6f309ef22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -883,40 +883,76 @@ void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data)
amdgpu_bo_free_kernel(&ctx_data->meta_data_obj, NULL, NULL);
}
-static int amdgpu_mes_test_map_ctx_meta_data(struct amdgpu_device *adev,
- struct amdgpu_vm *vm,
- struct amdgpu_mes_ctx_data *ctx_data)
+int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm,
+ struct amdgpu_mes_ctx_data *ctx_data)
{
- struct amdgpu_bo_va *meta_data_va = NULL;
- uint64_t meta_data_addr = AMDGPU_VA_RESERVED_SIZE;
+ struct amdgpu_bo_va *bo_va;
+ struct ww_acquire_ctx ticket;
+ struct list_head list;
+ struct amdgpu_bo_list_entry pd;
+ struct ttm_validate_buffer csa_tv;
+ struct amdgpu_sync sync;
int r;
- r = amdgpu_map_static_csa(adev, vm, ctx_data->meta_data_obj,
- &meta_data_va, meta_data_addr,
- sizeof(struct amdgpu_mes_ctx_meta_data));
- if (r)
+ amdgpu_sync_create(&sync);
+ INIT_LIST_HEAD(&list);
+ INIT_LIST_HEAD(&csa_tv.head);
+
+ csa_tv.bo = &ctx_data->meta_data_obj->tbo;
+ csa_tv.num_shared = 1;
+
+ list_add(&csa_tv.head, &list);
+ amdgpu_vm_get_pd_bo(vm, &list, &pd);
+
+ r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL);
+ if (r) {
+ DRM_ERROR("failed to reserve meta data BO: err=%d\n", r);
return r;
+ }
- r = amdgpu_vm_bo_update(adev, meta_data_va, false);
- if (r)
+ bo_va = amdgpu_vm_bo_add(adev, vm, ctx_data->meta_data_obj);
+ if (!bo_va) {
+ ttm_eu_backoff_reservation(&ticket, &list);
+ DRM_ERROR("failed to create bo_va for meta data BO\n");
+ return -ENOMEM;
+ }
+
+ r = amdgpu_vm_bo_map(adev, bo_va, ctx_data->meta_data_gpu_addr, 0,
+ sizeof(struct amdgpu_mes_ctx_meta_data),
+ AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE |
+ AMDGPU_PTE_EXECUTABLE);
+
+ if (r) {
+ DRM_ERROR("failed to do bo_map on meta data, err=%d\n", r);
goto error;
+ }
- r = amdgpu_vm_update_pdes(adev, vm, false);
- if (r)
+ r = amdgpu_vm_bo_update(adev, bo_va, false);
+ if (r) {
+ DRM_ERROR("failed to do vm_bo_update on meta data\n");
goto error;
+ }
+ amdgpu_sync_fence(&sync, bo_va->last_pt_update);
- dma_fence_wait(vm->last_update, false);
- dma_fence_wait(meta_data_va->last_pt_update, false);
+ r = amdgpu_vm_update_pdes(adev, vm, false);
+ if (r) {
+ DRM_ERROR("failed to update pdes on meta data\n");
+ goto error;
+ }
+ amdgpu_sync_fence(&sync, vm->last_update);
- ctx_data->meta_data_gpu_addr = meta_data_addr;
- ctx_data->meta_data_va = meta_data_va;
+ amdgpu_sync_wait(&sync, false);
+ ttm_eu_backoff_reservation(&ticket, &list);
+ amdgpu_sync_free(&sync);
+ ctx_data->meta_data_va = bo_va;
return 0;
error:
- BUG_ON(amdgpu_bo_reserve(ctx_data->meta_data_obj, true));
- amdgpu_vm_bo_rmv(adev, meta_data_va);
- amdgpu_bo_unreserve(ctx_data->meta_data_obj);
+ amdgpu_vm_bo_del(adev, bo_va);
+ ttm_eu_backoff_reservation(&ticket, &list);
+ amdgpu_sync_free(&sync);
return r;
}
@@ -1029,7 +1065,8 @@ int amdgpu_mes_self_test(struct amdgpu_device *adev)
goto error_pasid;
}
- r = amdgpu_mes_test_map_ctx_meta_data(adev, vm, &ctx_data);
+ ctx_data.meta_data_gpu_addr = AMDGPU_VA_RESERVED_SIZE;
+ r = amdgpu_mes_ctx_map_meta_data(adev, vm, &ctx_data);
if (r) {
DRM_ERROR("failed to map ctx meta data\n");
goto error_vm;
@@ -1075,7 +1112,7 @@ int amdgpu_mes_self_test(struct amdgpu_device *adev)
error_vm:
BUG_ON(amdgpu_bo_reserve(ctx_data.meta_data_obj, true));
- amdgpu_vm_bo_rmv(adev, ctx_data.meta_data_va);
+ amdgpu_vm_bo_del(adev, ctx_data.meta_data_va);
amdgpu_bo_unreserve(ctx_data.meta_data_obj);
amdgpu_vm_fini(adev, vm);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 5c9e7932c7a9..a965ace0fd0e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -264,6 +264,9 @@ void amdgpu_mes_remove_ring(struct amdgpu_device *adev,
int amdgpu_mes_ctx_alloc_meta_data(struct amdgpu_device *adev,
struct amdgpu_mes_ctx_data *ctx_data);
void amdgpu_mes_ctx_free_meta_data(struct amdgpu_mes_ctx_data *ctx_data);
+int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm,
+ struct amdgpu_mes_ctx_data *ctx_data);
int amdgpu_mes_self_test(struct amdgpu_device *adev);
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 72/73] drm/amdgpu/mes: disable mes sdma queue test
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (70 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 71/73] drm/amdgpu/mes: fix vm csa update issue Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
2022-04-29 17:46 ` [PATCH 73/73] drm/amdgpu/mes: Update the doorbell function signatures Alex Deucher
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx; +Cc: Alex Deucher, Jack Xiao, Hawking Zhang
From: Jack Xiao <Jack.Xiao@amd.com>
Disable mes sdma queue test on sienna cichlid+,
for fw hasn't supported to map sdma queue.
The test can be enabled if fw supports.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 51a6f309ef22..e23c864aca11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -1079,6 +1079,11 @@ int amdgpu_mes_self_test(struct amdgpu_device *adev)
}
for (i = 0; i < ARRAY_SIZE(queue_types); i++) {
+ /* On sienna cichlid+, fw hasn't supported to map sdma queue. */
+ if (adev->asic_type >= CHIP_SIENNA_CICHLID &&
+ i == AMDGPU_RING_TYPE_SDMA)
+ continue;
+
r = amdgpu_mes_test_create_gang_and_queues(adev, pasid,
&gang_ids[i],
queue_types[i][0],
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
* [PATCH 73/73] drm/amdgpu/mes: Update the doorbell function signatures
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
` (71 preceding siblings ...)
2022-04-29 17:46 ` [PATCH 72/73] drm/amdgpu/mes: disable mes sdma queue test Alex Deucher
@ 2022-04-29 17:46 ` Alex Deucher
72 siblings, 0 replies; 74+ messages in thread
From: Alex Deucher @ 2022-04-29 17:46 UTC (permalink / raw)
To: amd-gfx
Cc: Mukul Joshi, Jack Xiao, Oak Zeng, Harish Kasiviswanathan,
Alex Deucher, Felix Kuehling
From: Mukul Joshi <mukul.joshi@amd.com>
Update the function signatures for process doorbell allocations
with MES enabled to make them more generic. KFD would need to
access these functions to allocate/free doorbells when MES is
enabled.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 37 +++++++++++++++----------
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 9 ++++++
2 files changed, 31 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index e23c864aca11..5be30bf68b0c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -29,31 +29,40 @@
#define AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS 1024
#define AMDGPU_ONE_DOORBELL_SIZE 8
-static int amdgpu_mes_doorbell_process_slice(struct amdgpu_device *adev)
+int amdgpu_mes_doorbell_process_slice(struct amdgpu_device *adev)
{
return roundup(AMDGPU_ONE_DOORBELL_SIZE *
AMDGPU_MES_MAX_NUM_OF_QUEUES_PER_PROCESS,
PAGE_SIZE);
}
-static int amdgpu_mes_alloc_process_doorbells(struct amdgpu_device *adev,
- struct amdgpu_mes_process *process)
+int amdgpu_mes_alloc_process_doorbells(struct amdgpu_device *adev,
+ unsigned int *doorbell_index)
{
int r = ida_simple_get(&adev->mes.doorbell_ida, 2,
adev->mes.max_doorbell_slices,
GFP_KERNEL);
if (r > 0)
- process->doorbell_index = r;
+ *doorbell_index = r;
return r;
}
-static void amdgpu_mes_free_process_doorbells(struct amdgpu_device *adev,
- struct amdgpu_mes_process *process)
+void amdgpu_mes_free_process_doorbells(struct amdgpu_device *adev,
+ unsigned int doorbell_index)
{
- if (process->doorbell_index)
- ida_simple_remove(&adev->mes.doorbell_ida,
- process->doorbell_index);
+ if (doorbell_index)
+ ida_simple_remove(&adev->mes.doorbell_ida, doorbell_index);
+}
+
+unsigned int amdgpu_mes_get_doorbell_dw_offset_in_bar(
+ struct amdgpu_device *adev,
+ uint32_t doorbell_index,
+ unsigned int doorbell_id)
+{
+ return ((doorbell_index *
+ amdgpu_mes_doorbell_process_slice(adev)) / sizeof(u32) +
+ doorbell_id * 2);
}
static int amdgpu_mes_queue_doorbell_get(struct amdgpu_device *adev,
@@ -79,10 +88,8 @@ static int amdgpu_mes_queue_doorbell_get(struct amdgpu_device *adev,
set_bit(found, process->doorbell_bitmap);
- *doorbell_index =
- (process->doorbell_index *
- amdgpu_mes_doorbell_process_slice(adev)) / sizeof(u32) +
- found * 2;
+ *doorbell_index = amdgpu_mes_get_doorbell_dw_offset_in_bar(adev,
+ process->doorbell_index, found);
return 0;
}
@@ -262,7 +269,7 @@ int amdgpu_mes_create_process(struct amdgpu_device *adev, int pasid,
memset(process->proc_ctx_cpu_ptr, 0, AMDGPU_MES_PROC_CTX_SIZE);
/* allocate the starting doorbell index of the process */
- r = amdgpu_mes_alloc_process_doorbells(adev, process);
+ r = amdgpu_mes_alloc_process_doorbells(adev, &process->doorbell_index);
if (r < 0) {
DRM_ERROR("failed to allocate doorbell for process\n");
goto clean_up_ctx;
@@ -338,7 +345,7 @@ void amdgpu_mes_destroy_process(struct amdgpu_device *adev, int pasid)
kfree(gang);
}
- amdgpu_mes_free_process_doorbells(adev, process);
+ amdgpu_mes_free_process_doorbells(adev, process->doorbell_index);
idr_remove(&adev->mes.pasid_idr, pasid);
amdgpu_bo_free_kernel(&process->proc_ctx_bo,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index a965ace0fd0e..ba039984e431 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -270,4 +270,13 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
int amdgpu_mes_self_test(struct amdgpu_device *adev);
+int amdgpu_mes_alloc_process_doorbells(struct amdgpu_device *adev,
+ unsigned int *doorbell_index);
+void amdgpu_mes_free_process_doorbells(struct amdgpu_device *adev,
+ unsigned int doorbell_index);
+unsigned int amdgpu_mes_get_doorbell_dw_offset_in_bar(
+ struct amdgpu_device *adev,
+ uint32_t doorbell_index,
+ unsigned int doorbell_id);
+int amdgpu_mes_doorbell_process_slice(struct amdgpu_device *adev);
#endif /* __AMDGPU_MES_H__ */
--
2.35.1
^ permalink raw reply related [flat|nested] 74+ messages in thread
end of thread, other threads:[~2022-04-29 17:48 UTC | newest]
Thread overview: 74+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-04-29 17:45 [PATCH 00/73] MES support Alex Deucher
2022-04-29 17:45 ` [PATCH 01/73] drm/amdgpu: define MQD abstract layer for hw ip Alex Deucher
2022-04-29 17:45 ` [PATCH 02/73] drm/amdgpu: add helper function to initialize mqd from ring v4 Alex Deucher
2022-04-29 17:45 ` [PATCH 03/73] drm/amdgpu: add the per-context meta data v3 Alex Deucher
2022-04-29 17:45 ` [PATCH 04/73] drm/amdgpu: add mes ctx data in amdgpu_ring Alex Deucher
2022-04-29 17:45 ` [PATCH 05/73] drm/amdgpu: define ring structure to access rptr/wptr/fence Alex Deucher
2022-04-29 17:45 ` [PATCH 06/73] drm/amdgpu: use ring structure to access rptr/wptr v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 07/73] drm/amdgpu: initialize/finalize the ring for mes queue Alex Deucher
2022-04-29 17:45 ` [PATCH 08/73] drm/amdgpu: assign the cpu/gpu address of fence from ring Alex Deucher
2022-04-29 17:45 ` [PATCH 09/73] drm/amdgpu/gfx10: implement mqd functions of gfx/compute eng v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 10/73] drm/amdgpu/gfx10: use per ctx CSA for ce metadata Alex Deucher
2022-04-29 17:45 ` [PATCH 11/73] drm/amdgpu/gfx10: use per ctx CSA for de metadata Alex Deucher
2022-04-29 17:45 ` [PATCH 12/73] drm/amdgpu/gfx10: associate mes queue id with fence v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 13/73] drm/amdgpu/gfx10: inherit vmid from mqd Alex Deucher
2022-04-29 17:45 ` [PATCH 14/73] drm/amdgpu/gfx10: use INVALIDATE_TLBS to invalidate TLBs v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 15/73] drm/amdgpu/gmc10: skip emitting pasid mapping packet Alex Deucher
2022-04-29 17:45 ` [PATCH 16/73] drm/amdgpu: use the whole doorbell space for mes Alex Deucher
2022-04-29 17:45 ` [PATCH 17/73] drm/amdgpu: update mes process/gang/queue definitions Alex Deucher
2022-04-29 17:45 ` [PATCH 18/73] drm/amdgpu: add mes_kiq module parameter v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 19/73] drm/amdgpu: allocate doorbell index for mes kiq Alex Deucher
2022-04-29 17:45 ` [PATCH 20/73] drm/amdgpu/mes: extend mes framework to support multiple mes pipes Alex Deucher
2022-04-29 17:45 ` [PATCH 21/73] drm/amdgpu/gfx10: add mes queue fence handling Alex Deucher
2022-04-29 17:45 ` [PATCH 22/73] drm/amdgpu/gfx10: add mes support for gfx ib test Alex Deucher
2022-04-29 17:45 ` [PATCH 23/73] drm/amdgpu: don't use kiq to flush gpu tlb if mes enabled Alex Deucher
2022-04-29 17:45 ` [PATCH 24/73] drm/amdgpu/sdma: use per-ctx sdma csa address for mes sdma queue Alex Deucher
2022-04-29 17:45 ` [PATCH 25/73] drm/amdgpu/sdma5.2: initialize sdma mqd Alex Deucher
2022-04-29 17:45 ` [PATCH 26/73] drm/amdgpu/sdma5.2: associate mes queue id with fence Alex Deucher
2022-04-29 17:45 ` [PATCH 27/73] drm/amdgpu/sdma5.2: add mes queue fence handling Alex Deucher
2022-04-29 17:45 ` [PATCH 28/73] drm/amdgpu/sdma5.2: add mes support for sdma ring test Alex Deucher
2022-04-29 17:45 ` [PATCH 29/73] drm/amdgpu/sdma5.2: add mes support for sdma ib test Alex Deucher
2022-04-29 17:45 ` [PATCH 30/73] drm/amdgpu/sdma5: initialize sdma mqd Alex Deucher
2022-04-29 17:45 ` [PATCH 31/73] drm/amdgpu/sdma5: associate mes queue id with fence Alex Deucher
2022-04-29 17:45 ` [PATCH 32/73] drm/amdgpu/sdma5: add mes queue fence handling Alex Deucher
2022-04-29 17:45 ` [PATCH 33/73] drm/amdgpu/sdma5: add mes support for sdma ring test Alex Deucher
2022-04-29 17:45 ` [PATCH 34/73] drm/amdgpu/sdma5: add mes support for sdma ib test Alex Deucher
2022-04-29 17:45 ` [PATCH 35/73] drm/amdgpu: add mes kiq PSP GFX FW type Alex Deucher
2022-04-29 17:45 ` [PATCH 36/73] drm/amdgpu/mes: add mes kiq callback Alex Deucher
2022-04-29 17:45 ` [PATCH 37/73] drm/amdgpu: add mes kiq frontdoor loading support Alex Deucher
2022-04-29 17:45 ` [PATCH 38/73] drm/amdgpu: enable mes kiq N-1 test on sienna cichlid Alex Deucher
2022-04-29 17:45 ` [PATCH 39/73] drm/amdgpu/mes: manage mes doorbell allocation Alex Deucher
2022-04-29 17:45 ` [PATCH 40/73] drm/amdgpu: add mes queue id mask v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 41/73] drm/amdgpu/mes: initialize/finalize common mes structure v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 42/73] drm/amdgpu/mes: relocate status_fence slot allocation Alex Deucher
2022-04-29 17:45 ` [PATCH 43/73] drm/amdgpu/mes10.1: call general mes initialization Alex Deucher
2022-04-29 17:45 ` [PATCH 44/73] drm/amdgpu/mes10.1: add delay after mes engine enable Alex Deucher
2022-04-29 17:45 ` [PATCH 45/73] drm/amdgpu/mes10.1: implement the suspend/resume routine Alex Deucher
2022-04-29 17:45 ` [PATCH 46/73] drm/amdgpu/mes: implement creating mes process v2 Alex Deucher
2022-04-29 17:45 ` [PATCH 47/73] drm/amdgpu/mes: implement destroying mes process Alex Deucher
2022-04-29 17:45 ` [PATCH 48/73] drm/amdgpu/mes: implement adding mes gang Alex Deucher
2022-04-29 17:46 ` [PATCH 49/73] drm/amdgpu/mes: implement removing " Alex Deucher
2022-04-29 17:46 ` [PATCH 50/73] drm/amdgpu/mes: implement suspending all gangs Alex Deucher
2022-04-29 17:46 ` [PATCH 51/73] drm/amdgpu/mes: implement resuming " Alex Deucher
2022-04-29 17:46 ` [PATCH 52/73] drm/amdgpu/mes: initialize mqd from queue properties Alex Deucher
2022-04-29 17:46 ` [PATCH 53/73] drm/amdgpu/mes: implement adding mes queue Alex Deucher
2022-04-29 17:46 ` [PATCH 54/73] drm/amdgpu/mes: implement removing " Alex Deucher
2022-04-29 17:46 ` [PATCH 55/73] drm/amdgpu/mes: add helper function to convert ring to queue property Alex Deucher
2022-04-29 17:46 ` [PATCH 56/73] drm/amdgpu/mes: add helper function to get the ctx meta data offset Alex Deucher
2022-04-29 17:46 ` [PATCH 57/73] drm/amdgpu/mes: use ring for kernel queue submission Alex Deucher
2022-04-29 17:46 ` [PATCH 58/73] drm/amdgpu/mes: implement removing mes ring Alex Deucher
2022-04-29 17:46 ` [PATCH 59/73] drm/amdgpu/mes: add helper functions to alloc/free ctx metadata Alex Deucher
2022-04-29 17:46 ` [PATCH 60/73] drm/amdgpu: skip kfd routines when mes enabled Alex Deucher
2022-04-29 17:46 ` [PATCH 61/73] drm/amdgpu: Enable KFD with MES enabled Alex Deucher
2022-04-29 17:46 ` [PATCH 62/73] drm/amdgpu: skip some checking for mes queue ib submission Alex Deucher
2022-04-29 17:46 ` [PATCH 63/73] drm/amdgpu: skip kiq ib tests if mes enabled Alex Deucher
2022-04-29 17:46 ` [PATCH 64/73] drm/amdgpu: skip gds switch for mes queue Alex Deucher
2022-04-29 17:46 ` [PATCH 65/73] drm/amdgpu: kiq takes charge of all queues Alex Deucher
2022-04-29 17:46 ` [PATCH 66/73] drm/amdgpu/mes: map ctx metadata for mes self test Alex Deucher
2022-04-29 17:46 ` [PATCH 67/73] drm/amdgpu/mes: create gang and queues " Alex Deucher
2022-04-29 17:46 ` [PATCH 68/73] drm/amdgpu/mes: add ring/ib test " Alex Deucher
2022-04-29 17:46 ` [PATCH 69/73] drm/amdgpu/mes: implement " Alex Deucher
2022-04-29 17:46 ` [PATCH 70/73] drm/amdgpu/mes10.1: add mes self test in late init Alex Deucher
2022-04-29 17:46 ` [PATCH 71/73] drm/amdgpu/mes: fix vm csa update issue Alex Deucher
2022-04-29 17:46 ` [PATCH 72/73] drm/amdgpu/mes: disable mes sdma queue test Alex Deucher
2022-04-29 17:46 ` [PATCH 73/73] drm/amdgpu/mes: Update the doorbell function signatures Alex Deucher
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox