* [PATCH v2] drm/amdgpu: use local xcc to flush tlb
@ 2024-06-12 9:36 Yiqing Yao
[not found] ` <BL1PR12MB5876C672D00C550E43E00045EEC02@BL1PR12MB5876.namprd12.prod.outlook.com>
2024-06-12 10:58 ` Lazar, Lijo
0 siblings, 2 replies; 3+ messages in thread
From: Yiqing Yao @ 2024-06-12 9:36 UTC (permalink / raw)
To: amd-gfx, christian.koenig, alexander.deucher
Cc: owen.zhang2, haijun.chang, horace.chen, qing.ma, Yiqing Yao
When flushing gpu tlb using kiq for gfxhub, kiq ring is always
local by selecting kiq instance. Test shows mmreg write data packet's
higher bits then 16 have no effect when flush using kiq on gfxhub.
Also some variant have policy blocking higher offset when req/ack is set
with extra bits and can cause flush to timeout.
So keep the lower 16 bits only.
Remove redundant code.
Signed-off-by: Yiqing Yao <YiQing.Yao@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 350f6b6676f1..f3fe318e0c1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -853,8 +853,16 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
*/
if (adev->gfx.kiq[inst].ring.sched.ready &&
(amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
- uint32_t req = hub->vm_inv_eng0_req + hub->eng_distance * eng;
- uint32_t ack = hub->vm_inv_eng0_ack + hub->eng_distance * eng;
+
+ /*
+ * Select lower 16 bits to write in local xcc when flushing
+ * using kiq to write gfx as higher bits are always ignored
+ */
+ if (vmhub < AMDGPU_MMHUB0(0))
+ {
+ req = req & 0xffff;
+ ack = ack & 0xffff;
+ }
amdgpu_gmc_fw_reg_write_reg_wait(adev, req, ack, inv_req,
1 << vmid, inst);
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] drm/amdgpu: use local xcc to flush tlb
[not found] ` <BL1PR12MB5876C672D00C550E43E00045EEC02@BL1PR12MB5876.namprd12.prod.outlook.com>
@ 2024-06-12 10:23 ` Christian König
0 siblings, 0 replies; 3+ messages in thread
From: Christian König @ 2024-06-12 10:23 UTC (permalink / raw)
To: Ma, Qing (Mark), Yao, Yiqing(James),
amd-gfx@lists.freedesktop.org, Deucher, Alexander
Cc: Zhang, Owen(SRDC), Chang, HaiJun, Chen, Horace
Well there is still no explanation why this patch is needed in the first
place?
When the higher bits are ignored by the KIQ then it should already work.
Regards,
Christian.
Am 12.06.24 um 11:57 schrieb Ma, Qing (Mark):
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> @Deucher, Alexander @Koenig, Christian
> Can you help to review this patch? This patch is in the critical path of MI308 release in 6/20
> Related doc is amended in the attached email.
> Thanks
>
> -----Original Message-----
> From: Yao, Yiqing(James) <YiQing.Yao@amd.com>
> Sent: Wednesday, June 12, 2024 5:37 PM
> To: amd-gfx@lists.freedesktop.org; Koenig, Christian <Christian.Koenig@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>
> Cc: Zhang, Owen(SRDC) <Owen.Zhang2@amd.com>; Chang, HaiJun <HaiJun.Chang@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Ma, Qing (Mark) <Qing.Ma@amd.com>; Yao, Yiqing(James) <YiQing.Yao@amd.com>
> Subject: [PATCH v2] drm/amdgpu: use local xcc to flush tlb
>
> When flushing gpu tlb using kiq for gfxhub, kiq ring is always local by selecting kiq instance. Test shows mmreg write data packet's higher bits then 16 have no effect when flush using kiq on gfxhub.
>
> Also some variant have policy blocking higher offset when req/ack is set with extra bits and can cause flush to timeout.
>
> So keep the lower 16 bits only.
>
> Remove redundant code.
>
> Signed-off-by: Yiqing Yao <YiQing.Yao@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 350f6b6676f1..f3fe318e0c1d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -853,8 +853,16 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
> */
> if (adev->gfx.kiq[inst].ring.sched.ready &&
> (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
> - uint32_t req = hub->vm_inv_eng0_req + hub->eng_distance * eng;
> - uint32_t ack = hub->vm_inv_eng0_ack + hub->eng_distance * eng;
> +
> + /*
> + * Select lower 16 bits to write in local xcc when flushing
> + * using kiq to write gfx as higher bits are always ignored
> + */
> + if (vmhub < AMDGPU_MMHUB0(0))
> + {
> + req = req & 0xffff;
> + ack = ack & 0xffff;
> + }
>
> amdgpu_gmc_fw_reg_write_reg_wait(adev, req, ack, inv_req,
> 1 << vmid, inst);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] drm/amdgpu: use local xcc to flush tlb
2024-06-12 9:36 [PATCH v2] drm/amdgpu: use local xcc to flush tlb Yiqing Yao
[not found] ` <BL1PR12MB5876C672D00C550E43E00045EEC02@BL1PR12MB5876.namprd12.prod.outlook.com>
@ 2024-06-12 10:58 ` Lazar, Lijo
1 sibling, 0 replies; 3+ messages in thread
From: Lazar, Lijo @ 2024-06-12 10:58 UTC (permalink / raw)
To: Yiqing Yao, amd-gfx, christian.koenig, alexander.deucher
Cc: owen.zhang2, haijun.chang, horace.chen, qing.ma
On 6/12/2024 3:06 PM, Yiqing Yao wrote:
> When flushing gpu tlb using kiq for gfxhub, kiq ring is always
> local by selecting kiq instance. Test shows mmreg write data packet's
> higher bits then 16 have no effect when flush using kiq on gfxhub.
>
> Also some variant have policy blocking higher offset when req/ack is set
> with extra bits and can cause flush to timeout.
>
> So keep the lower 16 bits only.
>
> Remove redundant code.
>
> Signed-off-by: Yiqing Yao <YiQing.Yao@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 350f6b6676f1..f3fe318e0c1d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -853,8 +853,16 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
> */
> if (adev->gfx.kiq[inst].ring.sched.ready &&
> (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) {
> - uint32_t req = hub->vm_inv_eng0_req + hub->eng_distance * eng;
> - uint32_t ack = hub->vm_inv_eng0_ack + hub->eng_distance * eng;
> +
> + /*
> + * Select lower 16 bits to write in local xcc when flushing
> + * using kiq to write gfx as higher bits are always ignored
> + */
> + if (vmhub < AMDGPU_MMHUB0(0))
> + {
> + req = req & 0xffff;
> + ack = ack & 0xffff;
> + }
>
The issue is incorrect mask passed by host driver in discovery table
which results in incorrect register offsets. The fix should be in
discovery table passed by host driver and RRMT mechanism will then take
care.
Thanks,
Lijo
> amdgpu_gmc_fw_reg_write_reg_wait(adev, req, ack, inv_req,
> 1 << vmid, inst);
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-06-12 10:58 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-12 9:36 [PATCH v2] drm/amdgpu: use local xcc to flush tlb Yiqing Yao
[not found] ` <BL1PR12MB5876C672D00C550E43E00045EEC02@BL1PR12MB5876.namprd12.prod.outlook.com>
2024-06-12 10:23 ` Christian König
2024-06-12 10:58 ` Lazar, Lijo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox