From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 33299CA0EE0 for ; Wed, 13 Aug 2025 19:48:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id ED6E310E7BA; Wed, 13 Aug 2025 19:48:20 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="ZMrMAiqY"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3A9DA10E7BC for ; Wed, 13 Aug 2025 19:48:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1755114493; x=1786650493; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=REMH4c6CS2FzqRuM4PkTmR0xvB8lRAQ6XkJunEBd9gU=; b=ZMrMAiqYQrLu0WxLrp7yG2DsVPf0sszEYeJe5bv+eLM1TCOFdJofpw1M EsUNQJNzP6LRL+9yooI2fp1IhuYNy7cipr4p2Cp5x1KczOjoGj6S0MmEQ EvR1mZ2BjIi9vgJpPIsJz19REasIQ+Q1qldF3jtpJlQpgtdqyvbJURHVe dg0uI9Sxam5C85o62tiatKe8wGqnVXDNJ5nU4+art9G91WKwWiWD2mQgM gZgR3jego0UuUZcHSTa+MeuljufyflVH/MJs4J6qJ47YICVkVw510ciZR rICS/m2VQR/hPNE8LTc0pTa/Cvo4MDl3x3BhqUJBSnG3x1z4G6k00iZkk Q==; X-CSE-ConnectionGUID: fe1LQuwNR3qJgvIHVmV82g== X-CSE-MsgGUID: QftrRz1TSrOwp25qlg7YuQ== X-IronPort-AV: E=McAfee;i="6800,10657,11520"; a="60047021" X-IronPort-AV: E=Sophos;i="6.17,287,1747724400"; d="scan'208";a="60047021" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Aug 2025 12:48:12 -0700 X-CSE-ConnectionGUID: 1DNccDZLQSWppMv3IQ/k8g== X-CSE-MsgGUID: AgKirS1eQvCV/e6hozu6mw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,287,1747724400"; d="scan'208";a="166206194" Received: from dut137arlu.fm.intel.com ([10.105.23.66]) by orviesa009-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Aug 2025 12:48:12 -0700 From: stuartsummers To: Cc: intel-xe@lists.freedesktop.org, matthew.brost@intel.com, farah.kassabri@intel.com, Stuart Summers Subject: [PATCH 6/9] drm/xe: Decouple TLB invalidations from GT Date: Wed, 13 Aug 2025 19:48:03 +0000 Message-Id: <20250813194806.140500-7-stuart.summers@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250813194806.140500-1-stuart.summers@intel.com> References: <20250813194806.140500-1-stuart.summers@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" From: Matthew Brost Decouple TLB invalidations from the GT by updating the TLB invalidation layer to accept a `struct xe_tlb_inval` instead of a `struct xe_gt`. Also, rename *gt_tlb* to *tlb*. The internals of the TLB invalidation code still operate on a GT, but this is now hidden from the rest of the driver. Signed-off-by: Matthew Brost Signed-off-by: Stuart Summers Reviewed-by: Stuart Summers --- drivers/gpu/drm/xe/Makefile | 4 +- drivers/gpu/drm/xe/xe_ggtt.c | 4 +- drivers/gpu/drm/xe/xe_gt.c | 6 +- drivers/gpu/drm/xe/xe_gt_tlb_inval.h | 40 ----- drivers/gpu/drm/xe/xe_gt_tlb_inval_job.h | 34 ---- drivers/gpu/drm/xe/xe_gt_types.h | 2 +- drivers/gpu/drm/xe/xe_guc_ct.c | 2 +- drivers/gpu/drm/xe/xe_lmtt.c | 12 +- drivers/gpu/drm/xe/xe_migrate.h | 10 +- drivers/gpu/drm/xe/xe_pt.c | 63 ++++--- drivers/gpu/drm/xe/xe_svm.c | 1 - .../xe/{xe_gt_tlb_inval.c => xe_tlb_inval.c} | 146 +++++++++-------- drivers/gpu/drm/xe/xe_tlb_inval.h | 40 +++++ ..._gt_tlb_inval_job.c => xe_tlb_inval_job.c} | 154 +++++++++--------- drivers/gpu/drm/xe/xe_tlb_inval_job.h | 33 ++++ ...tlb_inval_types.h => xe_tlb_inval_types.h} | 35 ++-- drivers/gpu/drm/xe/xe_trace.h | 24 +-- drivers/gpu/drm/xe/xe_vm.c | 26 +-- 18 files changed, 331 insertions(+), 305 deletions(-) delete mode 100644 drivers/gpu/drm/xe/xe_gt_tlb_inval.h delete mode 100644 drivers/gpu/drm/xe/xe_gt_tlb_inval_job.h rename drivers/gpu/drm/xe/{xe_gt_tlb_inval.c => xe_tlb_inval.c} (79%) create mode 100644 drivers/gpu/drm/xe/xe_tlb_inval.h rename drivers/gpu/drm/xe/{xe_gt_tlb_inval_job.c => xe_tlb_inval_job.c} (51%) create mode 100644 drivers/gpu/drm/xe/xe_tlb_inval_job.h rename drivers/gpu/drm/xe/{xe_gt_tlb_inval_types.h => xe_tlb_inval_types.h} (56%) diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index 0a36b2463434..e4a363489072 100644 --- a/drivers/gpu/drm/xe/Makefile +++ b/drivers/gpu/drm/xe/Makefile @@ -61,8 +61,6 @@ xe-y += xe_bb.o \ xe_gt_pagefault.o \ xe_gt_sysfs.o \ xe_gt_throttle.o \ - xe_gt_tlb_inval.o \ - xe_gt_tlb_inval_job.o \ xe_gt_topology.o \ xe_guc.o \ xe_guc_ads.o \ @@ -117,6 +115,8 @@ xe-y += xe_bb.o \ xe_sync.o \ xe_tile.o \ xe_tile_sysfs.o \ + xe_tlb_inval.o \ + xe_tlb_inval_job.o \ xe_trace.o \ xe_trace_bo.o \ xe_trace_guc.o \ diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c index c3e46c270117..71c7690a92b3 100644 --- a/drivers/gpu/drm/xe/xe_ggtt.c +++ b/drivers/gpu/drm/xe/xe_ggtt.c @@ -23,13 +23,13 @@ #include "xe_device.h" #include "xe_gt.h" #include "xe_gt_printk.h" -#include "xe_gt_tlb_inval.h" #include "xe_map.h" #include "xe_mmio.h" #include "xe_pm.h" #include "xe_res_cursor.h" #include "xe_sriov.h" #include "xe_tile_sriov_vf.h" +#include "xe_tlb_inval.h" #include "xe_wa.h" #include "xe_wopcm.h" @@ -438,7 +438,7 @@ static void ggtt_invalidate_gt_tlb(struct xe_gt *gt) if (!gt) return; - err = xe_gt_tlb_inval_ggtt(gt); + err = xe_tlb_inval_ggtt(>->tlb_inval); xe_gt_WARN(gt, err, "Failed to invalidate GGTT (%pe)", ERR_PTR(err)); } diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index 2d79eb1a1113..34505a6d93ed 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -37,7 +37,6 @@ #include "xe_gt_sriov_pf.h" #include "xe_gt_sriov_vf.h" #include "xe_gt_sysfs.h" -#include "xe_gt_tlb_inval.h" #include "xe_gt_topology.h" #include "xe_guc_exec_queue_types.h" #include "xe_guc_pc.h" @@ -58,6 +57,7 @@ #include "xe_sa.h" #include "xe_sched_job.h" #include "xe_sriov.h" +#include "xe_tlb_inval.h" #include "xe_tuning.h" #include "xe_uc.h" #include "xe_uc_fw.h" @@ -850,7 +850,7 @@ static int gt_reset(struct xe_gt *gt) xe_uc_stop(>->uc); - xe_gt_tlb_inval_reset(gt); + xe_tlb_inval_reset(>->tlb_inval); err = do_gt_reset(gt); if (err) @@ -1064,5 +1064,5 @@ void xe_gt_declare_wedged(struct xe_gt *gt) xe_gt_assert(gt, gt_to_xe(gt)->wedged.mode); xe_uc_declare_wedged(>->uc); - xe_gt_tlb_inval_reset(gt); + xe_tlb_inval_reset(>->tlb_inval); } diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_inval.h b/drivers/gpu/drm/xe/xe_gt_tlb_inval.h deleted file mode 100644 index 801d4ecf88f0..000000000000 --- a/drivers/gpu/drm/xe/xe_gt_tlb_inval.h +++ /dev/null @@ -1,40 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2023 Intel Corporation - */ - -#ifndef _XE_GT_TLB_INVAL_H_ -#define _XE_GT_TLB_INVAL_H_ - -#include - -#include "xe_gt_tlb_inval_types.h" - -struct xe_gt; -struct xe_guc; -struct xe_vm; -struct xe_vma; - -int xe_gt_tlb_inval_init_early(struct xe_gt *gt); - -void xe_gt_tlb_inval_reset(struct xe_gt *gt); -int xe_gt_tlb_inval_ggtt(struct xe_gt *gt); -void xe_gt_tlb_inval_vm(struct xe_gt *gt, struct xe_vm *vm); -int xe_gt_tlb_inval_all(struct xe_gt *gt, struct xe_gt_tlb_inval_fence *fence); -int xe_gt_tlb_inval_range(struct xe_gt *gt, - struct xe_gt_tlb_inval_fence *fence, - u64 start, u64 end, u32 asid); -int xe_guc_tlb_inval_done_handler(struct xe_guc *guc, u32 *msg, u32 len); - -void xe_gt_tlb_inval_fence_init(struct xe_gt *gt, - struct xe_gt_tlb_inval_fence *fence, - bool stack); -void xe_gt_tlb_inval_fence_signal(struct xe_gt_tlb_inval_fence *fence); - -static inline void -xe_gt_tlb_inval_fence_wait(struct xe_gt_tlb_inval_fence *fence) -{ - dma_fence_wait(&fence->base, false); -} - -#endif /* _XE_GT_TLB_INVAL_ */ diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_inval_job.h b/drivers/gpu/drm/xe/xe_gt_tlb_inval_job.h deleted file mode 100644 index 883896194a34..000000000000 --- a/drivers/gpu/drm/xe/xe_gt_tlb_inval_job.h +++ /dev/null @@ -1,34 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2025 Intel Corporation - */ - -#ifndef _XE_GT_TLB_INVAL_JOB_H_ -#define _XE_GT_TLB_INVAL_JOB_H_ - -#include - -struct dma_fence; -struct drm_sched_job; -struct kref; -struct xe_exec_queue; -struct xe_gt; -struct xe_gt_tlb_inval_job; -struct xe_migrate; - -struct xe_gt_tlb_inval_job *xe_gt_tlb_inval_job_create(struct xe_exec_queue *q, - struct xe_gt *gt, - u64 start, u64 end, - u32 asid); - -int xe_gt_tlb_inval_job_alloc_dep(struct xe_gt_tlb_inval_job *job); - -struct dma_fence *xe_gt_tlb_inval_job_push(struct xe_gt_tlb_inval_job *job, - struct xe_migrate *m, - struct dma_fence *fence); - -void xe_gt_tlb_inval_job_get(struct xe_gt_tlb_inval_job *job); - -void xe_gt_tlb_inval_job_put(struct xe_gt_tlb_inval_job *job); - -#endif diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h index 7dc5a3f310f1..66158105aca5 100644 --- a/drivers/gpu/drm/xe/xe_gt_types.h +++ b/drivers/gpu/drm/xe/xe_gt_types.h @@ -12,12 +12,12 @@ #include "xe_gt_sriov_pf_types.h" #include "xe_gt_sriov_vf_types.h" #include "xe_gt_stats_types.h" -#include "xe_gt_tlb_inval_types.h" #include "xe_hw_engine_types.h" #include "xe_hw_fence_types.h" #include "xe_oa_types.h" #include "xe_reg_sr_types.h" #include "xe_sa_types.h" +#include "xe_tlb_inval_types.h" #include "xe_uc_types.h" struct xe_exec_queue_ops; diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c index 9131d121d941..5f38041cff4c 100644 --- a/drivers/gpu/drm/xe/xe_guc_ct.c +++ b/drivers/gpu/drm/xe/xe_guc_ct.c @@ -26,13 +26,13 @@ #include "xe_gt_sriov_pf_control.h" #include "xe_gt_sriov_pf_monitor.h" #include "xe_gt_sriov_printk.h" -#include "xe_gt_tlb_inval.h" #include "xe_guc.h" #include "xe_guc_log.h" #include "xe_guc_relay.h" #include "xe_guc_submit.h" #include "xe_map.h" #include "xe_pm.h" +#include "xe_tlb_inval.h" #include "xe_trace_guc.h" static void receive_g2h(struct xe_guc_ct *ct); diff --git a/drivers/gpu/drm/xe/xe_lmtt.c b/drivers/gpu/drm/xe/xe_lmtt.c index e5aba03ff8ac..f2bfbfa3efa1 100644 --- a/drivers/gpu/drm/xe/xe_lmtt.c +++ b/drivers/gpu/drm/xe/xe_lmtt.c @@ -11,7 +11,7 @@ #include "xe_assert.h" #include "xe_bo.h" -#include "xe_gt_tlb_inval.h" +#include "xe_tlb_inval.h" #include "xe_lmtt.h" #include "xe_map.h" #include "xe_mmio.h" @@ -228,8 +228,8 @@ void xe_lmtt_init_hw(struct xe_lmtt *lmtt) static int lmtt_invalidate_hw(struct xe_lmtt *lmtt) { - struct xe_gt_tlb_inval_fence fences[XE_MAX_GT_PER_TILE]; - struct xe_gt_tlb_inval_fence *fence = fences; + struct xe_tlb_inval_fence fences[XE_MAX_GT_PER_TILE]; + struct xe_tlb_inval_fence *fence = fences; struct xe_tile *tile = lmtt_to_tile(lmtt); struct xe_gt *gt; int result = 0; @@ -237,8 +237,8 @@ static int lmtt_invalidate_hw(struct xe_lmtt *lmtt) u8 id; for_each_gt_on_tile(gt, tile, id) { - xe_gt_tlb_inval_fence_init(gt, fence, true); - err = xe_gt_tlb_inval_all(gt, fence); + xe_tlb_inval_fence_init(>->tlb_inval, fence, true); + err = xe_tlb_inval_all(>->tlb_inval, fence); result = result ?: err; fence++; } @@ -252,7 +252,7 @@ static int lmtt_invalidate_hw(struct xe_lmtt *lmtt) */ fence = fences; for_each_gt_on_tile(gt, tile, id) - xe_gt_tlb_inval_fence_wait(fence++); + xe_tlb_inval_fence_wait(fence++); return result; } diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h index 8978d2cc1a75..4fad324b6253 100644 --- a/drivers/gpu/drm/xe/xe_migrate.h +++ b/drivers/gpu/drm/xe/xe_migrate.h @@ -15,7 +15,7 @@ struct ttm_resource; struct xe_bo; struct xe_gt; -struct xe_gt_tlb_inval_job; +struct xe_tlb_inval_job; struct xe_exec_queue; struct xe_migrate; struct xe_migrate_pt_update; @@ -94,13 +94,13 @@ struct xe_migrate_pt_update { /** @job: The job if a GPU page-table update. NULL otherwise */ struct xe_sched_job *job; /** - * @ijob: The GT TLB invalidation job for primary tile. NULL otherwise + * @ijob: The TLB invalidation job for primary GT. NULL otherwise */ - struct xe_gt_tlb_inval_job *ijob; + struct xe_tlb_inval_job *ijob; /** - * @mjob: The GT TLB invalidation job for media tile. NULL otherwise + * @mjob: The TLB invalidation job for media GT. NULL otherwise */ - struct xe_gt_tlb_inval_job *mjob; + struct xe_tlb_inval_job *mjob; /** @tile_id: Tile ID of the update */ u8 tile_id; }; diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c index f3a39e734a90..d70015c063ad 100644 --- a/drivers/gpu/drm/xe/xe_pt.c +++ b/drivers/gpu/drm/xe/xe_pt.c @@ -13,7 +13,7 @@ #include "xe_drm_client.h" #include "xe_exec_queue.h" #include "xe_gt.h" -#include "xe_gt_tlb_inval_job.h" +#include "xe_tlb_inval_job.h" #include "xe_migrate.h" #include "xe_pt_types.h" #include "xe_pt_walk.h" @@ -21,6 +21,7 @@ #include "xe_sched_job.h" #include "xe_sync.h" #include "xe_svm.h" +#include "xe_tlb_inval_job.h" #include "xe_trace.h" #include "xe_ttm_stolen_mgr.h" #include "xe_vm.h" @@ -1261,8 +1262,8 @@ static int op_add_deps(struct xe_vm *vm, struct xe_vma_op *op, } static int xe_pt_vm_dependencies(struct xe_sched_job *job, - struct xe_gt_tlb_inval_job *ijob, - struct xe_gt_tlb_inval_job *mjob, + struct xe_tlb_inval_job *ijob, + struct xe_tlb_inval_job *mjob, struct xe_vm *vm, struct xe_vma_ops *vops, struct xe_vm_pgtable_update_ops *pt_update_ops, @@ -1332,13 +1333,13 @@ static int xe_pt_vm_dependencies(struct xe_sched_job *job, if (job) { if (ijob) { - err = xe_gt_tlb_inval_job_alloc_dep(ijob); + err = xe_tlb_inval_job_alloc_dep(ijob); if (err) return err; } if (mjob) { - err = xe_gt_tlb_inval_job_alloc_dep(mjob); + err = xe_tlb_inval_job_alloc_dep(mjob); if (err) return err; } @@ -2338,6 +2339,15 @@ static const struct xe_migrate_pt_update_ops svm_migrate_ops = { static const struct xe_migrate_pt_update_ops svm_migrate_ops; #endif +static struct xe_dep_scheduler *to_dep_scheduler(struct xe_exec_queue *q, + struct xe_gt *gt) +{ + if (xe_gt_is_media_type(gt)) + return q->tlb_inval[XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT].dep_scheduler; + + return q->tlb_inval[XE_EXEC_QUEUE_TLB_INVAL_PRIMARY_GT].dep_scheduler; +} + /** * xe_pt_update_ops_run() - Run PT update operations * @tile: Tile of PT update operations @@ -2356,7 +2366,7 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops) struct xe_vm_pgtable_update_ops *pt_update_ops = &vops->pt_update_ops[tile->id]; struct dma_fence *fence, *ifence, *mfence; - struct xe_gt_tlb_inval_job *ijob = NULL, *mjob = NULL; + struct xe_tlb_inval_job *ijob = NULL, *mjob = NULL; struct dma_fence **fences = NULL; struct dma_fence_array *cf = NULL; struct xe_range_fence *rfence; @@ -2388,11 +2398,15 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops) #endif if (pt_update_ops->needs_invalidation) { - ijob = xe_gt_tlb_inval_job_create(pt_update_ops->q, - tile->primary_gt, - pt_update_ops->start, - pt_update_ops->last, - vm->usm.asid); + struct xe_exec_queue *q = pt_update_ops->q; + struct xe_dep_scheduler *dep_scheduler = + to_dep_scheduler(q, tile->primary_gt); + + ijob = xe_tlb_inval_job_create(q, &tile->primary_gt->tlb_inval, + dep_scheduler, + pt_update_ops->start, + pt_update_ops->last, + vm->usm.asid); if (IS_ERR(ijob)) { err = PTR_ERR(ijob); goto kill_vm_tile1; @@ -2400,11 +2414,14 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops) update.ijob = ijob; if (tile->media_gt) { - mjob = xe_gt_tlb_inval_job_create(pt_update_ops->q, - tile->media_gt, - pt_update_ops->start, - pt_update_ops->last, - vm->usm.asid); + dep_scheduler = to_dep_scheduler(q, tile->media_gt); + + mjob = xe_tlb_inval_job_create(q, + &tile->media_gt->tlb_inval, + dep_scheduler, + pt_update_ops->start, + pt_update_ops->last, + vm->usm.asid); if (IS_ERR(mjob)) { err = PTR_ERR(mjob); goto free_ijob; @@ -2455,13 +2472,13 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops) if (ijob) { struct dma_fence *__fence; - ifence = xe_gt_tlb_inval_job_push(ijob, tile->migrate, fence); + ifence = xe_tlb_inval_job_push(ijob, tile->migrate, fence); __fence = ifence; if (mjob) { fences[0] = ifence; - mfence = xe_gt_tlb_inval_job_push(mjob, tile->migrate, - fence); + mfence = xe_tlb_inval_job_push(mjob, tile->migrate, + fence); fences[1] = mfence; dma_fence_array_init(cf, 2, fences, @@ -2504,8 +2521,8 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops) if (pt_update_ops->needs_userptr_lock) up_read(&vm->userptr.notifier_lock); - xe_gt_tlb_inval_job_put(mjob); - xe_gt_tlb_inval_job_put(ijob); + xe_tlb_inval_job_put(mjob); + xe_tlb_inval_job_put(ijob); return fence; @@ -2514,8 +2531,8 @@ xe_pt_update_ops_run(struct xe_tile *tile, struct xe_vma_ops *vops) free_ijob: kfree(cf); kfree(fences); - xe_gt_tlb_inval_job_put(mjob); - xe_gt_tlb_inval_job_put(ijob); + xe_tlb_inval_job_put(mjob); + xe_tlb_inval_job_put(ijob); kill_vm_tile1: if (err != -EAGAIN && err != -ENODATA && tile->id) xe_vm_kill(vops->vm, false); diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c index d290e54134f3..c8febef4d679 100644 --- a/drivers/gpu/drm/xe/xe_svm.c +++ b/drivers/gpu/drm/xe/xe_svm.c @@ -7,7 +7,6 @@ #include "xe_bo.h" #include "xe_gt_stats.h" -#include "xe_gt_tlb_inval.h" #include "xe_migrate.h" #include "xe_module.h" #include "xe_pm.h" diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_inval.c b/drivers/gpu/drm/xe/xe_tlb_inval.c similarity index 79% rename from drivers/gpu/drm/xe/xe_gt_tlb_inval.c rename to drivers/gpu/drm/xe/xe_tlb_inval.c index 8d76de43f1da..bb7d8ef73888 100644 --- a/drivers/gpu/drm/xe/xe_gt_tlb_inval.c +++ b/drivers/gpu/drm/xe/xe_tlb_inval.c @@ -13,7 +13,7 @@ #include "xe_guc.h" #include "xe_guc_ct.h" #include "xe_gt_stats.h" -#include "xe_gt_tlb_inval.h" +#include "xe_tlb_inval.h" #include "xe_mmio.h" #include "xe_pm.h" #include "xe_sriov.h" @@ -38,42 +38,49 @@ static long tlb_timeout_jiffies(struct xe_gt *gt) return hw_tlb_timeout + 2 * delay; } -static void xe_gt_tlb_inval_fence_fini(struct xe_gt_tlb_inval_fence *fence) +static void xe_tlb_inval_fence_fini(struct xe_tlb_inval_fence *fence) { - if (WARN_ON_ONCE(!fence->gt)) + struct xe_gt *gt; + + if (WARN_ON_ONCE(!fence->tlb_inval)) return; - cancel_delayed_work(&fence->gt->tlb_invalidation.fence_tdr); + gt = fence->tlb_inval->private; + + cancel_delayed_work(>->tlb_inval.fence_tdr); - xe_pm_runtime_put(gt_to_xe(fence->gt)); - fence->gt = NULL; /* fini() should be called once */ + xe_pm_runtime_put(gt_to_xe(gt)); + fence->tlb_inval = NULL; /* fini() should be called once */ } static void -__inval_fence_signal(struct xe_device *xe, struct xe_gt_tlb_inval_fence *fence) +__inval_fence_signal(struct xe_device *xe, struct xe_tlb_inval_fence *fence) { bool stack = test_bit(FENCE_STACK_BIT, &fence->base.flags); - trace_xe_gt_tlb_inval_fence_signal(xe, fence); - xe_gt_tlb_inval_fence_fini(fence); + trace_xe_tlb_inval_fence_signal(xe, fence); + xe_tlb_inval_fence_fini(fence); dma_fence_signal(&fence->base); if (!stack) dma_fence_put(&fence->base); } static void -inval_fence_signal(struct xe_device *xe, struct xe_gt_tlb_inval_fence *fence) +inval_fence_signal(struct xe_device *xe, struct xe_tlb_inval_fence *fence) { list_del(&fence->link); __inval_fence_signal(xe, fence); } -void xe_gt_tlb_inval_fence_signal(struct xe_gt_tlb_inval_fence *fence) +void xe_tlb_inval_fence_signal(struct xe_tlb_inval_fence *fence) { - if (WARN_ON_ONCE(!fence->gt)) + struct xe_gt *gt; + + if (WARN_ON_ONCE(!fence->tlb_inval)) return; - __inval_fence_signal(gt_to_xe(fence->gt), fence); + gt = fence->tlb_inval->private; + __inval_fence_signal(gt_to_xe(gt), fence); } static void xe_gt_tlb_fence_timeout(struct work_struct *work) @@ -81,7 +88,7 @@ static void xe_gt_tlb_fence_timeout(struct work_struct *work) struct xe_gt *gt = container_of(work, struct xe_gt, tlb_inval.fence_tdr.work); struct xe_device *xe = gt_to_xe(gt); - struct xe_gt_tlb_inval_fence *fence, *next; + struct xe_tlb_inval_fence *fence, *next; LNL_FLUSH_WORK(>->uc.guc.ct.g2h_worker); @@ -94,7 +101,7 @@ static void xe_gt_tlb_fence_timeout(struct work_struct *work) if (msecs_to_jiffies(since_inval_ms) < tlb_timeout_jiffies(gt)) break; - trace_xe_gt_tlb_inval_fence_timeout(xe, fence); + trace_xe_tlb_inval_fence_timeout(xe, fence); xe_gt_err(gt, "TLB invalidation fence timeout, seqno=%d recv=%d", fence->seqno, gt->tlb_inval.seqno_recv); @@ -109,10 +116,10 @@ static void xe_gt_tlb_fence_timeout(struct work_struct *work) } /** - * xe_gt_tlb_inval_init_early - Initialize GT TLB invalidation state + * xe_tlb_inval_init_early - Initialize TLB invalidation state * @gt: GT structure * - * Initialize GT TLB invalidation state, purely software initialization, should + * Initialize TLB invalidation state, purely software initialization, should * be called once during driver load. * * Return: 0 on success, negative error code on error. @@ -122,6 +129,7 @@ int xe_gt_tlb_inval_init_early(struct xe_gt *gt) struct xe_device *xe = gt_to_xe(gt); int err; + gt->tlb_inval.private = gt; gt->tlb_inval.seqno = 1; INIT_LIST_HEAD(>->tlb_inval.pending_fences); spin_lock_init(>->tlb_inval.pending_lock); @@ -143,14 +151,15 @@ int xe_gt_tlb_inval_init_early(struct xe_gt *gt) } /** - * xe_gt_tlb_inval_reset - Initialize GT TLB invalidation reset - * @gt: GT structure + * xe_tlb_inval_reset - Initialize TLB invalidation reset + * @tlb_inval: TLB invalidation client * * Signal any pending invalidation fences, should be called during a GT reset */ -void xe_gt_tlb_inval_reset(struct xe_gt *gt) +void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval) { - struct xe_gt_tlb_inval_fence *fence, *next; + struct xe_gt *gt = tlb_inval->private; + struct xe_tlb_inval_fence *fence, *next; int pending_seqno; /* @@ -203,7 +212,7 @@ static bool tlb_inval_seqno_past(struct xe_gt *gt, int seqno) } static int send_tlb_inval(struct xe_guc *guc, - struct xe_gt_tlb_inval_fence *fence, + struct xe_tlb_inval_fence *fence, u32 *action, int len) { struct xe_gt *gt = guc_to_gt(guc); @@ -222,7 +231,7 @@ static int send_tlb_inval(struct xe_guc *guc, mutex_lock(>->tlb_inval.seqno_lock); seqno = gt->tlb_inval.seqno; fence->seqno = seqno; - trace_xe_gt_tlb_inval_fence_send(xe, fence); + trace_xe_tlb_inval_fence_send(xe, fence); action[1] = seqno; ret = xe_guc_ct_send(&guc->ct, action, len, G2H_LEN_DW_TLB_INVALIDATE, 1); @@ -267,7 +276,7 @@ static int send_tlb_inval(struct xe_guc *guc, XE_GUC_TLB_INVAL_FLUSH_CACHE) /** - * xe_gt_tlb_inval_guc - Issue a TLB invalidation on this GT for the GuC + * xe_tlb_inval_guc - Issue a TLB invalidation on this GT for the GuC * @gt: GT structure * @fence: invalidation fence which will be signal on TLB invalidation * completion @@ -277,8 +286,8 @@ static int send_tlb_inval(struct xe_guc *guc, * * Return: 0 on success, negative error code on error */ -static int xe_gt_tlb_inval_guc(struct xe_gt *gt, - struct xe_gt_tlb_inval_fence *fence) +static int xe_tlb_inval_guc(struct xe_gt *gt, + struct xe_tlb_inval_fence *fence) { u32 action[] = { XE_GUC_ACTION_TLB_INVALIDATION, @@ -299,30 +308,31 @@ static int xe_gt_tlb_inval_guc(struct xe_gt *gt, } /** - * xe_gt_tlb_inval_ggtt - Issue a TLB invalidation on this GT for the GGTT - * @gt: GT structure + * xe_tlb_inval_ggtt - Issue a TLB invalidation on this GT for the GGTT + * @tlb_inval: TLB invalidation client * * Issue a TLB invalidation for the GGTT. Completion of TLB invalidation is * synchronous. * * Return: 0 on success, negative error code on error */ -int xe_gt_tlb_inval_ggtt(struct xe_gt *gt) +int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval) { + struct xe_gt *gt = tlb_inval->private; struct xe_device *xe = gt_to_xe(gt); unsigned int fw_ref; if (xe_guc_ct_enabled(>->uc.guc.ct) && gt->uc.guc.submission_state.enabled) { - struct xe_gt_tlb_inval_fence fence; + struct xe_tlb_inval_fence fence; int ret; - xe_gt_tlb_inval_fence_init(gt, &fence, true); - ret = xe_gt_tlb_inval_guc(gt, &fence); + xe_tlb_inval_fence_init(tlb_inval, &fence, true); + ret = xe_tlb_inval_guc(gt, &fence); if (ret) return ret; - xe_gt_tlb_inval_fence_wait(&fence); + xe_tlb_inval_fence_wait(&fence); } else if (xe_device_uc_enabled(xe) && !xe_device_wedged(xe)) { struct xe_mmio *mmio = >->mmio; @@ -345,14 +355,17 @@ int xe_gt_tlb_inval_ggtt(struct xe_gt *gt) return 0; } -static int send_tlb_inval_all(struct xe_gt *gt, - struct xe_gt_tlb_inval_fence *fence) +static int send_tlb_inval_all(struct xe_tlb_inval *tlb_inval, + struct xe_tlb_inval_fence *fence) { u32 action[] = { XE_GUC_ACTION_TLB_INVALIDATION_ALL, 0, /* seqno, replaced in send_tlb_inval */ MAKE_INVAL_OP(XE_GUC_TLB_INVAL_FULL), }; + struct xe_gt *gt = tlb_inval->private; + + xe_gt_assert(gt, fence); return send_tlb_inval(>->uc.guc, fence, action, ARRAY_SIZE(action)); } @@ -360,19 +373,19 @@ static int send_tlb_inval_all(struct xe_gt *gt, /** * xe_gt_tlb_invalidation_all - Invalidate all TLBs across PF and all VFs. * @gt: the &xe_gt structure - * @fence: the &xe_gt_tlb_inval_fence to be signaled on completion + * @fence: the &xe_tlb_inval_fence to be signaled on completion * * Send a request to invalidate all TLBs across PF and all VFs. * * Return: 0 on success, negative error code on error */ -int xe_gt_tlb_inval_all(struct xe_gt *gt, struct xe_gt_tlb_inval_fence *fence) +int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval, + struct xe_tlb_inval_fence *fence) { + struct xe_gt *gt = tlb_inval->private; int err; - xe_gt_assert(gt, gt == fence->gt); - - err = send_tlb_inval_all(gt, fence); + err = send_tlb_inval_all(tlb_inval, fence); if (err) xe_gt_err(gt, "TLB invalidation request failed (%pe)", ERR_PTR(err)); @@ -387,9 +400,8 @@ int xe_gt_tlb_inval_all(struct xe_gt *gt, struct xe_gt_tlb_inval_fence *fence) #define MAX_RANGE_TLB_INVALIDATION_LENGTH (rounddown_pow_of_two(ULONG_MAX)) /** - * xe_gt_tlb_inval_range - Issue a TLB invalidation on this GT for an address range - * - * @gt: GT structure + * xe_tlb_inval_range - Issue a TLB invalidation on this GT for an address range + * @tlb_inval: TLB invalidation client * @fence: invalidation fence which will be signal on TLB invalidation * completion * @start: start address @@ -402,9 +414,11 @@ int xe_gt_tlb_inval_all(struct xe_gt *gt, struct xe_gt_tlb_inval_fence *fence) * * Return: Negative error code on error, 0 on success */ -int xe_gt_tlb_inval_range(struct xe_gt *gt, struct xe_gt_tlb_inval_fence *fence, - u64 start, u64 end, u32 asid) +int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval, + struct xe_tlb_inval_fence *fence, u64 start, u64 end, + u32 asid) { + struct xe_gt *gt = tlb_inval->private; struct xe_device *xe = gt_to_xe(gt); #define MAX_TLB_INVALIDATION_LEN 7 u32 action[MAX_TLB_INVALIDATION_LEN]; @@ -474,38 +488,38 @@ int xe_gt_tlb_inval_range(struct xe_gt *gt, struct xe_gt_tlb_inval_fence *fence, } /** - * xe_gt_tlb_inval_vm - Issue a TLB invalidation on this GT for a VM - * @gt: graphics tile + * xe_tlb_inval_vm - Issue a TLB invalidation on this GT for a VM + * @tlb_inval: TLB invalidation client * @vm: VM to invalidate * * Invalidate entire VM's address space */ -void xe_gt_tlb_inval_vm(struct xe_gt *gt, struct xe_vm *vm) +void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm) { - struct xe_gt_tlb_inval_fence fence; + struct xe_tlb_inval_fence fence; u64 range = 1ull << vm->xe->info.va_bits; int ret; - xe_gt_tlb_inval_fence_init(gt, &fence, true); + xe_tlb_inval_fence_init(tlb_inval, &fence, true); - ret = xe_gt_tlb_inval_range(gt, &fence, 0, range, vm->usm.asid); + ret = xe_tlb_inval_range(tlb_inval, &fence, 0, range, vm->usm.asid); if (ret < 0) return; - xe_gt_tlb_inval_fence_wait(&fence); + xe_tlb_inval_fence_wait(&fence); } /** - * xe_gt_tlb_inval_done_handler - GT TLB invalidation done handler + * xe_tlb_inval_done_handler - TLB invalidation done handler * @gt: gt * @seqno: seqno of invalidation that is done * - * Update recv seqno, signal any GT TLB invalidation fences, and restart TDR + * Update recv seqno, signal any TLB invalidation fences, and restart TDR */ -static void xe_gt_tlb_inval_done_handler(struct xe_gt *gt, int seqno) +static void xe_tlb_inval_done_handler(struct xe_gt *gt, int seqno) { struct xe_device *xe = gt_to_xe(gt); - struct xe_gt_tlb_inval_fence *fence, *next; + struct xe_tlb_inval_fence *fence, *next; unsigned long flags; /* @@ -533,7 +547,7 @@ static void xe_gt_tlb_inval_done_handler(struct xe_gt *gt, int seqno) list_for_each_entry_safe(fence, next, >->tlb_inval.pending_fences, link) { - trace_xe_gt_tlb_inval_fence_recv(xe, fence); + trace_xe_tlb_inval_fence_recv(xe, fence); if (!tlb_inval_seqno_past(gt, fence->seqno)) break; @@ -570,7 +584,7 @@ int xe_guc_tlb_inval_done_handler(struct xe_guc *guc, u32 *msg, u32 len) if (unlikely(len != 1)) return -EPROTO; - xe_gt_tlb_inval_done_handler(gt, msg[0]); + xe_tlb_inval_done_handler(gt, msg[0]); return 0; } @@ -593,19 +607,21 @@ static const struct dma_fence_ops inval_fence_ops = { }; /** - * xe_gt_tlb_inval_fence_init - Initialize TLB invalidation fence - * @gt: GT + * xe_tlb_inval_fence_init - Initialize TLB invalidation fence + * @tlb_inval: TLB invalidation client * @fence: TLB invalidation fence to initialize * @stack: fence is stack variable * - * Initialize TLB invalidation fence for use. xe_gt_tlb_inval_fence_fini + * Initialize TLB invalidation fence for use. xe_tlb_inval_fence_fini * will be automatically called when fence is signalled (all fences must signal), * even on error. */ -void xe_gt_tlb_inval_fence_init(struct xe_gt *gt, - struct xe_gt_tlb_inval_fence *fence, - bool stack) +void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval, + struct xe_tlb_inval_fence *fence, + bool stack) { + struct xe_gt *gt = tlb_inval->private; + xe_pm_runtime_get_noresume(gt_to_xe(gt)); spin_lock_irq(>->tlb_inval.lock); @@ -618,5 +634,5 @@ void xe_gt_tlb_inval_fence_init(struct xe_gt *gt, set_bit(FENCE_STACK_BIT, &fence->base.flags); else dma_fence_get(&fence->base); - fence->gt = gt; + fence->tlb_inval = tlb_inval; } diff --git a/drivers/gpu/drm/xe/xe_tlb_inval.h b/drivers/gpu/drm/xe/xe_tlb_inval.h new file mode 100644 index 000000000000..8c8d511a962d --- /dev/null +++ b/drivers/gpu/drm/xe/xe_tlb_inval.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2025 Intel Corporation + */ + +#ifndef _XE_TLB_INVAL_H_ +#define _XE_TLB_INVAL_H_ + +#include + +#include "xe_tlb_inval_types.h" + +struct xe_gt; +struct xe_guc; +struct xe_vm; + +int xe_gt_tlb_inval_init_early(struct xe_gt *gt); + +void xe_tlb_inval_reset(struct xe_tlb_inval *tlb_inval); +int xe_tlb_inval_ggtt(struct xe_tlb_inval *tlb_inval); +void xe_tlb_inval_vm(struct xe_tlb_inval *tlb_inval, struct xe_vm *vm); +int xe_tlb_inval_all(struct xe_tlb_inval *tlb_inval, + struct xe_tlb_inval_fence *fence); +int xe_tlb_inval_range(struct xe_tlb_inval *tlb_inval, + struct xe_tlb_inval_fence *fence, + u64 start, u64 end, u32 asid); +int xe_guc_tlb_inval_done_handler(struct xe_guc *guc, u32 *msg, u32 len); + +void xe_tlb_inval_fence_init(struct xe_tlb_inval *tlb_inval, + struct xe_tlb_inval_fence *fence, + bool stack); +void xe_tlb_inval_fence_signal(struct xe_tlb_inval_fence *fence); + +static inline void +xe_tlb_inval_fence_wait(struct xe_tlb_inval_fence *fence) +{ + dma_fence_wait(&fence->base, false); +} + +#endif /* _XE_TLB_INVAL_ */ diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_inval_job.c b/drivers/gpu/drm/xe/xe_tlb_inval_job.c similarity index 51% rename from drivers/gpu/drm/xe/xe_gt_tlb_inval_job.c rename to drivers/gpu/drm/xe/xe_tlb_inval_job.c index 41e0ea92ea5a..492def04a559 100644 --- a/drivers/gpu/drm/xe/xe_gt_tlb_inval_job.c +++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.c @@ -3,21 +3,22 @@ * Copyright © 2025 Intel Corporation */ +#include "xe_assert.h" #include "xe_dep_job_types.h" #include "xe_dep_scheduler.h" #include "xe_exec_queue.h" -#include "xe_gt.h" -#include "xe_gt_tlb_inval.h" -#include "xe_gt_tlb_inval_job.h" +#include "xe_gt_types.h" +#include "xe_tlb_inval.h" +#include "xe_tlb_inval_job.h" #include "xe_migrate.h" #include "xe_pm.h" -/** struct xe_gt_tlb_inval_job - GT TLB invalidation job */ -struct xe_gt_tlb_inval_job { +/** struct xe_tlb_inval_job - TLB invalidation job */ +struct xe_tlb_inval_job { /** @dep: base generic dependency Xe job */ struct xe_dep_job dep; - /** @gt: GT to invalidate */ - struct xe_gt *gt; + /** @tlb_inval: TLB invalidation client */ + struct xe_tlb_inval *tlb_inval; /** @q: exec queue issuing the invalidate */ struct xe_exec_queue *q; /** @refcount: ref count of this job */ @@ -37,63 +38,56 @@ struct xe_gt_tlb_inval_job { bool fence_armed; }; -static struct dma_fence *xe_gt_tlb_inval_job_run(struct xe_dep_job *dep_job) +static struct dma_fence *xe_tlb_inval_job_run(struct xe_dep_job *dep_job) { - struct xe_gt_tlb_inval_job *job = + struct xe_tlb_inval_job *job = container_of(dep_job, typeof(*job), dep); - struct xe_gt_tlb_inval_fence *ifence = + struct xe_tlb_inval_fence *ifence = container_of(job->fence, typeof(*ifence), base); - xe_gt_tlb_inval_range(job->gt, ifence, job->start, - job->end, job->asid); + xe_tlb_inval_range(job->tlb_inval, ifence, job->start, + job->end, job->asid); return job->fence; } -static void xe_gt_tlb_inval_job_free(struct xe_dep_job *dep_job) +static void xe_tlb_inval_job_free(struct xe_dep_job *dep_job) { - struct xe_gt_tlb_inval_job *job = + struct xe_tlb_inval_job *job = container_of(dep_job, typeof(*job), dep); - /* Pairs with get in xe_gt_tlb_inval_job_push */ - xe_gt_tlb_inval_job_put(job); + /* Pairs with get in xe_tlb_inval_job_push */ + xe_tlb_inval_job_put(job); } static const struct xe_dep_job_ops dep_job_ops = { - .run_job = xe_gt_tlb_inval_job_run, - .free_job = xe_gt_tlb_inval_job_free, + .run_job = xe_tlb_inval_job_run, + .free_job = xe_tlb_inval_job_free, }; -static int xe_gt_tlb_inval_context(struct xe_gt *gt) -{ - return xe_gt_is_media_type(gt) ? XE_EXEC_QUEUE_TLB_INVAL_MEDIA_GT : - XE_EXEC_QUEUE_TLB_INVAL_PRIMARY_GT; -} - /** - * xe_gt_tlb_inval_job_create() - GT TLB invalidation job create - * @gt: GT to invalidate + * xe_tlb_inval_job_create() - TLB invalidation job create * @q: exec queue issuing the invalidate + * @tlb_inval: TLB invalidation client + * @dep_scheduler: Dependency scheduler for job * @start: Start address to invalidate * @end: End address to invalidate * @asid: Address space ID to invalidate * - * Create a GT TLB invalidation job and initialize internal fields. The caller is + * Create a TLB invalidation job and initialize internal fields. The caller is * responsible for releasing the creation reference. * - * Return: GT TLB invalidation job object on success, ERR_PTR failure + * Return: TLB invalidation job object on success, ERR_PTR failure */ -struct xe_gt_tlb_inval_job *xe_gt_tlb_inval_job_create(struct xe_exec_queue *q, - struct xe_gt *gt, - u64 start, u64 end, - u32 asid) +struct xe_tlb_inval_job * +xe_tlb_inval_job_create(struct xe_exec_queue *q, struct xe_tlb_inval *tlb_inval, + struct xe_dep_scheduler *dep_scheduler, u64 start, + u64 end, u32 asid) { - struct xe_gt_tlb_inval_job *job; - struct xe_dep_scheduler *dep_scheduler = - q->tlb_inval[xe_gt_tlb_inval_context(gt)].dep_scheduler; + struct xe_tlb_inval_job *job; struct drm_sched_entity *entity = xe_dep_scheduler_entity(dep_scheduler); - struct xe_gt_tlb_inval_fence *ifence; + struct xe_tlb_inval_fence *ifence; int err; job = kmalloc(sizeof(*job), GFP_KERNEL); @@ -101,14 +95,14 @@ struct xe_gt_tlb_inval_job *xe_gt_tlb_inval_job_create(struct xe_exec_queue *q, return ERR_PTR(-ENOMEM); job->q = q; - job->gt = gt; + job->tlb_inval = tlb_inval; job->start = start; job->end = end; job->asid = asid; job->fence_armed = false; job->dep.ops = &dep_job_ops; kref_init(&job->refcount); - xe_exec_queue_get(q); /* Pairs with put in xe_gt_tlb_inval_job_destroy */ + xe_exec_queue_get(q); /* Pairs with put in xe_tlb_inval_job_destroy */ ifence = kmalloc(sizeof(*ifence), GFP_KERNEL); if (!ifence) { @@ -122,8 +116,8 @@ struct xe_gt_tlb_inval_job *xe_gt_tlb_inval_job_create(struct xe_exec_queue *q, if (err) goto err_fence; - /* Pairs with put in xe_gt_tlb_inval_job_destroy */ - xe_pm_runtime_get_noresume(gt_to_xe(job->gt)); + /* Pairs with put in xe_tlb_inval_job_destroy */ + xe_pm_runtime_get_noresume(gt_to_xe(q->gt)); return job; @@ -136,40 +130,40 @@ struct xe_gt_tlb_inval_job *xe_gt_tlb_inval_job_create(struct xe_exec_queue *q, return ERR_PTR(err); } -static void xe_gt_tlb_inval_job_destroy(struct kref *ref) +static void xe_tlb_inval_job_destroy(struct kref *ref) { - struct xe_gt_tlb_inval_job *job = container_of(ref, typeof(*job), - refcount); - struct xe_gt_tlb_inval_fence *ifence = + struct xe_tlb_inval_job *job = container_of(ref, typeof(*job), + refcount); + struct xe_tlb_inval_fence *ifence = container_of(job->fence, typeof(*ifence), base); - struct xe_device *xe = gt_to_xe(job->gt); struct xe_exec_queue *q = job->q; + struct xe_device *xe = gt_to_xe(q->gt); if (!job->fence_armed) kfree(ifence); else - /* Ref from xe_gt_tlb_inval_fence_init */ + /* Ref from xe_tlb_inval_fence_init */ dma_fence_put(job->fence); drm_sched_job_cleanup(&job->dep.drm); kfree(job); - xe_exec_queue_put(q); /* Pairs with get from xe_gt_tlb_inval_job_create */ - xe_pm_runtime_put(xe); /* Pairs with get from xe_gt_tlb_inval_job_create */ + xe_exec_queue_put(q); /* Pairs with get from xe_tlb_inval_job_create */ + xe_pm_runtime_put(xe); /* Pairs with get from xe_tlb_inval_job_create */ } /** - * xe_gt_tlb_inval_alloc_dep() - GT TLB invalidation job alloc dependency - * @job: GT TLB invalidation job to alloc dependency for + * xe_tlb_inval_alloc_dep() - TLB invalidation job alloc dependency + * @job: TLB invalidation job to alloc dependency for * - * Allocate storage for a dependency in the GT TLB invalidation fence. This + * Allocate storage for a dependency in the TLB invalidation fence. This * function should be called at most once per job and must be paired with - * xe_gt_tlb_inval_job_push being called with a real fence. + * xe_tlb_inval_job_push being called with a real fence. * * Return: 0 on success, -errno on failure */ -int xe_gt_tlb_inval_job_alloc_dep(struct xe_gt_tlb_inval_job *job) +int xe_tlb_inval_job_alloc_dep(struct xe_tlb_inval_job *job) { - xe_assert(gt_to_xe(job->gt), !xa_load(&job->dep.drm.dependencies, 0)); + xe_assert(gt_to_xe(job->q->gt), !xa_load(&job->dep.drm.dependencies, 0)); might_alloc(GFP_KERNEL); return drm_sched_job_add_dependency(&job->dep.drm, @@ -177,24 +171,24 @@ int xe_gt_tlb_inval_job_alloc_dep(struct xe_gt_tlb_inval_job *job) } /** - * xe_gt_tlb_inval_job_push() - GT TLB invalidation job push - * @job: GT TLB invalidation job to push + * xe_tlb_inval_job_push() - TLB invalidation job push + * @job: TLB invalidation job to push * @m: The migration object being used - * @fence: Dependency for GT TLB invalidation job + * @fence: Dependency for TLB invalidation job * - * Pushes a GT TLB invalidation job for execution, using @fence as a dependency. - * Storage for @fence must be preallocated with xe_gt_tlb_inval_job_alloc_dep + * Pushes a TLB invalidation job for execution, using @fence as a dependency. + * Storage for @fence must be preallocated with xe_tlb_inval_job_alloc_dep * prior to this call if @fence is not signaled. Takes a reference to the job’s * finished fence, which the caller is responsible for releasing, and return it * to the caller. This function is safe to be called in the path of reclaim. * * Return: Job's finished fence on success, cannot fail */ -struct dma_fence *xe_gt_tlb_inval_job_push(struct xe_gt_tlb_inval_job *job, - struct xe_migrate *m, - struct dma_fence *fence) +struct dma_fence *xe_tlb_inval_job_push(struct xe_tlb_inval_job *job, + struct xe_migrate *m, + struct dma_fence *fence) { - struct xe_gt_tlb_inval_fence *ifence = + struct xe_tlb_inval_fence *ifence = container_of(job->fence, typeof(*ifence), base); if (!dma_fence_is_signaled(fence)) { @@ -202,20 +196,20 @@ struct dma_fence *xe_gt_tlb_inval_job_push(struct xe_gt_tlb_inval_job *job, /* * Can be in path of reclaim, hence the preallocation of fence - * storage in xe_gt_tlb_inval_job_alloc_dep. Verify caller did + * storage in xe_tlb_inval_job_alloc_dep. Verify caller did * this correctly. */ - xe_assert(gt_to_xe(job->gt), + xe_assert(gt_to_xe(job->q->gt), xa_load(&job->dep.drm.dependencies, 0) == dma_fence_get_stub()); dma_fence_get(fence); /* ref released once dependency processed by scheduler */ ptr = xa_store(&job->dep.drm.dependencies, 0, fence, GFP_ATOMIC); - xe_assert(gt_to_xe(job->gt), !xa_is_err(ptr)); + xe_assert(gt_to_xe(job->q->gt), !xa_is_err(ptr)); } - xe_gt_tlb_inval_job_get(job); /* Pairs with put in free_job */ + xe_tlb_inval_job_get(job); /* Pairs with put in free_job */ job->fence_armed = true; /* @@ -225,8 +219,8 @@ struct dma_fence *xe_gt_tlb_inval_job_push(struct xe_gt_tlb_inval_job *job, */ xe_migrate_job_lock(m, job->q); - /* Creation ref pairs with put in xe_gt_tlb_inval_job_destroy */ - xe_gt_tlb_inval_fence_init(job->gt, ifence, false); + /* Creation ref pairs with put in xe_tlb_inval_job_destroy */ + xe_tlb_inval_fence_init(job->tlb_inval, ifence, false); dma_fence_get(job->fence); /* Pairs with put in DRM scheduler */ drm_sched_job_arm(&job->dep.drm); @@ -241,7 +235,7 @@ struct dma_fence *xe_gt_tlb_inval_job_push(struct xe_gt_tlb_inval_job *job, /* * Not using job->fence, as it has its own dma-fence context, which does - * not allow GT TLB invalidation fences on the same queue, GT tuple to + * not allow TLB invalidation fences on the same queue, GT tuple to * be squashed in dma-resv/DRM scheduler. Instead, we use the DRM scheduler * context and job's finished fence, which enables squashing. */ @@ -249,26 +243,26 @@ struct dma_fence *xe_gt_tlb_inval_job_push(struct xe_gt_tlb_inval_job *job, } /** - * xe_gt_tlb_inval_job_get() - Get a reference to GT TLB invalidation job - * @job: GT TLB invalidation job object + * xe_tlb_inval_job_get() - Get a reference to TLB invalidation job + * @job: TLB invalidation job object * - * Increment the GT TLB invalidation job's reference count + * Increment the TLB invalidation job's reference count */ -void xe_gt_tlb_inval_job_get(struct xe_gt_tlb_inval_job *job) +void xe_tlb_inval_job_get(struct xe_tlb_inval_job *job) { kref_get(&job->refcount); } /** - * xe_gt_tlb_inval_job_put() - Put a reference to GT TLB invalidation job - * @job: GT TLB invalidation job object + * xe_tlb_inval_job_put() - Put a reference to TLB invalidation job + * @job: TLB invalidation job object * - * Decrement the GT TLB invalidation job's reference count, call - * xe_gt_tlb_inval_job_destroy when reference count == 0. Skips decrement if + * Decrement the TLB invalidation job's reference count, call + * xe_tlb_inval_job_destroy when reference count == 0. Skips decrement if * input @job is NULL or IS_ERR. */ -void xe_gt_tlb_inval_job_put(struct xe_gt_tlb_inval_job *job) +void xe_tlb_inval_job_put(struct xe_tlb_inval_job *job) { if (!IS_ERR_OR_NULL(job)) - kref_put(&job->refcount, xe_gt_tlb_inval_job_destroy); + kref_put(&job->refcount, xe_tlb_inval_job_destroy); } diff --git a/drivers/gpu/drm/xe/xe_tlb_inval_job.h b/drivers/gpu/drm/xe/xe_tlb_inval_job.h new file mode 100644 index 000000000000..e63edcb26b50 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_tlb_inval_job.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2025 Intel Corporation + */ + +#ifndef _XE_TLB_INVAL_JOB_H_ +#define _XE_TLB_INVAL_JOB_H_ + +#include + +struct dma_fence; +struct xe_dep_scheduler; +struct xe_exec_queue; +struct xe_tlb_inval; +struct xe_tlb_inval_job; +struct xe_migrate; + +struct xe_tlb_inval_job * +xe_tlb_inval_job_create(struct xe_exec_queue *q, struct xe_tlb_inval *tlb_inval, + struct xe_dep_scheduler *dep_scheduler, + u64 start, u64 end, u32 asid); + +int xe_tlb_inval_job_alloc_dep(struct xe_tlb_inval_job *job); + +struct dma_fence *xe_tlb_inval_job_push(struct xe_tlb_inval_job *job, + struct xe_migrate *m, + struct dma_fence *fence); + +void xe_tlb_inval_job_get(struct xe_tlb_inval_job *job); + +void xe_tlb_inval_job_put(struct xe_tlb_inval_job *job); + +#endif diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_inval_types.h b/drivers/gpu/drm/xe/xe_tlb_inval_types.h similarity index 56% rename from drivers/gpu/drm/xe/xe_gt_tlb_inval_types.h rename to drivers/gpu/drm/xe/xe_tlb_inval_types.h index 442f72b78ccf..6d14b9f17b91 100644 --- a/drivers/gpu/drm/xe/xe_gt_tlb_inval_types.h +++ b/drivers/gpu/drm/xe/xe_tlb_inval_types.h @@ -3,58 +3,57 @@ * Copyright © 2023 Intel Corporation */ -#ifndef _XE_GT_TLB_INVAL_TYPES_H_ -#define _XE_GT_TLB_INVAL_TYPES_H_ +#ifndef _XE_TLB_INVAL_TYPES_H_ +#define _XE_TLB_INVAL_TYPES_H_ #include #include -struct xe_gt; - /** struct xe_tlb_inval - TLB invalidation client */ struct xe_tlb_inval { + /** @private: Backend private pointer */ + void *private; /** @tlb_inval.seqno: TLB invalidation seqno, protected by CT lock */ #define TLB_INVALIDATION_SEQNO_MAX 0x100000 int seqno; /** @tlb_invalidation.seqno_lock: protects @tlb_invalidation.seqno */ struct mutex seqno_lock; /** - * @tlb_inval.seqno_recv: last received TLB invalidation seqno, - * protected by CT lock + * @seqno_recv: last received TLB invalidation seqno, protected by + * CT lock */ int seqno_recv; /** - * @tlb_inval.pending_fences: list of pending fences waiting TLB - * invaliations, protected by CT lock + * @pending_fences: list of pending fences waiting TLB invaliations, + * protected CT lock */ struct list_head pending_fences; /** - * @tlb_inval.pending_lock: protects @tlb_inval.pending_fences - * and updating @tlb_inval.seqno_recv. + * @pending_lock: protects @pending_fences and updating @seqno_recv. */ spinlock_t pending_lock; /** - * @tlb_inval.fence_tdr: schedules a delayed call to - * xe_gt_tlb_fence_timeout after the timeut interval is over. + * @fence_tdr: schedules a delayed call to xe_tlb_fence_timeout after + * the timeout interval is over. */ struct delayed_work fence_tdr; - /** @wtlb_invalidation.wq: schedules GT TLB invalidation jobs */ + /** @job_wq: schedules TLB invalidation jobs */ struct workqueue_struct *job_wq; /** @tlb_inval.lock: protects TLB invalidation fences */ spinlock_t lock; }; /** - * struct xe_gt_tlb_inval_fence - XE GT TLB invalidation fence + * struct xe_tlb_inval_fence - TLB invalidation fence * - * Optionally passed to xe_gt_tlb_inval and will be signaled upon TLB + * Optionally passed to xe_tlb_inval* functions and will be signaled upon TLB * invalidation completion. */ -struct xe_gt_tlb_inval_fence { +struct xe_tlb_inval_fence { /** @base: dma fence base */ struct dma_fence base; - /** @gt: GT which fence belong to */ - struct xe_gt *gt; + /** @tlb_inval: TLB invalidation client which fence belong to */ + struct xe_tlb_inval *tlb_inval; /** @link: link into list of pending tlb fences */ struct list_head link; /** @seqno: seqno of TLB invalidation to signal fence one */ diff --git a/drivers/gpu/drm/xe/xe_trace.h b/drivers/gpu/drm/xe/xe_trace.h index 36538f50d06f..314f42fcbcbd 100644 --- a/drivers/gpu/drm/xe/xe_trace.h +++ b/drivers/gpu/drm/xe/xe_trace.h @@ -14,10 +14,10 @@ #include "xe_exec_queue_types.h" #include "xe_gpu_scheduler_types.h" -#include "xe_gt_tlb_inval_types.h" #include "xe_gt_types.h" #include "xe_guc_exec_queue_types.h" #include "xe_sched_job.h" +#include "xe_tlb_inval_types.h" #include "xe_vm.h" #define __dev_name_xe(xe) dev_name((xe)->drm.dev) @@ -25,13 +25,13 @@ #define __dev_name_gt(gt) __dev_name_xe(gt_to_xe((gt))) #define __dev_name_eq(q) __dev_name_gt((q)->gt) -DECLARE_EVENT_CLASS(xe_gt_tlb_inval_fence, - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_inval_fence *fence), +DECLARE_EVENT_CLASS(xe_tlb_inval_fence, + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), TP_ARGS(xe, fence), TP_STRUCT__entry( __string(dev, __dev_name_xe(xe)) - __field(struct xe_gt_tlb_inval_fence *, fence) + __field(struct xe_tlb_inval_fence *, fence) __field(int, seqno) ), @@ -45,23 +45,23 @@ DECLARE_EVENT_CLASS(xe_gt_tlb_inval_fence, __get_str(dev), __entry->fence, __entry->seqno) ); -DEFINE_EVENT(xe_gt_tlb_inval_fence, xe_gt_tlb_inval_fence_send, - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_inval_fence *fence), +DEFINE_EVENT(xe_tlb_inval_fence, xe_tlb_inval_fence_send, + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), TP_ARGS(xe, fence) ); -DEFINE_EVENT(xe_gt_tlb_inval_fence, xe_gt_tlb_inval_fence_recv, - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_inval_fence *fence), +DEFINE_EVENT(xe_tlb_inval_fence, xe_tlb_inval_fence_recv, + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), TP_ARGS(xe, fence) ); -DEFINE_EVENT(xe_gt_tlb_inval_fence, xe_gt_tlb_inval_fence_signal, - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_inval_fence *fence), +DEFINE_EVENT(xe_tlb_inval_fence, xe_tlb_inval_fence_signal, + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), TP_ARGS(xe, fence) ); -DEFINE_EVENT(xe_gt_tlb_inval_fence, xe_gt_tlb_inval_fence_timeout, - TP_PROTO(struct xe_device *xe, struct xe_gt_tlb_inval_fence *fence), +DEFINE_EVENT(xe_tlb_inval_fence, xe_tlb_inval_fence_timeout, + TP_PROTO(struct xe_device *xe, struct xe_tlb_inval_fence *fence), TP_ARGS(xe, fence) ); diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index d220d04721da..76402db022cc 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -28,7 +28,6 @@ #include "xe_drm_client.h" #include "xe_exec_queue.h" #include "xe_gt_pagefault.h" -#include "xe_gt_tlb_inval.h" #include "xe_migrate.h" #include "xe_pat.h" #include "xe_pm.h" @@ -38,6 +37,7 @@ #include "xe_res_cursor.h" #include "xe_svm.h" #include "xe_sync.h" +#include "xe_tlb_inval.h" #include "xe_trace_bo.h" #include "xe_wa.h" #include "xe_hmm.h" @@ -1892,7 +1892,7 @@ static void xe_vm_close(struct xe_vm *vm) xe_pt_clear(xe, vm->pt_root[id]); for_each_gt(gt, xe, id) - xe_gt_tlb_inval_vm(gt, vm); + xe_tlb_inval_vm(>->tlb_inval, vm); } } @@ -3878,7 +3878,7 @@ void xe_vm_unlock(struct xe_vm *vm) int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start, u64 end, u8 tile_mask) { - struct xe_gt_tlb_inval_fence + struct xe_tlb_inval_fence fence[XE_MAX_TILES_PER_DEVICE * XE_MAX_GT_PER_TILE]; struct xe_tile *tile; u32 fence_id = 0; @@ -3892,11 +3892,12 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start, if (!(tile_mask & BIT(id))) continue; - xe_gt_tlb_inval_fence_init(tile->primary_gt, - &fence[fence_id], true); + xe_tlb_inval_fence_init(&tile->primary_gt->tlb_inval, + &fence[fence_id], true); - err = xe_gt_tlb_inval_range(tile->primary_gt, &fence[fence_id], - start, end, vm->usm.asid); + err = xe_tlb_inval_range(&tile->primary_gt->tlb_inval, + &fence[fence_id], start, end, + vm->usm.asid); if (err) goto wait; ++fence_id; @@ -3904,11 +3905,12 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start, if (!tile->media_gt) continue; - xe_gt_tlb_inval_fence_init(tile->media_gt, - &fence[fence_id], true); + xe_tlb_inval_fence_init(&tile->media_gt->tlb_inval, + &fence[fence_id], true); - err = xe_gt_tlb_inval_range(tile->media_gt, &fence[fence_id], - start, end, vm->usm.asid); + err = xe_tlb_inval_range(&tile->media_gt->tlb_inval, + &fence[fence_id], start, end, + vm->usm.asid); if (err) goto wait; ++fence_id; @@ -3916,7 +3918,7 @@ int xe_vm_range_tilemask_tlb_inval(struct xe_vm *vm, u64 start, wait: for (id = 0; id < fence_id; ++id) - xe_gt_tlb_inval_fence_wait(&fence[id]); + xe_tlb_inval_fence_wait(&fence[id]); return err; } -- 2.34.1