From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EA191D111B6 for ; Wed, 26 Nov 2025 23:02:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A674E10E71F; Wed, 26 Nov 2025 23:02:19 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="iy+wNKja"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) by gabe.freedesktop.org (Postfix) with ESMTPS id D3F3F10E71E for ; Wed, 26 Nov 2025 23:02:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1764198136; x=1795734136; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BgVsCbkMZ8C/PBfPAnj6ZfFLDG6PXv0+TjpTYZY6ppc=; b=iy+wNKjaTvPSEjYXDjt0RZYe1FTGhSpreT9s8wihQWUoQzl0ibSr7fxr +hq+mIdYufK/9uV5f6OrcmdhBe+rTX++2NNgLEkD15IBDUC/brdbEL17i L70gupRMX6NmIH12YFwuMzm6q51f6yGn6WS3Za+m4j5QBFzgGt0hI57LE PYoODNBHkmaEh1OWNWCYkfQjHWZxQi79akoWXlr6kMugswKMLGlzpggXw M7wxlZjJWkNQxuBPWsezypNDNP6SE+cRSCQQR73JpPlnzucH6XTyLcn/i GIaDzNSy5XBvb+bq2XtyzA7cVBh2GXrxoAKOgVTxiLS1hlzoJaSmyiLFO w==; X-CSE-ConnectionGUID: meiv24leQg29zNqgxdagOg== X-CSE-MsgGUID: HJ3Y0So4SzmDZNKQFFqTzg== X-IronPort-AV: E=McAfee;i="6800,10657,11625"; a="66284524" X-IronPort-AV: E=Sophos;i="6.20,229,1758610800"; d="scan'208";a="66284524" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Nov 2025 15:02:16 -0800 X-CSE-ConnectionGUID: Pkk5wL60T1WXM8Gcc6YFjA== X-CSE-MsgGUID: vsUEZ9pcQgeEcZSfPxRQcw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.20,229,1758610800"; d="scan'208";a="224028522" Received: from osgc-sh-dragon.sh.intel.com ([10.239.81.44]) by fmviesa001.fm.intel.com with ESMTP; 26 Nov 2025 15:02:14 -0800 From: Brian Nguyen To: intel-xe@lists.freedesktop.org Cc: tejas.upadhyay@intel.com, matthew.brost@intel.com, shuicheng.lin@intel.com, stuart.summers@intel.com Subject: [PATCH v2 06/11] drm/xe: Create page reclaim list on unbind Date: Thu, 27 Nov 2025 07:02:07 +0800 Message-ID: <20251126230201.3782788-19-brian3.nguyen@intel.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20251126230201.3782788-13-brian3.nguyen@intel.com> References: <20251126230201.3782788-13-brian3.nguyen@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" Page reclaim list (PRL) is preparation work for the page reclaim feature. The PRL is firstly owned by pt_update_ops and all other page reclaim operations will point back to this PRL. PRL generates its entries during the unbind page walker, updating the PRL. This PRL is restricted to a 4K page, so 512 page entries at most. v2: - Removed unused function. (Shuicheng) - Compacted warning checking, update commit message, spelling, etc. (Shuicheng, Matthew B) - Fix kernel docs - Moved PRL max entries overflow handling out from generate_reclaim_entry to caller (Shuicheng) - Add xe_page_reclaim_list_init for clarity. (Matthew B) - Modify xe_guc_page_reclaim_entry to use macros for greater flexbility. (Matthew B) - Add fallback for PTE outside of page reclaim supported 4K, 64K, 2M pages (Matthew B) - Invalidate PRL for early abort page walk. - Removed page reclaim related variables from tlb fence (Matthew Brost) - Remove error handling in *alloc_entries failure. (Matthew B) Signed-off-by: Brian Nguyen Cc: Matthew Brost Cc: Shuicheng Lin --- drivers/gpu/drm/xe/Makefile | 1 + drivers/gpu/drm/xe/regs/xe_gtt_defs.h | 1 + drivers/gpu/drm/xe/xe_page_reclaim.c | 62 ++++++++++++++ drivers/gpu/drm/xe/xe_page_reclaim.h | 73 +++++++++++++++++ drivers/gpu/drm/xe/xe_pt.c | 112 +++++++++++++++++++++++++- drivers/gpu/drm/xe/xe_pt_types.h | 5 ++ 6 files changed, 253 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.c create mode 100644 drivers/gpu/drm/xe/xe_page_reclaim.h diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile index c9b60e19cecc..cbce647f2f01 100644 --- a/drivers/gpu/drm/xe/Makefile +++ b/drivers/gpu/drm/xe/Makefile @@ -95,6 +95,7 @@ xe-y += xe_bb.o \ xe_oa.o \ xe_observation.o \ xe_pagefault.o \ + xe_page_reclaim.o \ xe_pat.o \ xe_pci.o \ xe_pcode.o \ diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h index 4389e5a76f89..4d83461e538b 100644 --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h @@ -9,6 +9,7 @@ #define XELPG_GGTT_PTE_PAT0 BIT_ULL(52) #define XELPG_GGTT_PTE_PAT1 BIT_ULL(53) +#define XE_PTE_ADDR_MASK GENMASK_ULL(51, 12) #define GGTT_PTE_VFID GENMASK_ULL(11, 2) #define GUC_GGTT_TOP 0xFEE00000 diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.c b/drivers/gpu/drm/xe/xe_page_reclaim.c new file mode 100644 index 000000000000..63facea28213 --- /dev/null +++ b/drivers/gpu/drm/xe/xe_page_reclaim.c @@ -0,0 +1,62 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2025 Intel Corporation + */ + +#include +#include +#include +#include + +#include "xe_page_reclaim.h" + +#include "regs/xe_gt_regs.h" +#include "xe_assert.h" +#include "xe_macros.h" + +/** + * xe_page_reclaim_list_invalidate() - Mark a PRL as invalid + * @prl: Page reclaim list to reset + * + * Clears the entries pointer and marks the list as invalid so + * future use knows PRL is unusable. It is expected that the entries + * have already been released. + */ +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl) +{ + prl->entries = NULL; + prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; +} + +/** + * xe_page_reclaim_list_init() - Initialize a page reclaim list + * @prl: Page reclaim list to initialize + * + * Invalidates the list to prepare on initalization. + */ +void xe_page_reclaim_list_init(struct xe_page_reclaim_list *prl) +{ + xe_page_reclaim_list_invalidate(prl); +} + +/** + * xe_page_reclaim_list_alloc_entries() - Allocate page reclaim list entries + * @prl: Page reclaim list to allocate entries for + * + * Allocate one 4K page for the PRL entries, otherwise assign prl->entries to NULL. + */ +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl) +{ + struct page *page; + + if (XE_WARN_ON(prl->entries)) + return 0; + + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (page) { + prl->entries = page_address(page); + prl->num_entries = 0; + } + + return page ? 0 : -ENOMEM; +} diff --git a/drivers/gpu/drm/xe/xe_page_reclaim.h b/drivers/gpu/drm/xe/xe_page_reclaim.h new file mode 100644 index 000000000000..5ccff46d1b4e --- /dev/null +++ b/drivers/gpu/drm/xe/xe_page_reclaim.h @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2025 Intel Corporation + */ + +#ifndef _XE_PAGE_RECLAIM_H_ +#define _XE_PAGE_RECLAIM_H_ + +#include +#include +#include +#include +#include +#include + +#define XE_PAGE_RECLAIM_MAX_ENTRIES 512 +#define XE_PAGE_RECLAIM_LIST_MAX_SIZE SZ_4K + +struct xe_guc_page_reclaim_entry { + u32 dw0; +/* valid reclaim entry bit */ +#define XE_PAGE_RECLAIM_VALID BIT(0) +/* + * offset order of page size to be reclaimed + * page_size = 1 << (XE_PTE_SHIFT + reclamation_size) + */ +#define XE_PAGE_RECLAIM_SIZE GENMASK(6, 1) +#define XE_PAGE_RECLAIM_RSVD_0 GENMASK(11, 7) +/* lower 20 bits of the physical address */ +#define XE_PAGE_RECLAIM_ADDR_LO GENMASK(31, 12) + u32 dw1; +/* upper 20 bits of the physical address */ +#define XE_PAGE_RECLAIM_ADDR_HI GENMASK(19, 0) +#define XE_PAGE_RECLAIM_RSVD_1 GENMASK(31, 20) +} __packed; + +struct xe_page_reclaim_list { + /** @entries: array of page reclaim entries, page allocated */ + struct xe_guc_page_reclaim_entry *entries; + /** @num_entries: number of entries */ + int num_entries; +#define XE_PAGE_RECLAIM_INVALID_LIST -1 +}; + +void xe_page_reclaim_list_invalidate(struct xe_page_reclaim_list *prl); +void xe_page_reclaim_list_init(struct xe_page_reclaim_list *prl); +int xe_page_reclaim_list_alloc_entries(struct xe_page_reclaim_list *prl); +/** + * xe_page_reclaim_entries_get() - Increment the reference count of page reclaim entries. + * @entries: Pointer to the array of page reclaim entries. + * + * This function increments the reference count of the backing page. + */ +static inline void xe_page_reclaim_entries_get(struct xe_guc_page_reclaim_entry *entries) +{ + if (entries) + get_page(virt_to_page(entries)); +} + +/** + * xe_page_reclaim_entries_put() - Decrement the reference count of page reclaim entries. + * @entries: Pointer to the array of page reclaim entries. + * + * This function decrements the reference count of the backing page + * and frees it if the count reaches zero. + */ +static inline void xe_page_reclaim_entries_put(struct xe_guc_page_reclaim_entry *entries) +{ + if (entries) + put_page(virt_to_page(entries)); +} + +#endif /* _XE_PAGE_RECLAIM_H_ */ diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c index 884127b4d97d..347b111dc097 100644 --- a/drivers/gpu/drm/xe/xe_pt.c +++ b/drivers/gpu/drm/xe/xe_pt.c @@ -12,6 +12,7 @@ #include "xe_exec_queue.h" #include "xe_gt.h" #include "xe_migrate.h" +#include "xe_page_reclaim.h" #include "xe_pt_types.h" #include "xe_pt_walk.h" #include "xe_res_cursor.h" @@ -1535,6 +1536,9 @@ struct xe_pt_stage_unbind_walk { /** @modified_end: Walk range start, modified like @modified_start. */ u64 modified_end; + /** @prl: Backing pointer to page reclaim list in pt_update_ops */ + struct xe_page_reclaim_list *prl; + /* Output */ /* @wupd: Structure to track the page-table updates we're building */ struct xe_walk_update wupd; @@ -1572,6 +1576,61 @@ static bool xe_pt_check_kill(u64 addr, u64 next, unsigned int level, return false; } +/* Huge 2MB leaf lives directly in a level-1 table and has no children */ +static bool is_2m_pte(struct xe_pt *pte) +{ + return pte->level == 1 && !pte->base.children; +} + +/* page_size = 2^(reclamation_size + XE_PTE_SHIFT) */ +#define COMPUTE_RECLAIM_ADDRESS_MASK(page_size) \ +({ \ + BUILD_BUG_ON(!__builtin_constant_p(page_size)); \ + ilog2(page_size) - XE_PTE_SHIFT; \ +}) + +static void generate_reclaim_entry(struct xe_tile *tile, + struct xe_page_reclaim_list *prl, + u64 pte, struct xe_pt *xe_child) +{ + struct xe_guc_page_reclaim_entry *reclaim_entries = prl->entries; + u64 phys_addr = pte & XE_PTE_ADDR_MASK; + int num_entries = prl->num_entries; + u32 reclamation_size; + + xe_tile_assert(tile, xe_child->level <= MAX_HUGEPTE_LEVEL); + xe_tile_assert(tile, reclaim_entries); + xe_tile_assert(tile, num_entries < XE_PAGE_RECLAIM_MAX_ENTRIES - 1); + + if (num_entries == XE_PAGE_RECLAIM_INVALID_LIST) + return; + + /** + * reclamation_size indicates the size of the page to be + * invalidated and flushed from non-coherent cache. + * Page size is computed as 2^(reclamation_size + XE_PTE_SHIFT) bytes. + * Only 4K, 64K (level 0), and 2M pages are supported by hardware for page reclaim + */ + if (xe_child->level == 0 && !(pte & XE_PTE_PS64)) + reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_4K); /* reclamation_size = 0 */ + else if (xe_child->level == 0) + reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_64K); /* reclamation_size = 4 */ + else if (is_2m_pte(xe_child)) + reclamation_size = COMPUTE_RECLAIM_ADDRESS_MASK(SZ_2M); /* reclamation_size = 9 */ + else { + prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; + return; + } + + reclaim_entries[num_entries].dw0 = + FIELD_PREP(XE_PAGE_RECLAIM_VALID, 1) | + FIELD_PREP(XE_PAGE_RECLAIM_SIZE, reclamation_size) | + FIELD_PREP(XE_PAGE_RECLAIM_ADDR_LO, (u32)(phys_addr & 0xFFFFF)); + reclaim_entries[num_entries].dw1 = + FIELD_PREP(XE_PAGE_RECLAIM_ADDR_HI, (u32)((phys_addr >> 20) & 0xFFFFF)); + prl->num_entries++; +} + static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset, unsigned int level, u64 addr, u64 next, struct xe_ptw **child, @@ -1579,11 +1638,39 @@ static int xe_pt_stage_unbind_entry(struct xe_ptw *parent, pgoff_t offset, struct xe_pt_walk *walk) { struct xe_pt *xe_child = container_of(*child, typeof(*xe_child), base); + struct xe_pt_stage_unbind_walk *xe_walk = + container_of(walk, typeof(*xe_walk), base); + struct xe_device *xe = tile_to_xe(xe_walk->tile); XE_WARN_ON(!*child); XE_WARN_ON(!level); + /* Check for leaf node */ + if (xe_walk->prl && xe_walk->prl->num_entries != XE_PAGE_RECLAIM_INVALID_LIST && + !xe_child->base.children) { + struct iosys_map *leaf_map = &xe_child->bo->vmap; + pgoff_t first = xe_pt_offset(addr, 0, walk); + pgoff_t count = xe_pt_num_entries(addr, next, 0, walk); + + for (pgoff_t i = 0; i < count; i++) { + u64 pte = xe_map_rd(xe, leaf_map, (first + i) * sizeof(u64), u64); + + /* Account for NULL terminated entry on end (-1) */ + if (xe_walk->prl->num_entries < XE_PAGE_RECLAIM_MAX_ENTRIES - 1) { + generate_reclaim_entry(xe_walk->tile, xe_walk->prl, + pte, xe_child); + } else { + /* overflow, mark as invalid */ + xe_walk->prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; + break; + } + } + } - xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk); + /* If aborting page walk early, invalidate PRL since PTE may be dropped from this abort */ + if (xe_pt_check_kill(addr, next, level - 1, xe_child, action, walk) && + level > 1 && xe_child->base.children && xe_child->num_live != 0) { + xe_walk->prl->num_entries = XE_PAGE_RECLAIM_INVALID_LIST; + } return 0; } @@ -1654,6 +1741,8 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile *tile, { u64 start = range ? xe_svm_range_start(range) : xe_vma_start(vma); u64 end = range ? xe_svm_range_end(range) : xe_vma_end(vma); + struct xe_vm_pgtable_update_op *pt_update_op = + container_of(entries, struct xe_vm_pgtable_update_op, entries[0]); struct xe_pt_stage_unbind_walk xe_walk = { .base = { .ops = &xe_pt_stage_unbind_ops, @@ -1665,6 +1754,7 @@ static unsigned int xe_pt_stage_unbind(struct xe_tile *tile, .modified_start = start, .modified_end = end, .wupd.entries = entries, + .prl = pt_update_op->prl, }; struct xe_pt *pt = vm->pt_root[tile->id]; @@ -1897,6 +1987,7 @@ static int unbind_op_prepare(struct xe_tile *tile, struct xe_vm_pgtable_update_ops *pt_update_ops, struct xe_vma *vma) { + struct xe_device *xe = tile_to_xe(tile); u32 current_op = pt_update_ops->current_op; struct xe_vm_pgtable_update_op *pt_op = &pt_update_ops->ops[current_op]; int err; @@ -1914,6 +2005,11 @@ static int unbind_op_prepare(struct xe_tile *tile, pt_op->vma = vma; pt_op->bind = false; pt_op->rebind = false; + /* Maintain one PRL located in pt_update_ops that all others in unbind op reference */ + if (xe->info.has_page_reclaim_hw_assist && !pt_update_ops->prl.entries) + xe_page_reclaim_list_alloc_entries(&pt_update_ops->prl); + + pt_op->prl = (pt_update_ops->prl.entries) ? &pt_update_ops->prl : NULL; err = vma_reserve_fences(tile_to_xe(tile), vma); if (err) @@ -1921,6 +2017,13 @@ static int unbind_op_prepare(struct xe_tile *tile, pt_op->num_entries = xe_pt_stage_unbind(tile, xe_vma_vm(vma), vma, NULL, pt_op->entries); + /* Free PRL if list declared as invalid */ + if (pt_update_ops->prl.entries && + pt_update_ops->prl.num_entries == XE_PAGE_RECLAIM_INVALID_LIST) { + xe_page_reclaim_entries_put(pt_update_ops->prl.entries); + xe_page_reclaim_list_invalidate(&pt_update_ops->prl); + pt_op->prl = NULL; + } xe_vm_dbg_print_entries(tile_to_xe(tile), pt_op->entries, pt_op->num_entries, false); @@ -1979,6 +2082,7 @@ static int unbind_range_prepare(struct xe_vm *vm, pt_op->vma = XE_INVALID_VMA; pt_op->bind = false; pt_op->rebind = false; + pt_op->prl = NULL; pt_op->num_entries = xe_pt_stage_unbind(tile, vm, NULL, range, pt_op->entries); @@ -2096,6 +2200,7 @@ xe_pt_update_ops_init(struct xe_vm_pgtable_update_ops *pt_update_ops) init_llist_head(&pt_update_ops->deferred); pt_update_ops->start = ~0x0ull; pt_update_ops->last = 0x0ull; + xe_page_reclaim_list_init(&pt_update_ops->prl); } /** @@ -2518,6 +2623,11 @@ void xe_pt_update_ops_fini(struct xe_tile *tile, struct xe_vma_ops *vops) &vops->pt_update_ops[tile->id]; int i; + if (pt_update_ops->prl.entries) { + xe_page_reclaim_entries_put(pt_update_ops->prl.entries); + xe_page_reclaim_list_invalidate(&pt_update_ops->prl); + } + lockdep_assert_held(&vops->vm->lock); xe_vm_assert_held(vops->vm); diff --git a/drivers/gpu/drm/xe/xe_pt_types.h b/drivers/gpu/drm/xe/xe_pt_types.h index 881f01e14db8..88fabf8e2655 100644 --- a/drivers/gpu/drm/xe/xe_pt_types.h +++ b/drivers/gpu/drm/xe/xe_pt_types.h @@ -8,6 +8,7 @@ #include +#include "xe_page_reclaim.h" #include "xe_pt_walk.h" struct xe_bo; @@ -79,6 +80,8 @@ struct xe_vm_pgtable_update_op { struct xe_vm_pgtable_update entries[XE_VM_MAX_LEVEL * 2 + 1]; /** @vma: VMA for operation, operation not valid if NULL */ struct xe_vma *vma; + /** @prl: Backing pointer to page reclaim list of pt_update_ops */ + struct xe_page_reclaim_list *prl; /** @num_entries: number of entries for this update operation */ u32 num_entries; /** @bind: is a bind */ @@ -95,6 +98,8 @@ struct xe_vm_pgtable_update_ops { struct llist_head deferred; /** @q: exec queue for PT operations */ struct xe_exec_queue *q; + /** @prl: embedded page reclaim list */ + struct xe_page_reclaim_list prl; /** @start: start address of ops */ u64 start; /** @last: last address of ops */ -- 2.52.0