All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Somalapuram Amaranath" <Amaranath.Somalapuram@amd.com>,
	"Matthew Brost" <matthew.brost@intel.com>,
	dri-devel@lists.freedesktop.org
Subject: [PATCH v5 05/12] drm/ttm: Provide a generic LRU walker helper
Date: Tue, 18 Jun 2024 09:18:13 +0200	[thread overview]
Message-ID: <20240618071820.130917-6-thomas.hellstrom@linux.intel.com> (raw)
In-Reply-To: <20240618071820.130917-1-thomas.hellstrom@linux.intel.com>

Provide a generic LRU walker in TTM, in the spirit of drm_gem_lru_scan()
but building on the restartable TTM LRU functionality.

The LRU walker optionally supports locking objects as part of
a ww mutex locking transaction, to mimic to some extent the
current functionality in ttm. However any -EDEADLK return
is converted to -ENOMEM, so that the driver will need to back
off and possibly retry without being able to keep the
ticket.

v3:
- Move the helper to core ttm.
- Remove the drm_exec usage from it for now, it will be
  reintroduced later in the series.
v4:
- Handle the -EALREADY case if ticketlocking.

Cc: Christian König <christian.koenig@amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 145 ++++++++++++++++++++++++++++++
 include/drm/ttm/ttm_bo.h          |  32 +++++++
 2 files changed, 177 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 0b3f4267130c..45fcaf6f8644 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -768,3 +768,148 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
 	ttm_tt_destroy(bo->bdev, ttm);
 	return ret;
 }
+
+static bool ttm_lru_walk_trylock(struct ttm_lru_walk *walk,
+				 struct ttm_buffer_object *bo,
+				 bool *needs_unlock)
+{
+	struct ttm_operation_ctx *ctx = walk->ctx;
+
+	*needs_unlock = false;
+
+	if (dma_resv_trylock(bo->base.resv)) {
+		*needs_unlock = true;
+		return true;
+	}
+
+	if (bo->base.resv == ctx->resv && ctx->allow_res_evict) {
+		dma_resv_assert_held(bo->base.resv);
+		return true;
+	}
+
+	return false;
+}
+
+static int ttm_lru_walk_ticketlock(struct ttm_lru_walk *walk,
+				   struct ttm_buffer_object *bo,
+				   bool *needs_unlock)
+{
+	struct dma_resv *resv = bo->base.resv;
+	int ret;
+
+	if (walk->ctx->interruptible)
+		ret = dma_resv_lock_interruptible(resv, walk->ticket);
+	else
+		ret = dma_resv_lock(resv, walk->ticket);
+
+	if (!ret) {
+		*needs_unlock = true;
+		/* Only a single ticketlock per loop. */
+		walk->ticket = NULL;
+	} else if (ret == -EDEADLK) {
+		/* Caller needs to exit the ww transaction. */
+		ret = -ENOSPC;
+	}
+
+	return ret;
+}
+
+static void ttm_lru_walk_unlock(struct ttm_buffer_object *bo, bool locked)
+{
+	if (locked)
+		dma_resv_unlock(bo->base.resv);
+}
+
+/**
+ * ttm_lru_walk_for_evict() - Perform a LRU list walk, with actions taken on
+ * valid items.
+ * @walk: describe the walks and actions taken
+ * @bdev: The TTM device.
+ * @man: The struct ttm_resource manager whose LRU lists we're walking.
+ * @target: The end condition for the walk.
+ *
+ * The LRU lists of @man are walk, and for each struct ttm_resource encountered,
+ * the corresponding ttm_buffer_object is locked and taken a reference on, and
+ * the LRU lock is dropped. the LRU lock may be dropped before locking and, in
+ * that case, it's verified that the item actually remains on the LRU list after
+ * the lock, and that the buffer object didn't switch resource in between.
+ *
+ * With a locked object, the actions indicated by @walk->process_bo are
+ * performed, and after that, the bo is unlocked, the refcount dropped and the
+ * next struct ttm_resource is processed. Here, the walker relies on
+ * TTM's restartable LRU list implementation.
+ *
+ * Typically @walk->process_bo() would return the number of pages evicted,
+ * swapped or shrunken, so that when the total exceeds @target, or when the
+ * LRU list has been walked in full, iteration is terminated. It's also terminated
+ * on error. Note that the definition of @target is done by the caller, it
+ * could have a different meaning than the number of pages.
+ *
+ * Note that the way dma_resv individualization is done, locking needs to be done
+ * either with the LRU lock held (trylocking only) or with a reference on the
+ * object.
+ *
+ * Return: The progress made towards target or negative error code on error.
+ */
+long ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct ttm_device *bdev,
+			    struct ttm_resource_manager *man, long target)
+{
+	struct ttm_resource_cursor cursor;
+	struct ttm_resource *res;
+	long sofar = 0;
+	long lret;
+
+	spin_lock(&bdev->lru_lock);
+	ttm_resource_manager_for_each_res(man, &cursor, res) {
+		struct ttm_buffer_object *bo = res->bo;
+		bool bo_needs_unlock = false;
+		bool bo_locked = false;
+		int mem_type;
+
+		if (!bo || bo->resource != res)
+			continue;
+
+		if (ttm_lru_walk_trylock(walk, bo, &bo_needs_unlock))
+			bo_locked = true;
+		else if ((!walk->ticket) || walk->ctx->no_wait_gpu ||
+			 walk->trylock_only)
+			continue;
+
+		if (!ttm_bo_get_unless_zero(bo)) {
+			ttm_lru_walk_unlock(bo, bo_needs_unlock);
+			continue;
+		}
+
+		mem_type = res->mem_type;
+		spin_unlock(&bdev->lru_lock);
+
+		lret = 0;
+		if (!bo_locked && walk->ticket)
+			lret = ttm_lru_walk_ticketlock(walk, bo, &bo_needs_unlock);
+
+		/*
+		 * Note that in between the release of the lru lock and the
+		 * ticketlock, the bo may have switched resource,
+		 * and also memory type, since the resource may have been
+		 * freed and allocated again with a different memory type.
+		 * In that case, just skip it.
+		 */
+		if (!lret && bo->resource == res && res->mem_type == mem_type)
+			lret = walk->ops->process_bo(walk, bo);
+
+		ttm_lru_walk_unlock(bo, bo_needs_unlock);
+		ttm_bo_put(bo);
+		if (lret == -EBUSY || lret == -EALREADY)
+			lret = 0;
+		sofar = (lret < 0) ? lret : sofar + lret;
+		if (sofar < 0 || sofar >= target)
+			goto out;
+
+		cond_resched();
+		spin_lock(&bdev->lru_lock);
+	}
+	spin_unlock(&bdev->lru_lock);
+out:
+	ttm_resource_cursor_fini(&cursor);
+	return sofar;
+}
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 6ccf96c91f3a..8b032298d66e 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -190,6 +190,38 @@ struct ttm_operation_ctx {
 	uint64_t bytes_moved;
 };
 
+struct ttm_lru_walk;
+
+/** struct ttm_lru_walk_ops - Operations for a LRU walk. */
+struct ttm_lru_walk_ops {
+	/**
+	 * process_bo - Process this bo.
+	 * @walk: struct ttm_lru_walk describing the walk.
+	 * @bo: A locked and referenced buffer object.
+	 *
+	 * Return: Negative error code on error, Number of processed pages on
+	 * success. 0 also indicates success.
+	 */
+	long (*process_bo)(struct ttm_lru_walk *walk, struct ttm_buffer_object *bo);
+};
+
+/**
+ * struct ttm_lru_walk - Structure describing a LRU walk.
+ */
+struct ttm_lru_walk {
+	/** @ops: Pointer to the ops structure. */
+	const struct ttm_lru_walk_ops *ops;
+	/** @ctx: Pointer to the struct ttm_operation_ctx. */
+	struct ttm_operation_ctx *ctx;
+	/** @ticket: The struct ww_acquire_ctx if any. */
+	struct ww_acquire_ctx *ticket;
+	/** @tryock_only: Only use trylock for locking. */
+	bool trylock_only;
+};
+
+long ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct ttm_device *bdev,
+			    struct ttm_resource_manager *man, long target);
+
 /**
  * ttm_bo_get - reference a struct ttm_buffer_object
  *
-- 
2.44.0


  parent reply	other threads:[~2024-06-18  7:18 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-18  7:18 [PATCH v5 00/12] TTM shrinker helpers and xe buffer object shrinker Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 01/12] drm/ttm: Allow TTM LRU list nodes of different types Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 02/12] drm/ttm: Slightly clean up LRU list iteration Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 03/12] drm/ttm: Use LRU hitches Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 04/12] drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist moves Thomas Hellström
2024-06-19  3:37   ` Matthew Brost
2024-06-19  8:24     ` Thomas Hellström
2024-06-19 14:44       ` Matthew Brost
2024-06-18  7:18 ` Thomas Hellström [this message]
2024-06-18 22:11   ` [PATCH v5 05/12] drm/ttm: Provide a generic LRU walker helper Matthew Brost
2024-06-19  7:31     ` Thomas Hellström
2024-06-19 15:09       ` Matthew Brost
2024-06-18  7:18 ` [PATCH v5 06/12] drm/ttm: Use the LRU walker helper for swapping Thomas Hellström
2024-06-19  4:23   ` Matthew Brost
2024-06-19  8:36     ` Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 07/12] drm/ttm: Use the LRU walker for eviction Thomas Hellström
2024-06-19 22:52   ` Matthew Brost
2024-06-24  9:06     ` Thomas Hellström
2024-06-19 23:33   ` Matthew Brost
2024-06-24  9:16     ` Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 08/12] drm/ttm: Add a virtual base class for graphics memory backup Thomas Hellström
2024-06-20 15:17   ` Matthew Brost
2024-06-24  9:26     ` Thomas Hellström
2024-06-24 15:47       ` Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 09/12] drm/ttm/pool: Provide a helper to shrink pages Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 10/12] drm/ttm: Use fault-injection to test error paths Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 11/12] drm/ttm, drm/xe: Add a shrinker for xe bos Thomas Hellström
2024-06-18  7:18 ` [PATCH v5 12/12] drm/xe: Increase the XE_PL_TT watermark Thomas Hellström
2024-06-18  7:24 ` ✓ CI.Patch_applied: success for TTM shrinker helpers and xe buffer object shrinker (rev5) Patchwork
2024-06-18  7:24 ` ✗ CI.checkpatch: warning " Patchwork
2024-06-18  7:25 ` ✓ CI.KUnit: success " Patchwork
2024-06-18  7:37 ` ✓ CI.Build: " Patchwork
2024-06-18  7:39 ` ✗ CI.Hooks: failure " Patchwork
2024-06-18  7:41 ` ✗ CI.checksparse: warning " Patchwork
2024-06-18  8:03 ` ✓ CI.BAT: success " Patchwork
2024-06-18 19:08 ` ✗ CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240618071820.130917-6-thomas.hellstrom@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=Amaranath.Somalapuram@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.