All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	himal.prasad.ghimiray@intel.com, apopple@nvidia.com,
	airlied@gmail.com, "Simona Vetter" <simona.vetter@ffwll.ch>,
	felix.kuehling@amd.com, "Matthew Brost" <matthew.brost@intel.com>,
	"Christian König" <christian.koenig@amd.com>,
	dakr@kernel.org, "Mrozek, Michal" <michal.mrozek@intel.com>,
	"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>
Subject: [RFC PATCH 09/19] drm/pagemap_util: Add a utility to assign an owner to a set of interconnected gpus
Date: Wed, 12 Mar 2025 22:04:06 +0100	[thread overview]
Message-ID: <20250312210416.3120-10-thomas.hellstrom@linux.intel.com> (raw)
In-Reply-To: <20250312210416.3120-1-thomas.hellstrom@linux.intel.com>

The hmm_range_fault() and the migration helpers currently need a common
"owner" to identify pagemaps and clients with fast interconnect.
Add a drm_pagemap utility to setup such owners by registering
drm_pagemaps, in a registry, and for each new drm_pagemap,
query which existing drm_pagemaps have fast interconnects with the new
drm_pagemap.

The "owner" scheme is limited in that it is static at drm_pagemap creation.
Ideally one would want the owner to be adjusted at run-time, but that
requires changes to hmm. If the proposed scheme becomes too limited,
we need to revisit.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/Makefile           |   3 +-
 drivers/gpu/drm/drm_pagemap_util.c | 125 +++++++++++++++++++++++++++++
 include/drm/drm_pagemap_util.h     |  55 +++++++++++++
 3 files changed, 182 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/drm_pagemap_util.c
 create mode 100644 include/drm/drm_pagemap_util.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 6e3520bff769..bd7bdf973897 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -107,7 +107,8 @@ obj-$(CONFIG_DRM_GPUVM) += drm_gpuvm.o
 
 drm_gpusvm_helper-y := \
 	drm_gpusvm.o\
-	drm_pagemap.o
+	drm_pagemap.o\
+	drm_pagemap_util.o
 obj-$(CONFIG_DRM_GPUSVM) += drm_gpusvm_helper.o
 
 obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o
diff --git a/drivers/gpu/drm/drm_pagemap_util.c b/drivers/gpu/drm/drm_pagemap_util.c
new file mode 100644
index 000000000000..ae8f78cde4a7
--- /dev/null
+++ b/drivers/gpu/drm/drm_pagemap_util.c
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0-only OR MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <linux/slab.h>
+
+#include <drm/drm_pagemap.h>
+#include <drm/drm_pagemap_util.h>
+
+/**
+ * struct drm_pagemap_owner - Device interconnect group
+ * @kref: Reference count.
+ *
+ * A struct drm_pagemap_owner identifies a device interconnect group.
+ */
+struct drm_pagemap_owner {
+	struct kref kref;
+};
+
+static void drm_pagemap_owner_release(struct kref *kref)
+{
+	kfree(container_of(kref, struct drm_pagemap_owner, kref));
+}
+
+/**
+ * drm_pagemap_release_owner() - Stop participating in an interconnect group
+ * @peer: Pointer to the struct drm_pagemap_peer used when joining the group
+ *
+ * Stop participating in an interconnect group. This function is typically
+ * called when a pagemap is removed to indicate that it doesn't need to
+ * be taken into account.
+ */
+void drm_pagemap_release_owner(struct drm_pagemap_peer *peer)
+{
+	struct drm_pagemap_owner_list *owner_list = peer->list;
+
+	if (!owner_list)
+		return;
+
+	mutex_lock(&owner_list->lock);
+	list_del(&peer->link);
+	kref_put(&peer->owner->kref, drm_pagemap_owner_release);
+	peer->owner = NULL;
+	mutex_unlock(&owner_list->lock);
+}
+EXPORT_SYMBOL(drm_pagemap_release_owner);
+
+/**
+ * typedef interconnect_fn - Callback function to identify fast interconnects
+ * @peer1: First endpoint.
+ * @peer2: Second endpont.
+ *
+ * The function returns %true iff @peer1 and @peer2 have a fast interconnect.
+ * Note that this is symmetrical. The function has no notion of client and provider,
+ * which may not be sufficient in some cases. However, since the callback is intended
+ * to guide in providing common pagemap owners, the notion of a common owner to
+ * indicate fast interconnects would then have to change as well.
+ *
+ * Return: %true iff @peer1 and @peer2 have a fast interconnect. Otherwise @false.
+ */
+typedef bool (*interconnect_fn)(struct drm_pagemap_peer *peer1, struct drm_pagemap_peer *peer2);
+
+/**
+ * drm_pagemap_acquire_owner() - Join an interconnect group
+ * @peer: A struct drm_pagemap_peer keeping track of the device interconnect
+ * @owner_list: Pointer to the owner_list, keeping track of all interconnects
+ * @has_interconnect: Callback function to determine whether two peers have a
+ * fast local interconnect.
+ *
+ * Repeatedly calls @has_interconnect for @peer and other peers on @owner_list to
+ * determine a set of peers for which @peer has a fast interconnect. That set will
+ * have common &struct drm_pagemap_owner, and upon successful return, @peer::owner
+ * will point to that struct, holding a reference, and @peer will be registered in
+ * @owner_list. If @peer doesn't have any fast interconnects to other @peers, a
+ * new unique &struct drm_pagemap_owner will be allocated for it, and that
+ * may be shared with other peers that, at a later point, are determined to have
+ * a fast interconnect with @peer.
+ *
+ * When @peer no longer participates in an interconnect group,
+ * drm_pagemap_release_owner() should be called to drop the reference on the
+ * struct drm_pagemap_owner.
+ *
+ * Return: %0 on success, negative error code on failure.
+ */
+int drm_pagemap_acquire_owner(struct drm_pagemap_peer *peer,
+			      struct drm_pagemap_owner_list *owner_list,
+			      interconnect_fn has_interconnect)
+{
+	struct drm_pagemap_peer *cur_peer;
+	struct drm_pagemap_owner *owner = NULL;
+	bool interconnect = false;
+
+	mutex_lock(&owner_list->lock);
+	might_alloc(GFP_KERNEL);
+	list_for_each_entry(cur_peer, &owner_list->peers, link) {
+		if (cur_peer->owner != owner) {
+			if (owner && interconnect)
+				break;
+			owner = cur_peer->owner;
+			interconnect = true;
+		}
+		if (interconnect && !has_interconnect(peer, cur_peer))
+			interconnect = false;
+	}
+
+	if (!interconnect) {
+		owner = kmalloc(sizeof(*owner), GFP_KERNEL);
+		if (!owner) {
+			mutex_unlock(&owner_list->lock);
+			return -ENOMEM;
+		}
+		kref_init(&owner->kref);
+		list_add_tail(&peer->link, &owner_list->peers);
+	} else {
+		kref_get(&owner->kref);
+		list_add_tail(&peer->link, &cur_peer->link);
+	}
+	peer->owner = owner;
+	peer->list = owner_list;
+	mutex_unlock(&owner_list->lock);
+
+	return 0;
+}
+EXPORT_SYMBOL(drm_pagemap_acquire_owner);
diff --git a/include/drm/drm_pagemap_util.h b/include/drm/drm_pagemap_util.h
new file mode 100644
index 000000000000..03731c79493f
--- /dev/null
+++ b/include/drm/drm_pagemap_util.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+#ifndef _DRM_PAGEMAP_UTIL_H_
+#define _DRM_PAGEMAP_UTIL_H_
+
+#include <linux/list.h>
+#include <linux/mutex.h>
+
+struct drm_pagemap;
+struct drm_pagemap_owner;
+
+/**
+ * struct drm_pagemap_peer - Structure representing a fast interconnect peer
+ * @list: Pointer to a &struct drm_pagemap_owner_list used to keep track of peers
+ * @link: List link for @list's list of peers.
+ * @owner: Pointer to a &struct drm_pagemap_owner, common for a set of peers having
+ * fast interconnects.
+ */
+struct drm_pagemap_peer {
+	struct drm_pagemap_owner_list *list;
+	struct list_head link;
+	struct drm_pagemap_owner *owner;
+};
+
+/**
+ * struct drm_pagemap_owner_list - Keeping track of peers and owners
+ * @lock: Mutex protecting the @peers list.
+ * @peer: List of peers.
+ *
+ * The owner list defines the scope where we identify peers having fast interconnects
+ * and a common owner. Typically a driver has a single global owner list to
+ * keep track of common owners for the driver's pagemaps.
+ */
+struct drm_pagemap_owner_list {
+	struct mutex lock;
+	struct list_head peers;
+};
+
+/*
+ * Convenience macro to define an owner list.
+ */
+#define DRM_PAGEMAP_OWNER_LIST_DEFINE(_name)	\
+	struct drm_pagemap_owner_list _name = {	\
+	.lock = __MUTEX_INITIALIZER(_name.lock), \
+	.peers = LIST_HEAD_INIT(_name.peers) }
+
+void drm_pagemap_release_owner(struct drm_pagemap_peer *peer);
+
+int drm_pagemap_acquire_owner(struct drm_pagemap_peer *peer,
+			      struct drm_pagemap_owner_list *owner_list,
+			      bool (*has_interconnect)(struct drm_pagemap_peer *peer1,
+						       struct drm_pagemap_peer *peer2));
+#endif
-- 
2.48.1


  parent reply	other threads:[~2025-03-12 21:05 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-12 21:03 [RFC PATCH 00/19] drm, drm/xe: Multi-device GPUSVM Thomas Hellström
2025-03-12 21:03 ` [RFC PATCH 01/19] drm/xe: Introduce CONFIG_DRM_XE_GPUSVM Thomas Hellström
2025-03-12 21:03 ` [RFC PATCH 02/19] drm/xe/svm: Fix a potential bo UAF Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 03/19] drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 04/19] drm/pagemap: Add a populate_mm op Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 05/19] drm/xe: Implement and use the drm_pagemap " Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 06/19] drm/pagemap, drm/xe: Add refcounting to struct drm_pagemap and manage lifetime Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 07/19] drm/pagemap: Get rid of the struct drm_pagemap_zdd::device_private_page_owner field Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 08/19] drm/xe/bo: Add a bo remove callback Thomas Hellström
2025-03-14 13:05   ` Thomas Hellström
2025-03-12 21:04 ` Thomas Hellström [this message]
2025-03-12 21:04 ` [RFC PATCH 10/19] drm/gpusvm, drm/xe: Move the device private owner to the drm_gpusvm_ctx Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 11/19] drm/xe: Use the drm_pagemap_util helper to get a svm pagemap owner Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 12/19] drm/xe: Make the PT code handle placement per PTE rather than per vma / range Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 13/19] drm/gpusvm: Allow mixed mappings Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 14/19] drm/xe: Add a preferred dpagemap Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 15/19] drm/pagemap/util: Add file descriptors pointing to struct drm_pagemap Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 16/19] drm/xe/migrate: Allow xe_migrate_vram() also on non-pagefault capable devices Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 17/19] drm/xe/uapi: Add the devmem_open ioctl Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 18/19] drm/xe/uapi: HAX: Add the xe_madvise_prefer_devmem IOCTL Thomas Hellström
2025-03-12 21:04 ` [RFC PATCH 19/19] drm/xe: HAX: Use pcie p2p dma to test fast interconnect Thomas Hellström
2025-03-12 21:10 ` ✓ CI.Patch_applied: success for drm, drm/xe: Multi-device GPUSVM Patchwork
2025-03-12 21:11 ` ✗ CI.checkpatch: warning " Patchwork
2025-03-12 21:12 ` ✓ CI.KUnit: success " Patchwork
2025-03-12 21:29 ` ✓ CI.Build: " Patchwork
2025-03-12 21:31 ` ✗ CI.Hooks: failure " Patchwork
2025-03-12 21:33 ` ✓ CI.checksparse: success " Patchwork
2025-03-12 22:06 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-03-13 10:19 ` [RFC PATCH 00/19] " Christian König
2025-03-13 12:50   ` Thomas Hellström
2025-03-13 12:57     ` Christian König
2025-03-13 15:55       ` Thomas Hellström
2025-03-17  9:20       ` Thomas Hellström
2025-03-13 13:24 ` ✗ Xe.CI.Full: failure for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250312210416.3120-10-thomas.hellstrom@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=airlied@gmail.com \
    --cc=apopple@nvidia.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=felix.kuehling@amd.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=matthew.brost@intel.com \
    --cc=michal.mrozek@intel.com \
    --cc=simona.vetter@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.