Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Oak Zeng <oak.zeng@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: himal.prasad.ghimiray@intel.com, krishnaiah.bommu@intel.com,
	matthew.brost@intel.com, Thomas.Hellstrom@linux.intel.com,
	brian.welty@intel.com
Subject: [v2 02/31] drm/xe/svm: Add SVM document
Date: Tue,  9 Apr 2024 16:17:13 -0400	[thread overview]
Message-ID: <20240409201742.3042626-3-oak.zeng@intel.com> (raw)
In-Reply-To: <20240409201742.3042626-1-oak.zeng@intel.com>

Add shared virtual memory document.

Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
---
 Documentation/gpu/xe/index.rst  |   1 +
 Documentation/gpu/xe/xe_svm.rst |   8 +++
 drivers/gpu/drm/xe/xe_svm_doc.h | 121 ++++++++++++++++++++++++++++++++
 3 files changed, 130 insertions(+)
 create mode 100644 Documentation/gpu/xe/xe_svm.rst
 create mode 100644 drivers/gpu/drm/xe/xe_svm_doc.h

diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
index c224ecaee81e..106b60aba1f0 100644
--- a/Documentation/gpu/xe/index.rst
+++ b/Documentation/gpu/xe/index.rst
@@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver.
    xe_firmware
    xe_tile
    xe_debugging
+   xe_svm
diff --git a/Documentation/gpu/xe/xe_svm.rst b/Documentation/gpu/xe/xe_svm.rst
new file mode 100644
index 000000000000..62954ba1c6f8
--- /dev/null
+++ b/Documentation/gpu/xe/xe_svm.rst
@@ -0,0 +1,8 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+=============
+Shared virtual memory
+=============
+
+.. kernel-doc:: drivers/gpu/drm/xe/xe_svm_doc.h
+   :doc: Shared virtual memory
diff --git a/drivers/gpu/drm/xe/xe_svm_doc.h b/drivers/gpu/drm/xe/xe_svm_doc.h
new file mode 100644
index 000000000000..de38ee3585e4
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_svm_doc.h
@@ -0,0 +1,121 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef _XE_SVM_DOC_H_
+#define _XE_SVM_DOC_H_
+
+/**
+ * DOC: Shared virtual memory
+ *
+ * Shared Virtual Memory (SVM) allows the programmer to use a single virtual
+ * address space shared between threads executing on CPUs and GPUs. It abstracts
+ * away from the user the location of the backing memory, and hence simplifies
+ * the user programming model. In a non-SVM memory model, user need to explicitly
+ * decide memory placement such as device or system memory, also user need to
+ * explicitly migrate memory b/t device and system memory.
+ *
+ * Interface
+ * =========
+ *
+ * SVM makes use of default OS memory allocation and mapping interface such as
+ * malloc() and mmap(). The pointer returned from malloc() and mmap() can be
+ * directly used on both CPU and GPU program.
+ *
+ * SVM also provides API to set virtual address range based memory attributes
+ * such as preferred memory location, memory migration granularity, and memory
+ * atomic attributes etc. This is similar to Linux madvise API.
+ *
+ * Basic implementation
+ * ==============
+ *
+ * XeKMD implementation is based on Linux kernel Heterogeneous Memory Management
+ * (HMM) framework. HMM’s address space mirroring support allows sharing of the
+ * address space by duplicating sections of CPU page tables in the device page
+ * tables. This enables both CPU and GPU access a physical memory location using
+ * the same virtual address.
+ *
+ * Linux kernel also provides the ability to plugin device memory to the system
+ * (as a special ZONE_DEVICE type) and allocates struct page for each device memory
+ * page.
+ *
+ * HMM also provides a mechanism to migrate pages from host to device memory and
+ * vice versa.
+ *
+ * More information on HMM can be found here.
+ * https://www.kernel.org/doc/Documentation/vm/hmm.rst
+ *
+ * Unlike the non-SVM memory allocator (such as gem_create, vm_bind etc), there
+ * is no buffer object (BO, such as struct ttm_buffer_object, struct drm_gem_object),
+ * in our SVM implementation. We delibrately choose this implementation option
+ * to achieve page granularity memory placement, validation, eviction and migration.
+ *
+ * The SVM layer directly allocate device memory from drm buddy subsystem. The
+ * memory is organized as many blocks each of which has 2^n pages. SVM subsystem
+ * then mark the usage of each page using a simple bitmap. When all pages in a
+ * block are not used anymore, SVM return this block back to drm buddy subsystem.
+ *
+ * There are 3 events which can trigger SVM subsystem in actions:
+ *
+ * 1. A mmu notifier callback
+ *
+ * Since SVM need to mirror the program's CPU virtual address space from GPU side,
+ * when program's CPU address space changes, SVM need to make an identical change
+ * from GPU side. SVM/hmm use mmu interval notifier to achieve this. SVM register
+ * a mmu interval notifier call back function to core mm, and whenever a CPU side
+ * virtual address space is changed (i.e., when a virtual address range is unmapped
+ * from CPU calling munmap), the registered callback function will be called from
+ * core mm. SVM then mirror the CPU address space change from GPU side, i.e., unmap
+ * or invalidate the virtual address range from GPU page table.
+ *
+ * 2. A GPU page fault
+ *
+ * At the very beginning of a process's life, no virtual address of the process
+ * is mapped on GPU page table. So when GPU access any virtual address of the process
+ * a GPU page fault is triggered. SVM then decide the best memory location of the
+ * fault address (mainly from performance consideration. Some times also consider
+ * correctness requirement such as whether GPU can perform atomics operation to
+ * certain memory location), migrate memory if necessary, and map the fault address
+ * to GPU page table.
+ *
+ * 3. A CPU page fault
+ *
+ * A CPU page fault is usually managed by Linux core mm. But in a CPU and GPU
+ * mix programming environment, the backing store of a virtual address range
+ * can be in GPU's local memory which is not visible to CPU (DEVICE_PRIVATE),
+ * so CPU page fault handler need to migrate such pages to system memory for
+ * CPU to be able to access them. Such memory migration is device specific.
+ * HMM has a callback function (migrate_to_ram function of the dev_pagemap_ops)
+ * for device driver to implement.
+ *
+ *
+ * Memory hints: TBD
+ * =================
+ *
+ * Memory eviction: TBD
+ * ===============
+ *
+ * Lock design
+ * ===========
+ *
+ * https://www.kernel.org/doc/Documentation/vm/hmm.rst section "Address space mirroring
+ * implemenation and API" described the locking scheme that driver writer has to
+ * respect. There are 3 lock mechanism involved in this scheme:
+ *
+ * 1. Use mmp_read/write_lock to protect VMA, cpu page table operations.  Operation such
+ * as munmap/mmap, page table update during numa balance must hold this lock. Hmm_range_fault
+ * is a helper function provided by HMM to populate the CPU page table, so it must be called
+ * with this lock
+ *
+ * 2. Use xe_svm::mutex to protect device side page table operation. Any attempt to bind an
+ * address range to GPU, or invalidate an address range from GPU, should hold this device lock
+ *
+ * 3. In the GPU page fault handler, during device page table update, we hold a xe_svm::mutex,
+ * but we don't hold the mmap_read/write_lock. So programm's address space can change during
+ * the GPU page table update. mmu notifier seq# is used to determine whether unmap happened
+ * during during device page table update, if yes, then retry.
+ *
+ */
+
+#endif
-- 
2.26.3


  parent reply	other threads:[~2024-04-09 20:05 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-09 20:17 [v2 00/31] Basic system allocator support in xe driver Oak Zeng
2024-04-09 20:17 ` [v2 01/31] drm/xe: Refactor vm_bind Oak Zeng
2024-04-09 20:17 ` Oak Zeng [this message]
2024-04-09 20:17 ` [v2 03/31] drm/xe: Invalidate userptr VMA on page pin fault Oak Zeng
2024-04-09 20:17 ` [v2 04/31] drm/xe: Drop unused arguments from vm_bind_ioctl_ops_parse Oak Zeng
2024-04-09 20:17 ` [v2 05/31] drm/xe: Fix op->tile_mask for fault mode Oak Zeng
2024-04-09 20:17 ` [v2 06/31] drm/xe/uapi: Add DRM_XE_VM_BIND_FLAG_SYSTEM_ALLOCATOR flag Oak Zeng
2024-04-09 20:17 ` [v2 07/31] drm/xe: Create userptr if page fault occurs on system_allocator VMA Oak Zeng
2024-04-09 20:17 ` [v2 08/31] drm/xe: Add faulted userptr VMA garbage collector Oak Zeng
2024-04-09 20:17 ` [v2 09/31] drm/xe: Introduce helper to populate userptr Oak Zeng
2024-04-09 20:17 ` [v2 10/31] drm/xe: Introduce a helper to free sg table Oak Zeng
2024-04-09 20:17 ` [v2 11/31] drm/xe: Use hmm_range_fault to populate user pages Oak Zeng
2024-04-09 20:17 ` [v2 12/31] drm/xe/svm: Remap and provide memmap backing for GPU vram Oak Zeng
2024-04-10 21:09   ` Matthew Brost
2024-04-16 19:01   ` Matthew Brost
2024-04-09 20:17 ` [v2 13/31] drm/xe/svm: Introduce DRM_XE_SVM kernel config Oak Zeng
2024-04-10 21:13   ` Matthew Brost
2024-06-04 18:57     ` Zeng, Oak
2024-04-09 20:17 ` [v2 14/31] drm/xe: Introduce helper to get tile from memory region Oak Zeng
2024-04-10 21:17   ` Matthew Brost
2024-04-09 20:17 ` [v2 15/31] drm/xe: Introduce a helper to get dpa from pfn Oak Zeng
2024-04-10 21:35   ` Matthew Brost
2024-04-09 20:17 ` [v2 16/31] drm/xe/svm: Get xe memory region from page Oak Zeng
2024-04-10 21:38   ` Matthew Brost
2024-04-09 20:17 ` [v2 17/31] drm/xe: Get xe_vma from xe_userptr Oak Zeng
2024-04-10 21:42   ` Matthew Brost
2024-04-09 20:17 ` [v2 18/31] drm/xe/svm: Build userptr sg table for device pages Oak Zeng
2024-04-10 21:52   ` Matthew Brost
2024-04-09 20:17 ` [v2 19/31] drm/xe/svm: Determine a vma is backed by device memory Oak Zeng
2024-04-10 21:56   ` Matthew Brost
2024-06-05  2:29     ` Zeng, Oak
2024-04-09 20:17 ` [v2 20/31] drm/xe: add xe lock document Oak Zeng
2024-04-09 20:17 ` [v2 21/31] drm/xe/svm: Introduce svm migration function Oak Zeng
2024-04-10 22:06   ` Matthew Brost
2024-04-09 20:17 ` [v2 22/31] drm/xe/svm: implement functions to allocate and free device memory Oak Zeng
2024-04-10 22:23   ` Matthew Brost
2024-04-15 20:13     ` Zeng, Oak
2024-04-15 21:19       ` Matthew Brost
2024-06-05 22:16     ` Zeng, Oak
2024-06-05 23:37       ` Matthew Brost
2024-06-06  3:30         ` Zeng, Oak
2024-06-06  4:44           ` Matthew Brost
2024-04-17 20:55   ` Matthew Brost
2024-04-09 20:17 ` [v2 23/31] drm/xe/svm: Trace buddy block allocation and free Oak Zeng
2024-04-09 20:17 ` [v2 24/31] drm/xe/svm: Create and destroy xe svm Oak Zeng
2024-04-10 22:25   ` Matthew Brost
2024-04-09 20:17 ` [v2 25/31] drm/xe/svm: Add vm to xe_svm process Oak Zeng
2024-04-09 20:17 ` [v2 26/31] drm/xe: Make function lookup_vma public Oak Zeng
2024-04-10 22:26   ` Matthew Brost
2024-04-09 20:17 ` [v2 27/31] drm/xe/svm: Handle CPU page fault Oak Zeng
2024-04-11  2:07   ` Matthew Brost
2024-04-12 17:24     ` Zeng, Oak
2024-04-12 18:10       ` Matthew Brost
2024-04-12 18:39         ` Zeng, Oak
2024-06-07  4:44         ` Zeng, Oak
2024-06-07  4:30     ` Zeng, Oak
2024-04-09 20:17 ` [v2 28/31] drm/xe/svm: Introduce helper to migrate vma to vram Oak Zeng
2024-04-11  2:49   ` Matthew Brost
2024-04-12 21:21     ` Zeng, Oak
2024-04-15 19:40       ` Matthew Brost
2024-06-07 17:12         ` Zeng, Oak
2024-06-07 17:56           ` Matthew Brost
2024-06-07 18:10             ` Matthew Brost
2024-04-09 20:17 ` [v2 29/31] drm/xe/svm: trace svm migration Oak Zeng
2024-04-09 20:17 ` [v2 30/31] drm/xe/svm: Add a helper to determine a vma is fault userptr Oak Zeng
2024-04-11  2:50   ` Matthew Brost
2024-04-09 20:17 ` [v2 31/31] drm/xe/svm: Migration from sram to vram for system allocator Oak Zeng
2024-04-11  2:55   ` Matthew Brost
2024-06-07 17:22     ` Zeng, Oak
2024-06-07 18:18       ` Matthew Brost
2024-06-07 18:23         ` Matthew Brost
2024-04-09 20:52 ` ✗ CI.Patch_applied: failure for Basic system allocator support in xe driver Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240409201742.3042626-3-oak.zeng@intel.com \
    --to=oak.zeng@intel.com \
    --cc=Thomas.Hellstrom@linux.intel.com \
    --cc=brian.welty@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=krishnaiah.bommu@intel.com \
    --cc=matthew.brost@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox