* [PATCH 0/8] Use hmm_range_fault to populate user page
@ 2024-03-19 2:55 Oak Zeng
0 siblings, 0 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-19 2:55 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
This is an effort to unify hmmptr (system allocator) and userptr.
A helper xe_hmm_populate_range is created to populate a user
page using hmm_range_fault, instead of using get_user_pages_fast.
This helper is then used to replace some userptr codes.
The same help will be used later for hmmptr.
This is part of the hmmptr (system allocator) codes. Since this
part can be merged separately, send it out for CI and review first.
It will be followed by other hmmptr codes.
Oak Zeng (8):
drm/xe/svm: Remap and provide memmap backing for GPU vram
drm/xe/svm: Add DRM_XE_SVM kernel config entry
drm/xe: Helper to get tile from memory region
drm/xe: Introduce a helper to get dpa from pfn
drm/xe/svm: Get xe memory region from page
drm/xe: Helper to populate a userptr or hmmptr
drm/xe: Introduce a helper to free sg table
drm/xe: Use hmm_range_fault to populate user pages
drivers/gpu/drm/xe/Kconfig | 21 ++
drivers/gpu/drm/xe/Makefile | 2 +
drivers/gpu/drm/xe/xe_device.h | 5 +
drivers/gpu/drm/xe/xe_device_types.h | 8 +
drivers/gpu/drm/xe/xe_hmm.c | 282 +++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_hmm.h | 25 +++
drivers/gpu/drm/xe/xe_mmio.c | 6 +
drivers/gpu/drm/xe/xe_svm.h | 49 +++++
drivers/gpu/drm/xe/xe_svm_devmem.c | 89 +++++++++
drivers/gpu/drm/xe/xe_tile.c | 8 +
drivers/gpu/drm/xe/xe_vm.c | 122 ++----------
11 files changed, 507 insertions(+), 110 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_hmm.c
create mode 100644 drivers/gpu/drm/xe/xe_hmm.h
create mode 100644 drivers/gpu/drm/xe/xe_svm.h
create mode 100644 drivers/gpu/drm/xe/xe_svm_devmem.c
--
2.26.3
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 0/8] Use hmm_range_fault to populate user page
@ 2024-03-20 3:24 Oak Zeng
0 siblings, 0 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:24 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
This is an effort to unify hmmptr (system allocator) and userptr.
A helper xe_hmm_populate_range is created to populate a user
page using hmm_range_fault, instead of using get_user_pages_fast.
This helper is then used to replace some userptr codes.
The same help will be used later for hmmptr.
This is part of the hmmptr (system allocator) codes. Since this
part can be merged separately, send it out for CI and review first.
It will be followed by other hmmptr codes.
Oak Zeng (8):
drm/xe/svm: Remap and provide memmap backing for GPU vram
drm/xe/svm: Introduce DRM_XE_SVM kernel config
drm/xe: Introduce helper to get tile from memory region
drm/xe: Introduce a helper to get dpa from pfn
drm/xe/svm: Get xe memory region from page
drm/xe: Introduce helper to populate userptr
drm/xe: Introduce a helper to free sg table
drm/xe: Use hmm_range_fault to populate user pages
drivers/gpu/drm/xe/Kconfig | 21 +++
drivers/gpu/drm/xe/Makefile | 2 +
drivers/gpu/drm/xe/xe_device.h | 9 +
drivers/gpu/drm/xe/xe_device_types.h | 8 +
drivers/gpu/drm/xe/xe_hmm.c | 250 +++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_hmm.h | 11 ++
drivers/gpu/drm/xe/xe_mmio.c | 6 +
drivers/gpu/drm/xe/xe_svm.h | 49 ++++++
drivers/gpu/drm/xe/xe_svm_devmem.c | 89 ++++++++++
drivers/gpu/drm/xe/xe_tile.c | 6 +
drivers/gpu/drm/xe/xe_vm.c | 122 ++-----------
11 files changed, 463 insertions(+), 110 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_hmm.c
create mode 100644 drivers/gpu/drm/xe/xe_hmm.h
create mode 100644 drivers/gpu/drm/xe/xe_svm.h
create mode 100644 drivers/gpu/drm/xe/xe_svm_devmem.c
--
2.26.3
^ permalink raw reply [flat|nested] 18+ messages in thread
* ✓ CI.Patch_applied: success for Use hmm_range_fault to populate user page
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
@ 2024-03-20 3:37 ` Patchwork
2024-03-20 3:37 ` ✗ CI.checkpatch: warning " Patchwork
` (9 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2024-03-20 3:37 UTC (permalink / raw)
To: Oak Zeng; +Cc: intel-xe
== Series Details ==
Series: Use hmm_range_fault to populate user page
URL : https://patchwork.freedesktop.org/series/131351/
State : success
== Summary ==
=== Applying kernel patches on branch 'drm-tip' with base: ===
Base commit: 38507920a9f9 drm-tip: 2024y-03m-19d-19h-38m-53s UTC integration manifest
=== git am output follows ===
.git/rebase-apply/patch:182: new blank line at EOF.
+
warning: 1 line adds whitespace errors.
.git/rebase-apply/patch:251: new blank line at EOF.
+
warning: 1 line adds whitespace errors.
Applying: drm/xe/svm: Remap and provide memmap backing for GPU vram
Applying: drm/xe/svm: Introduce DRM_XE_SVM kernel config
Applying: drm/xe: Introduce helper to get tile from memory region
Applying: drm/xe: Introduce a helper to get dpa from pfn
Applying: drm/xe/svm: Get xe memory region from page
Applying: drm/xe: Introduce helper to populate userptr
Applying: drm/xe: Introduce a helper to free sg table
Applying: drm/xe: Use hmm_range_fault to populate user pages
^ permalink raw reply [flat|nested] 18+ messages in thread
* ✗ CI.checkpatch: warning for Use hmm_range_fault to populate user page
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
2024-03-20 3:37 ` ✓ CI.Patch_applied: success for " Patchwork
@ 2024-03-20 3:37 ` Patchwork
2024-03-20 3:38 ` ✗ CI.KUnit: failure " Patchwork
` (8 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2024-03-20 3:37 UTC (permalink / raw)
To: Oak Zeng; +Cc: intel-xe
== Series Details ==
Series: Use hmm_range_fault to populate user page
URL : https://patchwork.freedesktop.org/series/131351/
State : warning
== Summary ==
+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
a9eb1ac8298ef9f9146567c29fa762d8e9efa1ef
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit 142f5506f801a64b671f7aec0d29fbd3c61661e9
Author: Oak Zeng <oak.zeng@intel.com>
Date: Tue Mar 19 23:44:25 2024 -0400
drm/xe: Use hmm_range_fault to populate user pages
This is an effort to unify hmmptr (aka system allocator)
and userptr code. hmm_range_fault is used to populate
a virtual address range for both hmmptr and userptr,
instead of hmmptr using hmm_range_fault and userptr
using get_user_pages_fast.
This also aligns with AMD gpu driver's behavior. In
long term, we plan to put some common helpers in this
area to drm layer so it can be re-used by different
vendors.
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
+ /mt/dim checkpatch 38507920a9f9bd379ae008903e42fd7419b346c3 drm-intel
b9d01b42a96c drm/xe/svm: Remap and provide memmap backing for GPU vram
Traceback (most recent call last):
File "scripts/spdxcheck.py", line 6, in <module>
from ply import lex, yacc
ModuleNotFoundError: No module named 'ply'
-:105: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#105:
new file mode 100644
-:110: WARNING:SPDX_LICENSE_TAG: Improper SPDX comment style for 'drivers/gpu/drm/xe/xe_svm.h', please use '/*' instead
#110: FILE: drivers/gpu/drm/xe/xe_svm.h:1:
+// SPDX-License-Identifier: MIT
-:110: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#110: FILE: drivers/gpu/drm/xe/xe_svm.h:1:
+// SPDX-License-Identifier: MIT
-:142: CHECK:LINE_SPACING: Please don't use multiple blank lines
#142: FILE: drivers/gpu/drm/xe/xe_svm_devmem.c:12:
+
+
-:193: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#193: FILE: drivers/gpu/drm/xe/xe_svm_devmem.c:63:
+ drm_err(&xe->drm, "Failed to remap tile %d memory, errno %d\n",
+ tile->id, ret);
-:199: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#199: FILE: drivers/gpu/drm/xe/xe_svm_devmem.c:69:
+ drm_info(&xe->drm, "Added tile %d memory [%llx-%llx] to devm, remapped to %pr\n",
+ tile->id, mr->io_start, mr->io_start + mr->usable_size, res);
-:216: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#216: FILE: drivers/gpu/drm/xe/xe_svm_devmem.c:86:
+ devm_release_mem_region(dev, mr->pagemap.range.start,
+ mr->pagemap.range.end - mr->pagemap.range.start + 1);
total: 0 errors, 3 warnings, 4 checks, 165 lines checked
d157ed39a29f drm/xe/svm: Introduce DRM_XE_SVM kernel config
fcb1be70a47c drm/xe: Introduce helper to get tile from memory region
a86e27df3ece drm/xe: Introduce a helper to get dpa from pfn
802a6a8b45a3 drm/xe/svm: Get xe memory region from page
b7d81fae3ed5 drm/xe: Introduce helper to populate userptr
Traceback (most recent call last):
File "scripts/spdxcheck.py", line 6, in <module>
from ply import lex, yacc
ModuleNotFoundError: No module named 'ply'
-:55: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#55:
new file mode 100644
-:140: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#140: FILE: drivers/gpu/drm/xe/xe_hmm.c:81:
+static int xe_build_sg(struct xe_device *xe, struct hmm_range *range,
+ struct sg_table *st, bool write)
-:158: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#158: FILE: drivers/gpu/drm/xe/xe_hmm.c:99:
+ ret = sg_alloc_table_from_pages_segment(st, pages, npages, 0,
+ npages << PAGE_SHIFT, xe_sg_segment_size(dev), GFP_KERNEL);
-:163: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#163: FILE: drivers/gpu/drm/xe/xe_hmm.c:104:
+ ret = dma_map_sgtable(dev, st, write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_NO_KERNEL_MAPPING);
-:291: WARNING:SPDX_LICENSE_TAG: Improper SPDX comment style for 'drivers/gpu/drm/xe/xe_hmm.h', please use '/*' instead
#291: FILE: drivers/gpu/drm/xe/xe_hmm.h:1:
+// SPDX-License-Identifier: MIT
-:291: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#291: FILE: drivers/gpu/drm/xe/xe_hmm.h:1:
+// SPDX-License-Identifier: MIT
total: 0 errors, 3 warnings, 3 checks, 242 lines checked
a29fe8f5ab71 drm/xe: Introduce a helper to free sg table
-:48: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#48: FILE: drivers/gpu/drm/xe/xe_hmm.c:132:
+ dma_unmap_sgtable(dev, userptr->sg,
+ write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE, 0);
total: 0 errors, 0 warnings, 1 checks, 52 lines checked
142f5506f801 drm/xe: Use hmm_range_fault to populate user pages
^ permalink raw reply [flat|nested] 18+ messages in thread
* ✗ CI.KUnit: failure for Use hmm_range_fault to populate user page
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
2024-03-20 3:37 ` ✓ CI.Patch_applied: success for " Patchwork
2024-03-20 3:37 ` ✗ CI.checkpatch: warning " Patchwork
@ 2024-03-20 3:38 ` Patchwork
2024-03-20 3:44 ` [PATCH 1/8] drm/xe/svm: Remap and provide memmap backing for GPU vram Oak Zeng
` (7 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2024-03-20 3:38 UTC (permalink / raw)
To: Oak Zeng; +Cc: intel-xe
== Series Details ==
Series: Use hmm_range_fault to populate user page
URL : https://patchwork.freedesktop.org/series/131351/
State : failure
== Summary ==
+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
ERROR:root:../arch/x86/um/user-offsets.c:17:6: warning: no previous prototype for ‘foo’ [-Wmissing-prototypes]
17 | void foo(void)
| ^~~
In file included from ../arch/um/kernel/asm-offsets.c:1:
../arch/x86/um/shared/sysdep/kernel-offsets.h:9:6: warning: no previous prototype for ‘foo’ [-Wmissing-prototypes]
9 | void foo(void)
| ^~~
../arch/x86/um/bugs_64.c:9:6: warning: no previous prototype for ‘arch_check_bugs’ [-Wmissing-prototypes]
9 | void arch_check_bugs(void)
| ^~~~~~~~~~~~~~~
../arch/x86/um/bugs_64.c:13:6: warning: no previous prototype for ‘arch_examine_signal’ [-Wmissing-prototypes]
13 | void arch_examine_signal(int sig, struct uml_pt_regs *regs)
| ^~~~~~~~~~~~~~~~~~~
../arch/x86/um/fault.c:18:5: warning: no previous prototype for ‘arch_fixup’ [-Wmissing-prototypes]
18 | int arch_fixup(unsigned long address, struct uml_pt_regs *regs)
| ^~~~~~~~~~
../arch/x86/um/vdso/um_vdso.c:16:5: warning: no previous prototype for ‘__vdso_clock_gettime’ [-Wmissing-prototypes]
16 | int __vdso_clock_gettime(clockid_t clock, struct __kernel_old_timespec *ts)
| ^~~~~~~~~~~~~~~~~~~~
../arch/x86/um/vdso/um_vdso.c:30:5: warning: no previous prototype for ‘__vdso_gettimeofday’ [-Wmissing-prototypes]
30 | int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
| ^~~~~~~~~~~~~~~~~~~
../arch/x86/um/vdso/um_vdso.c:44:21: warning: no previous prototype for ‘__vdso_time’ [-Wmissing-prototypes]
44 | __kernel_old_time_t __vdso_time(__kernel_old_time_t *t)
| ^~~~~~~~~~~
../arch/x86/um/vdso/um_vdso.c:57:1: warning: no previous prototype for ‘__vdso_getcpu’ [-Wmissing-prototypes]
57 | __vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused)
| ^~~~~~~~~~~~~
../arch/x86/um/os-Linux/registers.c:146:15: warning: no previous prototype for ‘get_thread_reg’ [-Wmissing-prototypes]
146 | unsigned long get_thread_reg(int reg, jmp_buf *buf)
| ^~~~~~~~~~~~~~
../arch/x86/um/os-Linux/mcontext.c:7:6: warning: no previous prototype for ‘get_regs_from_mc’ [-Wmissing-prototypes]
7 | void get_regs_from_mc(struct uml_pt_regs *regs, mcontext_t *mc)
| ^~~~~~~~~~~~~~~~
../arch/um/os-Linux/skas/process.c:107:6: warning: no previous prototype for ‘wait_stub_done’ [-Wmissing-prototypes]
107 | void wait_stub_done(int pid)
| ^~~~~~~~~~~~~~
../arch/um/os-Linux/skas/process.c:683:6: warning: no previous prototype for ‘__switch_mm’ [-Wmissing-prototypes]
683 | void __switch_mm(struct mm_id *mm_idp)
| ^~~~~~~~~~~
../arch/um/kernel/skas/process.c:36:12: warning: no previous prototype for ‘start_uml’ [-Wmissing-prototypes]
36 | int __init start_uml(void)
| ^~~~~~~~~
../arch/x86/um/ptrace_64.c:111:5: warning: no previous prototype for ‘poke_user’ [-Wmissing-prototypes]
111 | int poke_user(struct task_struct *child, long addr, long data)
| ^~~~~~~~~
../arch/x86/um/ptrace_64.c:171:5: warning: no previous prototype for ‘peek_user’ [-Wmissing-prototypes]
171 | int peek_user(struct task_struct *child, long addr, long data)
| ^~~~~~~~~
../arch/um/kernel/skas/mmu.c:17:5: warning: no previous prototype for ‘init_new_context’ [-Wmissing-prototypes]
17 | int init_new_context(struct task_struct *task, struct mm_struct *mm)
| ^~~~~~~~~~~~~~~~
../arch/um/kernel/skas/mmu.c:60:6: warning: no previous prototype for ‘destroy_context’ [-Wmissing-prototypes]
60 | void destroy_context(struct mm_struct *mm)
| ^~~~~~~~~~~~~~~
../arch/x86/um/signal.c:560:6: warning: no previous prototype for ‘sys_rt_sigreturn’ [-Wmissing-prototypes]
560 | long sys_rt_sigreturn(void)
| ^~~~~~~~~~~~~~~~
../arch/um/os-Linux/main.c:187:7: warning: no previous prototype for ‘__wrap_malloc’ [-Wmissing-prototypes]
187 | void *__wrap_malloc(int size)
| ^~~~~~~~~~~~~
../arch/um/os-Linux/main.c:208:7: warning: no previous prototype for ‘__wrap_calloc’ [-Wmissing-prototypes]
208 | void *__wrap_calloc(int n, int size)
| ^~~~~~~~~~~~~
../arch/um/os-Linux/main.c:222:6: warning: no previous prototype for ‘__wrap_free’ [-Wmissing-prototypes]
222 | void __wrap_free(void *ptr)
| ^~~~~~~~~~~
../arch/um/os-Linux/mem.c:28:6: warning: no previous prototype for ‘kasan_map_memory’ [-Wmissing-prototypes]
28 | void kasan_map_memory(void *start, size_t len)
| ^~~~~~~~~~~~~~~~
../arch/um/os-Linux/mem.c:212:13: warning: no previous prototype for ‘check_tmpexec’ [-Wmissing-prototypes]
212 | void __init check_tmpexec(void)
| ^~~~~~~~~~~~~
../arch/um/os-Linux/signal.c:75:6: warning: no previous prototype for ‘sig_handler’ [-Wmissing-prototypes]
75 | void sig_handler(int sig, struct siginfo *si, mcontext_t *mc)
| ^~~~~~~~~~~
../arch/um/os-Linux/signal.c:111:6: warning: no previous prototype for ‘timer_alarm_handler’ [-Wmissing-prototypes]
111 | void timer_alarm_handler(int sig, struct siginfo *unused_si, mcontext_t *mc)
| ^~~~~~~~~~~~~~~~~~~
../arch/um/os-Linux/start_up.c:301:12: warning: no previous prototype for ‘parse_iomem’ [-Wmissing-prototypes]
301 | int __init parse_iomem(char *str, int *add)
| ^~~~~~~~~~~
../arch/um/kernel/mem.c:202:8: warning: no previous prototype for ‘pgd_alloc’ [-Wmissing-prototypes]
202 | pgd_t *pgd_alloc(struct mm_struct *mm)
| ^~~~~~~~~
../arch/um/kernel/mem.c:215:7: warning: no previous prototype for ‘uml_kmalloc’ [-Wmissing-prototypes]
215 | void *uml_kmalloc(int size, int flags)
| ^~~~~~~~~~~
../arch/x86/um/syscalls_64.c:48:6: warning: no previous prototype for ‘arch_switch_to’ [-Wmissing-prototypes]
48 | void arch_switch_to(struct task_struct *to)
| ^~~~~~~~~~~~~~
../arch/um/kernel/process.c:51:5: warning: no previous prototype for ‘pid_to_processor_id’ [-Wmissing-prototypes]
51 | int pid_to_processor_id(int pid)
| ^~~~~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:87:7: warning: no previous prototype for ‘__switch_to’ [-Wmissing-prototypes]
87 | void *__switch_to(struct task_struct *from, struct task_struct *to)
| ^~~~~~~~~~~
../arch/um/kernel/process.c:140:6: warning: no previous prototype for ‘fork_handler’ [-Wmissing-prototypes]
140 | void fork_handler(void)
| ^~~~~~~~~~~~
../arch/um/kernel/process.c:217:6: warning: no previous prototype for ‘arch_cpu_idle’ [-Wmissing-prototypes]
217 | void arch_cpu_idle(void)
| ^~~~~~~~~~~~~
../arch/um/kernel/process.c:253:5: warning: no previous prototype for ‘copy_to_user_proc’ [-Wmissing-prototypes]
253 | int copy_to_user_proc(void __user *to, void *from, int size)
| ^~~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:263:5: warning: no previous prototype for ‘clear_user_proc’ [-Wmissing-prototypes]
263 | int clear_user_proc(void __user *buf, int size)
| ^~~~~~~~~~~~~~~
../arch/um/kernel/process.c:271:6: warning: no previous prototype for ‘set_using_sysemu’ [-Wmissing-prototypes]
271 | void set_using_sysemu(int value)
| ^~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:278:5: warning: no previous prototype for ‘get_using_sysemu’ [-Wmissing-prototypes]
278 | int get_using_sysemu(void)
| ^~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:316:12: warning: no previous prototype for ‘make_proc_sysemu’ [-Wmissing-prototypes]
316 | int __init make_proc_sysemu(void)
| ^~~~~~~~~~~~~~~~
../arch/um/kernel/process.c:348:15: warning: no previous prototype for ‘arch_align_stack’ [-Wmissing-prototypes]
348 | unsigned long arch_align_stack(unsigned long sp)
| ^~~~~~~~~~~~~~~~
../arch/um/kernel/reboot.c:45:6: warning: no previous prototype for ‘machine_restart’ [-Wmissing-prototypes]
45 | void machine_restart(char * __unused)
| ^~~~~~~~~~~~~~~
../arch/um/kernel/reboot.c:51:6: warning: no previous prototype for ‘machine_power_off’ [-Wmissing-prototypes]
51 | void machine_power_off(void)
| ^~~~~~~~~~~~~~~~~
../arch/um/kernel/reboot.c:57:6: warning: no previous prototype for ‘machine_halt’ [-Wmissing-prototypes]
57 | void machine_halt(void)
| ^~~~~~~~~~~~
../arch/um/kernel/tlb.c:579:6: warning: no previous prototype for ‘flush_tlb_mm_range’ [-Wmissing-prototypes]
579 | void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
| ^~~~~~~~~~~~~~~~~~
../arch/um/kernel/tlb.c:594:6: warning: no previous prototype for ‘force_flush_all’ [-Wmissing-prototypes]
594 | void force_flush_all(void)
| ^~~~~~~~~~~~~~~
../arch/um/kernel/um_arch.c:408:19: warning: no previous prototype for ‘read_initrd’ [-Wmissing-prototypes]
408 | int __init __weak read_initrd(void)
| ^~~~~~~~~~~
../arch/um/kernel/um_arch.c:461:7: warning: no previous prototype for ‘text_poke’ [-Wmissing-prototypes]
461 | void *text_poke(void *addr, const void *opcode, size_t len)
| ^~~~~~~~~
../arch/um/kernel/um_arch.c:473:6: warning: no previous prototype for ‘text_poke_sync’ [-Wmissing-prototypes]
473 | void text_poke_sync(void)
| ^~~~~~~~~~~~~~
../arch/um/kernel/kmsg_dump.c:60:12: warning: no previous prototype for ‘kmsg_dumper_stdout_init’ [-Wmissing-prototypes]
60 | int __init kmsg_dumper_stdout_init(void)
| ^~~~~~~~~~~~~~~~~~~~~~~
../lib/iomap.c:156:5: warning: no previous prototype for ‘ioread64_lo_hi’ [-Wmissing-prototypes]
156 | u64 ioread64_lo_hi(const void __iomem *addr)
| ^~~~~~~~~~~~~~
../lib/iomap.c:163:5: warning: no previous prototype for ‘ioread64_hi_lo’ [-Wmissing-prototypes]
163 | u64 ioread64_hi_lo(const void __iomem *addr)
| ^~~~~~~~~~~~~~
../lib/iomap.c:170:5: warning: no previous prototype for ‘ioread64be_lo_hi’ [-Wmissing-prototypes]
170 | u64 ioread64be_lo_hi(const void __iomem *addr)
| ^~~~~~~~~~~~~~~~
../lib/iomap.c:178:5: warning: no previous prototype for ‘ioread64be_hi_lo’ [-Wmissing-prototypes]
178 | u64 ioread64be_hi_lo(const void __iomem *addr)
| ^~~~~~~~~~~~~~~~
../lib/iomap.c:264:6: warning: no previous prototype for ‘iowrite64_lo_hi’ [-Wmissing-prototypes]
264 | void iowrite64_lo_hi(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~
../lib/iomap.c:272:6: warning: no previous prototype for ‘iowrite64_hi_lo’ [-Wmissing-prototypes]
272 | void iowrite64_hi_lo(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~
../lib/iomap.c:280:6: warning: no previous prototype for ‘iowrite64be_lo_hi’ [-Wmissing-prototypes]
280 | void iowrite64be_lo_hi(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~~~
../lib/iomap.c:288:6: warning: no previous prototype for ‘iowrite64be_hi_lo’ [-Wmissing-prototypes]
288 | void iowrite64be_hi_lo(u64 val, void __iomem *addr)
| ^~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/xe_tile.c: In function ‘xe_tile_init_noalloc’:
../drivers/gpu/drm/xe/xe_tile.c:182:39: error: ‘xe’ undeclared (first use in this function)
182 | if (IS_ENABLED(CONFIG_DRM_XE_SVM) && xe->info.has_usm)
| ^~
../drivers/gpu/drm/xe/xe_tile.c:182:39: note: each undeclared identifier is reported only once for each function it appears in
make[7]: *** [../scripts/Makefile.build:243: drivers/gpu/drm/xe/xe_tile.o] Error 1
make[7]: *** Waiting for unfinished jobs....
make[6]: *** [../scripts/Makefile.build:481: drivers/gpu/drm/xe] Error 2
make[5]: *** [../scripts/Makefile.build:481: drivers/gpu/drm] Error 2
make[4]: *** [../scripts/Makefile.build:481: drivers/gpu] Error 2
make[3]: *** [../scripts/Makefile.build:481: drivers] Error 2
make[2]: *** [/kernel/Makefile:1921: .] Error 2
make[1]: *** [/kernel/Makefile:240: __sub-make] Error 2
make: *** [Makefile:240: __sub-make] Error 2
[03:37:54] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[03:37:59] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make ARCH=um O=.kunit --jobs=48
+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 0/8] Use hmm_range_fault to populate user page
@ 2024-03-20 3:44 Oak Zeng
2024-03-20 3:37 ` ✓ CI.Patch_applied: success for " Patchwork
` (10 more replies)
0 siblings, 11 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
This is an effort to unify hmmptr (system allocator) and userptr.
A helper xe_hmm_populate_range is created to populate a user
page using hmm_range_fault, instead of using get_user_pages_fast.
This helper is then used to replace some userptr codes.
The same help will be used later for hmmptr.
This is part of the hmmptr (system allocator) codes. Since this
part can be merged separately, send it out for CI and review first.
It will be followed by other hmmptr codes.
Oak Zeng (8):
drm/xe/svm: Remap and provide memmap backing for GPU vram
drm/xe/svm: Introduce DRM_XE_SVM kernel config
drm/xe: Introduce helper to get tile from memory region
drm/xe: Introduce a helper to get dpa from pfn
drm/xe/svm: Get xe memory region from page
drm/xe: Introduce helper to populate userptr
drm/xe: Introduce a helper to free sg table
drm/xe: Use hmm_range_fault to populate user pages
drivers/gpu/drm/xe/Kconfig | 21 +++
drivers/gpu/drm/xe/Makefile | 2 +
drivers/gpu/drm/xe/xe_device.h | 9 +
drivers/gpu/drm/xe/xe_device_types.h | 8 +
drivers/gpu/drm/xe/xe_hmm.c | 255 +++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_hmm.h | 11 ++
drivers/gpu/drm/xe/xe_mmio.c | 6 +
drivers/gpu/drm/xe/xe_svm.h | 49 +++++
drivers/gpu/drm/xe/xe_svm_devmem.c | 89 ++++++++++
drivers/gpu/drm/xe/xe_tile.c | 6 +
drivers/gpu/drm/xe/xe_vm.c | 122 ++-----------
11 files changed, 468 insertions(+), 110 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_hmm.c
create mode 100644 drivers/gpu/drm/xe/xe_hmm.h
create mode 100644 drivers/gpu/drm/xe/xe_svm.h
create mode 100644 drivers/gpu/drm/xe/xe_svm_devmem.c
--
2.26.3
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/8] drm/xe/svm: Remap and provide memmap backing for GPU vram
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
` (2 preceding siblings ...)
2024-03-20 3:38 ` ✗ CI.KUnit: failure " Patchwork
@ 2024-03-20 3:44 ` Oak Zeng
2024-03-20 3:44 ` [PATCH 2/8] drm/xe/svm: Introduce DRM_XE_SVM kernel config Oak Zeng
` (6 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
Memory remap GPU vram using devm_memremap_pages, so each GPU vram
page is backed by a struct page.
Those struct pages are created to allow hmm migrate buffer b/t
GPU vram and CPU system memory using existing Linux migration
mechanism (i.e., migrating b/t CPU system memory and hard disk).
This is prepare work to enable svm (shared virtual memory) through
Linux kernel hmm framework. The memory remap's page map type is set
to MEMORY_DEVICE_PRIVATE for now. This means even though each GPU
vram page get a struct page and can be mapped in CPU page table,
but such pages are treated as GPU's private resource, so CPU can't
access them. If CPU access such page, a page fault is triggered
and page will be migrate to system memory.
For GPU device which supports coherent memory protocol b/t CPU and
GPU (such as CXL and CAPI protocol), we can remap device memory as
MEMORY_DEVICE_COHERENT. This is TBD.
v1:
Changes per code review feedback from Matt:
change .o order in Makefile
fix indentation
change code order in mmio_fini
remove unnecessary header file
uniform xe_svm_devm_add/_remove parameter
use tile (vs dev) as pagemap.owner during memremap
only remap vram for platform that support usm
Changes per review feedback from Brian:
s/xe_svm_devm_add/xe_devm_add
s/xe_svm_devm_remove/xe_devm_remove
move calling of xe_devm_add to xe_tile.c
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
---
drivers/gpu/drm/xe/Makefile | 1 +
drivers/gpu/drm/xe/xe_device_types.h | 8 +++
drivers/gpu/drm/xe/xe_mmio.c | 6 ++
drivers/gpu/drm/xe/xe_svm.h | 15 +++++
drivers/gpu/drm/xe/xe_svm_devmem.c | 89 ++++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_tile.c | 4 ++
6 files changed, 123 insertions(+)
create mode 100644 drivers/gpu/drm/xe/xe_svm.h
create mode 100644 drivers/gpu/drm/xe/xe_svm_devmem.c
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 3c3e67885559..e2ec6d1375c0 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -128,6 +128,7 @@ xe-y += xe_bb.o \
xe_sa.o \
xe_sched_job.o \
xe_step.o \
+ xe_svm_devmem.o \
xe_sync.o \
xe_tile.o \
xe_tile_sysfs.o \
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 9785eef2e5a4..607b61326c9a 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -99,6 +99,14 @@ struct xe_mem_region {
resource_size_t actual_physical_size;
/** @mapping: pointer to VRAM mappable space */
void __iomem *mapping;
+ /** @pagemap: Used to remap device memory as ZONE_DEVICE */
+ struct dev_pagemap pagemap;
+ /**
+ * @hpa_base: base host physical address
+ *
+ * This is generated when remap device memory as ZONE_DEVICE
+ */
+ resource_size_t hpa_base;
};
/**
diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
index 1de9de4f94b6..bee3c89bc380 100644
--- a/drivers/gpu/drm/xe/xe_mmio.c
+++ b/drivers/gpu/drm/xe/xe_mmio.c
@@ -22,6 +22,7 @@
#include "xe_module.h"
#include "xe_sriov.h"
#include "xe_tile.h"
+#include "xe_svm.h"
#define XEHP_MTCFG_ADDR XE_REG(0x101800)
#define TILE_COUNT REG_GENMASK(15, 8)
@@ -354,6 +355,11 @@ void xe_mmio_probe_tiles(struct xe_device *xe)
static void mmio_fini(struct drm_device *drm, void *arg)
{
struct xe_device *xe = arg;
+ struct xe_tile *tile;
+ u8 id;
+
+ for_each_tile(tile, xe, id)
+ xe_devm_remove(tile, &tile->mem.vram);
pci_iounmap(to_pci_dev(xe->drm.dev), xe->mmio.regs);
if (xe->mem.vram.mapping)
diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
new file mode 100644
index 000000000000..e944971cfc6d
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_svm.h
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#ifndef __XE_SVM_H
+#define __XE_SVM_H
+
+struct xe_tile;
+struct xe_mem_region;
+
+int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr);
+void xe_devm_remove(struct xe_tile *tile, struct xe_mem_region *mr);
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_svm_devmem.c b/drivers/gpu/drm/xe/xe_svm_devmem.c
new file mode 100644
index 000000000000..31af56e8285a
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_svm_devmem.c
@@ -0,0 +1,89 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include <linux/mm_types.h>
+#include <linux/sched/mm.h>
+
+#include "xe_device_types.h"
+#include "xe_svm.h"
+
+
+static vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf)
+{
+ return 0;
+}
+
+static void xe_devm_page_free(struct page *page)
+{
+}
+
+static const struct dev_pagemap_ops xe_devm_pagemap_ops = {
+ .page_free = xe_devm_page_free,
+ .migrate_to_ram = xe_devm_migrate_to_ram,
+};
+
+/**
+ * xe_devm_add: Remap and provide memmap backing for device memory
+ * @tile: tile that the memory region blongs to
+ * @mr: memory region to remap
+ *
+ * This remap device memory to host physical address space and create
+ * struct page to back device memory
+ *
+ * Return: 0 on success standard error code otherwise
+ */
+int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr)
+{
+ struct xe_device *xe = tile_to_xe(tile);
+ struct device *dev = &to_pci_dev(xe->drm.dev)->dev;
+ struct resource *res;
+ void *addr;
+ int ret;
+
+ res = devm_request_free_mem_region(dev, &iomem_resource,
+ mr->usable_size);
+ if (IS_ERR(res)) {
+ ret = PTR_ERR(res);
+ return ret;
+ }
+
+ mr->pagemap.type = MEMORY_DEVICE_PRIVATE;
+ mr->pagemap.range.start = res->start;
+ mr->pagemap.range.end = res->end;
+ mr->pagemap.nr_range = 1;
+ mr->pagemap.ops = &xe_devm_pagemap_ops;
+ mr->pagemap.owner = xe;
+ addr = devm_memremap_pages(dev, &mr->pagemap);
+ if (IS_ERR(addr)) {
+ devm_release_mem_region(dev, res->start, resource_size(res));
+ ret = PTR_ERR(addr);
+ drm_err(&xe->drm, "Failed to remap tile %d memory, errno %d\n",
+ tile->id, ret);
+ return ret;
+ }
+ mr->hpa_base = res->start;
+
+ drm_info(&xe->drm, "Added tile %d memory [%llx-%llx] to devm, remapped to %pr\n",
+ tile->id, mr->io_start, mr->io_start + mr->usable_size, res);
+ return 0;
+}
+
+/**
+ * xe_devm_remove: Unmap device memory and free resources
+ * @tile: xe tile
+ * @mr: memory region to remove
+ */
+void xe_devm_remove(struct xe_tile *tile, struct xe_mem_region *mr)
+{
+ struct device *dev = &to_pci_dev(tile->xe->drm.dev)->dev;
+
+ /*FIXME: Does below cause a kernel hange during moduel remove?*/
+ if (mr->hpa_base) {
+ devm_memunmap_pages(dev, &mr->pagemap);
+ devm_release_mem_region(dev, mr->pagemap.range.start,
+ mr->pagemap.range.end - mr->pagemap.range.start + 1);
+ }
+}
+
diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
index 0650b2fa75ef..f1c4f9de51df 100644
--- a/drivers/gpu/drm/xe/xe_tile.c
+++ b/drivers/gpu/drm/xe/xe_tile.c
@@ -14,6 +14,7 @@
#include "xe_tile_sysfs.h"
#include "xe_ttm_vram_mgr.h"
#include "xe_wa.h"
+#include "xe_svm.h"
/**
* DOC: Multi-tile Design
@@ -158,6 +159,7 @@ static int tile_ttm_mgr_init(struct xe_tile *tile)
*/
int xe_tile_init_noalloc(struct xe_tile *tile)
{
+ struct xe_device *xe = tile_to_xe(tile);
int err;
xe_device_mem_access_get(tile_to_xe(tile));
@@ -175,6 +177,8 @@ int xe_tile_init_noalloc(struct xe_tile *tile)
xe_tile_sysfs_init(tile);
+ if (xe->info.has_usm)
+ xe_devm_add(tile, &tile->mem.vram);
err_mem_access:
xe_device_mem_access_put(tile_to_xe(tile));
return err;
--
2.26.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/8] drm/xe/svm: Introduce DRM_XE_SVM kernel config
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
` (3 preceding siblings ...)
2024-03-20 3:44 ` [PATCH 1/8] drm/xe/svm: Remap and provide memmap backing for GPU vram Oak Zeng
@ 2024-03-20 3:44 ` Oak Zeng
2024-03-20 3:44 ` [PATCH 3/8] drm/xe: Introduce helper to get tile from memory region Oak Zeng
` (5 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
Introduce a DRM_XE_SVM kernel config entry for
xe svm feature. xe svm feature allows share
virtual address space between CPU and GPU program.
v1: Improve commit message (Thomas)
Avoid using #if directive (Thomas)
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
---
drivers/gpu/drm/xe/Kconfig | 21 +++++++++++++++++++++
drivers/gpu/drm/xe/xe_tile.c | 4 +++-
2 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
index 1a556d087e63..e244165459c5 100644
--- a/drivers/gpu/drm/xe/Kconfig
+++ b/drivers/gpu/drm/xe/Kconfig
@@ -83,6 +83,27 @@ config DRM_XE_FORCE_PROBE
4571.
Use "!*" to block the probe of the driver for all known devices.
+config DRM_XE_SVM
+ bool "Enable Shared Virtual Memory support in xe"
+ depends on DRM_XE
+ depends on ARCH_ENABLE_MEMORY_HOTPLUG
+ depends on ARCH_ENABLE_MEMORY_HOTREMOVE
+ depends on MEMORY_HOTPLUG
+ depends on MEMORY_HOTREMOVE
+ depends on ARCH_HAS_PTE_DEVMAP
+ depends on SPARSEMEM_VMEMMAP
+ depends on ZONE_DEVICE
+ depends on DEVICE_PRIVATE
+ depends on MMU
+ select HMM_MIRROR
+ select MMU_NOTIFIER
+ default y
+ help
+ Choose this option if you want Shared Virtual Memory (SVM)
+ support in xe. With SVM, virtual address space is shared
+ between CPU and GPU. This means any virtual address such
+ as malloc or mmap returns, variables on stack, or global
+ memory pointers, can be used for GPU transparently.
menu "drm/Xe Debugging"
depends on DRM_XE
diff --git a/drivers/gpu/drm/xe/xe_tile.c b/drivers/gpu/drm/xe/xe_tile.c
index f1c4f9de51df..a484799364c9 100644
--- a/drivers/gpu/drm/xe/xe_tile.c
+++ b/drivers/gpu/drm/xe/xe_tile.c
@@ -159,7 +159,9 @@ static int tile_ttm_mgr_init(struct xe_tile *tile)
*/
int xe_tile_init_noalloc(struct xe_tile *tile)
{
+#if IS_ENABLED(CONFIG_DRM_XE_SVM)
struct xe_device *xe = tile_to_xe(tile);
+#endif
int err;
xe_device_mem_access_get(tile_to_xe(tile));
@@ -177,7 +179,7 @@ int xe_tile_init_noalloc(struct xe_tile *tile)
xe_tile_sysfs_init(tile);
- if (xe->info.has_usm)
+ if (IS_ENABLED(CONFIG_DRM_XE_SVM) && xe->info.has_usm)
xe_devm_add(tile, &tile->mem.vram);
err_mem_access:
xe_device_mem_access_put(tile_to_xe(tile));
--
2.26.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/8] drm/xe: Introduce helper to get tile from memory region
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
` (4 preceding siblings ...)
2024-03-20 3:44 ` [PATCH 2/8] drm/xe/svm: Introduce DRM_XE_SVM kernel config Oak Zeng
@ 2024-03-20 3:44 ` Oak Zeng
2024-03-20 3:44 ` [PATCH 4/8] drm/xe: Introduce a helper to get dpa from pfn Oak Zeng
` (4 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
Introduce a simple helper to retrieve tile from memory region
v1: move the function to xe_device.h (Matt)
improve commit message, add kerneldoc (Thomas)
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index d413bc2c6be5..3f74c460bb14 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -176,4 +176,13 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p);
u64 xe_device_canonicalize_addr(struct xe_device *xe, u64 address);
u64 xe_device_uncanonicalize_addr(struct xe_device *xe, u64 address);
+/**
+ * xe_mem_region_to_tile() - retrieve tile from memory region
+ * @mr: the memory region we retrieve tile from
+ */
+static inline struct xe_tile *xe_mem_region_to_tile(struct xe_mem_region *mr)
+{
+ return container_of(mr, struct xe_tile, mem.vram);
+}
+
#endif
--
2.26.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/8] drm/xe: Introduce a helper to get dpa from pfn
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
` (5 preceding siblings ...)
2024-03-20 3:44 ` [PATCH 3/8] drm/xe: Introduce helper to get tile from memory region Oak Zeng
@ 2024-03-20 3:44 ` Oak Zeng
2024-03-20 21:41 ` Welty, Brian
2024-03-20 3:44 ` [PATCH 5/8] drm/xe/svm: Get xe memory region from page Oak Zeng
` (3 subsequent siblings)
10 siblings, 1 reply; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
Since we now create struct page backing for each vram page,
each vram page now also has a pfn, just like system memory.
This allow us to calcuate device physical address from pfn.
v1: move the function to xe_svm.h (Matt)
s/vram_pfn_to_dpa/xe_mem_region_pfn_to_dpa (Matt)
add kernel document for the helper (Thomas)
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
---
drivers/gpu/drm/xe/xe_svm.h | 27 +++++++++++++++++++++++++--
1 file changed, 25 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
index e944971cfc6d..8a34429eb674 100644
--- a/drivers/gpu/drm/xe/xe_svm.h
+++ b/drivers/gpu/drm/xe/xe_svm.h
@@ -6,8 +6,31 @@
#ifndef __XE_SVM_H
#define __XE_SVM_H
-struct xe_tile;
-struct xe_mem_region;
+#include "xe_device_types.h"
+#include "xe_device.h"
+#include "xe_assert.h"
+
+/**
+ * xe_mem_region_pfn_to_dpa() - Calculate page's dpa from pfn
+ *
+ * @mr: The memory region that page resides in
+ * @pfn: page frame number of the page
+ *
+ * Returns: the device physical address of the page
+ */
+static inline u64 xe_mem_region_pfn_to_dpa(struct xe_mem_region *mr, u64 pfn)
+{
+ u64 dpa;
+ struct xe_tile *tile = xe_mem_region_to_tile(mr);
+ struct xe_device *xe = tile_to_xe(tile);
+ u64 offset;
+
+ xe_assert(xe, (pfn << PAGE_SHIFT) >= mr->hpa_base);
+ offset = (pfn << PAGE_SHIFT) - mr->hpa_base;
+ dpa = mr->dpa_base + offset;
+
+ return dpa;
+}
int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr);
void xe_devm_remove(struct xe_tile *tile, struct xe_mem_region *mr);
--
2.26.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 5/8] drm/xe/svm: Get xe memory region from page
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
` (6 preceding siblings ...)
2024-03-20 3:44 ` [PATCH 4/8] drm/xe: Introduce a helper to get dpa from pfn Oak Zeng
@ 2024-03-20 3:44 ` Oak Zeng
2024-03-20 3:44 ` [PATCH 6/8] drm/xe: Introduce helper to populate userptr Oak Zeng
` (2 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
For gpu vram page, we now have a struct page backing of
it. struct page's pgmap points to xe_memory_region's
pagemap. This allow us to retrieve xe_memory_region
from struct page.
v1: move the function to xe_svm.h
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
---
drivers/gpu/drm/xe/xe_svm.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
index 8a34429eb674..624c1581f8ba 100644
--- a/drivers/gpu/drm/xe/xe_svm.h
+++ b/drivers/gpu/drm/xe/xe_svm.h
@@ -6,6 +6,7 @@
#ifndef __XE_SVM_H
#define __XE_SVM_H
+#include <linux/mm_types.h>
#include "xe_device_types.h"
#include "xe_device.h"
#include "xe_assert.h"
@@ -35,4 +36,14 @@ static inline u64 xe_mem_region_pfn_to_dpa(struct xe_mem_region *mr, u64 pfn)
int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr);
void xe_devm_remove(struct xe_tile *tile, struct xe_mem_region *mr);
+/**
+ * xe_page_to_mem_region() - Get a page's memory region
+ *
+ * @page: a struct page pointer pointing to a page in vram memory region
+ */
+static inline struct xe_mem_region *xe_page_to_mem_region(struct page *page)
+{
+ return container_of(page->pgmap, struct xe_mem_region, pagemap);
+}
+
#endif
--
2.26.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 6/8] drm/xe: Introduce helper to populate userptr
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
` (7 preceding siblings ...)
2024-03-20 3:44 ` [PATCH 5/8] drm/xe/svm: Get xe memory region from page Oak Zeng
@ 2024-03-20 3:44 ` Oak Zeng
2024-03-20 19:19 ` Welty, Brian
2024-03-20 3:44 ` [PATCH 7/8] drm/xe: Introduce a helper to free sg table Oak Zeng
2024-03-20 3:44 ` [PATCH 8/8] drm/xe: Use hmm_range_fault to populate user pages Oak Zeng
10 siblings, 1 reply; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
Introduce a helper function xe_userptr_populate_range to populate
a a userptr range. This functions calls hmm_range_fault to read
CPU page tables and populate all pfns/pages of this virtual address
range.
If the populated page is system memory page, dma-mapping is performed
to get a dma-address which can be used later for GPU to access pages.
If the populated page is device private page, we calculate the dpa (
device physical address) of the page. This will be handled in future
patches.
The dma-address or dpa is then saved in userptr's sg table. This is
prepare work to replace the get_user_pages_fast code in userptr code
path.
v1: Address review comments:
separate a npage_in_range function (Matt)
reparameterize function xe_userptr_populate_range function (Matt)
move mmu_interval_read_begin() call into while loop (Thomas)
s/mark_range_accessed/xe_mark_range_accessed (Thomas)
use set_page_dirty_lock (vs set_page_dirty) (Thomas)
move a few checking in xe_vma_userptr_pin_pages to hmm.c (Matt)
v2: Remove device private page support. Only support system
pages for now. use dma-map-sg rather than dma-map-page (Matt/Thomas)
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
---
drivers/gpu/drm/xe/Makefile | 1 +
drivers/gpu/drm/xe/xe_hmm.c | 225 ++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_hmm.h | 10 ++
3 files changed, 236 insertions(+)
create mode 100644 drivers/gpu/drm/xe/xe_hmm.c
create mode 100644 drivers/gpu/drm/xe/xe_hmm.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index e2ec6d1375c0..52300de1e86f 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -101,6 +101,7 @@ xe-y += xe_bb.o \
xe_guc_pc.o \
xe_guc_submit.o \
xe_heci_gsc.o \
+ xe_hmm.o \
xe_hw_engine.o \
xe_hw_engine_class_sysfs.o \
xe_hw_fence.o \
diff --git a/drivers/gpu/drm/xe/xe_hmm.c b/drivers/gpu/drm/xe/xe_hmm.c
new file mode 100644
index 000000000000..e5719cab0c3d
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_hmm.c
@@ -0,0 +1,225 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#include <linux/mmu_notifier.h>
+#include <linux/dma-mapping.h>
+#include <linux/memremap.h>
+#include <linux/swap.h>
+#include <linux/hmm.h>
+#include <linux/mm.h>
+#include "xe_hmm.h"
+#include "xe_svm.h"
+#include "xe_vm.h"
+#include "xe_bo.h"
+
+static u64 xe_npages_in_range(unsigned long start, unsigned long end)
+{
+ return (PAGE_ALIGN(end) - PAGE_ALIGN_DOWN(start)) >> PAGE_SHIFT;
+}
+
+/**
+ * xe_mark_range_accessed() - mark a range is accessed, so core mm
+ * have such information for memory eviction or write back to
+ * hard disk
+ *
+ * @range: the range to mark
+ * @write: if write to this range, we mark pages in this range
+ * as dirty
+ */
+static void xe_mark_range_accessed(struct hmm_range *range, bool write)
+{
+ struct page *page;
+ u64 i, npages;
+
+ npages = xe_npages_in_range(range->start, range->end);
+ for (i = 0; i < npages; i++) {
+ page = hmm_pfn_to_page(range->hmm_pfns[i]);
+ if (write)
+ set_page_dirty_lock(page);
+
+ mark_page_accessed(page);
+ }
+}
+
+/**
+ * xe_build_sg() - build a scatter gather table for all the physical pages/pfn
+ * in a hmm_range. dma-map pages if necessary. dma-address is save in sg table
+ * and will be used to program GPU page table later.
+ *
+ * @xe: the xe device who will access the dma-address in sg table
+ * @range: the hmm range that we build the sg table from. range->hmm_pfns[]
+ * has the pfn numbers of pages that back up this hmm address range.
+ * @st: pointer to the sg table.
+ * @write: whether we write to this range. This decides dma map direction
+ * for system pages. If write we map it bi-diretional; otherwise
+ * DMA_TO_DEVICE
+ *
+ * All the contiguous pfns will be collapsed into one entry in
+ * the scatter gather table. This is for the purpose of efficiently
+ * programming GPU page table.
+ *
+ * The dma_address in the sg table will later be used by GPU to
+ * access memory. So if the memory is system memory, we need to
+ * do a dma-mapping so it can be accessed by GPU/DMA.
+ *
+ * FIXME: This function currently only support pages in system
+ * memory. If the memory is GPU local memory (of the GPU who
+ * is going to access memory), we need gpu dpa (device physical
+ * address), and there is no need of dma-mapping. This is TBD.
+ *
+ * FIXME: dma-mapping for peer gpu device to access remote gpu's
+ * memory. Add this when you support p2p
+ *
+ * This function allocates the storage of the sg table. It is
+ * caller's responsibility to free it calling sg_free_table.
+ *
+ * Returns 0 if successful; -ENOMEM if fails to allocate memory
+ */
+static int xe_build_sg(struct xe_device *xe, struct hmm_range *range,
+ struct sg_table *st, bool write)
+{
+ struct device *dev = xe->drm.dev;
+ struct page **pages;
+ u64 i, npages;
+ int ret;
+
+ npages = xe_npages_in_range(range->start, range->end);
+ pages = kvmalloc_array(npages, sizeof(*pages), GFP_KERNEL);
+ if (!pages)
+ return -ENOMEM;
+
+ for (i = 0; i < npages; i++) {
+ pages[i] = hmm_pfn_to_page(range->hmm_pfns[i]);
+ xe_assert(xe, !is_device_private_page(pages[i]));
+ }
+
+ ret = sg_alloc_table_from_pages_segment(st, pages, npages, 0,
+ npages << PAGE_SHIFT, xe_sg_segment_size(dev), GFP_KERNEL);
+ if (ret)
+ goto free_pages;
+
+ ret = dma_map_sgtable(dev, st, write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_NO_KERNEL_MAPPING);
+
+free_pages:
+ kvfree(pages);
+ return ret;
+}
+
+/**
+ * xe_userptr_populate_range() - Populate physical pages of a virtual
+ * address range
+ *
+ * @uvma: userptr vma which has information of the range to populate.
+ *
+ * This function populate the physical pages of a virtual
+ * address range. The populated physical pages is saved in
+ * userptr's sg table. It is similar to get_user_pages but call
+ * hmm_range_fault.
+ *
+ * This function also read mmu notifier sequence # (
+ * mmu_interval_read_begin), for the purpose of later
+ * comparison (through mmu_interval_read_retry).
+ *
+ * This must be called with mmap read or write lock held.
+ *
+ * This function allocates the storage of the userptr sg table.
+ * It is caller's responsibility to free it calling sg_free_table.
+ *
+ * returns: 0 for succuss; negative error no on failure
+ */
+int xe_userptr_populate_range(struct xe_userptr_vma *uvma)
+{
+ unsigned long timeout =
+ jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
+ unsigned long *pfns, flags = HMM_PFN_REQ_FAULT;
+ struct xe_userptr *userptr;
+ struct xe_vma *vma = &uvma->vma;
+ u64 start = xe_vma_userptr(vma);
+ u64 end = start + xe_vma_size(vma);
+ struct xe_vm *vm = xe_vma_vm(vma);
+ struct hmm_range hmm_range;
+ bool write = !xe_vma_read_only(vma);
+ bool in_kthread = !current->mm;
+ unsigned long notifier_seq;
+ u64 npages;
+ int ret;
+
+ userptr = &uvma->userptr;
+ mmap_assert_locked(userptr->notifier.mm);
+
+ if (vma->gpuva.flags & XE_VMA_DESTROYED)
+ return 0;
+
+ notifier_seq = mmu_interval_read_begin(&userptr->notifier);
+ if (notifier_seq == userptr->notifier_seq)
+ return 0;
+
+ npages = xe_npages_in_range(start, end);
+ pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
+ if (unlikely(!pfns))
+ return -ENOMEM;
+
+ if (write)
+ flags |= HMM_PFN_REQ_WRITE;
+
+ if (in_kthread) {
+ if (!mmget_not_zero(userptr->notifier.mm)) {
+ ret = -EFAULT;
+ goto free_pfns;
+ }
+ kthread_use_mm(userptr->notifier.mm);
+ }
+
+ memset64((u64 *)pfns, (u64)flags, npages);
+ hmm_range.hmm_pfns = pfns;
+ hmm_range.notifier = &userptr->notifier;
+ hmm_range.start = start;
+ hmm_range.end = end;
+ hmm_range.pfn_flags_mask = HMM_PFN_REQ_FAULT | HMM_PFN_REQ_WRITE;
+ /**
+ * FIXME:
+ * Set the dev_private_owner can prevent hmm_range_fault to fault
+ * in the device private pages owned by caller. See function
+ * hmm_vma_handle_pte. In multiple GPU case, this should be set to the
+ * device owner of the best migration destination. e.g., device0/vm0
+ * has a page fault, but we have determined the best placement of
+ * the fault address should be on device1, we should set below to
+ * device1 instead of device0.
+ */
+ hmm_range.dev_private_owner = vm->xe;
+
+ while (true) {
+ hmm_range.notifier_seq = mmu_interval_read_begin(&userptr->notifier);
+ ret = hmm_range_fault(&hmm_range);
+ if (time_after(jiffies, timeout))
+ break;
+
+ if (ret == -EBUSY)
+ continue;
+ break;
+ }
+
+ if (in_kthread) {
+ kthread_unuse_mm(userptr->notifier.mm);
+ mmput(userptr->notifier.mm);
+ }
+
+ if (ret)
+ goto free_pfns;
+
+ ret = xe_build_sg(vm->xe, &hmm_range, &userptr->sgt, write);
+ if (ret)
+ goto free_pfns;
+
+ xe_mark_range_accessed(&hmm_range, write);
+ userptr->sg = &userptr->sgt;
+ userptr->notifier_seq = hmm_range.notifier_seq;
+
+free_pfns:
+ kvfree(pfns);
+ return ret;
+}
+
diff --git a/drivers/gpu/drm/xe/xe_hmm.h b/drivers/gpu/drm/xe/xe_hmm.h
new file mode 100644
index 000000000000..fa5ddc11f10b
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_hmm.h
@@ -0,0 +1,10 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2024 Intel Corporation
+ */
+
+#include <linux/types.h>
+
+struct xe_userptr_vma;
+
+int xe_userptr_populate_range(struct xe_userptr_vma *uvma);
--
2.26.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 7/8] drm/xe: Introduce a helper to free sg table
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
` (8 preceding siblings ...)
2024-03-20 3:44 ` [PATCH 6/8] drm/xe: Introduce helper to populate userptr Oak Zeng
@ 2024-03-20 3:44 ` Oak Zeng
2024-03-20 3:44 ` [PATCH 8/8] drm/xe: Use hmm_range_fault to populate user pages Oak Zeng
10 siblings, 0 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
Introduce xe_userptr_free_sg helper to dma-unmap all
addresses in userptr's sg table and free sg table.
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Suggested by: Matthew Brost <matthew.brost@intel.com>
---
drivers/gpu/drm/xe/xe_hmm.c | 30 ++++++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_hmm.h | 1 +
2 files changed, 31 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_hmm.c b/drivers/gpu/drm/xe/xe_hmm.c
index e5719cab0c3d..5c0c3d277096 100644
--- a/drivers/gpu/drm/xe/xe_hmm.c
+++ b/drivers/gpu/drm/xe/xe_hmm.c
@@ -3,6 +3,7 @@
* Copyright © 2024 Intel Corporation
*/
+#include <linux/scatterlist.h>
#include <linux/mmu_notifier.h>
#include <linux/dma-mapping.h>
#include <linux/memremap.h>
@@ -108,6 +109,32 @@ static int xe_build_sg(struct xe_device *xe, struct hmm_range *range,
return ret;
}
+/**
+ * xe_userptr_free_sg() - Free the scatter gather table of userptr
+ *
+ * @uvma: the userptr vma which hold the scatter gather table
+ *
+ * With function xe_userptr_populate_range, we allocate storage of
+ * the userptr sg table. This is a helper function to free this
+ * sg table, and dma unmap the address in the table.
+ */
+void xe_userptr_free_sg(struct xe_userptr_vma *uvma)
+{
+ struct xe_userptr *userptr = &uvma->userptr;
+ struct xe_vma *vma = &uvma->vma;
+ bool write = !xe_vma_read_only(vma);
+ struct xe_vm *vm = xe_vma_vm(vma);
+ struct xe_device *xe = vm->xe;
+ struct device *dev = xe->drm.dev;
+
+ xe_assert(xe, userptr->sg);
+ dma_unmap_sgtable(dev, userptr->sg,
+ write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE, 0);
+
+ sg_free_table(userptr->sg);
+ userptr->sg = NULL;
+}
+
/**
* xe_userptr_populate_range() - Populate physical pages of a virtual
* address range
@@ -157,6 +184,9 @@ int xe_userptr_populate_range(struct xe_userptr_vma *uvma)
if (notifier_seq == userptr->notifier_seq)
return 0;
+ if (userptr->sg)
+ xe_userptr_free_sg(uvma);
+
npages = xe_npages_in_range(start, end);
pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
if (unlikely(!pfns))
diff --git a/drivers/gpu/drm/xe/xe_hmm.h b/drivers/gpu/drm/xe/xe_hmm.h
index fa5ddc11f10b..037ed3bf76da 100644
--- a/drivers/gpu/drm/xe/xe_hmm.h
+++ b/drivers/gpu/drm/xe/xe_hmm.h
@@ -8,3 +8,4 @@
struct xe_userptr_vma;
int xe_userptr_populate_range(struct xe_userptr_vma *uvma);
+void xe_userptr_free_sg(struct xe_userptr_vma *uvma);
--
2.26.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 8/8] drm/xe: Use hmm_range_fault to populate user pages
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
` (9 preceding siblings ...)
2024-03-20 3:44 ` [PATCH 7/8] drm/xe: Introduce a helper to free sg table Oak Zeng
@ 2024-03-20 3:44 ` Oak Zeng
10 siblings, 0 replies; 18+ messages in thread
From: Oak Zeng @ 2024-03-20 3:44 UTC (permalink / raw)
To: intel-xe
Cc: thomas.hellstrom, matthew.brost, brian.welty,
himal.prasad.ghimiray
This is an effort to unify hmmptr (aka system allocator)
and userptr code. hmm_range_fault is used to populate
a virtual address range for both hmmptr and userptr,
instead of hmmptr using hmm_range_fault and userptr
using get_user_pages_fast.
This also aligns with AMD gpu driver's behavior. In
long term, we plan to put some common helpers in this
area to drm layer so it can be re-used by different
vendors.
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
---
drivers/gpu/drm/xe/xe_vm.c | 122 ++++---------------------------------
1 file changed, 12 insertions(+), 110 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 51d62323d9ee..11a4bb9d5415 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -38,6 +38,7 @@
#include "xe_sync.h"
#include "xe_trace.h"
#include "xe_wa.h"
+#include "xe_hmm.h"
static struct drm_gem_object *xe_vm_obj(struct xe_vm *vm)
{
@@ -65,113 +66,21 @@ int xe_vma_userptr_check_repin(struct xe_userptr_vma *uvma)
int xe_vma_userptr_pin_pages(struct xe_userptr_vma *uvma)
{
- struct xe_userptr *userptr = &uvma->userptr;
struct xe_vma *vma = &uvma->vma;
struct xe_vm *vm = xe_vma_vm(vma);
struct xe_device *xe = vm->xe;
- const unsigned long num_pages = xe_vma_size(vma) >> PAGE_SHIFT;
- struct page **pages;
- bool in_kthread = !current->mm;
- unsigned long notifier_seq;
- int pinned, ret, i;
- bool read_only = xe_vma_read_only(vma);
+ struct xe_userptr *userptr;
+ int ret;
lockdep_assert_held(&vm->lock);
xe_assert(xe, xe_vma_is_userptr(vma));
-retry:
- if (vma->gpuva.flags & XE_VMA_DESTROYED)
- return 0;
-
- notifier_seq = mmu_interval_read_begin(&userptr->notifier);
- if (notifier_seq == userptr->notifier_seq)
- return 0;
-
- pages = kvmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL);
- if (!pages)
- return -ENOMEM;
-
- if (userptr->sg) {
- dma_unmap_sgtable(xe->drm.dev,
- userptr->sg,
- read_only ? DMA_TO_DEVICE :
- DMA_BIDIRECTIONAL, 0);
- sg_free_table(userptr->sg);
- userptr->sg = NULL;
- }
- pinned = ret = 0;
- if (in_kthread) {
- if (!mmget_not_zero(userptr->notifier.mm)) {
- ret = -EFAULT;
- goto mm_closed;
- }
- kthread_use_mm(userptr->notifier.mm);
- }
-
- while (pinned < num_pages) {
- ret = get_user_pages_fast(xe_vma_userptr(vma) +
- pinned * PAGE_SIZE,
- num_pages - pinned,
- read_only ? 0 : FOLL_WRITE,
- &pages[pinned]);
- if (ret < 0)
- break;
+ userptr = &uvma->userptr;
+ mmap_read_lock(userptr->notifier.mm);
+ ret = xe_userptr_populate_range(uvma);
+ mmap_read_unlock(userptr->notifier.mm);
- pinned += ret;
- ret = 0;
- }
-
- if (in_kthread) {
- kthread_unuse_mm(userptr->notifier.mm);
- mmput(userptr->notifier.mm);
- }
-mm_closed:
- if (ret)
- goto out;
-
- ret = sg_alloc_table_from_pages_segment(&userptr->sgt, pages,
- pinned, 0,
- (u64)pinned << PAGE_SHIFT,
- xe_sg_segment_size(xe->drm.dev),
- GFP_KERNEL);
- if (ret) {
- userptr->sg = NULL;
- goto out;
- }
- userptr->sg = &userptr->sgt;
-
- ret = dma_map_sgtable(xe->drm.dev, userptr->sg,
- read_only ? DMA_TO_DEVICE :
- DMA_BIDIRECTIONAL,
- DMA_ATTR_SKIP_CPU_SYNC |
- DMA_ATTR_NO_KERNEL_MAPPING);
- if (ret) {
- sg_free_table(userptr->sg);
- userptr->sg = NULL;
- goto out;
- }
-
- for (i = 0; i < pinned; ++i) {
- if (!read_only) {
- lock_page(pages[i]);
- set_page_dirty(pages[i]);
- unlock_page(pages[i]);
- }
-
- mark_page_accessed(pages[i]);
- }
-
-out:
- release_pages(pages, pinned);
- kvfree(pages);
-
- if (!(ret < 0)) {
- userptr->notifier_seq = notifier_seq;
- if (xe_vma_userptr_check_repin(uvma) == -EAGAIN)
- goto retry;
- }
-
- return ret < 0 ? ret : 0;
+ return ret;
}
static bool preempt_fences_waiting(struct xe_vm *vm)
@@ -917,8 +826,6 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm,
static void xe_vma_destroy_late(struct xe_vma *vma)
{
struct xe_vm *vm = xe_vma_vm(vma);
- struct xe_device *xe = vm->xe;
- bool read_only = xe_vma_read_only(vma);
if (vma->ufence) {
xe_sync_ufence_put(vma->ufence);
@@ -926,16 +833,11 @@ static void xe_vma_destroy_late(struct xe_vma *vma)
}
if (xe_vma_is_userptr(vma)) {
- struct xe_userptr *userptr = &to_userptr_vma(vma)->userptr;
+ struct xe_userptr_vma *uvma = to_userptr_vma(vma);
+ struct xe_userptr *userptr = &uvma->userptr;
- if (userptr->sg) {
- dma_unmap_sgtable(xe->drm.dev,
- userptr->sg,
- read_only ? DMA_TO_DEVICE :
- DMA_BIDIRECTIONAL, 0);
- sg_free_table(userptr->sg);
- userptr->sg = NULL;
- }
+ if (userptr->sg)
+ xe_userptr_free_sg(uvma);
/*
* Since userptr pages are not pinned, we can't remove
--
2.26.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 6/8] drm/xe: Introduce helper to populate userptr
2024-03-20 3:44 ` [PATCH 6/8] drm/xe: Introduce helper to populate userptr Oak Zeng
@ 2024-03-20 19:19 ` Welty, Brian
2024-03-20 20:02 ` Zeng, Oak
0 siblings, 1 reply; 18+ messages in thread
From: Welty, Brian @ 2024-03-20 19:19 UTC (permalink / raw)
To: Oak Zeng, intel-xe; +Cc: thomas.hellstrom, matthew.brost, himal.prasad.ghimiray
On 3/19/2024 8:44 PM, Oak Zeng wrote:
> Introduce a helper function xe_userptr_populate_range to populate
> a a userptr range. This functions calls hmm_range_fault to read
> CPU page tables and populate all pfns/pages of this virtual address
> range.
>
> If the populated page is system memory page, dma-mapping is performed
> to get a dma-address which can be used later for GPU to access pages.
>
> If the populated page is device private page, we calculate the dpa (
> device physical address) of the page. This will be handled in future
> patches.
>
> The dma-address or dpa is then saved in userptr's sg table. This is
> prepare work to replace the get_user_pages_fast code in userptr code
> path.
>
> v1: Address review comments:
> separate a npage_in_range function (Matt)
> reparameterize function xe_userptr_populate_range function (Matt)
> move mmu_interval_read_begin() call into while loop (Thomas)
> s/mark_range_accessed/xe_mark_range_accessed (Thomas)
> use set_page_dirty_lock (vs set_page_dirty) (Thomas)
> move a few checking in xe_vma_userptr_pin_pages to hmm.c (Matt)
> v2: Remove device private page support. Only support system
> pages for now. use dma-map-sg rather than dma-map-page (Matt/Thomas)
Are the prior patches 3, 4, 5 of this series no longer used?
I searched but looks like they were used in older version of this series
when you had the device private page support. Unused now?
For the small helpers which are only few lines of code, in my opinion,
it's nicer to squash it into patch where first used. Makes it easier
to scrutinize it based on first introduction/usage.
>
> Signed-off-by: Oak Zeng <oak.zeng@intel.com>
> Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@intel.com>
> Cc: Brian Welty <brian.welty@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 1 +
> drivers/gpu/drm/xe/xe_hmm.c | 225 ++++++++++++++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_hmm.h | 10 ++
> 3 files changed, 236 insertions(+)
> create mode 100644 drivers/gpu/drm/xe/xe_hmm.c
> create mode 100644 drivers/gpu/drm/xe/xe_hmm.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index e2ec6d1375c0..52300de1e86f 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -101,6 +101,7 @@ xe-y += xe_bb.o \
> xe_guc_pc.o \
> xe_guc_submit.o \
> xe_heci_gsc.o \
> + xe_hmm.o \
> xe_hw_engine.o \
> xe_hw_engine_class_sysfs.o \
> xe_hw_fence.o \
> diff --git a/drivers/gpu/drm/xe/xe_hmm.c b/drivers/gpu/drm/xe/xe_hmm.c
> new file mode 100644
> index 000000000000..e5719cab0c3d
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_hmm.c
> @@ -0,0 +1,225 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2024 Intel Corporation
> + */
> +
> +#include <linux/mmu_notifier.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/memremap.h>
> +#include <linux/swap.h>
> +#include <linux/hmm.h>
> +#include <linux/mm.h>
> +#include "xe_hmm.h"
> +#include "xe_svm.h"
> +#include "xe_vm.h"
> +#include "xe_bo.h"
> +
> +static u64 xe_npages_in_range(unsigned long start, unsigned long end)
> +{
> + return (PAGE_ALIGN(end) - PAGE_ALIGN_DOWN(start)) >> PAGE_SHIFT;
> +}
> +
> +/**
> + * xe_mark_range_accessed() - mark a range is accessed, so core mm
> + * have such information for memory eviction or write back to
> + * hard disk
> + *
> + * @range: the range to mark
> + * @write: if write to this range, we mark pages in this range
> + * as dirty
> + */
> +static void xe_mark_range_accessed(struct hmm_range *range, bool write)
> +{
> + struct page *page;
> + u64 i, npages;
> +
> + npages = xe_npages_in_range(range->start, range->end);
> + for (i = 0; i < npages; i++) {
> + page = hmm_pfn_to_page(range->hmm_pfns[i]);
> + if (write)
> + set_page_dirty_lock(page);
> +
> + mark_page_accessed(page);
> + }
> +}
> +
> +/**
> + * xe_build_sg() - build a scatter gather table for all the physical pages/pfn
> + * in a hmm_range. dma-map pages if necessary. dma-address is save in sg table
> + * and will be used to program GPU page table later.
> + *
> + * @xe: the xe device who will access the dma-address in sg table
> + * @range: the hmm range that we build the sg table from. range->hmm_pfns[]
> + * has the pfn numbers of pages that back up this hmm address range.
> + * @st: pointer to the sg table.
> + * @write: whether we write to this range. This decides dma map direction
> + * for system pages. If write we map it bi-diretional; otherwise
> + * DMA_TO_DEVICE
> + *
> + * All the contiguous pfns will be collapsed into one entry in
> + * the scatter gather table. This is for the purpose of efficiently
> + * programming GPU page table.
> + *
> + * The dma_address in the sg table will later be used by GPU to
> + * access memory. So if the memory is system memory, we need to
> + * do a dma-mapping so it can be accessed by GPU/DMA.
> + *
> + * FIXME: This function currently only support pages in system
> + * memory. If the memory is GPU local memory (of the GPU who
> + * is going to access memory), we need gpu dpa (device physical
> + * address), and there is no need of dma-mapping. This is TBD.
> + *
> + * FIXME: dma-mapping for peer gpu device to access remote gpu's
> + * memory. Add this when you support p2p
> + *
> + * This function allocates the storage of the sg table. It is
> + * caller's responsibility to free it calling sg_free_table.
> + *
> + * Returns 0 if successful; -ENOMEM if fails to allocate memory
> + */
> +static int xe_build_sg(struct xe_device *xe, struct hmm_range *range,
> + struct sg_table *st, bool write)
> +{
> + struct device *dev = xe->drm.dev;
> + struct page **pages;
> + u64 i, npages;
> + int ret;
> +
> + npages = xe_npages_in_range(range->start, range->end);
> + pages = kvmalloc_array(npages, sizeof(*pages), GFP_KERNEL);
> + if (!pages)
> + return -ENOMEM;
> +
> + for (i = 0; i < npages; i++) {
> + pages[i] = hmm_pfn_to_page(range->hmm_pfns[i]);
> + xe_assert(xe, !is_device_private_page(pages[i]));
> + }
> +
> + ret = sg_alloc_table_from_pages_segment(st, pages, npages, 0,
> + npages << PAGE_SHIFT, xe_sg_segment_size(dev), GFP_KERNEL);
> + if (ret)
> + goto free_pages;
> +
> + ret = dma_map_sgtable(dev, st, write ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE,
> + DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_NO_KERNEL_MAPPING);
> +
> +free_pages:
> + kvfree(pages);
> + return ret;
> +}
> +
> +/**
> + * xe_userptr_populate_range() - Populate physical pages of a virtual
> + * address range
> + *
> + * @uvma: userptr vma which has information of the range to populate.
> + *
> + * This function populate the physical pages of a virtual
> + * address range. The populated physical pages is saved in
> + * userptr's sg table. It is similar to get_user_pages but call
> + * hmm_range_fault.
> + *
> + * This function also read mmu notifier sequence # (
> + * mmu_interval_read_begin), for the purpose of later
> + * comparison (through mmu_interval_read_retry).
> + *
> + * This must be called with mmap read or write lock held.
> + *
> + * This function allocates the storage of the userptr sg table.
> + * It is caller's responsibility to free it calling sg_free_table.
> + *
> + * returns: 0 for succuss; negative error no on failure
> + */
> +int xe_userptr_populate_range(struct xe_userptr_vma *uvma)
> +{
> + unsigned long timeout =
> + jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
> + unsigned long *pfns, flags = HMM_PFN_REQ_FAULT;
> + struct xe_userptr *userptr;
> + struct xe_vma *vma = &uvma->vma;
> + u64 start = xe_vma_userptr(vma);
> + u64 end = start + xe_vma_size(vma);
> + struct xe_vm *vm = xe_vma_vm(vma);
> + struct hmm_range hmm_range;
> + bool write = !xe_vma_read_only(vma);
> + bool in_kthread = !current->mm;
> + unsigned long notifier_seq;
> + u64 npages;
> + int ret;
> +
> + userptr = &uvma->userptr;
> + mmap_assert_locked(userptr->notifier.mm);
> +
> + if (vma->gpuva.flags & XE_VMA_DESTROYED)
> + return 0;
> +
> + notifier_seq = mmu_interval_read_begin(&userptr->notifier);
> + if (notifier_seq == userptr->notifier_seq)
> + return 0;
> +
> + npages = xe_npages_in_range(start, end);
> + pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
> + if (unlikely(!pfns))
> + return -ENOMEM;
> +
> + if (write)
> + flags |= HMM_PFN_REQ_WRITE;
> +
> + if (in_kthread) {
> + if (!mmget_not_zero(userptr->notifier.mm)) {
> + ret = -EFAULT;
> + goto free_pfns;
> + }
> + kthread_use_mm(userptr->notifier.mm);
> + }
> +
> + memset64((u64 *)pfns, (u64)flags, npages);
> + hmm_range.hmm_pfns = pfns;
> + hmm_range.notifier = &userptr->notifier;
> + hmm_range.start = start;
> + hmm_range.end = end;
> + hmm_range.pfn_flags_mask = HMM_PFN_REQ_FAULT | HMM_PFN_REQ_WRITE;
> + /**
> + * FIXME:
> + * Set the dev_private_owner can prevent hmm_range_fault to fault
> + * in the device private pages owned by caller. See function
> + * hmm_vma_handle_pte. In multiple GPU case, this should be set to the
> + * device owner of the best migration destination. e.g., device0/vm0
> + * has a page fault, but we have determined the best placement of
> + * the fault address should be on device1, we should set below to
> + * device1 instead of device0.
> + */
> + hmm_range.dev_private_owner = vm->xe;
> +
> + while (true) {
> + hmm_range.notifier_seq = mmu_interval_read_begin(&userptr->notifier);
> + ret = hmm_range_fault(&hmm_range);
> + if (time_after(jiffies, timeout))
> + break;
> +
> + if (ret == -EBUSY)
> + continue;
> + break;
> + }
> +
> + if (in_kthread) {
> + kthread_unuse_mm(userptr->notifier.mm);
> + mmput(userptr->notifier.mm);
> + }
> +
> + if (ret)
> + goto free_pfns;
> +
> + ret = xe_build_sg(vm->xe, &hmm_range, &userptr->sgt, write);
> + if (ret)
> + goto free_pfns;
> +
> + xe_mark_range_accessed(&hmm_range, write);
> + userptr->sg = &userptr->sgt;
> + userptr->notifier_seq = hmm_range.notifier_seq;
> +
> +free_pfns:
> + kvfree(pfns);
> + return ret;
> +}
> +
> diff --git a/drivers/gpu/drm/xe/xe_hmm.h b/drivers/gpu/drm/xe/xe_hmm.h
> new file mode 100644
> index 000000000000..fa5ddc11f10b
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_hmm.h
> @@ -0,0 +1,10 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2024 Intel Corporation
> + */
> +
> +#include <linux/types.h>
> +
> +struct xe_userptr_vma;
> +
> +int xe_userptr_populate_range(struct xe_userptr_vma *uvma);
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH 6/8] drm/xe: Introduce helper to populate userptr
2024-03-20 19:19 ` Welty, Brian
@ 2024-03-20 20:02 ` Zeng, Oak
0 siblings, 0 replies; 18+ messages in thread
From: Zeng, Oak @ 2024-03-20 20:02 UTC (permalink / raw)
To: Welty, Brian, intel-xe@lists.freedesktop.org
Cc: Hellstrom, Thomas, Brost, Matthew, Ghimiray, Himal Prasad
Hi Brian,
> -----Original Message-----
> From: Welty, Brian <brian.welty@intel.com>
> Sent: Wednesday, March 20, 2024 3:19 PM
> To: Zeng, Oak <oak.zeng@intel.com>; intel-xe@lists.freedesktop.org
> Cc: Hellstrom, Thomas <thomas.hellstrom@intel.com>; Brost, Matthew
> <matthew.brost@intel.com>; Ghimiray, Himal Prasad
> <himal.prasad.ghimiray@intel.com>
> Subject: Re: [PATCH 6/8] drm/xe: Introduce helper to populate userptr
>
>
>
> On 3/19/2024 8:44 PM, Oak Zeng wrote:
> > Introduce a helper function xe_userptr_populate_range to populate
> > a a userptr range. This functions calls hmm_range_fault to read
> > CPU page tables and populate all pfns/pages of this virtual address
> > range.
> >
> > If the populated page is system memory page, dma-mapping is performed
> > to get a dma-address which can be used later for GPU to access pages.
> >
> > If the populated page is device private page, we calculate the dpa (
> > device physical address) of the page. This will be handled in future
> > patches.
> >
> > The dma-address or dpa is then saved in userptr's sg table. This is
> > prepare work to replace the get_user_pages_fast code in userptr code
> > path.
> >
> > v1: Address review comments:
> > separate a npage_in_range function (Matt)
> > reparameterize function xe_userptr_populate_range function (Matt)
> > move mmu_interval_read_begin() call into while loop (Thomas)
> > s/mark_range_accessed/xe_mark_range_accessed (Thomas)
> > use set_page_dirty_lock (vs set_page_dirty) (Thomas)
> > move a few checking in xe_vma_userptr_pin_pages to hmm.c (Matt)
> > v2: Remove device private page support. Only support system
> > pages for now. use dma-map-sg rather than dma-map-page (Matt/Thomas)
>
> Are the prior patches 3, 4, 5 of this series no longer used?
>
> I searched but looks like they were used in older version of this series
> when you had the device private page support. Unused now?
>
> For the small helpers which are only few lines of code, in my opinion,
> it's nicer to squash it into patch where first used. Makes it easier
> to scrutinize it based on first introduction/usage.
You are right. In the original series, I tried to make the xe_userptr_populate_range function work for both usrptr and hmmptr. Patch 3,4,5 is for hmmptr.
But during the review, I agreed with Matt to make this series only for userptr and leave the hmmptr part for the future. From this perspective, patch 1, 2 is also not necessary. Only 6,7, 8 is needed.
I will rework
Oak
>
>
> >
> > Signed-off-by: Oak Zeng <oak.zeng@intel.com>
> > Co-developed-by: Niranjana Vishwanathapura
> <niranjana.vishwanathapura@intel.com>
> > Signed-off-by: Niranjana Vishwanathapura
> <niranjana.vishwanathapura@intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@intel.com>
> > Cc: Brian Welty <brian.welty@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile | 1 +
> > drivers/gpu/drm/xe/xe_hmm.c | 225
> ++++++++++++++++++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_hmm.h | 10 ++
> > 3 files changed, 236 insertions(+)
> > create mode 100644 drivers/gpu/drm/xe/xe_hmm.c
> > create mode 100644 drivers/gpu/drm/xe/xe_hmm.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index e2ec6d1375c0..52300de1e86f 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -101,6 +101,7 @@ xe-y += xe_bb.o \
> > xe_guc_pc.o \
> > xe_guc_submit.o \
> > xe_heci_gsc.o \
> > + xe_hmm.o \
> > xe_hw_engine.o \
> > xe_hw_engine_class_sysfs.o \
> > xe_hw_fence.o \
> > diff --git a/drivers/gpu/drm/xe/xe_hmm.c b/drivers/gpu/drm/xe/xe_hmm.c
> > new file mode 100644
> > index 000000000000..e5719cab0c3d
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_hmm.c
> > @@ -0,0 +1,225 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2024 Intel Corporation
> > + */
> > +
> > +#include <linux/mmu_notifier.h>
> > +#include <linux/dma-mapping.h>
> > +#include <linux/memremap.h>
> > +#include <linux/swap.h>
> > +#include <linux/hmm.h>
> > +#include <linux/mm.h>
> > +#include "xe_hmm.h"
> > +#include "xe_svm.h"
> > +#include "xe_vm.h"
> > +#include "xe_bo.h"
> > +
> > +static u64 xe_npages_in_range(unsigned long start, unsigned long end)
> > +{
> > + return (PAGE_ALIGN(end) - PAGE_ALIGN_DOWN(start)) >> PAGE_SHIFT;
> > +}
> > +
> > +/**
> > + * xe_mark_range_accessed() - mark a range is accessed, so core mm
> > + * have such information for memory eviction or write back to
> > + * hard disk
> > + *
> > + * @range: the range to mark
> > + * @write: if write to this range, we mark pages in this range
> > + * as dirty
> > + */
> > +static void xe_mark_range_accessed(struct hmm_range *range, bool write)
> > +{
> > + struct page *page;
> > + u64 i, npages;
> > +
> > + npages = xe_npages_in_range(range->start, range->end);
> > + for (i = 0; i < npages; i++) {
> > + page = hmm_pfn_to_page(range->hmm_pfns[i]);
> > + if (write)
> > + set_page_dirty_lock(page);
> > +
> > + mark_page_accessed(page);
> > + }
> > +}
> > +
> > +/**
> > + * xe_build_sg() - build a scatter gather table for all the physical pages/pfn
> > + * in a hmm_range. dma-map pages if necessary. dma-address is save in sg
> table
> > + * and will be used to program GPU page table later.
> > + *
> > + * @xe: the xe device who will access the dma-address in sg table
> > + * @range: the hmm range that we build the sg table from. range-
> >hmm_pfns[]
> > + * has the pfn numbers of pages that back up this hmm address range.
> > + * @st: pointer to the sg table.
> > + * @write: whether we write to this range. This decides dma map direction
> > + * for system pages. If write we map it bi-diretional; otherwise
> > + * DMA_TO_DEVICE
> > + *
> > + * All the contiguous pfns will be collapsed into one entry in
> > + * the scatter gather table. This is for the purpose of efficiently
> > + * programming GPU page table.
> > + *
> > + * The dma_address in the sg table will later be used by GPU to
> > + * access memory. So if the memory is system memory, we need to
> > + * do a dma-mapping so it can be accessed by GPU/DMA.
> > + *
> > + * FIXME: This function currently only support pages in system
> > + * memory. If the memory is GPU local memory (of the GPU who
> > + * is going to access memory), we need gpu dpa (device physical
> > + * address), and there is no need of dma-mapping. This is TBD.
> > + *
> > + * FIXME: dma-mapping for peer gpu device to access remote gpu's
> > + * memory. Add this when you support p2p
> > + *
> > + * This function allocates the storage of the sg table. It is
> > + * caller's responsibility to free it calling sg_free_table.
> > + *
> > + * Returns 0 if successful; -ENOMEM if fails to allocate memory
> > + */
> > +static int xe_build_sg(struct xe_device *xe, struct hmm_range *range,
> > + struct sg_table *st, bool write)
> > +{
> > + struct device *dev = xe->drm.dev;
> > + struct page **pages;
> > + u64 i, npages;
> > + int ret;
> > +
> > + npages = xe_npages_in_range(range->start, range->end);
> > + pages = kvmalloc_array(npages, sizeof(*pages), GFP_KERNEL);
> > + if (!pages)
> > + return -ENOMEM;
> > +
> > + for (i = 0; i < npages; i++) {
> > + pages[i] = hmm_pfn_to_page(range->hmm_pfns[i]);
> > + xe_assert(xe, !is_device_private_page(pages[i]));
> > + }
> > +
> > + ret = sg_alloc_table_from_pages_segment(st, pages, npages, 0,
> > + npages << PAGE_SHIFT, xe_sg_segment_size(dev),
> GFP_KERNEL);
> > + if (ret)
> > + goto free_pages;
> > +
> > + ret = dma_map_sgtable(dev, st, write ? DMA_BIDIRECTIONAL :
> DMA_TO_DEVICE,
> > + DMA_ATTR_SKIP_CPU_SYNC |
> DMA_ATTR_NO_KERNEL_MAPPING);
> > +
> > +free_pages:
> > + kvfree(pages);
> > + return ret;
> > +}
> > +
> > +/**
> > + * xe_userptr_populate_range() - Populate physical pages of a virtual
> > + * address range
> > + *
> > + * @uvma: userptr vma which has information of the range to populate.
> > + *
> > + * This function populate the physical pages of a virtual
> > + * address range. The populated physical pages is saved in
> > + * userptr's sg table. It is similar to get_user_pages but call
> > + * hmm_range_fault.
> > + *
> > + * This function also read mmu notifier sequence # (
> > + * mmu_interval_read_begin), for the purpose of later
> > + * comparison (through mmu_interval_read_retry).
> > + *
> > + * This must be called with mmap read or write lock held.
> > + *
> > + * This function allocates the storage of the userptr sg table.
> > + * It is caller's responsibility to free it calling sg_free_table.
> > + *
> > + * returns: 0 for succuss; negative error no on failure
> > + */
> > +int xe_userptr_populate_range(struct xe_userptr_vma *uvma)
> > +{
> > + unsigned long timeout =
> > + jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
> > + unsigned long *pfns, flags = HMM_PFN_REQ_FAULT;
> > + struct xe_userptr *userptr;
> > + struct xe_vma *vma = &uvma->vma;
> > + u64 start = xe_vma_userptr(vma);
> > + u64 end = start + xe_vma_size(vma);
> > + struct xe_vm *vm = xe_vma_vm(vma);
> > + struct hmm_range hmm_range;
> > + bool write = !xe_vma_read_only(vma);
> > + bool in_kthread = !current->mm;
> > + unsigned long notifier_seq;
> > + u64 npages;
> > + int ret;
> > +
> > + userptr = &uvma->userptr;
> > + mmap_assert_locked(userptr->notifier.mm);
> > +
> > + if (vma->gpuva.flags & XE_VMA_DESTROYED)
> > + return 0;
> > +
> > + notifier_seq = mmu_interval_read_begin(&userptr->notifier);
> > + if (notifier_seq == userptr->notifier_seq)
> > + return 0;
> > +
> > + npages = xe_npages_in_range(start, end);
> > + pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL);
> > + if (unlikely(!pfns))
> > + return -ENOMEM;
> > +
> > + if (write)
> > + flags |= HMM_PFN_REQ_WRITE;
> > +
> > + if (in_kthread) {
> > + if (!mmget_not_zero(userptr->notifier.mm)) {
> > + ret = -EFAULT;
> > + goto free_pfns;
> > + }
> > + kthread_use_mm(userptr->notifier.mm);
> > + }
> > +
> > + memset64((u64 *)pfns, (u64)flags, npages);
> > + hmm_range.hmm_pfns = pfns;
> > + hmm_range.notifier = &userptr->notifier;
> > + hmm_range.start = start;
> > + hmm_range.end = end;
> > + hmm_range.pfn_flags_mask = HMM_PFN_REQ_FAULT |
> HMM_PFN_REQ_WRITE;
> > + /**
> > + * FIXME:
> > + * Set the dev_private_owner can prevent hmm_range_fault to fault
> > + * in the device private pages owned by caller. See function
> > + * hmm_vma_handle_pte. In multiple GPU case, this should be set to the
> > + * device owner of the best migration destination. e.g., device0/vm0
> > + * has a page fault, but we have determined the best placement of
> > + * the fault address should be on device1, we should set below to
> > + * device1 instead of device0.
> > + */
> > + hmm_range.dev_private_owner = vm->xe;
> > +
> > + while (true) {
> > + hmm_range.notifier_seq = mmu_interval_read_begin(&userptr-
> >notifier);
> > + ret = hmm_range_fault(&hmm_range);
> > + if (time_after(jiffies, timeout))
> > + break;
> > +
> > + if (ret == -EBUSY)
> > + continue;
> > + break;
> > + }
> > +
> > + if (in_kthread) {
> > + kthread_unuse_mm(userptr->notifier.mm);
> > + mmput(userptr->notifier.mm);
> > + }
> > +
> > + if (ret)
> > + goto free_pfns;
> > +
> > + ret = xe_build_sg(vm->xe, &hmm_range, &userptr->sgt, write);
> > + if (ret)
> > + goto free_pfns;
> > +
> > + xe_mark_range_accessed(&hmm_range, write);
> > + userptr->sg = &userptr->sgt;
> > + userptr->notifier_seq = hmm_range.notifier_seq;
> > +
> > +free_pfns:
> > + kvfree(pfns);
> > + return ret;
> > +}
> > +
> > diff --git a/drivers/gpu/drm/xe/xe_hmm.h b/drivers/gpu/drm/xe/xe_hmm.h
> > new file mode 100644
> > index 000000000000..fa5ddc11f10b
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_hmm.h
> > @@ -0,0 +1,10 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2024 Intel Corporation
> > + */
> > +
> > +#include <linux/types.h>
> > +
> > +struct xe_userptr_vma;
> > +
> > +int xe_userptr_populate_range(struct xe_userptr_vma *uvma);
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/8] drm/xe: Introduce a helper to get dpa from pfn
2024-03-20 3:44 ` [PATCH 4/8] drm/xe: Introduce a helper to get dpa from pfn Oak Zeng
@ 2024-03-20 21:41 ` Welty, Brian
2024-03-20 22:34 ` Zeng, Oak
0 siblings, 1 reply; 18+ messages in thread
From: Welty, Brian @ 2024-03-20 21:41 UTC (permalink / raw)
To: Oak Zeng, intel-xe; +Cc: thomas.hellstrom, matthew.brost, himal.prasad.ghimiray
On 3/19/2024 8:44 PM, Oak Zeng wrote:
> Since we now create struct page backing for each vram page,
> each vram page now also has a pfn, just like system memory.
> This allow us to calcuate device physical address from pfn.
>
> v1: move the function to xe_svm.h (Matt)
> s/vram_pfn_to_dpa/xe_mem_region_pfn_to_dpa (Matt)
> add kernel document for the helper (Thomas)
>
> Signed-off-by: Oak Zeng <oak.zeng@intel.com>
> ---
> drivers/gpu/drm/xe/xe_svm.h | 27 +++++++++++++++++++++++++--
> 1 file changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
> index e944971cfc6d..8a34429eb674 100644
> --- a/drivers/gpu/drm/xe/xe_svm.h
> +++ b/drivers/gpu/drm/xe/xe_svm.h
> @@ -6,8 +6,31 @@
> #ifndef __XE_SVM_H
> #define __XE_SVM_H
>
> -struct xe_tile;
> -struct xe_mem_region;
> +#include "xe_device_types.h"
> +#include "xe_device.h"
> +#include "xe_assert.h"
> +
> +/**
> + * xe_mem_region_pfn_to_dpa() - Calculate page's dpa from pfn
> + *
> + * @mr: The memory region that page resides in
> + * @pfn: page frame number of the page
> + *
> + * Returns: the device physical address of the page
> + */
> +static inline u64 xe_mem_region_pfn_to_dpa(struct xe_mem_region *mr, u64 pfn)
> +{
> + u64 dpa;
> + struct xe_tile *tile = xe_mem_region_to_tile(mr);
I believe you are dropping this patch for now, but wanted to comment.
Is above the only usage of xe_mem_region_to_tile() ?
As mentioned in earlier review, it's better to avoid the tight
association with xe_mem_region and tile. Such as for the UMA we talked
before, it's nice to have flexibility for xe_mem_region to be associated
with multiple tiles. But I guess this is all future/pending, so can
leave to solve in future if you prefer.
Not sure you have other usage of xe_mem_region_to_tile(). But here,
looks only used for getting tile->xe pointer.
Can caller of xe_mem_region_pfn_to_dpa() just pass in the xe device pointer?
-Brian
> + struct xe_device *xe = tile_to_xe(tile);
> + u64 offset;
> +
> + xe_assert(xe, (pfn << PAGE_SHIFT) >= mr->hpa_base);
> + offset = (pfn << PAGE_SHIFT) - mr->hpa_base;
> + dpa = mr->dpa_base + offset;
> +
> + return dpa;
> +}
>
> int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr);
> void xe_devm_remove(struct xe_tile *tile, struct xe_mem_region *mr);
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH 4/8] drm/xe: Introduce a helper to get dpa from pfn
2024-03-20 21:41 ` Welty, Brian
@ 2024-03-20 22:34 ` Zeng, Oak
0 siblings, 0 replies; 18+ messages in thread
From: Zeng, Oak @ 2024-03-20 22:34 UTC (permalink / raw)
To: Welty, Brian, intel-xe@lists.freedesktop.org
Cc: Hellstrom, Thomas, Brost, Matthew, Ghimiray, Himal Prasad
Hi Brian,
> -----Original Message-----
> From: Welty, Brian <brian.welty@intel.com>
> Sent: Wednesday, March 20, 2024 5:42 PM
> To: Zeng, Oak <oak.zeng@intel.com>; intel-xe@lists.freedesktop.org
> Cc: Hellstrom, Thomas <thomas.hellstrom@intel.com>; Brost, Matthew
> <matthew.brost@intel.com>; Ghimiray, Himal Prasad
> <himal.prasad.ghimiray@intel.com>
> Subject: Re: [PATCH 4/8] drm/xe: Introduce a helper to get dpa from pfn
>
>
>
> On 3/19/2024 8:44 PM, Oak Zeng wrote:
> > Since we now create struct page backing for each vram page,
> > each vram page now also has a pfn, just like system memory.
> > This allow us to calcuate device physical address from pfn.
> >
> > v1: move the function to xe_svm.h (Matt)
> > s/vram_pfn_to_dpa/xe_mem_region_pfn_to_dpa (Matt)
> > add kernel document for the helper (Thomas)
> >
> > Signed-off-by: Oak Zeng <oak.zeng@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_svm.h | 27 +++++++++++++++++++++++++--
> > 1 file changed, 25 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
> > index e944971cfc6d..8a34429eb674 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.h
> > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > @@ -6,8 +6,31 @@
> > #ifndef __XE_SVM_H
> > #define __XE_SVM_H
> >
> > -struct xe_tile;
> > -struct xe_mem_region;
> > +#include "xe_device_types.h"
> > +#include "xe_device.h"
> > +#include "xe_assert.h"
> > +
> > +/**
> > + * xe_mem_region_pfn_to_dpa() - Calculate page's dpa from pfn
> > + *
> > + * @mr: The memory region that page resides in
> > + * @pfn: page frame number of the page
> > + *
> > + * Returns: the device physical address of the page
> > + */
> > +static inline u64 xe_mem_region_pfn_to_dpa(struct xe_mem_region *mr,
> u64 pfn)
> > +{
> > + u64 dpa;
> > + struct xe_tile *tile = xe_mem_region_to_tile(mr);
>
> I believe you are dropping this patch for now, but wanted to comment.
>
> Is above the only usage of xe_mem_region_to_tile() ?
>
> As mentioned in earlier review, it's better to avoid the tight
> association with xe_mem_region and tile. Such as for the UMA we talked
> before, it's nice to have flexibility for xe_mem_region to be associated
> with multiple tiles. But I guess this is all future/pending, so can
> leave to solve in future if you prefer.
>
> Not sure you have other usage of xe_mem_region_to_tile(). But here,
> looks only used for getting tile->xe pointer.
> Can caller of xe_mem_region_pfn_to_dpa() just pass in the xe device pointer?
Yes, in this case we can pass in the xe pointer. It is a good idea.
I agree what you said above. Couple region with tile is a bad idea. I will consider this perspective in the coming series.
Oak
>
> -Brian
>
>
> > + struct xe_device *xe = tile_to_xe(tile);
> > + u64 offset;
> > +
> > + xe_assert(xe, (pfn << PAGE_SHIFT) >= mr->hpa_base);
> > + offset = (pfn << PAGE_SHIFT) - mr->hpa_base;
> > + dpa = mr->dpa_base + offset;
> > +
> > + return dpa;
> > +}
> >
> > int xe_devm_add(struct xe_tile *tile, struct xe_mem_region *mr);
> > void xe_devm_remove(struct xe_tile *tile, struct xe_mem_region *mr);
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-03-20 22:34 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-20 3:44 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
2024-03-20 3:37 ` ✓ CI.Patch_applied: success for " Patchwork
2024-03-20 3:37 ` ✗ CI.checkpatch: warning " Patchwork
2024-03-20 3:38 ` ✗ CI.KUnit: failure " Patchwork
2024-03-20 3:44 ` [PATCH 1/8] drm/xe/svm: Remap and provide memmap backing for GPU vram Oak Zeng
2024-03-20 3:44 ` [PATCH 2/8] drm/xe/svm: Introduce DRM_XE_SVM kernel config Oak Zeng
2024-03-20 3:44 ` [PATCH 3/8] drm/xe: Introduce helper to get tile from memory region Oak Zeng
2024-03-20 3:44 ` [PATCH 4/8] drm/xe: Introduce a helper to get dpa from pfn Oak Zeng
2024-03-20 21:41 ` Welty, Brian
2024-03-20 22:34 ` Zeng, Oak
2024-03-20 3:44 ` [PATCH 5/8] drm/xe/svm: Get xe memory region from page Oak Zeng
2024-03-20 3:44 ` [PATCH 6/8] drm/xe: Introduce helper to populate userptr Oak Zeng
2024-03-20 19:19 ` Welty, Brian
2024-03-20 20:02 ` Zeng, Oak
2024-03-20 3:44 ` [PATCH 7/8] drm/xe: Introduce a helper to free sg table Oak Zeng
2024-03-20 3:44 ` [PATCH 8/8] drm/xe: Use hmm_range_fault to populate user pages Oak Zeng
-- strict thread matches above, loose matches on Subject: below --
2024-03-20 3:24 [PATCH 0/8] Use hmm_range_fault to populate user page Oak Zeng
2024-03-19 2:55 Oak Zeng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox