From: Matthew Brost <matthew.brost@intel.com>
To: "Hajda, Andrzej" <andrzej.hajda@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>,
<intel-xe@lists.freedesktop.org>,
Maciej Patelczyk <maciej.patelczyk@intel.com>,
Jonathan Cavitt <jonathan.cavitt@intel.com>
Subject: Re: [PATCH 12/18] drm/xe/eudebug: implement userptr_vma access
Date: Fri, 25 Oct 2024 18:23:40 +0000 [thread overview]
Message-ID: <ZxviLOv75ISykjK8@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <447c0757-e987-4714-a720-16191be2416c@intel.com>
On Fri, Oct 25, 2024 at 03:20:18PM +0200, Hajda, Andrzej wrote:
>
> W dniu 24.10.2024 o 18:06, Matthew Brost pisze:
> > On Wed, Oct 23, 2024 at 01:32:53PM +0200, Hajda, Andrzej wrote:
> > > W dniu 22.10.2024 o 00:34, Matthew Brost pisze:
> > > > On Mon, Oct 21, 2024 at 11:54:30AM +0200, Hajda, Andrzej wrote:
> > > > > W dniu 20.10.2024 o 20:16, Matthew Brost pisze:
> > > > > > On Tue, Oct 01, 2024 at 05:43:00PM +0300, Mika Kuoppala wrote:
> > > > > > > From: Andrzej Hajda <andrzej.hajda@intel.com>
> > > > > > >
> > > > > > > Debugger needs to read/write program's vmas including
> > > > > > > userptr_vma. Since hmm_range_fault is used to pin userptr
> > > > > > > vmas, it is possible to map those vmas from debugger
> > > > > > > context.
> > > > > > >
> > > > > > > v2: pin pages vs notifier, move to vm.c (Matthew)
> > > > > > >
> > > > > > > Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
> > > > > > > Signed- off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
> > > > > > > Signed- off-by: Mika Kuoppala
> > > > > > > <mika.kuoppala@linux.intel.com> Reviewed- by: Jonathan
> > > > > > > Cavitt <jonathan.cavitt@intel.com> --- drivers/
> > > > > > > gpu/drm/xe/xe_eudebug.c | 2 +- drivers/gpu/drm/xe/ xe_vm.c
> > > > > > > | 47 +++++++++++++++++++++++++++++++++ drivers/
> > > > > > > gpu/drm/xe/xe_vm.h | 3 +++ 3 files changed, 51
> > > > > > > insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/
> > > > > > > drm/ xe/xe_eudebug.c index edad6d533d0b..b09d7414cfe3 100644
> > > > > > > --- a/ drivers/gpu/drm/xe/xe_eudebug.c +++
> > > > > > > b/drivers/gpu/drm/ xe/ xe_eudebug.c @@ -3023,7 +3023,7 @@
> > > > > > > static int xe_eudebug_vma_access(struct xe_vma *vma, u64
> > > > > > > offset, return ret; } - return -EINVAL; + return
> > > > > > > xe_uvma_access(to_userptr_vma(vma), offset, buf, bytes,
> > > > > > > write); } static int xe_eudebug_vm_access(struct xe_vm *vm,
> > > > > > > u64 offset, diff --git a/drivers/gpu/drm/xe/xe_vm.c b/
> > > > > > > drivers/ gpu/drm/xe/xe_vm.c index a836dfc5a86f..5f891e76993b
> > > > > > > 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/
> > > > > > > xe/xe_vm.c @@ -3421,3 +3421,50 @@ void
> > > > > > > xe_vm_snapshot_free(struct xe_vm_snapshot *snap) }
> > > > > > > kvfree(snap); } + +int xe_uvma_access(struct xe_userptr_vma
> > > > > > > *uvma, u64 offset, + void *buf, u64 len, bool write) +{
> > > > > > Maybe dump question but are we overthinking this here?
> > > > > >
> > > > > > Can we just use kthread_use_mm, copy_to_user, copy_from_user?
> > > > > >
> > > > > > If not then my previous comments still apply here.
> > > > > This function is called from debugger process context and
> > > > > kthread_use_mm is allowed only from kthread. Spawning kthread
> > > > > just for this is an option but looks odd and suboptimal, could be
> > > > > kind of last resort, or not?
> > > > >
> > > > > Another options: 1. Keep reference to remote task in xe_userptr and
> > > > > use access_process_vm(up->task, ...).
> > > > >
> > > > I think remote refs are generally a bad idea but admittedly don't fully
> > > > understand what this would look like.
> > > >
> > > > > 2. Pass xe_eudebug.target_task reference down from eudebug framework
> > > > > to this helper and use access_process_vm. Current call chain is:
> > > > > __xe_eudebug_vm_access - has access to xe_eudebug.target_task
> > > > > ->__vm_read_write --->xe_eudebug_vm_access ---->xe_eudebug_vm_access
> > > > > ----->xe_eudebug_vma_access ------
> > > > > > xe_vm_userptr_access So to achieve this multiple changes are
> > > > > required, but maybe it is valid path to go? One potential issue with
> > > > > 1 and 2 is that multiple UMD tests were failing when
> > > > > access_process_vm/access_remote_vm were used, they were not
> > > > > investigated as this approach was dropped due to different reasons.
> > > > >
> > > > > 3. Continue approach from this patch, but with corrected page
> > > > > iterator of up->sg sg list[1]. This was nacked by you(?) [2] but
> > > > > I have problem understanding why? I see lot of code in kernel
> > > > > mapping sg pages: linux$ git grep ' kmap.*sg' | wc -l
> > > > I looked here every example I found has mapping and accessing 1
> > > > page at time not mapping 1 page and accessing many.
> > > >
> > > > > 61 Is it incorrect? Or our case is different?
> > > > >
> > > > A sglist segments are dma-addresses (virtual address), thus every
> > > > 4k in the segment can be a different physical page.
> > > sglist is also list of pages and their lengths (in case of consecutive
> > > pages, they are glued together), i.e. exactly what we need. And this is
> > > done in xe_build_sg: it calls sg_alloc_table_from_pages_segment which
> > > is documented as follows:
> > > ...
> > > * Allocate and initialize an sg table from a list of pages. Contiguous
> > > * ranges of the pages are squashed into a single scatterlist node up to the
> > > * maximum size specified in @max_segment. A user may provide an offset at a
> > > * start and a size of valid data in a buffer specified by the page array.
> > > ...
> > > So sglist contains the same information as hmm_range->hmm_pfns[i] and little
> > > more.
> > >
> > > So my concern is if mapping operation can destroy this info, but looking at
> > > the code this does not seems to be the case. For example iommu_dma_map_sg
> > > docs explicitly says about "preserve the original offsets and sizes for the
> > > caller".
> > >
> > >
> > > > i.e., look at this snippet:
> > > >
> > > > + void *ptr = kmap_local_page(sg_page(cur.sgl)) + cur.start; + +
> > > > cur_len = min(cur.size, cur.remaining); + if (write) + memcpy(ptr, buf,
> > > > cur_len); + else + memcpy(buf, ptr, cur_len); + kunmap_local(ptr);
> > > >
> > > > If 'cur.start' > 4k, then you are potentially pointing to an incorrect
> > > > page and corrupting memory.
> > > With added possibility to iterate over sgl pages in xe_res_cursor[1] it does
> > > not seems to be true.
> > > Why? cur.start is limited by length of the segment (cur.sgl->length), if it
> > > happens to be more than 4k, it means sg_page(cur.sgl) points to consecutive
> > > pages and cur.start is correct.
> > >
> > I suppose if the cursor is changed to walk the pages not the dma
> > address, yea i guess this would work. Still my much prefered way would
> > just call hmm_range_fault or optionally save off the pages in
> > xe_vma_userptr_pin_pages given at some point we will ditch SG tables for
> > userptr in favor of drm gpusvm which will be page based.
>
> Both alternatives seems for me suboptimal, as their result is what we have
> already in sg table, for a price of extra call (hmm_range_fault) and extra
> allocations (both). Most important they will complicate the code without
> clear benefit.
>
Again the SG is going to eventually be dropped in favor of a page array
in GPU SVM. Also the general consensus among the Linux community is SG
tables are terrible and wouldn't be surprised if eventually these get
pulled out / deprecated in the kernel entirely. Fine if you want use a
SG table I guess as we will just rewrite this once GPU SVM based userptr
lands.
> I will try to implement version with hmm_range_fault but I am still confused
> why we need to complicate things.
>
Correctness vs convenience...
It is a bit unfortunate our hmm_range_fault wrapper is built around a VMA
argument - I suggested a more generic interface there to this easier in
reviews but that feedback was ignored.
Matt
>
> Regards
>
> Andrzej
>
>
> >
> > Matt
> >
> > > > Likewise if 'cur_len' > 4k, then you are potentially pointing to an
> > > > incorrect page and corrupting memory.
> > > Again, in case of consecutive pages it should be in range.
> > >
> > > Anyway if there is an issue with consecutive pages, which I am not aware of,
> > > we can always build sg list with segments pointing to 4K pages by
> > > modyfing xe_build_sg to call sg_alloc_table_from_pages_segment with 4K max
> > > segment size.
> > >
> > > [1]: https://lore.kernel.org/intel-xe/20241011-xe_res_cursor_add_page_iterator-v3-1-0f8b8d3ab021@intel.com/
> > >
> > > Regards
> > > Andrzej
> > >
> > > > This loop would have to be changed to something like below which kmaps
> > > > and accesses 1 page at a time... for (xe_res_first_sg(up-
> > > > > sg, offset, len, &cur); cur.remaining; xe_res_next(&cur, cur_len))
> > > > { int segment_len; int remain;
> > > >
> > > > cur_len = min(cur.size, cur.remaining); remain = cur_len;
> > > >
> > > > for (i = 0; i < cur_len; i += segment_len) { phys_addr_t phys =
> > > > iommu_iova_to_phys(sg_dma_address(cur.sgl) + i + cur.start); struct page
> > > > *page = phys_to_page(phys); void *ptr = kmap_local_page(page); int
> > > > ptr_offset = offset & ~PAGE_MASK;
> > > >
> > > > segment_len = min(remain, PAGE_SIZE - ptr_offset);
> > > >
> > > > if (write) memcpy(ptr + ptr_offset, buf + i, segment_len); else
> > > > memcpy(buf + i, ptr + ptr_offset, segment_len); kunmap_local(ptr);
> > > >
> > > > offset += segment_len; remain -= segment_len; } buf += cur_len; }
> > > >
> > > > > 4. As you suggested in [3](?), modify xe_hmm_userptr_populate_range
> > > > > to keep hmm_range.hmm_pfns(or sth similar) in xe_userptr and use it
> > > > > later (instead of up->sg) to iterate over pages.
> > > > >
> > > > Or just call hmm_range_fault directly here and operate on returned pages
> > > > directly.
> > > >
> > > > BTW eventually all the userptr stuff is going to change and be
> > > > based on GPU SVM [4]. Calling hmm_range_fault directly will always
> > > > work though and likely the safest option.
> > > >
> > > > Matt
> > > >
> > > > [4] https://patchwork.freedesktop.org/patch/619809/? series=137870&rev=2
> > > >
> > > > > [1]: https://lore.kernel.org/intel-xe/20241011-
> > > > > xe_res_cursor_add_page_iterator-v3-1-0f8b8d3ab021@intel.com/ [2]:
> > > > > https://lore.kernel.org/intel-xe/Zw32fauoUmB6Iojk@DUT025-
> > > > > TGLU.fm.intel.com/ [3]: https://patchwork.freedesktop.org/
> > > > > patch/617481/?series=136572&rev=2#comment_1126527
> > > > >
> > > > > Regards Andrzej
> > > > >
> > > > > > Matt
> > > > > >
> > > > > > > + struct xe_vm *vm = xe_vma_vm(&uvma->vma); + struct
> > > > > > > xe_userptr *up = &uvma->userptr; + struct xe_res_cursor cur
> > > > > > > = {}; + int cur_len, ret = 0; + + while (true) { +
> > > > > > > down_read(&vm-
> > > > > > > > userptr.notifier_lock); + if (!
> > > > > > > xe_vma_userptr_check_repin(uvma)) + break; + +
> > > > > > > spin_lock(&vm-
> > > > > > > > userptr.invalidated_lock); + list_del_init(&uvma-
> > > > > > > > userptr.invalidate_link); + spin_unlock(&vm-
> > > > > > > > userptr.invalidated_lock); + + up_read(&vm-
> > > > > > > > userptr.notifier_lock); + ret =
> > > > > > > xe_vma_userptr_pin_pages(uvma); + if (ret) + return ret;
> > > > > > > + } + + if (!up->sg) { + ret = -EINVAL; + goto
> > > > > > > out_unlock_notifier; + } + + for (xe_res_first_sg(up->sg,
> > > > > > > offset, len, &cur); cur.remaining; + xe_res_next(&cur,
> > > > > > > cur_len)) { + void *ptr = kmap_local_page(sg_page(cur.sgl))
> > > > > > > + cur.start; + + cur_len = min(cur.size, cur.remaining); +
> > > > > > > if (write) + memcpy(ptr, buf, cur_len); + else +
> > > > > > > memcpy(buf, ptr, cur_len); + kunmap_local(ptr); + buf +=
> > > > > > > cur_len; + } + ret = len; + +out_unlock_notifier: +
> > > > > > > up_read(&vm-
> > > > > > > > userptr.notifier_lock); + return ret; +} diff --git a/
> > > > > > > > drivers/
> > > > > > > gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h index
> > > > > > > c864dba35e1d..99b9a9b011de 100644 --- a/drivers/gpu/drm/xe/
> > > > > > > xe_vm.h +++ b/drivers/gpu/drm/xe/xe_vm.h @@ -281,3 +281,6 @@
> > > > > > > struct xe_vm_snapshot *xe_vm_snapshot_capture(struct xe_vm
> > > > > > > *vm); void xe_vm_snapshot_capture_delayed(struct
> > > > > > > xe_vm_snapshot *snap); void xe_vm_snapshot_print(struct
> > > > > > > xe_vm_snapshot *snap, struct drm_printer *p); void
> > > > > > > xe_vm_snapshot_free(struct xe_vm_snapshot *snap); + +int
> > > > > > > xe_uvma_access(struct xe_userptr_vma *uvma, u64 offset, +
> > > > > > > void *buf, u64 len, bool write); -- 2.34.1
> > > > > > >
next prev parent reply other threads:[~2024-10-25 18:24 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-01 14:42 [PATCH 00/18] GPU debug support (eudebug) v2 Mika Kuoppala
2024-10-01 14:42 ` [PATCH 01/18] ptrace: export ptrace_may_access Mika Kuoppala
2024-10-01 14:42 ` [PATCH 02/18] drm/xe/eudebug: Introduce eudebug support Mika Kuoppala
2024-10-01 14:42 ` [PATCH 03/18] drm/xe/eudebug: Introduce discovery for resources Mika Kuoppala
2024-10-03 0:41 ` Matthew Brost
2024-10-03 7:27 ` Mika Kuoppala
2024-10-03 16:32 ` Matthew Brost
2024-10-01 14:42 ` [PATCH 04/18] drm/xe/eudebug: Introduce exec_queue events Mika Kuoppala
2024-10-01 14:42 ` [PATCH 05/18] drm/xe/eudebug: hw enablement for eudebug Mika Kuoppala
2024-10-01 14:42 ` [PATCH 06/18] drm/xe: Add EUDEBUG_ENABLE exec queue property Mika Kuoppala
2024-10-01 14:42 ` [PATCH 07/18] drm/xe/eudebug: Introduce per device attention scan worker Mika Kuoppala
2024-10-01 14:42 ` [PATCH 08/18] drm/xe/eudebug: Introduce EU control interface Mika Kuoppala
2024-10-01 14:42 ` [PATCH 09/18] drm/xe/eudebug: Add vm bind and vm bind ops Mika Kuoppala
2024-10-01 14:42 ` [PATCH 10/18] drm/xe/eudebug: Add UFENCE events with acks Mika Kuoppala
2024-10-01 14:42 ` [PATCH 11/18] drm/xe/eudebug: vm open/pread/pwrite Mika Kuoppala
2024-10-01 14:43 ` [PATCH 12/18] drm/xe/eudebug: implement userptr_vma access Mika Kuoppala
2024-10-05 3:48 ` Matthew Brost
2024-10-12 2:39 ` Matthew Brost
2024-10-12 2:55 ` Matthew Brost
2024-10-20 18:16 ` Matthew Brost
2024-10-21 9:54 ` Hajda, Andrzej
2024-10-21 22:34 ` Matthew Brost
2024-10-23 11:32 ` Hajda, Andrzej
2024-10-24 16:06 ` Matthew Brost
2024-10-25 13:20 ` Hajda, Andrzej
2024-10-25 18:23 ` Matthew Brost [this message]
2024-10-28 16:19 ` [PATCH 1/2] drm/xe: keep list of system pages in xe_userptr Andrzej Hajda
2024-10-28 16:19 ` [PATCH 2/2] drm/xe/eudebug: implement userptr_vma access Andrzej Hajda
2024-10-28 16:55 ` Hajda, Andrzej
2024-10-01 14:43 ` [PATCH 13/18] drm/xe: Debug metadata create/destroy ioctls Mika Kuoppala
2024-10-01 16:25 ` Matthew Auld
2024-10-02 13:59 ` Grzegorzek, Dominik
2024-10-01 14:43 ` [PATCH 14/18] drm/xe: Attach debug metadata to vma Mika Kuoppala
2024-10-01 14:43 ` [PATCH 15/18] drm/xe/eudebug: Add debug metadata support for xe_eudebug Mika Kuoppala
2024-10-01 14:43 ` [PATCH 16/18] drm/xe/eudebug: Implement vm_bind_op discovery Mika Kuoppala
2024-10-01 14:43 ` [PATCH 17/18] drm/xe/eudebug: Dynamically toggle debugger functionality Mika Kuoppala
2024-10-01 14:43 ` [PATCH 18/18] drm/xe/eudebug_test: Introduce xe_eudebug wa kunit test Mika Kuoppala
2024-10-01 15:02 ` ✓ CI.Patch_applied: success for GPU debug support (eudebug) (rev2) Patchwork
2024-10-01 15:03 ` ✗ CI.checkpatch: warning " Patchwork
2024-10-01 15:03 ` ✗ CI.KUnit: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZxviLOv75ISykjK8@DUT025-TGLU.fm.intel.com \
--to=matthew.brost@intel.com \
--cc=andrzej.hajda@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jonathan.cavitt@intel.com \
--cc=maciej.patelczyk@intel.com \
--cc=mika.kuoppala@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox