Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Simona Vetter <simona.vetter@ffwll.ch>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Matthew Brost" <matthew.brost@intel.com>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Huang Rui" <ray.huang@amd.com>,
	intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	matthew.auld@intel.com, "David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>
Subject: Re: [PATCH v6 2/8] drm/ttm: Add ttm_bo_access
Date: Mon, 11 Nov 2024 12:34:12 +0100	[thread overview]
Message-ID: <a1ffb3f7-77bc-41ff-a98a-5cd889f081fa@amd.com> (raw)
In-Reply-To: <ZzHYB4MBJmVjk-AR@phenom.ffwll.local>

Am 11.11.24 um 11:10 schrieb Simona Vetter:
> On Mon, Nov 11, 2024 at 10:00:17AM +0200, Joonas Lahtinen wrote:
>> Back from some time off and will try to answer below.
>>
>> Adding Dave and Sima as this topic has been previously discussed to some
>> extent and will be good to reach common understanding about what the
>> series is trying to do and what is the difference to the AMD debugging
>> model.
> I chatted about this thread a bit on irc with folks, and I think an
> orthogonal issue is the question, what should be in ttm-utils? I've asked
> Matt to type up a DOC patch once we have some consensus, since imo the
> somewhat lackluster documentation situation for ttm is also somewhat a
> cause for these big threads on various different topics. Aside from the
> fact that gpu memory management is just hard.
>
> On the uapi/design aspect, I think this would serve well with a patch to
> drm-uapi.rst that adds a debugging section? At least once we have some
> rough consensus across drivers, and more importantly userspace in the form
> of gdb upstream (at least I'm not aware of any other upstream debugger
> patches, I think amd's rocm stuff is also gdb-only).

Yeah that seems to be a really good idea. Similar design ideas came up 
AMD internally as well but where dropped after pointing people to 
pidfd_getfd().

But the bigger problem seems to be that the design doesn't seems to take 
the dma_fence requirements into account.

In other words attaching gdb to a pid seems to stop the GPU thread of 
this pid without waiting for the XE preemption nor end of operation fence.

I mean if the GPU threads are preempted that could work, but yeah not 
like this :)

Regards,
Christian.

>
> Some wash-up thoughts from me below, but consider them fairly irrelevant
> since I think the main driver for these big questions here should be
> gdb/userspace.
>
>> Quoting Christian König (2024-11-07 11:44:33)
>>> Am 06.11.24 um 18:00 schrieb Matthew Brost:
>>>
>>>      [SNIP]
>>>
>>>      This is not a generic interface that anyone can freely access. The same
>>>      permissions used by ptrace are checked when opening such an interface.
>>>      See [1] [2].
>>>
>>>      [1] https://patchwork.freedesktop.org/patch/617470/?series=136572&rev=2
>>>      [2] https://patchwork.freedesktop.org/patch/617471/?series=136572&rev=2
>>>
>>>
>>> Thanks a lot for those pointers, that is exactly what I was looking for.
>>>
>>> And yeah, it is what I feared. You are re-implementing existing functionality,
>>> but see below.
>> Could you elaborate on what this "existing functionality" exactly is?
>> I do not think this functionality exists at this time.
>>
>> The EU debugging architecture for Xe specifically avoids the need for GDB
>> to attach with ptrace to the CPU process or interfere with the CPU process for
>> the debugging via parasitic threads or so.
>>
>> Debugger connection is opened to the DRM driver for given PID (which uses the
>> ptrace may access check for now) after which the all DRM client of that
>> PID are exposed to the debugger process.
>>
>> What we want to expose via that debugger connection is the ability for GDB to
>> read/write the different GPU VM address spaces (ppGTT for Intel GPUs) just like
>> the EU threads would see them. Note that the layout of the ppGTT is
>> completely up to the userspace driver to setup and is mostly only partially
>> equal to the CPU address space.
>>
>> Specifically as part of reading/writing the ppGTT for debugging purposes,
>> there are deep flushes needed: for example flushing instruction cache
>> when adding/removing breakpoints.
>>
>> Maybe that will explain the background. I elaborate on this at the end some more.
>>
>>>              kmap/vmap are used everywhere in the DRM subsystem to access BOs, so I’m
>>>              failing to see the problem with adding a simple helper based on existing
>>>              code.
>>>
>>>          What#s possible and often done is to do kmap/vmap if you need to implement a
>>>          CPU copy for scanout for example or for copying/validating command buffers.
>>>          But that usually requires accessing the whole BO and has separate security
>>>          checks.
>>>
>>>          When you want to access only a few bytes of a BO that sounds massively like
>>>          a peek/poke like interface and we have already rejected that more than once.
>>>          There even used to be standardized GEM IOCTLs for that which have been
>>>          removed by now.
>> Referring to the explanation at top: These IOCTL are not for the debugging target
>> process to issue. The peek/poke interface is specifically for GDB only
>> to facilitate the emulation of memory reads/writes on the GPU address
>> space as they were done by EUs themselves. And to recap: for modifying
>> instructions for example (add/remove breakpoint), extra level of cache flushing is
>> needed which is not available to regular userspace.
>>
>> I specifically discussed with Sima on the difference before moving forward with this
>> design originally. If something has changed since then, I'm of course happy to rediscuss.
>>
>> However, if this code can't be added, not sure how we would ever be able
>> to implement core dumps for GPU threads/memory?
>>
>>>          If you need to access BOs which are placed in not CPU accessible memory then
>>>          implement the access callback for ptrace, see amdgpu_ttm_access_memory for
>>>          an example how to do this.
>> As also mentioned above, we don't work via ptrace at all when it comes
>> to debugging the EUs. The only thing used for now is the ptrace_may_access to
>> implement similar access restrictions as ptrace has. This can be changed
>> to something else if needed.
>>
>>>      Ptrace access via vm_operations_struct.access → ttm_bo_vm_access.
>>>
>>>      This series renames ttm_bo_vm_access to ttm_bo_access, with no code changes.
>>>
>>>      The above function accesses a BO via kmap if it is in SYSTEM / TT,
>>>      which is existing code.
>>>
>>>      This function is only exposed to user space via ptrace permissions.
>> Maybe this sentence is what caused the confusion.
>>
>> Userspace is never exposed with peek/poke interface, only the debugger
>> connection which is its own FD.
>>
>>>      In this series, we implement a function [3] similar to
>>>      amdgpu_ttm_access_memory for the TTM vfunc access_memory. What is
>>>      missing is non-visible CPU memory access, similar to
>>>      amdgpu_ttm_access_memory_sdma. This will be addressed in a follow-up and
>>>      was omitted in this series given its complexity.
>>>
>>>      So, this looks more or less identical to AMD's ptrace implementation,
>>>      but in GPU address space. Again, I fail to see what the problem is here.
>>>      What am I missing?
>>>
>>>
>>> The main question is why can't you use the existing interfaces directly?
>> We're not working on the CPU address space or BOs. We're working
>> strictly on the GPU address space as would be seen by an EU thread if it
>> accessed address X.
>>
>>> Additional to the peek/poke interface of ptrace Linux has the pidfd_getfd
>>> system call, see here https://man7.org/linux/man-pages/man2/pidfd_getfd.2.html.
>>>
>>> The pidfd_getfd() allows to dup() the render node file descriptor into your gdb
>>> process. That in turn gives you all the access you need from gdb, including
>>> mapping BOs and command submission on behalf of the application.
>> We're not operating on the CPU address space nor are we operating on BOs
>> (there is no concept of BO in the EU debug interface). Each VMA in the VM
>> could come from anywhere, only the start address and size matter. And
>> neither do we need to interfere with the command submission of the
>> process under debug.
>>
>>> As far as I can see that allows for the same functionality as the eudebug
>>> interface, just without any driver specific code messing with ptrace
>>> permissions and peek/poke interfaces.
>>>
>>> So the question is still why do you need the whole eudebug interface in the
>>> first place? I might be missing something, but that seems to be superfluous
>>> from a high level view.
>> Recapping from above. It is to allow the debugging of EU threads per DRM
>> client, completely independent of the CPU process. If ptrace_may_acces
>> is the sore point, we could consider other permission checks, too. There
>> is no other connection to ptrace in this architecture as single
>> permission check to know if PID is fair game to access by debugger
>> process.
>>
>> Why no parasitic thread or ptrace: Going forward, binding the EU debugging to
>> the DRM client would also pave way for being able to extend core kernel generated
>> core dump with each DRM client's EU thread/memory dump. We have similar
>> feature called "Offline core dump" enabled in the downstream public
>> trees for i915, where we currently attach the EU thread dump to i915 error state
>> and then later combine i915 error state with CPU core dump file with a
>> tool.
>>
>> This is relatively little amount of extra code, as this baseline series
>> already introduces GDB the ability to perform the necessary actions.
>> It's just the matter of kernel driver calling: "stop all threads", then
>> copying the memory map and memory contents for GPU threads, just like is
>> done for CPU threads.
>>
>> With parasitic thread injection, not sure if there is such way forward,
>> as it would seem to require to inject quite abit more logic to core kernel?
>>
>>> It's true that the AMD KFD part has still similar functionality, but that is
>>> because of the broken KFD design of tying driver state to the CPU process
>>> (which makes it inaccessible for gdb even with imported render node fd).
>>>
>>> Both Sima and I (and partially Dave as well) have pushed back on the KFD
>>> approach. And the long term plan is to get rid of such device driver specific
>>> interface which re-implement existing functionality just differently.
>> Recapping, this series is not adding it back. The debugger connection
>> is a separate FD from the DRM one, with separate IOCTL set. We don't allow
>> the DRM FD any new operations based on ptrace is attached or not. We
>> don't ever do that check even.
>>
>> We only restrict the opening of the debugger connection to given PID with
>> ptrace_may_access check for now. That can be changed to something else,
>> if necessary.
> Yeah I think unnecessarily tying gpu processes to cpu processes is a bad
> thing, least because even today all the svm discussions we have still hit
> clear use-cases, where a 1:1 match is not wanted (like multiple gpu svm
> sections with offsets). Not even speaking of all the gpu usecases where
> the gpu vm space is still entirely independent of the cpu side.
>
> So that's why I think this entirely separate approach looks like the right
> one, with ptrace_may_access as the access control check to make sure we
> match ptrace on the cpu side.
>
> But there's very obviously a bikeshed to be had on what the actual uapi
> should look like, especially how gdb opens up a gpu debug access fd. But I
> also think that's not much on drm to decide, but whatever gdb wants. And
> then we aim for some consistency on that lookup/access control part
> (ideally, I might be missing some reasons why this is a bad idea) across
> drm drivers.
>
>>> So you need to have a really really good explanation why the eudebug interface
>>> is actually necessary.
>> TL;DR The main point is to decouple the debugging of the EU workloads from the
>> debugging of the CPU process. This avoids the interference with the CPU process with
>> parasitic thread injection. Further this also allows generating a core dump
>> without any GDB connected. There are also many other smaller pros/cons
>> which can be discussed but for the context of this patch, this is the
>> main one.
>>
>> So unlike parasitic thread injection, we don't unlock any special IOCTL for
>> the process under debug to be performed by the parasitic thread, but we
>> allow the minimal set of operations to be performed by GDB as if those were
>> done on the EUs themselves.
>>
>> One can think of it like the minimal subset of ptrace but for EU threads,
>> not the CPU threads. And thus, building on this it's possible to extend
>> the core kernel generated core dumps with DRM specific extension which
>> would contain the EU thread/memory dump.
> It might be good to document (in that debugging doc patch probably) why
> thread injection is not a great option, and why the tradeoffs for
> debugging are different than for for checkpoint/restore, where with CRIU
> we landed on doing most of this in userspace, and often requiring
> injection threads to make it all work.
>
> Cheers, Sima
>
>> Regards, Joonas
>>
>>> Regards,
>>> Christian.
>>>
>>>
>>>
>>>      Matt
>>>
>>>      [3] https://patchwork.freedesktop.org/patch/622520/?series=140200&rev=6
>>>
>>>
>>>          Regards,
>>>          Christian.
>>>
>>>
>>>              Matt
>>>
>>>
>>>                  Regards,
>>>                  Christian.
>>>


  reply	other threads:[~2024-11-11 11:34 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-31 18:10 [PATCH v6 0/8] Fix non-contiguous VRAM BO access in Xe Matthew Brost
2024-10-31 18:10 ` [PATCH v6 1/8] drm/xe: Add xe_bo_vm_access Matthew Brost
2024-10-31 18:10 ` [PATCH v6 2/8] drm/ttm: Add ttm_bo_access Matthew Brost
2024-10-31 23:43   ` Matthew Brost
2024-11-04 17:34     ` Rodrigo Vivi
2024-11-04 19:28       ` Christian König
2024-11-04 21:49         ` Matthew Brost
2024-11-05  7:41           ` Christian König
2024-11-05 18:35             ` Matthew Brost
2024-11-06  9:48               ` Christian König
2024-11-06 15:25                 ` Matthew Brost
2024-11-06 15:44                   ` Christian König
2024-11-06 17:00                     ` Matthew Brost
2024-11-07  9:44                       ` Christian König
2024-11-11  8:00                         ` Joonas Lahtinen
2024-11-11 10:10                           ` Simona Vetter
2024-11-11 11:34                             ` Christian König [this message]
2024-11-11 14:00                               ` Joonas Lahtinen
2024-11-11 15:54                                 ` Christian König
2024-11-11 22:45                                   ` Matthew Brost
2024-11-12  9:23                                     ` Christian König
2024-11-12 13:41                                       ` Joonas Lahtinen
2024-11-12 16:22                                         ` Thomas Hellström
2024-11-12 16:25                                           ` Christian König
2024-11-12 16:33                                             ` Thomas Hellström
2024-11-13  8:37                                               ` Christian König
2024-11-13 10:44                                                 ` Thomas Hellström
2024-11-13 11:42                                                   ` Christian König
2024-11-15 18:27                                                     ` Matthew Brost
2024-11-25 15:29                                                       ` Matthew Brost
2024-11-25 16:19                                                         ` Christian König
2024-11-25 17:27                                                           ` Matthew Brost
2024-11-26  8:19                                                             ` Christian König
2024-11-26 17:49                                                               ` Matthew Brost
2024-11-27 13:21                                                                 ` Christian König
2024-11-12  8:28                                 ` Simona Vetter
2024-11-12  8:58                                   ` Christian König
2024-11-12 13:30                                     ` Joonas Lahtinen
2024-11-11 11:27                           ` Christian König
2024-11-04 19:47     ` Christian König
2024-11-04 21:30       ` Matthew Brost
2024-11-04 22:26         ` Rodrigo Vivi
2024-10-31 18:10 ` [PATCH v6 3/8] drm/xe: Add xe_ttm_access_memory Matthew Brost
2024-10-31 18:10 ` [PATCH v6 4/8] drm/xe: Take PM ref in delayed snapshot capture worker Matthew Brost
2024-10-31 18:10 ` [PATCH v6 5/8] drm/xe/display: Update intel_bo_read_from_page to use ttm_bo_access Matthew Brost
2024-10-31 18:10 ` [PATCH v6 6/8] drm/xe: Use ttm_bo_access in xe_vm_snapshot_capture_delayed Matthew Brost
2024-10-31 18:10 ` [PATCH v6 7/8] drm/xe: Set XE_BO_FLAG_PINNED in migrate selftest BOs Matthew Brost
2024-10-31 18:10 ` [PATCH v6 8/8] drm/xe: Only allow contiguous BOs to use xe_bo_vmap Matthew Brost
2024-10-31 18:15 ` ✓ CI.Patch_applied: success for Fix non-contiguous VRAM BO access in Xe (rev6) Patchwork
2024-10-31 18:15 ` ✗ CI.checkpatch: warning " Patchwork
2024-10-31 18:17 ` ✓ CI.KUnit: success " Patchwork
2024-10-31 18:28 ` ✓ CI.Build: " Patchwork
2024-10-31 18:31 ` ✓ CI.Hooks: " Patchwork
2024-10-31 18:32 ` ✗ CI.checksparse: warning " Patchwork
2024-10-31 18:57 ` ✓ CI.BAT: success " Patchwork
2024-10-31 21:27 ` ✗ CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1ffb3f7-77bc-41ff-a98a-5cd889f081fa@amd.com \
    --to=christian.koenig@amd.com \
    --cc=airlied@gmail.com \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=ray.huang@amd.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=simona.vetter@ffwll.ch \
    --cc=simona@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox