From: "Christian König" <christian.koenig@amd.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>,
"Christian König" <ckoenig.leichtzumerken@gmail.com>,
"Matthew Brost" <matthew.brost@intel.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
Huang Rui <ray.huang@amd.com>,
intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
matthew.auld@intel.com, David Airlie <airlied@gmail.com>,
Simona Vetter <simona@ffwll.ch>
Subject: Re: [PATCH v6 2/8] drm/ttm: Add ttm_bo_access
Date: Wed, 13 Nov 2024 09:37:04 +0100 [thread overview]
Message-ID: <6ab9e754-6f4a-48ec-b143-02585bc40291@amd.com> (raw)
In-Reply-To: <d469ab3509a1b875ad0c44d01569675682d6c61c.camel@linux.intel.com>
[-- Attachment #1: Type: text/plain, Size: 26965 bytes --]
Am 12.11.24 um 17:33 schrieb Thomas Hellström:
> [SNIP]
>>>> This has been extensively discussed already, but was expected to
>>>> really
>>>> only be needed for low-on-memory scenarios. However it now seems
>>>> like
>>>> the need is much earlier due to the random userptr page joining
>>>> by
>>>> core
>>>> mm.
>>> Just to clarify here:
>>>
>>> In Long-Running mode with recoverable pagefaults enabled we don't
>>> have
>>> any preempt-fences, but rather just zap the PTEs pointing to the
>>> affected memory and flush TLB. So from a memory resource POW a
>>> breakpoint should be safe, and no mmu notifier nor shrinker will be
>>> blocked.
>> That sounds like a HMM based approach which would clearly work.
>>
>> But where is that? I don't see any HMM based approach anywhere in the
>> XE
>> driver.
> This is a mode that uses recoverable pagefaults to fault either full
> userptr or full bos, and used with DRM_XE_VM_CRATE_FLAG_FAULT_MODE.
> (not SVM)!
>
> userptrs in xe are bo-less, and using the vm's resv, but otherwise
> using hmm similar to amdgpu: xe_hmm.c
Yeah, I have seen that one.
> fault servicing:
> xe_gt_pagefault.c
>
> PTE zapping on eviction and notifier:
> xe_vm_invalidate_vma(), xe_vm.c
Ah, that was the stuff I was missing.
So the implementation in xe_preempt_fence.c is just for graphics
submissions? That would make the whole thing much easier to handle.
The only remaining question I can then see is if long running
submissions with DRM_XE_VM_CRATE_FLAG_FAULT_MODE could potentially block
graphics submissions without this flag from accessing the hardware?
Thanks a lot for pointing this out,
Christian.
>
> Thanks,
> Thomas
>
>> Regards,
>> Christian.
>>
>>> Nor will there be any jobs with published dma-fences depending on
>>> the
>>> job blocked either temporarily by a pagefault or long-term by a
>>> debugger breakpoint.
>>>
>>> /Thomas
>>>
>>>
>>>> If that is done and the memory pre-empt fence is serviced even
>>>> for
>>>> debuggable contexts, do you have further concerns with the
>>>> presented
>>>> approach
>>>> from dma-buf and drm/sched perspective?
>>>>
>>>> Regards, Joonas
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>> This means that a breakpoint or core dump doesn't halt
>>>>> GPU
>>>>> threads, but
>>>>> rather suspends them. E.g. all running wave data is
>>>>> collected into a state
>>>>> bag which can be restored later on.
>>>>>
>>>>> I was under the impression that those long running
>>>>> compute
>>>>> threads do
>>>>> exactly that, but when the hardware can't switch out
>>>>> the
>>>>> GPU thread/process
>>>>> while in a break then that isn't the case.
>>>>>
>>>>> As long as you don't find a way to avoid that this
>>>>> patch
>>>>> set is a pretty
>>>>> clear NAK from my side as DMA-buf and TTM maintainer.
>>>>>
>>>>>
>>>>> I believe this is addressed above.
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>> What might work is to keep the submission on the
>>>>> hardware
>>>>> in the break state
>>>>> but forbid any memory access. This way you can signal
>>>>> your
>>>>> preemption fence
>>>>> even when the hardware isn't made available.
>>>>>
>>>>> Before you continue XE setups a new pre-emption fence
>>>>> and
>>>>> makes sure that
>>>>> all page tables etc... are up to date.
>>>>>
>>>>> Could be tricky to get this right if completion fence
>>>>> based
>>>>> submissions are
>>>>> mixed in as well, but that gives you at least a
>>>>> direction
>>>>> you could
>>>>> potentially go.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>> Regards, Joonas
>>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>> Some wash-up thoughts from me below, but
>>>>> consider them fairly irrelevant
>>>>> since I think the main driver for these
>>>>> big
>>>>> questions here should be
>>>>> gdb/userspace.
>>>>>
>>>>>
>>>>> Quoting Christian König (2024-11-07
>>>>> 11:44:33)
>>>>>
>>>>> Am 06.11.24 um 18:00 schrieb
>>>>> Matthew
>>>>> Brost:
>>>>>
>>>>> [SNIP]
>>>>>
>>>>> This is not a generic
>>>>> interface
>>>>> that anyone can freely access. The same
>>>>> permissions used by ptrace
>>>>> are
>>>>> checked when opening such an interface.
>>>>> See [1] [2].
>>>>>
>>>>>
>>>>> [1]
>>>>> https://patchwork.freedesktop.org/patch/617470/?series=136572&r
>>>>> e
>>>>> v=2
>>>>>
>>>>> [2]
>>>>> https://patchwork.freedesktop.org/patch/617471/?series=136572&r
>>>>> e
>>>>> v=2
>>>>>
>>>>>
>>>>> Thanks a lot for those pointers,
>>>>> that
>>>>> is exactly what I was looking for.
>>>>>
>>>>> And yeah, it is what I feared. You
>>>>> are
>>>>> re-implementing existing functionality,
>>>>> but see below.
>>>>>
>>>>> Could you elaborate on what this
>>>>> "existing
>>>>> functionality" exactly is?
>>>>> I do not think this functionality
>>>>> exists at
>>>>> this time.
>>>>>
>>>>> The EU debugging architecture for Xe
>>>>> specifically avoids the need for GDB
>>>>> to attach with ptrace to the CPU
>>>>> process or
>>>>> interfere with the CPU process for
>>>>> the debugging via parasitic threads or
>>>>> so.
>>>>>
>>>>> Debugger connection is opened to the
>>>>> DRM
>>>>> driver for given PID (which uses the
>>>>> ptrace may access check for now) after
>>>>> which the all DRM client of that
>>>>> PID are exposed to the debugger
>>>>> process.
>>>>>
>>>>> What we want to expose via that
>>>>> debugger
>>>>> connection is the ability for GDB to
>>>>> read/write the different GPU VM
>>>>> address
>>>>> spaces (ppGTT for Intel GPUs) just like
>>>>> the EU threads would see them. Note
>>>>> that
>>>>> the layout of the ppGTT is
>>>>> completely up to the userspace driver
>>>>> to
>>>>> setup and is mostly only partially
>>>>> equal to the CPU address space.
>>>>>
>>>>> Specifically as part of
>>>>> reading/writing the
>>>>> ppGTT for debugging purposes,
>>>>> there are deep flushes needed: for
>>>>> example
>>>>> flushing instruction cache
>>>>> when adding/removing breakpoints.
>>>>>
>>>>> Maybe that will explain the
>>>>> background. I
>>>>> elaborate on this at the end some more.
>>>>>
>>>>>
>>>>> kmap/vmap are used
>>>>> everywhere in the DRM subsystem to access BOs, so I’m
>>>>> failing to see the
>>>>> problem with adding a simple helper based on existing
>>>>> code.
>>>>>
>>>>> What#s possible and
>>>>> often
>>>>> done is to do kmap/vmap if you need to implement a
>>>>> CPU copy for scanout for
>>>>> example or for copying/validating command buffers.
>>>>> But that usually
>>>>> requires
>>>>> accessing the whole BO and has separate security
>>>>> checks.
>>>>>
>>>>> When you want to access
>>>>> only
>>>>> a few bytes of a BO that sounds massively like
>>>>> a peek/poke like
>>>>> interface
>>>>> and we have already rejected that more than once.
>>>>> There even used to be
>>>>> standardized GEM IOCTLs for that which have been
>>>>> removed by now.
>>>>>
>>>>> Referring to the explanation at top:
>>>>> These
>>>>> IOCTL are not for the debugging target
>>>>> process to issue. The peek/poke
>>>>> interface
>>>>> is specifically for GDB only
>>>>> to facilitate the emulation of memory
>>>>> reads/writes on the GPU address
>>>>> space as they were done by EUs
>>>>> themselves.
>>>>> And to recap: for modifying
>>>>> instructions for example (add/remove
>>>>> breakpoint), extra level of cache flushing is
>>>>> needed which is not available to
>>>>> regular
>>>>> userspace.
>>>>>
>>>>> I specifically discussed with Sima on
>>>>> the
>>>>> difference before moving forward with this
>>>>> design originally. If something has
>>>>> changed
>>>>> since then, I'm of course happy to rediscuss.
>>>>>
>>>>> However, if this code can't be added,
>>>>> not
>>>>> sure how we would ever be able
>>>>> to implement core dumps for GPU
>>>>> threads/memory?
>>>>>
>>>>>
>>>>> If you need to access
>>>>> BOs
>>>>> which are placed in not CPU accessible memory then
>>>>> implement the access
>>>>> callback
>>>>> for ptrace, see amdgpu_ttm_access_memory for
>>>>> an example how to do
>>>>> this.
>>>>>
>>>>> As also mentioned above, we don't work
>>>>> via
>>>>> ptrace at all when it comes
>>>>> to debugging the EUs. The only thing
>>>>> used
>>>>> for now is the ptrace_may_access to
>>>>> implement similar access restrictions
>>>>> as
>>>>> ptrace has. This can be changed
>>>>> to something else if needed.
>>>>>
>>>>>
>>>>> Ptrace access via
>>>>> vm_operations_struct.access → ttm_bo_vm_access.
>>>>>
>>>>> This series renames
>>>>> ttm_bo_vm_access to ttm_bo_access, with no code changes.
>>>>>
>>>>> The above function accesses
>>>>> a BO
>>>>> via kmap if it is in SYSTEM / TT,
>>>>> which is existing code.
>>>>>
>>>>> This function is only
>>>>> exposed to
>>>>> user space via ptrace permissions.
>>>>>
>>>>> Maybe this sentence is what caused the
>>>>> confusion.
>>>>>
>>>>> Userspace is never exposed with
>>>>> peek/poke
>>>>> interface, only the debugger
>>>>> connection which is its own FD.
>>>>>
>>>>>
>>>>> In this series, we implement
>>>>> a
>>>>> function [3] similar to
>>>>> amdgpu_ttm_access_memory for
>>>>> the
>>>>> TTM vfunc access_memory. What is
>>>>> missing is non-visible CPU
>>>>> memory
>>>>> access, similar to
>>>>>
>>>>> amdgpu_ttm_access_memory_sdma.
>>>>> This will be addressed in a follow-up and
>>>>> was omitted in this series
>>>>> given
>>>>> its complexity.
>>>>>
>>>>> So, this looks more or less
>>>>> identical to AMD's ptrace implementation,
>>>>> but in GPU address space.
>>>>> Again,
>>>>> I fail to see what the problem is here.
>>>>> What am I missing?
>>>>>
>>>>>
>>>>> The main question is why can't you
>>>>> use
>>>>> the existing interfaces directly?
>>>>>
>>>>> We're not working on the CPU address
>>>>> space
>>>>> or BOs. We're working
>>>>> strictly on the GPU address space as
>>>>> would
>>>>> be seen by an EU thread if it
>>>>> accessed address X.
>>>>>
>>>>>
>>>>> Additional to the peek/poke
>>>>> interface
>>>>> of ptrace Linux has the pidfd_getfd
>>>>> system call, see
>>>>> herehttps://man7.org/linux/man-pages/man2/pidfd_getfd.2.html.
>>>>>
>>>>> The pidfd_getfd() allows to dup()
>>>>> the
>>>>> render node file descriptor into your gdb
>>>>> process. That in turn gives you
>>>>> all the
>>>>> access you need from gdb, including
>>>>> mapping BOs and command submission
>>>>> on
>>>>> behalf of the application.
>>>>>
>>>>> We're not operating on the CPU address
>>>>> space nor are we operating on BOs
>>>>> (there is no concept of BO in the EU
>>>>> debug
>>>>> interface). Each VMA in the VM
>>>>> could come from anywhere, only the
>>>>> start
>>>>> address and size matter. And
>>>>> neither do we need to interfere with
>>>>> the
>>>>> command submission of the
>>>>> process under debug.
>>>>>
>>>>>
>>>>> As far as I can see that allows
>>>>> for the
>>>>> same functionality as the eudebug
>>>>> interface, just without any driver
>>>>> specific code messing with ptrace
>>>>> permissions and peek/poke
>>>>> interfaces.
>>>>>
>>>>> So the question is still why do
>>>>> you
>>>>> need the whole eudebug interface in the
>>>>> first place? I might be missing
>>>>> something, but that seems to be superfluous
>>>>> from a high level view.
>>>>>
>>>>> Recapping from above. It is to allow
>>>>> the
>>>>> debugging of EU threads per DRM
>>>>> client, completely independent of the
>>>>> CPU
>>>>> process. If ptrace_may_acces
>>>>> is the sore point, we could consider
>>>>> other
>>>>> permission checks, too. There
>>>>> is no other connection to ptrace in
>>>>> this
>>>>> architecture as single
>>>>> permission check to know if PID is
>>>>> fair
>>>>> game to access by debugger
>>>>> process.
>>>>>
>>>>> Why no parasitic thread or ptrace:
>>>>> Going
>>>>> forward, binding the EU debugging to
>>>>> the DRM client would also pave way for
>>>>> being able to extend core kernel generated
>>>>> core dump with each DRM client's EU
>>>>> thread/memory dump. We have similar
>>>>> feature called "Offline core dump"
>>>>> enabled
>>>>> in the downstream public
>>>>> trees for i915, where we currently
>>>>> attach
>>>>> the EU thread dump to i915 error state
>>>>> and then later combine i915 error
>>>>> state
>>>>> with CPU core dump file with a
>>>>> tool.
>>>>>
>>>>> This is relatively little amount of
>>>>> extra
>>>>> code, as this baseline series
>>>>> already introduces GDB the ability to
>>>>> perform the necessary actions.
>>>>> It's just the matter of kernel driver
>>>>> calling: "stop all threads", then
>>>>> copying the memory map and memory
>>>>> contents
>>>>> for GPU threads, just like is
>>>>> done for CPU threads.
>>>>>
>>>>> With parasitic thread injection, not
>>>>> sure
>>>>> if there is such way forward,
>>>>> as it would seem to require to inject
>>>>> quite
>>>>> abit more logic to core kernel?
>>>>>
>>>>>
>>>>> It's true that the AMD KFD part
>>>>> has
>>>>> still similar functionality, but that is
>>>>> because of the broken KFD design
>>>>> of
>>>>> tying driver state to the CPU process
>>>>> (which makes it inaccessible for
>>>>> gdb
>>>>> even with imported render node fd).
>>>>>
>>>>> Both Sima and I (and partially
>>>>> Dave as
>>>>> well) have pushed back on the KFD
>>>>> approach. And the long term plan
>>>>> is to
>>>>> get rid of such device driver specific
>>>>> interface which re-implement
>>>>> existing
>>>>> functionality just differently.
>>>>>
>>>>> Recapping, this series is not adding
>>>>> it
>>>>> back. The debugger connection
>>>>> is a separate FD from the DRM one,
>>>>> with
>>>>> separate IOCTL set. We don't allow
>>>>> the DRM FD any new operations based on
>>>>> ptrace is attached or not. We
>>>>> don't ever do that check even.
>>>>>
>>>>> We only restrict the opening of the
>>>>> debugger connection to given PID with
>>>>> ptrace_may_access check for now. That
>>>>> can
>>>>> be changed to something else,
>>>>> if necessary.
>>>>>
>>>>> Yeah I think unnecessarily tying gpu
>>>>> processes
>>>>> to cpu processes is a bad
>>>>> thing, least because even today all the
>>>>> svm
>>>>> discussions we have still hit
>>>>> clear use-cases, where a 1:1 match is not
>>>>> wanted (like multiple gpu svm
>>>>> sections with offsets). Not even speaking
>>>>> of
>>>>> all the gpu usecases where
>>>>> the gpu vm space is still entirely
>>>>> independent
>>>>> of the cpu side.
>>>>>
>>>>> So that's why I think this entirely
>>>>> separate
>>>>> approach looks like the right
>>>>> one, with ptrace_may_access as the access
>>>>> control check to make sure we
>>>>> match ptrace on the cpu side.
>>>>>
>>>>> But there's very obviously a bikeshed to
>>>>> be had
>>>>> on what the actual uapi
>>>>> should look like, especially how gdb opens
>>>>> up a
>>>>> gpu debug access fd. But I
>>>>> also think that's not much on drm to
>>>>> decide,
>>>>> but whatever gdb wants. And
>>>>> then we aim for some consistency on that
>>>>> lookup/access control part
>>>>> (ideally, I might be missing some reasons
>>>>> why
>>>>> this is a bad idea) across
>>>>> drm drivers.
>>>>>
>>>>>
>>>>> So you need to have a really
>>>>> really
>>>>> good explanation why the eudebug interface
>>>>> is actually necessary.
>>>>>
>>>>> TL;DR The main point is to decouple
>>>>> the
>>>>> debugging of the EU workloads from the
>>>>> debugging of the CPU process. This
>>>>> avoids
>>>>> the interference with the CPU process with
>>>>> parasitic thread injection. Further
>>>>> this
>>>>> also allows generating a core dump
>>>>> without any GDB connected. There are
>>>>> also
>>>>> many other smaller pros/cons
>>>>> which can be discussed but for the
>>>>> context
>>>>> of this patch, this is the
>>>>> main one.
>>>>>
>>>>> So unlike parasitic thread injection,
>>>>> we
>>>>> don't unlock any special IOCTL for
>>>>> the process under debug to be
>>>>> performed by
>>>>> the parasitic thread, but we
>>>>> allow the minimal set of operations to
>>>>> be
>>>>> performed by GDB as if those were
>>>>> done on the EUs themselves.
>>>>>
>>>>> One can think of it like the minimal
>>>>> subset
>>>>> of ptrace but for EU threads,
>>>>> not the CPU threads. And thus,
>>>>> building on
>>>>> this it's possible to extend
>>>>> the core kernel generated core dumps
>>>>> with
>>>>> DRM specific extension which
>>>>> would contain the EU thread/memory
>>>>> dump.
>>>>>
>>>>> It might be good to document (in that
>>>>> debugging
>>>>> doc patch probably) why
>>>>> thread injection is not a great option,
>>>>> and why
>>>>> the tradeoffs for
>>>>> debugging are different than for for
>>>>> checkpoint/restore, where with CRIU
>>>>> we landed on doing most of this in
>>>>> userspace,
>>>>> and often requiring
>>>>> injection threads to make it all work.
>>>>>
>>>>> Cheers, Sima
>>>>>
>>>>>
>>>>> Regards, Joonas
>>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>> [3]
>>>>> https://patchwork.freedesktop.org/patch/622520/?series=140200&r
>>>>> e
>>>>> v=6
>>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>>
[-- Attachment #2: Type: text/html, Size: 45048 bytes --]
next prev parent reply other threads:[~2024-11-13 8:37 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-31 18:10 [PATCH v6 0/8] Fix non-contiguous VRAM BO access in Xe Matthew Brost
2024-10-31 18:10 ` [PATCH v6 1/8] drm/xe: Add xe_bo_vm_access Matthew Brost
2024-10-31 18:10 ` [PATCH v6 2/8] drm/ttm: Add ttm_bo_access Matthew Brost
2024-10-31 23:43 ` Matthew Brost
2024-11-04 17:34 ` Rodrigo Vivi
2024-11-04 19:28 ` Christian König
2024-11-04 21:49 ` Matthew Brost
2024-11-05 7:41 ` Christian König
2024-11-05 18:35 ` Matthew Brost
2024-11-06 9:48 ` Christian König
2024-11-06 15:25 ` Matthew Brost
2024-11-06 15:44 ` Christian König
2024-11-06 17:00 ` Matthew Brost
2024-11-07 9:44 ` Christian König
2024-11-11 8:00 ` Joonas Lahtinen
2024-11-11 10:10 ` Simona Vetter
2024-11-11 11:34 ` Christian König
2024-11-11 14:00 ` Joonas Lahtinen
2024-11-11 15:54 ` Christian König
2024-11-11 22:45 ` Matthew Brost
2024-11-12 9:23 ` Christian König
2024-11-12 13:41 ` Joonas Lahtinen
2024-11-12 16:22 ` Thomas Hellström
2024-11-12 16:25 ` Christian König
2024-11-12 16:33 ` Thomas Hellström
2024-11-13 8:37 ` Christian König [this message]
2024-11-13 10:44 ` Thomas Hellström
2024-11-13 11:42 ` Christian König
2024-11-15 18:27 ` Matthew Brost
2024-11-25 15:29 ` Matthew Brost
2024-11-25 16:19 ` Christian König
2024-11-25 17:27 ` Matthew Brost
2024-11-26 8:19 ` Christian König
2024-11-26 17:49 ` Matthew Brost
2024-11-27 13:21 ` Christian König
2024-11-12 8:28 ` Simona Vetter
2024-11-12 8:58 ` Christian König
2024-11-12 13:30 ` Joonas Lahtinen
2024-11-11 11:27 ` Christian König
2024-11-04 19:47 ` Christian König
2024-11-04 21:30 ` Matthew Brost
2024-11-04 22:26 ` Rodrigo Vivi
2024-10-31 18:10 ` [PATCH v6 3/8] drm/xe: Add xe_ttm_access_memory Matthew Brost
2024-10-31 18:10 ` [PATCH v6 4/8] drm/xe: Take PM ref in delayed snapshot capture worker Matthew Brost
2024-10-31 18:10 ` [PATCH v6 5/8] drm/xe/display: Update intel_bo_read_from_page to use ttm_bo_access Matthew Brost
2024-10-31 18:10 ` [PATCH v6 6/8] drm/xe: Use ttm_bo_access in xe_vm_snapshot_capture_delayed Matthew Brost
2024-10-31 18:10 ` [PATCH v6 7/8] drm/xe: Set XE_BO_FLAG_PINNED in migrate selftest BOs Matthew Brost
2024-10-31 18:10 ` [PATCH v6 8/8] drm/xe: Only allow contiguous BOs to use xe_bo_vmap Matthew Brost
2024-10-31 18:15 ` ✓ CI.Patch_applied: success for Fix non-contiguous VRAM BO access in Xe (rev6) Patchwork
2024-10-31 18:15 ` ✗ CI.checkpatch: warning " Patchwork
2024-10-31 18:17 ` ✓ CI.KUnit: success " Patchwork
2024-10-31 18:28 ` ✓ CI.Build: " Patchwork
2024-10-31 18:31 ` ✓ CI.Hooks: " Patchwork
2024-10-31 18:32 ` ✗ CI.checksparse: warning " Patchwork
2024-10-31 18:57 ` ✓ CI.BAT: success " Patchwork
2024-10-31 21:27 ` ✗ CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6ab9e754-6f4a-48ec-b143-02585bc40291@amd.com \
--to=christian.koenig@amd.com \
--cc=airlied@gmail.com \
--cc=ckoenig.leichtzumerken@gmail.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=joonas.lahtinen@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=ray.huang@amd.com \
--cc=rodrigo.vivi@intel.com \
--cc=simona.vetter@ffwll.ch \
--cc=simona@ffwll.ch \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox