linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Airlie <airlied@gmail.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@redhat.com>,
	Matthew Brost <matthew.brost@intel.com>,
	daniel@ffwll.ch, corbet@lwn.net, dri-devel@lists.freedesktop.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	mripard@kernel.org, bskeggs@redhat.com, jason@jlekstrand.net,
	nouveau@lists.freedesktop.org, airlied@redhat.com
Subject: Re: [Nouveau] [PATCH drm-next 05/14] drm/nouveau: new VM_BIND uapi interfaces
Date: Wed, 1 Feb 2023 18:10:24 +1000	[thread overview]
Message-ID: <CAPM=9txON8VCb3H7vDY_DOgtUg2Ad3mBvYVxgSMyZ1noOu-rBQ@mail.gmail.com> (raw)
In-Reply-To: <1840e9fb-fd1b-79b7-4238-54ae97333d0b@amd.com>

On Mon, 30 Jan 2023 at 23:02, Christian König <christian.koenig@amd.com> wrote:
>
> Am 29.01.23 um 19:46 schrieb Danilo Krummrich:
> > On 1/27/23 22:09, Danilo Krummrich wrote:
> >> On 1/27/23 16:17, Christian König wrote:
> >>> Am 27.01.23 um 15:44 schrieb Danilo Krummrich:
> >>>> [SNIP]
> >>>>>>>
> >>>>>>> What you want is one component for tracking the VA allocations
> >>>>>>> (drm_mm based) and a different component/interface for tracking
> >>>>>>> the VA mappings (probably rb tree based).
> >>>>>>
> >>>>>> That's what the GPUVA manager is doing. There are gpuva_regions
> >>>>>> which correspond to VA allocations and gpuvas which represent the
> >>>>>> mappings. Both are tracked separately (currently both with a
> >>>>>> separate drm_mm, though). However, the GPUVA manager needs to
> >>>>>> take regions into account when dealing with mappings to make sure
> >>>>>> the GPUVA manager doesn't propose drivers to merge over region
> >>>>>> boundaries. Speaking from userspace PoV, the kernel wouldn't
> >>>>>> merge mappings from different VKBuffer objects even if they're
> >>>>>> virtually and physically contiguous.
> >>>>>
> >>>>> That are two completely different things and shouldn't be handled
> >>>>> in a single component.
> >>>>
> >>>> They are different things, but they're related in a way that for
> >>>> handling the mappings (in particular merging and sparse) the GPUVA
> >>>> manager needs to know the VA allocation (or region) boundaries.
> >>>>
> >>>> I have the feeling there might be a misunderstanding. Userspace is
> >>>> in charge to actually allocate a portion of VA space and manage it.
> >>>> The GPUVA manager just needs to know about those VA space
> >>>> allocations and hence keeps track of them.
> >>>>
> >>>> The GPUVA manager is not meant to be an allocator in the sense of
> >>>> finding and providing a hole for a given request.
> >>>>
> >>>> Maybe the non-ideal choice of using drm_mm was implying something
> >>>> else.
> >>>
> >>> Uff, well long story short that doesn't even remotely match the
> >>> requirements. This way the GPUVA manager won't be usable for a whole
> >>> bunch of use cases.
> >>>
> >>> What we have are mappings which say X needs to point to Y with this
> >>> and hw dependent flags.
> >>>
> >>> The whole idea of having ranges is not going to fly. Neither with
> >>> AMD GPUs and I strongly think not with Intels XA either.
> >>
> >> A range in the sense of the GPUVA manager simply represents a VA
> >> space allocation (which in case of Nouveau is taken in userspace).
> >> Userspace allocates the portion of VA space and lets the kernel know
> >> about it. The current implementation needs that for the named
> >> reasons. So, I think there is no reason why this would work with one
> >> GPU, but not with another. It's just part of the design choice of the
> >> manager.
> >>
> >> And I'm absolutely happy to discuss the details of the manager
> >> implementation though.
> >>
> >>>
> >>>>> We should probably talk about the design of the GPUVA manager once
> >>>>> more when this should be applicable to all GPU drivers.
> >>>>
> >>>> That's what I try to figure out with this RFC, how to make it
> >>>> appicable for all GPU drivers, so I'm happy to discuss this. :-)
> >>>
> >>> Yeah, that was really good idea :) That proposal here is really far
> >>> away from the actual requirements.
> >>>
> >>
> >> And those are the ones I'm looking for. Do you mind sharing the
> >> requirements for amdgpu in particular?
> >>
> >>>>>> For sparse residency the kernel also needs to know the region
> >>>>>> boundaries to make sure that it keeps sparse mappings around.
> >>>>>
> >>>>> What?
> >>>>
> >>>> When userspace creates a new VKBuffer with the
> >>>> VK_BUFFER_CREATE_SPARSE_BINDING_BIT the kernel may need to create
> >>>> sparse mappings in order to ensure that using this buffer without
> >>>> any memory backed mappings doesn't fault the GPU.
> >>>>
> >>>> Currently, the implementation does this the following way:
> >>>>
> >>>> 1. Userspace creates a new VKBuffer and hence allocates a portion
> >>>> of the VA space for it. It calls into the kernel indicating the new
> >>>> VA space region and the fact that the region is sparse.
> >>>>
> >>>> 2. The kernel picks up the region and stores it in the GPUVA
> >>>> manager, the driver creates the corresponding sparse mappings /
> >>>> page table entries.
> >>>>
> >>>> 3. Userspace might ask the driver to create a couple of memory
> >>>> backed mappings for this particular VA region. The GPUVA manager
> >>>> stores the mapping parameters, the driver creates the corresponding
> >>>> page table entries.
> >>>>
> >>>> 4. Userspace might ask to unmap all the memory backed mappings from
> >>>> this particular VA region. The GPUVA manager removes the mapping
> >>>> parameters, the driver cleans up the corresponding page table
> >>>> entries. However, the driver also needs to re-create the sparse
> >>>> mappings, since it's a sparse buffer, hence it needs to know the
> >>>> boundaries of the region it needs to create the sparse mappings in.
> >>>
> >>> Again, this is not how things are working. First of all the kernel
> >>> absolutely should *NOT* know about those regions.
> >>>
> >>> What we have inside the kernel is the information what happens if an
> >>> address X is accessed. On AMD HW this can be:
> >>>
> >>> 1. Route to the PCIe bus because the mapped BO is stored in system
> >>> memory.
> >>> 2. Route to the internal MC because the mapped BO is stored in local
> >>> memory.
> >>> 3. Route to other GPUs in the same hive.
> >>> 4. Route to some doorbell to kick of other work.
> >>> ...
> >>> x. Ignore write, return 0 on reads (this is what is used for sparse
> >>> mappings).
> >>> x+1. Trigger a recoverable page fault. This is used for things like
> >>> SVA.
> >>> x+2. Trigger a non-recoverable page fault. This is used for things
> >>> like unmapped regions where access is illegal.
> >>>
> >>> All this is plus some hw specific caching flags.
> >>>
> >>> When Vulkan allocates a sparse VKBuffer what should happen is the
> >>> following:
> >>>
> >>> 1. The Vulkan driver somehow figures out a VA region A..B for the
> >>> buffer. This can be in userspace (libdrm_amdgpu) or kernel (drm_mm),
> >>> but essentially is currently driver specific.
> >>
> >> Right, for Nouveau we have this in userspace as well.
> >>
> >>>
> >>> 2. The kernel gets a request to map the VA range A..B as sparse,
> >>> meaning that it updates the page tables from A..B with the sparse
> >>> setting.
> >>>
> >>> 3. User space asks kernel to map a couple of memory backings at
> >>> location A+1, A+10, A+15 etc....
> >>>
> >>> 4. The VKBuffer is de-allocated, userspace asks kernel to update
> >>> region A..B to not map anything (usually triggers a non-recoverable
> >>> fault).
> >>
> >> Until here this seems to be identical to what I'm doing.
> >>
> >> It'd be interesting to know how amdgpu handles everything that
> >> potentially happens between your 3) and 4). More specifically, how
> >> are the page tables changed when memory backed mappings are mapped on
> >> a sparse range? What happens when the memory backed mappings are
> >> unmapped, but the VKBuffer isn't de-allocated, and hence sparse
> >> mappings need to be re-deployed?
> >>
> >> Let's assume the sparse VKBuffer (and hence the VA space allocation)
> >> is pretty large. In Nouveau the corresponding PTEs would have a
> >> rather huge page size to cover this. Now, if small memory backed
> >> mappings are mapped to this huge sparse buffer, in Nouveau we'd
> >> allocate a new PT with a corresponding smaller page size overlaying
> >> the sparse mappings PTEs.
> >>
> >> How would this look like in amdgpu?
> >>
> >>>
> >>> When you want to unify this between hw drivers I strongly suggest to
> >>> completely start from scratch once more.
> >>>
> >
> > I just took some time digging into amdgpu and, surprisingly, aside
> > from the gpuva_regions it seems like amdgpu basically does exactly the
> > same as I do in the GPU VA manager. As explained, those region
> > boundaries are needed for merging only and, depending on the driver,
> > might be useful for sparse mappings.
> >
> > For drivers that don't intend to merge at all and (somehow) are
> > capable of dealing with sparse regions without knowing the sparse
> > region's boundaries, it'd be easy to make those gpuva_regions optional.
>
> Yeah, but this then defeats the approach of having the same hw
> independent interface/implementation for all drivers.

I think you are running a few steps ahead here. The plan isn't to have
an independent interface, it's to provide a set of routines and
tracking that will be consistent across drivers, so that all drivers
once using them will operate in mostly the same fashion with respect
to GPU VA tracking and VA/BO lifetimes. Already in the tree we have
amdgpu and freedreno which I think end up operating slightly different
around lifetimes. I'd like to save future driver writers the effort of
dealing with those decisions and this should drive their user api
design so to enable vulkan sparse bindings.

Now if merging is a feature that makes sense to one driver maybe it
makes sense to all, however there may be reasons amdgpu gets away
without merging that other drivers might not benefit from, there might
also be a benefit to amdgpu from merging that you haven't looked at
yet, so I think we could leave merging as an optional extra driver
knob here. The userspace API should operate the same, it would just be
the gpu pagetables that would end up different sizes.

Dave.

  parent reply	other threads:[~2023-02-01  8:10 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-18  6:12 [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 01/14] drm: execution context for GEM buffers Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 02/14] drm/exec: fix memory leak in drm_exec_prepare_obj() Danilo Krummrich
2023-01-18  8:51   ` Christian König
2023-01-18 19:00     ` Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 03/14] drm: manager to keep track of GPUs VA mappings Danilo Krummrich
2023-01-19  4:14   ` Bagas Sanjaya
2023-01-20 18:32     ` Danilo Krummrich
2023-01-23 23:23   ` Niranjana Vishwanathapura
2023-01-26 23:43   ` Matthew Brost
2023-01-27  0:24   ` Matthew Brost
2023-02-03 17:37   ` Matthew Brost
2023-02-06 13:35     ` Christian König
2023-02-06 13:46       ` Danilo Krummrich
2023-02-14 11:52     ` Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 04/14] drm: debugfs: provide infrastructure to dump a DRM GPU VA space Danilo Krummrich
2023-01-18 13:55   ` kernel test robot
2023-01-18 15:47   ` kernel test robot
2023-01-18  6:12 ` [PATCH drm-next 05/14] drm/nouveau: new VM_BIND uapi interfaces Danilo Krummrich
2023-01-27  1:05   ` Matthew Brost
2023-01-27  1:26     ` Danilo Krummrich
2023-01-27  7:55       ` Christian König
2023-01-27 13:12         ` Danilo Krummrich
2023-01-27 13:23           ` Christian König
2023-01-27 14:44             ` Danilo Krummrich
2023-01-27 15:17               ` Christian König
2023-01-27 20:25                 ` David Airlie
2023-01-30 12:58                   ` Christian König
2023-01-27 21:09                 ` Danilo Krummrich
2023-01-29 18:46                   ` Danilo Krummrich
2023-01-30 13:02                     ` Christian König
2023-01-30 23:38                       ` Danilo Krummrich
2023-02-01  8:10                       ` Dave Airlie [this message]
2023-02-02 11:53                         ` [Nouveau] " Christian König
2023-02-02 18:31                           ` Danilo Krummrich
2023-02-06  9:48                             ` Christian König
2023-02-06 13:27                               ` Danilo Krummrich
2023-02-06 16:14                                 ` Christian König
2023-02-06 18:20                                   ` Danilo Krummrich
2023-02-07  9:35                                     ` Christian König
2023-02-07 10:50                                       ` Danilo Krummrich
2023-02-10 11:50                                         ` Christian König
2023-02-10 12:47                                           ` Danilo Krummrich
2023-01-27  1:43     ` Danilo Krummrich
2023-01-27  3:21       ` Matthew Brost
2023-01-27  3:33         ` Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 06/14] drm/nouveau: get vmm via nouveau_cli_vmm() Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 07/14] drm/nouveau: bo: initialize GEM GPU VA interface Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 08/14] drm/nouveau: move usercopy helpers to nouveau_drv.h Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 09/14] drm/nouveau: fence: fail to emit when fence context is killed Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 10/14] drm/nouveau: chan: provide nouveau_channel_kill() Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 11/14] drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm Danilo Krummrich
2023-01-20  3:37   ` kernel test robot
2023-01-18  6:12 ` [PATCH drm-next 12/14] drm/nouveau: implement uvmm for user mode bindings Danilo Krummrich
2023-01-18  6:12 ` [PATCH drm-next 13/14] drm/nouveau: implement new VM_BIND UAPI Danilo Krummrich
2023-01-18 20:37   ` Thomas Hellström (Intel)
2023-01-19  3:44     ` Danilo Krummrich
2023-01-19  4:58       ` Matthew Brost
2023-01-19  7:32         ` Thomas Hellström (Intel)
2023-01-20 10:08         ` Boris Brezillon
2023-01-18  6:12 ` [PATCH drm-next 14/14] drm/nouveau: debugfs: implement DRM GPU VA debugfs Danilo Krummrich
2023-01-18  8:53 ` [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Christian König
2023-01-18 15:34   ` Danilo Krummrich
2023-01-18 15:37     ` Christian König
2023-01-18 16:19       ` Danilo Krummrich
2023-01-18 16:30         ` Alex Deucher
2023-01-18 16:50           ` Danilo Krummrich
2023-01-18 16:54             ` Alex Deucher
2023-01-18 19:17               ` Dave Airlie
2023-01-18 19:48                 ` Christian König
2023-01-19  4:04                   ` Danilo Krummrich
2023-01-19  5:23                     ` Matthew Brost
2023-01-19 11:33                       ` drm_gpuva_manager requirements (was Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI) Christian König
2023-02-06 14:48                       ` [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Oded Gabbay
2023-03-16 16:39                         ` Danilo Krummrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPM=9txON8VCb3H7vDY_DOgtUg2Ad3mBvYVxgSMyZ1noOu-rBQ@mail.gmail.com' \
    --to=airlied@gmail.com \
    --cc=airlied@redhat.com \
    --cc=bskeggs@redhat.com \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=dakr@redhat.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jason@jlekstrand.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=nouveau@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).