From: Boris Brezillon <boris.brezillon@collabora.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: "Alyssa Rosenzweig" <alyssa@rosenzweig.io>,
"Steven Price" <steven.price@arm.com>,
"Liviu Dudau" <liviu.dudau@arm.com>,
"Adrián Larumbe" <adrian.larumbe@collabora.com>,
lima@lists.freedesktop.org, "Qiang Yu" <yuq825@gmail.com>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
dri-devel@lists.freedesktop.org,
"Dmitry Osipenko" <dmitry.osipenko@collabora.com>,
kernel@collabora.com,
"Faith Ekstrand" <faith.ekstrand@collabora.com>
Subject: Re: [PATCH v3 0/8] drm: Introduce sparse GEM shmem
Date: Fri, 11 Apr 2025 14:02:54 +0200 [thread overview]
Message-ID: <20250411140254.042f9862@collabora.com> (raw)
In-Reply-To: <9fd6fb8c-7dbb-467d-a759-eec852e1f006@amd.com>
On Fri, 11 Apr 2025 12:55:57 +0200
Christian König <christian.koenig@amd.com> wrote:
> Am 11.04.25 um 10:38 schrieb Boris Brezillon:
> > On Fri, 11 Apr 2025 10:04:07 +0200
> > Christian König <christian.koenig@amd.com> wrote:
> >
> >> Am 10.04.25 um 20:41 schrieb Boris Brezillon:
> >>> On Thu, 10 Apr 2025 14:01:03 -0400
> >>> Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:
> >>>
> >>>>>>> In Panfrost and Lima, we don't have this concept of
> >>>>>>> "incremental rendering", so when we fail the allocation, we
> >>>>>>> just fail the GPU job with an unhandled GPU fault.
> >>>>>> To be honest I think that this is enough to mark those two
> >>>>>> drivers as broken. It's documented that this approach is a
> >>>>>> no-go for upstream drivers.
> >>>>>>
> >>>>>> How widely is that used?
> >>>>> It exists in lima and panfrost, and I wouldn't be surprised if a
> >>>>> similar mechanism was used in other drivers for tiler-based GPUs
> >>>>> (etnaviv, freedreno, powervr, ...), because ultimately that's
> >>>>> how tilers work: the amount of memory needed to store per-tile
> >>>>> primitives (and metadata) depends on what the geometry pipeline
> >>>>> feeds the tiler with, and that can't be predicted. If you
> >>>>> over-provision, that's memory the system won't be able to use
> >>>>> while rendering takes place, even though only a small portion
> >>>>> might actually be used by the GPU. If your allocation is too
> >>>>> small, it will either trigger a GPU fault (for HW not supporting
> >>>>> an "incremental rendering" mode) or under-perform (because
> >>>>> flushing primitives has a huge cost on tilers).
> >>>> Yes and no.
> >>>>
> >>>> Although we can't allocate more memory for /this/ frame, we know
> >>>> the required size is probably constant across its lifetime. That
> >>>> gives a simple heuristic to manage the tiler heap efficiently
> >>>> without allocations - even fallible ones - in the fence signal
> >>>> path:
> >>>>
> >>>> * Start with a small fixed size tiler heap
> >>>> * Try to render, let incremental rendering kick in when it's too
> >>>> small.
> >>>> * When cleaning up the job, check if we used incremental
> >>>> rendering.
> >>>> * If we did - double the size of the heap the next time we submit
> >>>> work.
> >>>>
> >>>> The tiler heap still grows dynamically - it just does so over the
> >>>> span of a couple frames. In practice that means a tiny hit to
> >>>> startup time as we dynamically figure out the right size,
> >>>> incurring extra flushing at the start, without needing any
> >>>> "grow-on-page-fault" heroics.
> >>>>
> >>>> This should solve the problem completely for CSF/panthor. So it's
> >>>> only hardware that architecturally cannot do incremental
> >>>> rendering (older Mali: panfrost/lima) where we need this mess.
> >>>>
> >>> OTOH, if we need something
> >>> for Utgard(Lima)/Midgard/Bifrost/Valhall(Panfrost), why not use
> >>> the same thing for CSF, since CSF is arguably the sanest of all
> >>> the HW architectures listed above: allocation can fail/be
> >>> non-blocking, because there's a fallback to incremental rendering
> >>> when it fails.
> >> Yeah that is a rather interesting point Alyssa noted here.
> >>
> >> So basically you could as well implement it like this:
> >> 1. Userspace makes a submission.
> >> 2. HW finds buffer is not large enough, sets and error code and
> >> completes submission. 3. Userspace detects error, re-allocates
> >> buffer with increased size. 4. Userspace re-submits to incremental
> >> complete the submission. 5. Eventually repeat until fully
> >> completed.
> >>
> >> That would work but is likely just not the most performant
> >> solution. So faulting in memory on demand is basically just an
> >> optimization and that is ok as far as I can see.
> > Yeah, Alyssa's suggestion got me thinking too, and I think I can
> > come up with a plan where we try non-blocking allocation first, and
> > if it fails, we trigger incremental rendering, and queue a blocking
> > heap-chunk allocation on separate workqueue, such that next time the
> > tiler heap hits an OOM, it has a chunk (or multiple chunks) readily
> > available if the blocking allocation completed in the meantime.
> > That's basically what Alyssa suggested, with an optimization if the
> > system is not under memory pressure, and without userspace being
> > involved (so no uAPI changes).
>
> That sounds like it most likely won't work. In an OOM situation the
> blocking allocation would just cause more pressure to complete your
> rendering to free up memory.
Right. It could be deferred to the next job submission instead of being
queued immediately. My point being, userspace doesn't have to know
about it, because the kernel knows when a tiler OOM happened, and can
flag the buffer as "probably needs more space when you have an
opportunity to alloc more". It wouldn't be different from reporting it
to userspace and letting userspace explicitly grow the buffer, and it
avoids introducing a gap between old and new mesa.
>
> > I guess this leaves older GPUs that don't support incremental
> > rendering in a bad place though.
>
> Well what's the handling there currently? Just crash when you're OOM?
It's "alloc(GFP_KERNEL) and crash if it fails or times out", yes.
next prev parent reply other threads:[~2025-04-11 12:03 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-04 9:26 [PATCH v3 0/8] drm: Introduce sparse GEM shmem Boris Brezillon
2025-04-04 9:26 ` [PATCH v3 1/8] drm/gem: Add helpers to request a range of pages on a GEM Boris Brezillon
2025-04-04 9:26 ` [PATCH v3 2/8] drm/gem-shmem: Support sparse backing Boris Brezillon
2025-04-04 9:26 ` [PATCH v3 3/8] drm/panfrost: Switch to sparse gem shmem to implement our alloc-on-fault Boris Brezillon
2025-04-04 9:26 ` [PATCH v3 4/8] drm/panthor: Add support for alloc-on-fault buffers Boris Brezillon
2025-04-04 12:17 ` Boris Brezillon
2025-04-04 9:26 ` [PATCH v3 5/8] drm/panthor: Allow kernel BOs to pass DRM_PANTHOR_BO_ALLOC_ON_FAULT Boris Brezillon
2025-04-04 9:26 ` [PATCH v3 6/8] drm/panthor: Add a panthor_vm_pre_fault_range() helper Boris Brezillon
2025-04-04 9:26 ` [PATCH v3 7/8] drm/panthor: Make heap chunk allocation non-blocking Boris Brezillon
2025-04-04 9:26 ` [PATCH v3 8/8] drm/lima: Use drm_gem_shmem_sparse_backing for heap buffers Boris Brezillon
2025-04-10 14:48 ` [PATCH v3 0/8] drm: Introduce sparse GEM shmem Boris Brezillon
2025-04-10 15:05 ` Christian König
2025-04-10 15:53 ` Boris Brezillon
2025-04-10 16:43 ` Christian König
2025-04-10 17:20 ` Boris Brezillon
2025-04-10 18:01 ` Alyssa Rosenzweig
2025-04-10 18:41 ` Boris Brezillon
2025-04-11 8:04 ` Christian König
2025-04-11 8:38 ` Boris Brezillon
2025-04-11 10:55 ` Christian König
2025-04-11 12:02 ` Boris Brezillon [this message]
2025-04-11 12:45 ` Christian König
2025-04-11 13:00 ` Boris Brezillon
2025-04-11 13:13 ` Christian König
2025-04-11 14:39 ` Boris Brezillon
2025-04-14 12:47 ` Boris Brezillon
2025-04-14 15:34 ` Steven Price
2025-04-15 9:47 ` Boris Brezillon
2025-04-16 15:16 ` Steven Price
2025-04-16 15:53 ` Boris Brezillon
2025-04-15 12:39 ` Daniel Stone
2025-04-11 18:24 ` Simona Vetter
2025-04-11 12:01 ` Simona Vetter
2025-04-11 12:50 ` Christian König
2025-04-11 18:18 ` Simona Vetter
2025-04-11 13:52 ` Alyssa Rosenzweig
2025-04-11 18:16 ` Simona Vetter
2025-04-14 11:22 ` Boris Brezillon
2025-04-14 13:03 ` Alyssa Rosenzweig
2025-04-14 13:31 ` Boris Brezillon
2025-04-14 13:42 ` Alyssa Rosenzweig
2025-04-14 13:08 ` Liviu Dudau
2025-04-14 14:34 ` Simona Vetter
2025-04-14 15:15 ` Boris Brezillon
2025-04-14 14:46 ` Simona Vetter
2025-04-10 18:52 ` Christian König
2025-04-11 8:08 ` Boris Brezillon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250411140254.042f9862@collabora.com \
--to=boris.brezillon@collabora.com \
--cc=adrian.larumbe@collabora.com \
--cc=airlied@gmail.com \
--cc=alyssa@rosenzweig.io \
--cc=christian.koenig@amd.com \
--cc=dmitry.osipenko@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=faith.ekstrand@collabora.com \
--cc=kernel@collabora.com \
--cc=lima@lists.freedesktop.org \
--cc=liviu.dudau@arm.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mripard@kernel.org \
--cc=simona@ffwll.ch \
--cc=steven.price@arm.com \
--cc=tzimmermann@suse.de \
--cc=yuq825@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.