Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Danilo Krummrich <dakr@kernel.org>
To: Rob Clark <robdclark@gmail.com>
Cc: "Connor Abbott" <cwabbott0@gmail.com>,
	"Rob Clark" <robdclark@chromium.org>,
	phasta@kernel.org, dri-devel@lists.freedesktop.org,
	freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org,
	"Matthew Brost" <matthew.brost@intel.com>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"open list" <linux-kernel@vger.kernel.org>,
	"Boris Brezillon" <boris.brezillon@collabora.com>
Subject: Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
Date: Tue, 20 May 2025 18:54:32 +0200	[thread overview]
Message-ID: <aCyzyAPbQ1SYbo4q@pollux> (raw)
In-Reply-To: <CAF6AEGspvuTHU0t9z__p_HkdRNi=cXir3t453AbR6DFNzDpgvw@mail.gmail.com>

On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > On Thu, May 15, 2025 at 12:56:38PM -0700, Rob Clark wrote:
> > > On Thu, May 15, 2025 at 11:56 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > >
> > > > On Thu, May 15, 2025 at 10:40:15AM -0700, Rob Clark wrote:
> > > > > On Thu, May 15, 2025 at 10:30 AM Danilo Krummrich <dakr@kernel.org> wrote:
> > > > > >
> > > > > > (Cc: Boris)
> > > > > >
> > > > > > On Thu, May 15, 2025 at 12:22:18PM -0400, Connor Abbott wrote:
> > > > > > > For some context, other drivers have the concept of a "synchronous"
> > > > > > > VM_BIND ioctl which completes immediately, and drivers implement it by
> > > > > > > waiting for the whole thing to finish before returning.
> > > > > >
> > > > > > Nouveau implements sync by issuing a normal async VM_BIND and subsequently
> > > > > > waits for the out-fence synchronously.
> > > > >
> > > > > As Connor mentioned, we'd prefer it to be async rather than blocking,
> > > > > in normal cases, otherwise with drm native context for using native
> > > > > UMD in guest VM, you'd be blocking the single host/VMM virglrender
> > > > > thread.
> > > > >
> > > > > The key is we want to keep it async in the normal cases, and not have
> > > > > weird edge case CTS tests blow up from being _too_ async ;-)
> > > >
> > > > I really wonder why they don't blow up in Nouveau, which also support full
> > > > asynchronous VM_BIND. Mind sharing which tests blow up? :)
> > >
> > > Maybe it was dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24,
> >
> > The test above is part of the smoke testing I do for nouveau, but I haven't seen
> > such issues yet for nouveau.
> 
> nouveau is probably not using async binds for everything?  Or maybe
> I'm just pointing to the wrong test.

Let me double check later on.

> > > but I might be mixing that up, I'd have to back out this patch and see
> > > where things blow up, which would take many hours.
> >
> > Well, you said that you never had this issue with "real" workloads, but only
> > with VK CTS, so I really think we should know what we are trying to fix here.
> >
> > We can't just add new generic infrastructure without reasonable and *well
> > understood* justification.
> 
> What is not well understood about this?  We need to pre-allocate
> memory that we likely don't need for pagetables.
> 
> In the worst case, a large # of async PAGE_SIZE binds, you end up
> needing to pre-allocate 3 pgtable pages (4 lvl pgtable) per one page
> of mapping.  Queue up enough of those and you can explode your memory
> usage.

Well, the general principle how this can OOM is well understood, sure. What's
not well understood is how we run in this case. I think we should also
understand what test causes the issue and why other drivers are not affected
(yet).

> > > There definitely was one where I was seeing >5k VM_BIND jobs pile up,
> > > so absolutely throttling like this is needed.
> >
> > I still don't understand why the kernel must throttle this? If userspace uses
> > async VM_BIND, it obviously can't spam the kernel infinitely without running
> > into an OOM case.
> 
> It is a valid question about whether the kernel or userspace should be
> the one to do the throttling.
> 
> I went for doing it in the kernel because the kernel has better
> knowledge of how much it needs to pre-allocate.
> 
> (There is also the side point, that this pre-allocated memory is not
> charged to the calling process from a PoV of memory accounting.  So
> with that in mind it seems like a good idea for the kernel to throttle
> memory usage.)

That's a very valid point, maybe we should investigate in the direction of
addressing this, rather than trying to work around it in the scheduler, where we
can only set an arbitrary credit limit.

> > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > through async VM_BIND, then the proposed solution seems wrong:
> >
> > Do we really want the driver developer to set an arbitrary boundary of a number
> > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > semi-sync?
> >
> > How do we choose this number of jobs? A very small number to be safe, which
> > scales badly on powerful machines? A large number that scales well on powerful
> > machines, but OOMs on weaker ones?
> 
> The way I am using it in msm, the credit amount and limit are in units
> of pre-allocated pages in-flight.  I set the enqueue_credit_limit to
> 1024 pages, once there are jobs queued up exceeding that limit, they
> start blocking.
> 
> The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.

That doesn't make a difference for my question. How do you know 1024 pages is a
good value? How do we scale for different machines with different capabilities?

If you have a powerful machine with lots of memory, we might throttle userspace
for no reason, no?

If the machine has very limited resources, it might already be too much?

next prev parent reply	other threads:[~2025-05-20 16:54 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-14 16:58 [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
2025-05-14 16:59 ` [PATCH v4 01/40] drm/gpuvm: Don't require obj lock in destructor path Rob Clark
2025-05-14 16:59 ` [PATCH v4 02/40] drm/gpuvm: Allow VAs to hold soft reference to BOs Rob Clark
2025-05-14 16:59 ` [PATCH v4 03/40] drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan() Rob Clark
2025-05-14 16:59 ` [PATCH v4 04/40] drm/sched: Add enqueue credit limit Rob Clark
2025-05-15  9:28   ` Philipp Stanner
2025-05-15 16:15     ` Rob Clark
2025-05-15 16:22       ` Connor Abbott
2025-05-15 17:29         ` Danilo Krummrich
2025-05-15 17:40           ` Rob Clark
2025-05-15 18:56             ` Danilo Krummrich
2025-05-15 19:56               ` Rob Clark
2025-05-20  7:06                 ` Danilo Krummrich
2025-05-20 16:07                   ` Rob Clark
2025-05-20 16:54                     ` Danilo Krummrich [this message]
2025-05-20 17:05                       ` Connor Abbott
2025-05-20 17:22                       ` Rob Clark
2025-05-22 11:00                         ` Danilo Krummrich
2025-05-22 14:47                           ` Rob Clark
2025-05-22 15:53                             ` Danilo Krummrich
2025-05-23  2:31                               ` Rob Clark
2025-05-23  6:58                                 ` Danilo Krummrich
2025-05-15 17:23       ` Danilo Krummrich
2025-05-15 17:36         ` Rob Clark
2025-05-14 16:59 ` [PATCH v4 05/40] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() Rob Clark
2025-05-14 16:59 ` [PATCH v4 06/40] drm/msm: Rename msm_file_private -> msm_context Rob Clark
2025-05-14 16:59 ` [PATCH v4 07/40] drm/msm: Improve msm_context comments Rob Clark
2025-05-14 16:59 ` [PATCH v4 08/40] drm/msm: Rename msm_gem_address_space -> msm_gem_vm Rob Clark
2025-05-14 16:59 ` [PATCH v4 09/40] drm/msm: Remove vram carveout support Rob Clark
2025-05-14 16:59 ` [PATCH v4 10/40] drm/msm: Collapse vma allocation and initialization Rob Clark
2025-05-14 16:59 ` [PATCH v4 11/40] drm/msm: Collapse vma close and delete Rob Clark
2025-05-14 17:13 ` [PATCH v4 00/40] drm/msm: sparse / "VM_BIND" support Rob Clark
  -- strict thread matches above, loose matches on Subject: below --
2025-05-14 17:53 Rob Clark
2025-05-14 17:53 ` [PATCH v4 04/40] drm/sched: Add enqueue credit limit Rob Clark

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aCyzyAPbQ1SYbo4q@pollux \
    --to=dakr@kernel.org \
    --cc=airlied@gmail.com \
    --cc=boris.brezillon@collabora.com \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=cwabbott0@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=phasta@kernel.org \
    --cc=robdclark@chromium.org \
    --cc=robdclark@gmail.com \
    --cc=simona@ffwll.ch \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).