Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Nonblocking maps
@ 2011-09-22 23:27 Ben Widawsky
  2011-09-22 23:27 ` [PATCH] drm/i915: IOCTL to query the cache level of a BO Ben Widawsky
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Ben Widawsky @ 2011-09-22 23:27 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kilarski, Bernard R, Daniel Vetter

After going back and forth many times, I think Daniel and I have agreed
on the solution for non-blocking maps.

This new interface adds a new call to map buffers non-blocking if
possible. In actuality it may block, but it will track if the buffer
needs flushing or not and does the right thing for the use. This relies
on a new IOCTL to determine the cache type of the object.

With the posted benchmark, there are significant improvements on Gen5
with a very synthetic test meant to test unnecessary blocking on
architectures. The test is posted in the series for reference, but isn't
actually useful in it's current form other than to prove there is a
potential performance improvement. The test shows no performance
regression on Gen6.

Ben

^ permalink raw reply	[flat|nested] 19+ messages in thread
* [PATCH 0/5 v3] Nonblocking maps
@ 2011-09-26  1:35 Ben Widawsky
  2011-09-26  1:35 ` [PATCH] intel: non-blocking mmaps on the cheap Ben Widawsky
  0 siblings, 1 reply; 19+ messages in thread
From: Ben Widawsky @ 2011-09-26  1:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

After going back and forth many times, I think Daniel and I have agreed
on the solution for non-blocking maps.

This new interface adds a new call to map buffers non-blocking if
possible. In actuality it may block, but it will track if the buffer
needs flushing or not and does the right thing for the use. This relies
on a new IOCTL to determine the cache type of the object.

With the posted benchmark, there are significant improvements on Gen5
with a very synthetic test meant to test unnecessary blocking on
architectures. The test is posted in the series for reference, but isn't
actually useful in it's current form other than to prove there is a
potential performance improvement. The test shows no performance
regression on Gen6.

v2: 
Some cleanups to libdrm, and mesa patch was messed up.
Lot of reworks on mesa (cleanup + usage of nonblocking)
Removed the gpu-tools test case since it was just reference

v3:
Mesa cleanups were total bonghits
Added a piglit test

Ben

^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: [PATCH 1/6] RFCish: write only mappings (aka non-blocking)
@ 2011-09-20 19:19 Daniel Vetter
  2011-09-21  8:19 ` [PATCH] intel: non-blocking mmaps on the cheap Daniel Vetter
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Vetter @ 2011-09-20 19:19 UTC (permalink / raw)
  To: Eric Anholt; +Cc: Ben Widawsky, intel-gfx

On Tue, Sep 20, 2011 at 10:17:25AM -0700, Eric Anholt wrote:
> On Tue, 20 Sep 2011 13:06:43 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> > - Why do we need any patches for gtt non-blocking mmaps? I've re-read our
> >   code, and afaics we're only calling wait_rendering from gem_fault if
> >   obj->gtt_space == NULL. I.e. there's no way the gpu is currently using
> >   the data and hence no way for us to block on it. I think the only thing
> >   needed is a small libdrm batch to enable non-blocking gtt mmaps
> > 
> >   void drm_intel_enable_non_blocking_gtt_mmap(obj)
> > 
> >   which sets a bit somewhere and moves the obj (once) into the gtt domain.
> >   And a corresponding change in gtt_mmap to disable the set_domain call.
> >   This only works as long as no one else access the object from the cpu
> >   domain, but afaics we'll use non-blocking mmaps only for unshared
> >   buffers, so that should be fine.
> > 
> >   I might also just be dense and not see the issue ...
> 
> This was what I was looking for.  Ben was concerned that while warming
> up towards steady state, the page faults for new pages of the giant
> vertex buffer (for example) would end up blocking in the fault handler.
> I really have a hard time caring about that case.

Well, that can easily be handled by just prefaulting the full range on the
enable_non_blocking call. The thing I was concerned about was when we need
to move around the bo in the gtt to make some space and shoot down the
mappings to do so: On the first fault the bo is naturally not busy, but on
subsequent faults on other parts of the bo the gpu might already be using
it. But I've double-checked, and it looks like with your revert (commit
e92d03bf) we should be safe.

Now for cpu coherent mmaps on machines/kernels support llc caching: Would
you prefer libdrm to transparently use that for non-blocking maps if
available, or is an explicit feature-check with a sepearte cpu map
function preferred? I'm thinking of adding a new map_non_blocking
functions to add to libdrm that either uses gtt mmaps, or cpu mmaps if
they're coherent. The risk I'm seeing with that approach is that future
hw gens might have slightly different semantics for these (e.g funny
games with swizzling) so transparently using one instead of the other may
end up in headaches.  Otoh for untiled buffers to upload vertices, pixels,
whatnoelse, we should be fairly safe. And cpu mmaps for tiled buffers are
broken already, thanks to bit17 swizzling.

I think I can etch out a bit of time and whip up an rfc patchset in the
coming days.

Cheers, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-10-06 22:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-22 23:27 [PATCH 0/4] Nonblocking maps Ben Widawsky
2011-09-22 23:27 ` [PATCH] drm/i915: IOCTL to query the cache level of a BO Ben Widawsky
2011-09-23  8:04   ` Daniel Vetter
2011-09-23  9:00   ` Chris Wilson
2011-09-22 23:27 ` [PATCH] intel: non-blocking mmaps on the cheap Ben Widawsky
2011-09-22 23:35   ` Ben Widawsky
2011-09-23  8:52     ` Daniel Vetter
2011-10-06 20:55   ` Eric Anholt
2011-10-06 22:56     ` Ben Widawsky
2011-09-22 23:27 ` [PATCH] i965: use nonblocking maps MapRangeBuffer Ben Widawsky
2011-09-23 17:15   ` Eric Anholt
2011-09-23 18:46     ` Ben Widawsky
2011-09-23 18:56       ` [Intel-gfx] " Ben Widawsky
2011-09-23 19:21         ` Ben Widawsky
2011-09-22 23:27 ` [PATCH] gpu-tools: nonblocking map test Ben Widawsky
  -- strict thread matches above, loose matches on Subject: below --
2011-09-26  1:35 [PATCH 0/5 v3] Nonblocking maps Ben Widawsky
2011-09-26  1:35 ` [PATCH] intel: non-blocking mmaps on the cheap Ben Widawsky
2011-09-20 19:19 [PATCH 1/6] RFCish: write only mappings (aka non-blocking) Daniel Vetter
2011-09-21  8:19 ` [PATCH] intel: non-blocking mmaps on the cheap Daniel Vetter
2011-09-21 18:11   ` Eric Anholt
2011-09-21 19:19     ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox