public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: oscar.mateo@intel.com
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 00/50] Execlists v2
Date: Tue, 13 May 2014 15:48:24 +0200	[thread overview]
Message-ID: <20140513134824.GH3908@phenom.ffwll.local> (raw)
In-Reply-To: <1399637360-4277-1-git-send-email-oscar.mateo@intel.com>

On Fri, May 09, 2014 at 01:08:30PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> For a description of this patchset, please check the previous cover letter [1].
> 
> Together with this patchset, I'm also submitting an IGT test: gem_execlist [2].
> 
> v2:
> - Use the same context struct for all the different engines (suggested by Brad Volkin).
> - Rename write_tail to submit (suggested by Brad).
> - Simplify hardware submission id creation by using LRCA[31:11] as hwCtxId[18:0].
> - Non-render contexts are only two pages long (suggested by Damien Lespiau).
> - Disable HWSTAM, as no one listens to it anyway (suggested by Damien).
> - Do not write PDPs in the context every time, doing it at context creation time is enough.
> - Various kmalloc changes in gen8_switch_context_queue (suggested by Damien).
> - Module parameter to disable Execlists (as per Damien's patches).
> - Update the HW read pointer in CONTEXT_STATUS_PTR (suggested by Damien).
> - Fixed gpu reset and basic error reporting (verified by the new gem_error_capture test).
> - Fix for unexpected full preemption in some scenarios (instead of lite restore).
> - Ack the context switch interrupts as soon as possible (fix by Bob Beckett).
> - Move default context backing object creation to intel_init_ring.
> - Take into account the second BSD ring.
> - Help out the ctx switch interrupt handler by sharing the burden of squashing requests
>   together.
> 
> What I haven't done in this release:
> 
> - Get the context sizes from the CXT_SIZE registers, as suggested by Damien: the BSpec is full 
>   of holes with regards to the various CXT_SIZE registers, but the hardcoded values seem pretty
>   clear.
> - Allocate the ringbuffer together with the context, as suggested by Damien: now that every
>   context has NUM_RINGS ringbuffers on it, the advantage of this is not clear anymore.
> - Damien pointed out that we are missing the RS context restore, but I don't see any RS values
>   that are needed on the first execution (the first save should take care of these).
> - I have added a comment to clarify how the context population takes place (MI_LOAD_REGISTER_IMM
>   plus <reg,value> pairs) but I haven't provided names for each position (as Jeff Mcgee suggested)
>   or created an OUT_BATCH_REG_WRITE(reg, value) (as Daniel Vetter suggested).
> 
> [1]
> http://lists.freedesktop.org/archives/intel-gfx/2014-March/042563.html
> [2]
> http://lists.freedesktop.org/archives/intel-gfx/2014-May/044846.html

I've done a very cursory read of this, and my original comment from my
original high-level review on the internal list still stands: I'm freaked
out by how invasive this is into the existing ring code. All the changes
in i915_dma.c look very suspicious, since that code is for the legacy ums
crap and will _never_ run on bdw. Nor even on anything more modern than
g4x platforms (gen4).

Apparently that review has been lost/ignored, so I'll quote it in full:

"In reading through your patches the big thing which jumped out
is how the new execlist code is intermingled with the old legacy gen8
framebuffer stuff. Imo those two modes don't match at all, and we need to
resolve this mismatch with another abstraction layer ;-)

"I'm thinking of dev_priv->gt.do_execbuf which takes care of all the
lower-level request tracking and command submission. Feel free to massively
bikeshed the name. I'm thinking that we should move everything from
i915_gem_execbuffer_move_to_gpu to i915_gem_execbuffer_retire_commands into
that callback. With the current code that'd include the active list tracking,
maybe we should move that part out again. Otoh if we go wild with scheduling
and preemption, active bo tracking _will_ be rather different from previous
platforms. To support execlist we might need some more vfuncs in the
ringbuffer struct to support execlist specific stuff (submit execlist, enable
context switch interrupts), but a lot of the existing stuff will be redudant.
At the end (once things settles) we should then document which kind of
do_execbuf uses which kinds of low-level ring interfaces.

"With that abstraction:
- We can separate gen8 execlist from legacy gen8 code, and so should avoid
regressions (and so blocking mesa).
- Play around with different execlist approaches (guc, deferred execution,
whatever, ...) since it'll just be differen copy&pasta.

"Maybe we also need a similar abstraction for a wait_seqno/request interface.
But since has/fulsim can't simulate interrupts properly that discussion is a
bit moot for now.

"Finally I think our immediate focus for execlist enabling should be to get
multi-context execlists going, so that we can validate whether that'll work
together with mesa/hw contexts. If it doesn't, not much point in bothering.
The simplest way is to just block in the ->do_execbuf callback if we can't
submit the new context right away. It'll suck a bit perf-wise, but will get
the hw going.

"In summary execlists looks like a big&invasive feature. My aim here with those
ideas is purely risk mitigation: I want to avoid committing resources for a
(potentially) dead-end design, while still giving us enough flexibility to do
the necessary prototyping to figure out the right answers."

That mail was from Mar 25th, 2013.

So essentially what I'd prefer is we keep all the existing ringbuffer code
as-is, and throw in a complete new set (with fairly new datastructures) in
for execlists. Then only interaction points would be:
- Ring init either calls into legacy ring init or new fancy execlist ring
  init.
- Execbuf calls ring->do_submit with ring/engine, ctx object, batch state
  and otherwise doesn't care one bit how it will all get submitted.
- Context state needs to be frobbed a bit so that we create the correct
  backing object (i.e. legacy hw state or execlist ring+ctx). To make this
  feasible it's probably best to switch the implicit per-fd ctx to be
  per-ring. That way we still have the fixed hw-contxt->ring/engine
  relationship and don't need to play tricks with lazy context allocation
  (because those beats are so big with execlists).

Yes this means that a bunch of things like e.g. seqno emission and flusing
in intel_ringbuffer.c will be duplicated into intel_lrc.c. But imo that's
a feature, not a bug - I would be massively suprised if there's not
subtile differences here we need to be able to cope with.

Cheers, Daniel
> 
> Ben Widawsky (13):
>   drm/i915: s/for_each_ring/for_each_active_ring
>   drm/i915: for_each_ring
>   drm/i915: Extract trivial parts of ring init (early init)
>   drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring
>     Contexts)
>   drm/i915/bdw: Rework init code for Logical Ring Contexts
>   drm/i915/bdw: A bit more advanced context init/fini
>   drm/i915/bdw: Populate LR contexts (somewhat)
>   drm/i915/bdw: Status page for LR contexts
>   drm/i915/bdw: Enable execlists in the hardware
>   drm/i915/bdw: LR context ring init
>   drm/i915/bdw: GEN8 new ring flush
>   drm/i915/bdw: Implement context switching (somewhat)
>   drm/i915/bdw: Print context state in debugfs
> 
> Michel Thierry (1):
>   drm/i915/bdw: Get prepared for a two-stage execlist submit process
> 
> Oscar Mateo (33):
>   drm/i915: Simplify a couple of functions thanks to for_each_ring
>   drm/i915: Extract ringbuffer destroy, make destroy & alloc outside
>     accesible
>   drm/i915: s/intel_ring_buffer/intel_engine
>   drm/i915: Split the ringbuffers and the rings
>   drm/i915: Rename functions that mention ringbuffers (meaning rings)
>   drm/i915: Plumb the context everywhere in the execbuffer path
>   drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit
>   drm/i915: Write a new set of context-aware ringbuffer management
>     functions
>   drm/i915: Final touches to ringbuffer and context plumbing and
>     refactoring
>   drm/i915: s/write_tail/submit
>   drm/i915: Introduce one context backing object per engine
>   drm/i915: Make i915_gem_create_context outside accessible
>   drm/i915: Option to skip backing object allocation during context
>     creation
>   drm/i915: Extract context backing object allocation
>   drm/i915/bdw: New file for Logical Ring Contexts and Execlists
>   drm/i915/bdw: Allocate ringbuffer backing objects for default global
>     LRC
>   drm/i915/bdw: Allocate ringbuffer for user-created LRCs
>   drm/i915/bdw: Deferred creation of user-created LRCs
>   drm/i915/bdw: Allow non-default, non-render, user-created LRCs
>   drm/i915/bdw: Execlists ring tail writing
>   drm/i915/bdw: Set the request context information correctly in the LRC
>     case
>   drm/i915/bdw: Always write seqno to default context
>   drm/i915/bdw: Write the tail pointer, LRC style
>   drm/i915/bdw: Don't write PDP in the legacy way when using LRCs
>   drm/i915/bdw: Start queueing contexts to be submitted
>   drm/i915/bdw: Display execlists info in debugfs
>   drm/i915/bdw: Display context backing obj & ringbuffer info in debugfs
>   drm/i915/bdw: Document execlists and logical ring contexts
>   drm/i915/bdw: Avoid non-lite-restore preemptions
>   drm/i915/bdw: Make sure gpu reset still works with Execlists
>   drm/i915/bdw: Make sure error capture keeps working with Execlists
>   drm/i915/bdw: Help out the ctx switch interrupt handler
>   drm/i915/bdw: Enable logical ring contexts
> 
> Thomas Daniel (3):
>   drm/i915/bdw: Add forcewake lock around ELSP writes
>   drm/i915/bdw: LR context switch interrupts
>   drm/i915/bdw: Handle context switch events
> 
>  drivers/gpu/drm/i915/Makefile              |   1 +
>  drivers/gpu/drm/i915/i915_cmd_parser.c     |  16 +-
>  drivers/gpu/drm/i915/i915_debugfs.c        | 180 ++++++-
>  drivers/gpu/drm/i915/i915_dma.c            |  48 +-
>  drivers/gpu/drm/i915/i915_drv.h            |  97 +++-
>  drivers/gpu/drm/i915/i915_gem.c            | 172 ++++---
>  drivers/gpu/drm/i915/i915_gem_context.c    | 220 +++++---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 ++--
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.h        |   2 +-
>  drivers/gpu/drm/i915/i915_gpu_error.c      |  19 +-
>  drivers/gpu/drm/i915/i915_irq.c            | 102 ++--
>  drivers/gpu/drm/i915/i915_params.c         |   6 +
>  drivers/gpu/drm/i915/i915_reg.h            |  11 +
>  drivers/gpu/drm/i915/i915_trace.h          |  26 +-
>  drivers/gpu/drm/i915/intel_display.c       |  26 +-
>  drivers/gpu/drm/i915/intel_drv.h           |   4 +-
>  drivers/gpu/drm/i915/intel_lrc.c           | 729 ++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_overlay.c       |  12 +-
>  drivers/gpu/drm/i915/intel_pm.c            |  18 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.c    | 792 +++++++++++++++++++----------
>  drivers/gpu/drm/i915/intel_ringbuffer.h    | 196 ++++---
>  22 files changed, 2107 insertions(+), 696 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/intel_lrc.c
> 
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

  parent reply	other threads:[~2014-05-13 13:48 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-09 12:08 [PATCH 00/50] Execlists v2 oscar.mateo
2014-05-09 12:08 ` [PATCH 01/50] drm/i915: s/for_each_ring/for_each_active_ring oscar.mateo
2014-05-09 12:08 ` [PATCH 02/50] drm/i915: for_each_ring oscar.mateo
2014-05-13 13:25   ` Daniel Vetter
2014-05-19 16:33   ` Volkin, Bradley D
2014-05-19 16:36     ` Mateo Lozano, Oscar
2014-05-09 12:08 ` [PATCH 03/50] drm/i915: Simplify a couple of functions thanks to for_each_ring oscar.mateo
2014-05-09 12:08 ` [PATCH 04/50] drm/i915: Extract trivial parts of ring init (early init) oscar.mateo
2014-05-13 13:26   ` Daniel Vetter
2014-05-13 13:47     ` Chris Wilson
2014-05-14 11:53     ` Mateo Lozano, Oscar
2014-05-14 12:28       ` Daniel Vetter
2014-05-09 12:08 ` [PATCH 05/50] drm/i915: Extract ringbuffer destroy, make destroy & alloc outside accesible oscar.mateo
2014-05-09 12:08 ` [PATCH 06/50] drm/i915: s/intel_ring_buffer/intel_engine oscar.mateo
2014-05-13 13:28   ` Daniel Vetter
2014-05-14 13:26     ` Damien Lespiau
2014-05-15 14:17       ` Mateo Lozano, Oscar
2014-05-15 20:52         ` Daniel Vetter
2014-05-19 10:02           ` Mateo Lozano, Oscar
2014-05-19 12:20             ` Daniel Vetter
2014-05-19 13:41               ` Mateo Lozano, Oscar
2014-05-19 13:52                 ` Daniel Vetter
2014-05-19 14:43                   ` Mateo Lozano, Oscar
2014-05-19 15:11                     ` Daniel Vetter
2014-05-19 15:26                       ` Mateo Lozano, Oscar
2014-05-19 15:49                         ` Daniel Vetter
2014-05-19 16:12                           ` Mateo Lozano, Oscar
2014-05-19 16:24                             ` Volkin, Bradley D
2014-05-19 16:33                               ` Mateo Lozano, Oscar
2014-05-19 16:40                                 ` Volkin, Bradley D
2014-05-19 16:49                                   ` Mateo Lozano, Oscar
2014-05-19 17:00                                     ` Volkin, Bradley D
2014-05-20  8:11                             ` Daniel Vetter
2014-05-09 12:08 ` [PATCH 07/50] drm/i915: Split the ringbuffers and the rings oscar.mateo
2014-05-09 12:08 ` [PATCH 08/50] drm/i915: Rename functions that mention ringbuffers (meaning rings) oscar.mateo
2014-05-09 12:08 ` [PATCH 09/50] drm/i915: Plumb the context everywhere in the execbuffer path oscar.mateo
2014-05-16 11:04   ` Chris Wilson
2014-05-16 11:11     ` Mateo Lozano, Oscar
2014-05-16 11:31       ` Chris Wilson
2014-05-09 12:08 ` [PATCH 10/50] drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit oscar.mateo
2014-05-09 12:08 ` [PATCH 11/50] drm/i915: Write a new set of context-aware ringbuffer management functions oscar.mateo
2014-05-09 12:08 ` [PATCH 12/50] drm/i915: Final touches to ringbuffer and context plumbing and refactoring oscar.mateo
2014-05-09 12:08 ` [PATCH 13/50] drm/i915: s/write_tail/submit oscar.mateo
2014-05-09 12:08 ` [PATCH 14/50] drm/i915: Introduce one context backing object per engine oscar.mateo
2014-05-09 12:08 ` [PATCH 15/50] drm/i915: Make i915_gem_create_context outside accessible oscar.mateo
2014-05-09 12:08 ` [PATCH 16/50] drm/i915: Option to skip backing object allocation during context creation oscar.mateo
2014-05-09 12:08 ` [PATCH 17/50] drm/i915: Extract context backing object allocation oscar.mateo
2014-05-09 12:08 ` [PATCH 18/50] drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring Contexts) oscar.mateo
2014-05-09 12:08 ` [PATCH 19/50] drm/i915/bdw: New file for Logical Ring Contexts and Execlists oscar.mateo
2014-05-09 12:08 ` [PATCH 20/50] drm/i915/bdw: Rework init code for Logical Ring Contexts oscar.mateo
2014-05-09 12:08 ` [PATCH 21/50] drm/i915/bdw: A bit more advanced context init/fini oscar.mateo
2014-05-09 12:08 ` [PATCH 22/50] drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC oscar.mateo
2014-05-09 12:08 ` [PATCH 23/50] drm/i915/bdw: Allocate ringbuffer for user-created LRCs oscar.mateo
2014-05-09 12:08 ` [PATCH 24/50] drm/i915/bdw: Populate LR contexts (somewhat) oscar.mateo
2014-05-09 13:36   ` Damien Lespiau
2014-05-12 17:00   ` [PATCH v2 " oscar.mateo
2014-05-09 12:08 ` [PATCH 25/50] drm/i915/bdw: Deferred creation of user-created LRCs oscar.mateo
2014-05-09 12:08 ` [PATCH 26/50] drm/i915/bdw: Allow non-default, non-render, " oscar.mateo
2014-05-13 13:35   ` Daniel Vetter
2014-05-14 11:38     ` Mateo Lozano, Oscar
2014-05-09 12:08 ` [PATCH 27/50] drm/i915/bdw: Status page for LR contexts oscar.mateo
2014-05-09 12:08 ` [PATCH 28/50] drm/i915/bdw: Enable execlists in the hardware oscar.mateo
2014-05-09 12:08 ` [PATCH 29/50] drm/i915/bdw: Execlists ring tail writing oscar.mateo
2014-05-09 12:09 ` [PATCH 30/50] drm/i915/bdw: LR context ring init oscar.mateo
2014-05-09 12:09 ` [PATCH 31/50] drm/i915/bdw: Set the request context information correctly in the LRC case oscar.mateo
2014-05-09 12:09 ` [PATCH 32/50] drm/i915/bdw: GEN8 new ring flush oscar.mateo
2014-05-09 12:09 ` [PATCH 33/50] drm/i915/bdw: Always write seqno to default context oscar.mateo
2014-05-09 12:09 ` [PATCH 34/50] drm/i915/bdw: Implement context switching (somewhat) oscar.mateo
2014-05-09 12:09 ` [PATCH 35/50] drm/i915/bdw: Add forcewake lock around ELSP writes oscar.mateo
2014-05-09 12:09 ` [PATCH 36/50] drm/i915/bdw: Write the tail pointer, LRC style oscar.mateo
2014-05-09 12:09 ` [PATCH 37/50] drm/i915/bdw: Don't write PDP in the legacy way when using LRCs oscar.mateo
2014-05-09 12:09 ` [PATCH 38/50] drm/i915/bdw: LR context switch interrupts oscar.mateo
2014-05-09 12:09 ` [PATCH 39/50] drm/i915/bdw: Get prepared for a two-stage execlist submit process oscar.mateo
2014-05-09 12:09 ` [PATCH 40/50] drm/i915/bdw: Handle context switch events oscar.mateo
2014-06-11 11:52   ` Daniel Vetter
2014-06-11 12:02     ` Mateo Lozano, Oscar
2014-06-11 15:23       ` Mateo Lozano, Oscar
2014-06-12  6:53         ` Daniel Vetter
2014-05-09 12:09 ` [PATCH 41/50] drm/i915/bdw: Start queueing contexts to be submitted oscar.mateo
2014-05-09 12:09 ` [PATCH 42/50] drm/i915/bdw: Display execlists info in debugfs oscar.mateo
2014-05-09 12:09 ` [PATCH 43/50] drm/i915/bdw: Display context backing obj & ringbuffer " oscar.mateo
2014-05-09 12:09 ` [PATCH 44/50] drm/i915/bdw: Print context state " oscar.mateo
2014-05-09 12:09 ` [PATCH 45/50] drm/i915/bdw: Document execlists and logical ring contexts oscar.mateo
2014-05-09 12:09 ` [PATCH 46/50] drm/i915/bdw: Avoid non-lite-restore preemptions oscar.mateo
2014-05-09 12:09 ` [PATCH 47/50] drm/i915/bdw: Make sure gpu reset still works with Execlists oscar.mateo
2014-05-09 12:09 ` [PATCH 48/50] drm/i915/bdw: Make sure error capture keeps working " oscar.mateo
2014-05-09 12:09 ` [PATCH 49/50] drm/i915/bdw: Help out the ctx switch interrupt handler oscar.mateo
2014-06-11 11:50   ` Daniel Vetter
2014-06-11 12:01     ` Mateo Lozano, Oscar
2014-06-11 13:57       ` Daniel Vetter
2014-06-11 14:26         ` Mateo Lozano, Oscar
2014-05-09 12:09 ` [PATCH 50/50] drm/i915/bdw: Enable logical ring contexts oscar.mateo
2014-05-12 17:04 ` [PATCH 49.1/50] drm/i915/bdw: Do not call intel_runtime_pm_get() in an interrupt oscar.mateo
2014-05-13 13:48 ` Daniel Vetter [this message]
  -- strict thread matches above, loose matches on Subject: below --
2014-05-15 14:15 [PATCH 00/50] Execlists v2 Mateo Lozano, Oscar
2014-05-15 21:09 ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140513134824.GH3908@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=oscar.mateo@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox