public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* Re: [PATCH 00/50] Execlists v2
@ 2014-05-15 14:15 Mateo Lozano, Oscar
  2014-05-15 21:09 ` Daniel Vetter
  0 siblings, 1 reply; 4+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-15 14:15 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx@lists.freedesktop.org

> I've done a very cursory read of this, and my original comment from my
> original high-level review on the internal list still stands: I'm freaked out by how
> invasive this is into the existing ring code. All the changes in i915_dma.c look
> very suspicious, since that code is for the legacy ums crap and will _never_ run
> on bdw. Nor even on anything more modern than g4x platforms (gen4).
> 
> Apparently that review has been lost/ignored, so I'll quote it in full:

Back in March 2013 I wasn´t involved on Execlists (or, FWIW, i915) at all so, yes, I´m afraid it got lost. Sorry :(

> "In reading through your patches the big thing which jumped out is how the
> new execlist code is intermingled with the old legacy gen8 framebuffer stuff.
> Imo those two modes don't match at all, and we need to resolve this mismatch
> with another abstraction layer ;-)

IMHO, I have minimized quite a lot the number of conflicts (very few "if execlists enabled" scattered over the legacy code remeining). Rather than "being intermingled", I would say it now "smoothly rides" on the legacy stuff.
If you give it a look with fresh eyes, maybe you will like it?

> "I'm thinking of dev_priv->gt.do_execbuf which takes care of all the lower-level
> request tracking and command submission. Feel free to massively bikeshed the
> name. I'm thinking that we should move everything from
> i915_gem_execbuffer_move_to_gpu to
> i915_gem_execbuffer_retire_commands into that callback. With the current
> code that'd include the active list tracking, maybe we should move that part out
> again. Otoh if we go wild with scheduling and preemption, active bo tracking
> _will_ be rather different from previous platforms. To support execlist we
> might need some more vfuncs in the ringbuffer struct to support execlist
> specific stuff (submit execlist, enable context switch interrupts), but a lot of the
> existing stuff will be redudant.
> At the end (once things settles) we should then document which kind of
> do_execbuf uses which kinds of low-level ring interfaces.

I´m afraid I would still need all the currently existing low-level ring interfaces, plus a bunch of new ones :(
(do_execbuf is not the only user of low-level ring interfaces, see below).

> "With that abstraction:
> - We can separate gen8 execlist from legacy gen8 code, and so should avoid
> regressions (and so blocking mesa).
> - Play around with different execlist approaches (guc, deferred execution,
> whatever, ...) since it'll just be differen copy&pasta.

Ok, yes, I understand what you say. Playing around with other submission approaches will become harder and harder if we reuse too much legacy code. But can´t we do it as a cleanup on top of this code?

> "Finally I think our immediate focus for execlist enabling should be to get multi-
> context execlists going, so that we can validate whether that'll work together
> with mesa/hw contexts. If it doesn't, not much point in bothering.
> The simplest way is to just block in the ->do_execbuf callback if we can't submit
> the new context right away. It'll suck a bit perf-wise, but will get the hw going.

I think this is proved by now (QED!).

> So essentially what I'd prefer is we keep all the existing ringbuffer code as-is,
> and throw in a complete new set (with fairly new datastructures) in for
> execlists. Then only interaction points would be:
> - Ring init either calls into legacy ring init or new fancy execlist ring
>   init.
> - Execbuf calls ring->do_submit with ring/engine, ctx object, batch state
>   and otherwise doesn't care one bit how it will all get submitted.

That´s easy for execbufer, but what about the code that put things directly into the ringbuffer? I refer to constructions like:

intel_ring_begin()
intel_ring_emit()
...
intel_ring_emit()
intel_ring_advance()

And also others like direct calls to i915_add_request outside the execbuffer path.

> - Context state needs to be frobbed a bit so that we create the correct
>   backing object (i.e. legacy hw state or execlist ring+ctx). To make this
>   feasible it's probably best to switch the implicit per-fd ctx to be
>   per-ring. That way we still have the fixed hw-contxt->ring/engine
>   relationship and don't need to play tricks with lazy context allocation
>   (because those beats are so big with execlists).

Sorry, I don´t get this: no more implicit per-fd ctx? so everybody uses the Aliasing PPGTT by default again?

^ permalink raw reply	[flat|nested] 4+ messages in thread
* [PATCH 00/50] Execlists v2
@ 2014-05-09 12:08 oscar.mateo
  2014-05-13 13:48 ` Daniel Vetter
  0 siblings, 1 reply; 4+ messages in thread
From: oscar.mateo @ 2014-05-09 12:08 UTC (permalink / raw)
  To: intel-gfx

From: Oscar Mateo <oscar.mateo@intel.com>

For a description of this patchset, please check the previous cover letter [1].

Together with this patchset, I'm also submitting an IGT test: gem_execlist [2].

v2:
- Use the same context struct for all the different engines (suggested by Brad Volkin).
- Rename write_tail to submit (suggested by Brad).
- Simplify hardware submission id creation by using LRCA[31:11] as hwCtxId[18:0].
- Non-render contexts are only two pages long (suggested by Damien Lespiau).
- Disable HWSTAM, as no one listens to it anyway (suggested by Damien).
- Do not write PDPs in the context every time, doing it at context creation time is enough.
- Various kmalloc changes in gen8_switch_context_queue (suggested by Damien).
- Module parameter to disable Execlists (as per Damien's patches).
- Update the HW read pointer in CONTEXT_STATUS_PTR (suggested by Damien).
- Fixed gpu reset and basic error reporting (verified by the new gem_error_capture test).
- Fix for unexpected full preemption in some scenarios (instead of lite restore).
- Ack the context switch interrupts as soon as possible (fix by Bob Beckett).
- Move default context backing object creation to intel_init_ring.
- Take into account the second BSD ring.
- Help out the ctx switch interrupt handler by sharing the burden of squashing requests
  together.

What I haven't done in this release:

- Get the context sizes from the CXT_SIZE registers, as suggested by Damien: the BSpec is full 
  of holes with regards to the various CXT_SIZE registers, but the hardcoded values seem pretty
  clear.
- Allocate the ringbuffer together with the context, as suggested by Damien: now that every
  context has NUM_RINGS ringbuffers on it, the advantage of this is not clear anymore.
- Damien pointed out that we are missing the RS context restore, but I don't see any RS values
  that are needed on the first execution (the first save should take care of these).
- I have added a comment to clarify how the context population takes place (MI_LOAD_REGISTER_IMM
  plus <reg,value> pairs) but I haven't provided names for each position (as Jeff Mcgee suggested)
  or created an OUT_BATCH_REG_WRITE(reg, value) (as Daniel Vetter suggested).

[1]
http://lists.freedesktop.org/archives/intel-gfx/2014-March/042563.html
[2]
http://lists.freedesktop.org/archives/intel-gfx/2014-May/044846.html

Ben Widawsky (13):
  drm/i915: s/for_each_ring/for_each_active_ring
  drm/i915: for_each_ring
  drm/i915: Extract trivial parts of ring init (early init)
  drm/i915/bdw: Macro and module parameter for LRCs (Logical Ring
    Contexts)
  drm/i915/bdw: Rework init code for Logical Ring Contexts
  drm/i915/bdw: A bit more advanced context init/fini
  drm/i915/bdw: Populate LR contexts (somewhat)
  drm/i915/bdw: Status page for LR contexts
  drm/i915/bdw: Enable execlists in the hardware
  drm/i915/bdw: LR context ring init
  drm/i915/bdw: GEN8 new ring flush
  drm/i915/bdw: Implement context switching (somewhat)
  drm/i915/bdw: Print context state in debugfs

Michel Thierry (1):
  drm/i915/bdw: Get prepared for a two-stage execlist submit process

Oscar Mateo (33):
  drm/i915: Simplify a couple of functions thanks to for_each_ring
  drm/i915: Extract ringbuffer destroy, make destroy & alloc outside
    accesible
  drm/i915: s/intel_ring_buffer/intel_engine
  drm/i915: Split the ringbuffers and the rings
  drm/i915: Rename functions that mention ringbuffers (meaning rings)
  drm/i915: Plumb the context everywhere in the execbuffer path
  drm/i915: s/__intel_ring_advance/intel_ringbuffer_advance_and_submit
  drm/i915: Write a new set of context-aware ringbuffer management
    functions
  drm/i915: Final touches to ringbuffer and context plumbing and
    refactoring
  drm/i915: s/write_tail/submit
  drm/i915: Introduce one context backing object per engine
  drm/i915: Make i915_gem_create_context outside accessible
  drm/i915: Option to skip backing object allocation during context
    creation
  drm/i915: Extract context backing object allocation
  drm/i915/bdw: New file for Logical Ring Contexts and Execlists
  drm/i915/bdw: Allocate ringbuffer backing objects for default global
    LRC
  drm/i915/bdw: Allocate ringbuffer for user-created LRCs
  drm/i915/bdw: Deferred creation of user-created LRCs
  drm/i915/bdw: Allow non-default, non-render, user-created LRCs
  drm/i915/bdw: Execlists ring tail writing
  drm/i915/bdw: Set the request context information correctly in the LRC
    case
  drm/i915/bdw: Always write seqno to default context
  drm/i915/bdw: Write the tail pointer, LRC style
  drm/i915/bdw: Don't write PDP in the legacy way when using LRCs
  drm/i915/bdw: Start queueing contexts to be submitted
  drm/i915/bdw: Display execlists info in debugfs
  drm/i915/bdw: Display context backing obj & ringbuffer info in debugfs
  drm/i915/bdw: Document execlists and logical ring contexts
  drm/i915/bdw: Avoid non-lite-restore preemptions
  drm/i915/bdw: Make sure gpu reset still works with Execlists
  drm/i915/bdw: Make sure error capture keeps working with Execlists
  drm/i915/bdw: Help out the ctx switch interrupt handler
  drm/i915/bdw: Enable logical ring contexts

Thomas Daniel (3):
  drm/i915/bdw: Add forcewake lock around ELSP writes
  drm/i915/bdw: LR context switch interrupts
  drm/i915/bdw: Handle context switch events

 drivers/gpu/drm/i915/Makefile              |   1 +
 drivers/gpu/drm/i915/i915_cmd_parser.c     |  16 +-
 drivers/gpu/drm/i915/i915_debugfs.c        | 180 ++++++-
 drivers/gpu/drm/i915/i915_dma.c            |  48 +-
 drivers/gpu/drm/i915/i915_drv.h            |  97 +++-
 drivers/gpu/drm/i915/i915_gem.c            | 172 ++++---
 drivers/gpu/drm/i915/i915_gem_context.c    | 220 +++++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  85 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  41 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h        |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c      |  19 +-
 drivers/gpu/drm/i915/i915_irq.c            | 102 ++--
 drivers/gpu/drm/i915/i915_params.c         |   6 +
 drivers/gpu/drm/i915/i915_reg.h            |  11 +
 drivers/gpu/drm/i915/i915_trace.h          |  26 +-
 drivers/gpu/drm/i915/intel_display.c       |  26 +-
 drivers/gpu/drm/i915/intel_drv.h           |   4 +-
 drivers/gpu/drm/i915/intel_lrc.c           | 729 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_overlay.c       |  12 +-
 drivers/gpu/drm/i915/intel_pm.c            |  18 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 792 +++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 196 ++++---
 22 files changed, 2107 insertions(+), 696 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_lrc.c

-- 
1.9.0

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-05-15 21:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-15 14:15 [PATCH 00/50] Execlists v2 Mateo Lozano, Oscar
2014-05-15 21:09 ` Daniel Vetter
  -- strict thread matches above, loose matches on Subject: below --
2014-05-09 12:08 oscar.mateo
2014-05-13 13:48 ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox