* [PATCH 1/2] HAX enable guc submission for CI
@ 2017-02-14 11:44 Chris Wilson
2017-02-14 11:44 ` [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc Chris Wilson
2017-02-14 14:52 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] HAX enable guc submission for CI Patchwork
0 siblings, 2 replies; 8+ messages in thread
From: Chris Wilson @ 2017-02-14 11:44 UTC (permalink / raw)
To: intel-gfx
---
drivers/gpu/drm/i915/i915_params.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 2e9645e6555a..8fa96edddf9f 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -56,8 +56,8 @@ struct i915_params i915 __read_mostly = {
.verbose_state_checks = 1,
.nuclear_pageflip = 0,
.edp_vswing = 0,
- .enable_guc_loading = 0,
- .enable_guc_submission = 0,
+ .enable_guc_loading = 1,
+ .enable_guc_submission = 1,
.guc_log_level = -1,
.enable_dp_mst = true,
.inject_load_failure = 0,
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc
2017-02-14 11:44 [PATCH 1/2] HAX enable guc submission for CI Chris Wilson
@ 2017-02-14 11:44 ` Chris Wilson
2017-02-14 14:03 ` Joonas Lahtinen
2017-02-15 11:56 ` Tvrtko Ursulin
2017-02-14 14:52 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] HAX enable guc submission for CI Patchwork
1 sibling, 2 replies; 8+ messages in thread
From: Chris Wilson @ 2017-02-14 11:44 UTC (permalink / raw)
To: intel-gfx
This emulates execlists on top of the GuC in order to defer submission of
requests to the hardware. This deferral allows time for high priority
requests to gazump their way to the head of the queue, however it nerfs
the GuC by converting it back into a simple execlist (where the CPU has
to wake up after every request to feed new commands into the GuC).
v2: Drop hack status - though iirc there is still a lockdep inversion
between fence and engine->timeline->lock (which is impossible as the
nesting only occurs on different fences - hopefully just requires some
judicious lockdep annotation)
v3: Apply lockdep nesting to enabling signaling on the request, using
the pattern we already have in __i915_gem_request_submit();
v4: Replaying requests after a hang also now needs the timeline
spinlock, to disable the interrupts at least
v5: Hold wq lock for completeness, and emit a tracepoint for enabling signal
v6: Reorder interrupt checking for a happier gcc.
v7: Only signal the tasklet after a user-interrupt if using guc scheduling
v8: Restore lost update of rq through the i915_guc_irq_handler (Tvrtko)
v9: Avoid re-initialising the engine->irq_tasklet from inside a reset
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
drivers/gpu/drm/i915/i915_guc_submission.c | 113 +++++++++++++++++++++++++++--
drivers/gpu/drm/i915/i915_irq.c | 13 +++-
drivers/gpu/drm/i915/intel_lrc.c | 5 +-
3 files changed, 117 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8ced9e26f075..5860691aaa48 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -25,6 +25,8 @@
#include "i915_drv.h"
#include "intel_uc.h"
+#include <trace/events/dma_fence.h>
+
/**
* DOC: GuC-based command submission
*
@@ -348,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request)
u32 freespace;
int ret;
- spin_lock(&client->wq_lock);
+ spin_lock_irq(&client->wq_lock);
freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size);
freespace -= client->wq_rsvd;
if (likely(freespace >= wqi_size)) {
@@ -358,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request)
client->no_wq_space++;
ret = -EAGAIN;
}
- spin_unlock(&client->wq_lock);
+ spin_unlock_irq(&client->wq_lock);
return ret;
}
@@ -370,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request *request)
GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size);
- spin_lock(&client->wq_lock);
+ spin_lock_irq(&client->wq_lock);
client->wq_rsvd -= wqi_size;
- spin_unlock(&client->wq_lock);
+ spin_unlock_irq(&client->wq_lock);
}
/* Construct a Work Item and append it to the GuC's Work Queue */
@@ -532,10 +534,97 @@ static void __i915_guc_submit(struct drm_i915_gem_request *rq)
static void i915_guc_submit(struct drm_i915_gem_request *rq)
{
- i915_gem_request_submit(rq);
+ __i915_gem_request_submit(rq);
__i915_guc_submit(rq);
}
+static void nested_enable_signaling(struct drm_i915_gem_request *rq)
+{
+ /* If we use dma_fence_enable_sw_signaling() directly, lockdep
+ * detects an ordering issue between the fence lockclass and the
+ * global_timeline. This circular dependency can only occur via 2
+ * different fences (but same fence lockclass), so we use the nesting
+ * annotation here to prevent the warn, equivalent to the nesting
+ * inside i915_gem_request_submit() for when we also enable the
+ * signaler.
+ */
+
+ if (test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+ &rq->fence.flags))
+ return;
+
+ GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags));
+ trace_dma_fence_enable_signal(&rq->fence);
+
+ spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING);
+ intel_engine_enable_signaling(rq);
+ spin_unlock(&rq->lock);
+}
+
+static bool i915_guc_dequeue(struct intel_engine_cs *engine)
+{
+ struct execlist_port *port = engine->execlist_port;
+ struct drm_i915_gem_request *last = port[0].request;
+ unsigned long flags;
+ struct rb_node *rb;
+ bool submit = false;
+
+ spin_lock_irqsave(&engine->timeline->lock, flags);
+ rb = engine->execlist_first;
+ while (rb) {
+ struct drm_i915_gem_request *cursor =
+ rb_entry(rb, typeof(*cursor), priotree.node);
+
+ if (last && cursor->ctx != last->ctx) {
+ if (port != engine->execlist_port)
+ break;
+
+ i915_gem_request_assign(&port->request, last);
+ nested_enable_signaling(last);
+ port++;
+ }
+
+ rb = rb_next(rb);
+ rb_erase(&cursor->priotree.node, &engine->execlist_queue);
+ RB_CLEAR_NODE(&cursor->priotree.node);
+ cursor->priotree.priority = INT_MAX;
+
+ i915_guc_submit(cursor);
+ last = cursor;
+ submit = true;
+ }
+ if (submit) {
+ i915_gem_request_assign(&port->request, last);
+ nested_enable_signaling(last);
+ engine->execlist_first = rb;
+ }
+ spin_unlock_irqrestore(&engine->timeline->lock, flags);
+
+ return submit;
+}
+
+static void i915_guc_irq_handler(unsigned long data)
+{
+ struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
+ struct execlist_port *port = engine->execlist_port;
+ struct drm_i915_gem_request *rq;
+ bool submit;
+
+ do {
+ rq = port[0].request;
+ while (rq && i915_gem_request_completed(rq)) {
+ i915_gem_request_put(rq);
+ port[0].request = port[1].request;
+ port[1].request = NULL;
+ rq = port[0].request;
+ }
+
+ submit = false;
+ if (!port[1].request)
+ submit = i915_guc_dequeue(engine);
+ } while (submit);
+}
+
/*
* Everything below here is concerned with setup & teardown, and is
* therefore not part of the somewhat time-critical batch-submission
@@ -944,15 +1033,25 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
/* Take over from manual control of ELSP (execlists) */
for_each_engine(engine, dev_priv, id) {
struct drm_i915_gem_request *rq;
+ unsigned long flags;
- engine->submit_request = i915_guc_submit;
- engine->schedule = NULL;
+ /* The tasklet was initialised by execlists, and may be in
+ * a state of flux (across a reset) and so we just want to
+ * take over the callback without changing any other state
+ * in the tasklet.
+ */
+ engine->irq_tasklet.func = i915_guc_irq_handler;
/* Replay the current set of previously submitted requests */
+ spin_lock_irqsave(&engine->timeline->lock, flags);
list_for_each_entry(rq, &engine->timeline->requests, link) {
+ spin_lock(&client->wq_lock);
client->wq_rsvd += sizeof(struct guc_wq_item);
+ spin_unlock(&client->wq_lock);
+
__i915_guc_submit(rq);
}
+ spin_unlock_irqrestore(&engine->timeline->lock, flags);
}
return 0;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index cdc7da60d37a..aa886b5fb2cd 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1350,13 +1350,20 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv,
static __always_inline void
gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
{
- if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
- notify_ring(engine);
+ bool tasklet = false;
if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
- tasklet_hi_schedule(&engine->irq_tasklet);
+ tasklet = true;
}
+
+ if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
+ notify_ring(engine);
+ tasklet |= i915.enable_guc_submission;
+ }
+
+ if (tasklet)
+ tasklet_hi_schedule(&engine->irq_tasklet);
}
static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index af717e1356a9..09e57755c325 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1300,7 +1300,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
/* After a GPU reset, we may have requests to replay */
clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
- if (!execlists_elsp_idle(engine)) {
+ if (!i915.enable_guc_submission && !execlists_elsp_idle(engine)) {
DRM_DEBUG_DRIVER("Restarting %s from requests [0x%x, 0x%x]\n",
engine->name,
port_seqno(&engine->execlist_port[0]),
@@ -1385,9 +1385,6 @@ static void reset_common_ring(struct intel_engine_cs *engine,
request->ring->last_retired_head = -1;
intel_ring_update_space(request->ring);
- if (i915.enable_guc_submission)
- return;
-
/* Catch up with any missed context-switch interrupts */
if (request->ctx != port[0].request->ctx) {
i915_gem_request_put(port[0].request);
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc
2017-02-14 11:44 ` [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc Chris Wilson
@ 2017-02-14 14:03 ` Joonas Lahtinen
2017-02-15 11:56 ` Tvrtko Ursulin
1 sibling, 0 replies; 8+ messages in thread
From: Joonas Lahtinen @ 2017-02-14 14:03 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On ti, 2017-02-14 at 11:44 +0000, Chris Wilson wrote:
> This emulates execlists on top of the GuC in order to defer submission of
> requests to the hardware. This deferral allows time for high priority
> requests to gazump their way to the head of the queue, however it nerfs
> the GuC by converting it back into a simple execlist (where the CPU has
> to wake up after every request to feed new commands into the GuC).
>
> v2: Drop hack status - though iirc there is still a lockdep inversion
> between fence and engine->timeline->lock (which is impossible as the
> nesting only occurs on different fences - hopefully just requires some
> judicious lockdep annotation)
> v3: Apply lockdep nesting to enabling signaling on the request, using
> the pattern we already have in __i915_gem_request_submit();
> v4: Replaying requests after a hang also now needs the timeline
> spinlock, to disable the interrupts at least
> v5: Hold wq lock for completeness, and emit a tracepoint for enabling signal
> v6: Reorder interrupt checking for a happier gcc.
> v7: Only signal the tasklet after a user-interrupt if using guc scheduling
> v8: Restore lost update of rq through the i915_guc_irq_handler (Tvrtko)
> v9: Avoid re-initialising the engine->irq_tasklet from inside a reset
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
I already did, but here goes again;
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Regards, Joonas
--
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* ✗ Fi.CI.BAT: failure for series starting with [1/2] HAX enable guc submission for CI
2017-02-14 11:44 [PATCH 1/2] HAX enable guc submission for CI Chris Wilson
2017-02-14 11:44 ` [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc Chris Wilson
@ 2017-02-14 14:52 ` Patchwork
1 sibling, 0 replies; 8+ messages in thread
From: Patchwork @ 2017-02-14 14:52 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] HAX enable guc submission for CI
URL : https://patchwork.freedesktop.org/series/19620/
State : failure
== Summary ==
Series 19620v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/19620/revisions/1/mbox/
Test gem_ctx_basic:
pass -> DMESG-WARN (fi-bxt-j4205)
Test gem_ctx_create:
Subgroup basic-files:
pass -> DMESG-WARN (fi-bxt-j4205)
Test gem_tiled_pread_basic:
pass -> DMESG-WARN (fi-bxt-j4205)
Test kms_busy:
Subgroup basic-flip-default-a:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-default-b:
pass -> DMESG-WARN (fi-bxt-j4205) fdo#99728
Subgroup basic-flip-default-c:
pass -> DMESG-WARN (fi-bxt-j4205)
Test kms_cursor_legacy:
Subgroup basic-busy-flip-before-cursor-atomic:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-busy-flip-before-cursor-legacy:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-after-cursor-atomic:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-after-cursor-legacy:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-after-cursor-varying-size:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-before-cursor-atomic:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-before-cursor-legacy:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-before-cursor-varying-size:
pass -> DMESG-WARN (fi-bxt-j4205)
Test kms_flip:
Subgroup basic-flip-vs-dpms:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-vs-modeset:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-flip-vs-wf_vblank:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup basic-plain-flip:
pass -> DMESG-WARN (fi-bxt-j4205)
Test kms_frontbuffer_tracking:
Subgroup basic:
pass -> DMESG-WARN (fi-bxt-j4205)
Test kms_pipe_crc_basic:
Subgroup bad-source:
pass -> DMESG-WARN (fi-bxt-j4205)
Subgroup suspend-read-crc-pipe-c:
pass -> FAIL (fi-skl-6700k)
fdo#99728 https://bugs.freedesktop.org/show_bug.cgi?id=99728
fi-bdw-5557u total:252 pass:241 dwarn:0 dfail:0 fail:0 skip:11
fi-bsw-n3050 total:252 pass:213 dwarn:0 dfail:0 fail:0 skip:39
fi-bxt-j4205 total:252 pass:213 dwarn:20 dfail:0 fail:0 skip:19
fi-bxt-t5700 total:83 pass:70 dwarn:0 dfail:0 fail:0 skip:12
fi-byt-j1900 total:252 pass:225 dwarn:0 dfail:0 fail:0 skip:27
fi-byt-n2820 total:252 pass:221 dwarn:0 dfail:0 fail:0 skip:31
fi-hsw-4770 total:252 pass:236 dwarn:0 dfail:0 fail:0 skip:16
fi-hsw-4770r total:252 pass:236 dwarn:0 dfail:0 fail:0 skip:16
fi-ilk-650 total:252 pass:202 dwarn:0 dfail:0 fail:0 skip:50
fi-ivb-3520m total:252 pass:234 dwarn:0 dfail:0 fail:0 skip:18
fi-ivb-3770 total:252 pass:234 dwarn:0 dfail:0 fail:0 skip:18
fi-kbl-7500u total:252 pass:234 dwarn:0 dfail:0 fail:0 skip:18
fi-skl-6260u total:252 pass:242 dwarn:0 dfail:0 fail:0 skip:10
fi-skl-6700hq total:252 pass:235 dwarn:0 dfail:0 fail:0 skip:17
fi-skl-6700k total:252 pass:229 dwarn:4 dfail:0 fail:1 skip:18
fi-skl-6770hq total:252 pass:242 dwarn:0 dfail:0 fail:0 skip:10
fi-snb-2520m total:252 pass:224 dwarn:0 dfail:0 fail:0 skip:28
fi-snb-2600 total:252 pass:223 dwarn:0 dfail:0 fail:0 skip:29
fd9871a3f519674bb9b29b219393a5a3d470a04b drm-tip: 2017y-02m-14d-12h-28m-14s UTC integration manifest
9051d79 drm/i915/scheduler: emulate a scheduler for guc
3539de1 HAX enable guc submission for CI
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3802/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc
2017-02-14 11:44 ` [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc Chris Wilson
2017-02-14 14:03 ` Joonas Lahtinen
@ 2017-02-15 11:56 ` Tvrtko Ursulin
2017-02-15 12:20 ` Chris Wilson
1 sibling, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2017-02-15 11:56 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 14/02/2017 11:44, Chris Wilson wrote:
> This emulates execlists on top of the GuC in order to defer submission of
> requests to the hardware. This deferral allows time for high priority
> requests to gazump their way to the head of the queue, however it nerfs
> the GuC by converting it back into a simple execlist (where the CPU has
> to wake up after every request to feed new commands into the GuC).
>
> v2: Drop hack status - though iirc there is still a lockdep inversion
> between fence and engine->timeline->lock (which is impossible as the
> nesting only occurs on different fences - hopefully just requires some
> judicious lockdep annotation)
> v3: Apply lockdep nesting to enabling signaling on the request, using
> the pattern we already have in __i915_gem_request_submit();
> v4: Replaying requests after a hang also now needs the timeline
> spinlock, to disable the interrupts at least
> v5: Hold wq lock for completeness, and emit a tracepoint for enabling signal
> v6: Reorder interrupt checking for a happier gcc.
> v7: Only signal the tasklet after a user-interrupt if using guc scheduling
> v8: Restore lost update of rq through the i915_guc_irq_handler (Tvrtko)
> v9: Avoid re-initialising the engine->irq_tasklet from inside a reset
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_guc_submission.c | 113 +++++++++++++++++++++++++++--
> drivers/gpu/drm/i915/i915_irq.c | 13 +++-
> drivers/gpu/drm/i915/intel_lrc.c | 5 +-
> 3 files changed, 117 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 8ced9e26f075..5860691aaa48 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -25,6 +25,8 @@
> #include "i915_drv.h"
> #include "intel_uc.h"
>
> +#include <trace/events/dma_fence.h>
> +
> /**
> * DOC: GuC-based command submission
> *
> @@ -348,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request)
> u32 freespace;
> int ret;
>
> - spin_lock(&client->wq_lock);
> + spin_lock_irq(&client->wq_lock);
> freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size);
> freespace -= client->wq_rsvd;
> if (likely(freespace >= wqi_size)) {
> @@ -358,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request)
> client->no_wq_space++;
> ret = -EAGAIN;
> }
> - spin_unlock(&client->wq_lock);
> + spin_unlock_irq(&client->wq_lock);
>
> return ret;
> }
> @@ -370,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request *request)
>
> GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size);
>
> - spin_lock(&client->wq_lock);
> + spin_lock_irq(&client->wq_lock);
> client->wq_rsvd -= wqi_size;
> - spin_unlock(&client->wq_lock);
> + spin_unlock_irq(&client->wq_lock);
> }
>
> /* Construct a Work Item and append it to the GuC's Work Queue */
> @@ -532,10 +534,97 @@ static void __i915_guc_submit(struct drm_i915_gem_request *rq)
>
> static void i915_guc_submit(struct drm_i915_gem_request *rq)
> {
> - i915_gem_request_submit(rq);
> + __i915_gem_request_submit(rq);
> __i915_guc_submit(rq);
> }
>
> +static void nested_enable_signaling(struct drm_i915_gem_request *rq)
> +{
> + /* If we use dma_fence_enable_sw_signaling() directly, lockdep
> + * detects an ordering issue between the fence lockclass and the
> + * global_timeline. This circular dependency can only occur via 2
> + * different fences (but same fence lockclass), so we use the nesting
> + * annotation here to prevent the warn, equivalent to the nesting
> + * inside i915_gem_request_submit() for when we also enable the
> + * signaler.
> + */
> +
> + if (test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
> + &rq->fence.flags))
> + return;
> +
> + GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags));
> + trace_dma_fence_enable_signal(&rq->fence);
> +
> + spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING);
> + intel_engine_enable_signaling(rq);
> + spin_unlock(&rq->lock);
> +}
> +
> +static bool i915_guc_dequeue(struct intel_engine_cs *engine)
> +{
> + struct execlist_port *port = engine->execlist_port;
> + struct drm_i915_gem_request *last = port[0].request;
> + unsigned long flags;
> + struct rb_node *rb;
> + bool submit = false;
> +
> + spin_lock_irqsave(&engine->timeline->lock, flags);
> + rb = engine->execlist_first;
> + while (rb) {
> + struct drm_i915_gem_request *cursor =
> + rb_entry(rb, typeof(*cursor), priotree.node);
> +
> + if (last && cursor->ctx != last->ctx) {
> + if (port != engine->execlist_port)
> + break;
> +
> + i915_gem_request_assign(&port->request, last);
> + nested_enable_signaling(last);
> + port++;
> + }
> +
> + rb = rb_next(rb);
> + rb_erase(&cursor->priotree.node, &engine->execlist_queue);
> + RB_CLEAR_NODE(&cursor->priotree.node);
> + cursor->priotree.priority = INT_MAX;
> +
> + i915_guc_submit(cursor);
> + last = cursor;
> + submit = true;
> + }
> + if (submit) {
> + i915_gem_request_assign(&port->request, last);
> + nested_enable_signaling(last);
> + engine->execlist_first = rb;
> + }
> + spin_unlock_irqrestore(&engine->timeline->lock, flags);
> +
> + return submit;
> +}
> +
> +static void i915_guc_irq_handler(unsigned long data)
> +{
> + struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
> + struct execlist_port *port = engine->execlist_port;
> + struct drm_i915_gem_request *rq;
> + bool submit;
> +
> + do {
> + rq = port[0].request;
> + while (rq && i915_gem_request_completed(rq)) {
> + i915_gem_request_put(rq);
> + port[0].request = port[1].request;
> + port[1].request = NULL;
> + rq = port[0].request;
> + }
> +
> + submit = false;
> + if (!port[1].request)
> + submit = i915_guc_dequeue(engine);
> + } while (submit);
> +}
> +
> /*
> * Everything below here is concerned with setup & teardown, and is
> * therefore not part of the somewhat time-critical batch-submission
> @@ -944,15 +1033,25 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv)
> /* Take over from manual control of ELSP (execlists) */
> for_each_engine(engine, dev_priv, id) {
> struct drm_i915_gem_request *rq;
> + unsigned long flags;
>
> - engine->submit_request = i915_guc_submit;
> - engine->schedule = NULL;
> + /* The tasklet was initialised by execlists, and may be in
> + * a state of flux (across a reset) and so we just want to
> + * take over the callback without changing any other state
> + * in the tasklet.
> + */
> + engine->irq_tasklet.func = i915_guc_irq_handler;
>
> /* Replay the current set of previously submitted requests */
> + spin_lock_irqsave(&engine->timeline->lock, flags);
> list_for_each_entry(rq, &engine->timeline->requests, link) {
> + spin_lock(&client->wq_lock);
> client->wq_rsvd += sizeof(struct guc_wq_item);
> + spin_unlock(&client->wq_lock);
> +
> __i915_guc_submit(rq);
> }
> + spin_unlock_irqrestore(&engine->timeline->lock, flags);
> }
>
> return 0;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index cdc7da60d37a..aa886b5fb2cd 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1350,13 +1350,20 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv,
> static __always_inline void
> gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
> {
> - if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
> - notify_ring(engine);
> + bool tasklet = false;
>
> if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
> set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> - tasklet_hi_schedule(&engine->irq_tasklet);
> + tasklet = true;
> }
> +
> + if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
> + notify_ring(engine);
> + tasklet |= i915.enable_guc_submission;
Hm.. noticed that BXT was quite unhappy in CI?
I wonder if we need to set ENGINE_IRQ_EXECLISTS here (and clear it in
the irq tasklet as well) to be fully identical between the two backends.
Not saying that was the reason for the CI unhappiness, just an
observation at this stage.
Regards,
Tvrtko
> + }
> +
> + if (tasklet)
> + tasklet_hi_schedule(&engine->irq_tasklet);
> }
>
> static irqreturn_t gen8_gt_irq_ack(struct drm_i915_private *dev_priv,
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index af717e1356a9..09e57755c325 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1300,7 +1300,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
>
> /* After a GPU reset, we may have requests to replay */
> clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> - if (!execlists_elsp_idle(engine)) {
> + if (!i915.enable_guc_submission && !execlists_elsp_idle(engine)) {
> DRM_DEBUG_DRIVER("Restarting %s from requests [0x%x, 0x%x]\n",
> engine->name,
> port_seqno(&engine->execlist_port[0]),
> @@ -1385,9 +1385,6 @@ static void reset_common_ring(struct intel_engine_cs *engine,
> request->ring->last_retired_head = -1;
> intel_ring_update_space(request->ring);
>
> - if (i915.enable_guc_submission)
> - return;
> -
> /* Catch up with any missed context-switch interrupts */
> if (request->ctx != port[0].request->ctx) {
> i915_gem_request_put(port[0].request);
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc
2017-02-15 11:56 ` Tvrtko Ursulin
@ 2017-02-15 12:20 ` Chris Wilson
2017-02-15 12:33 ` Tvrtko Ursulin
0 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2017-02-15 12:20 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
On Wed, Feb 15, 2017 at 11:56:28AM +0000, Tvrtko Ursulin wrote:
>
> On 14/02/2017 11:44, Chris Wilson wrote:
> >diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> >index cdc7da60d37a..aa886b5fb2cd 100644
> >--- a/drivers/gpu/drm/i915/i915_irq.c
> >+++ b/drivers/gpu/drm/i915/i915_irq.c
> >@@ -1350,13 +1350,20 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv,
> > static __always_inline void
> > gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
> > {
> >- if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
> >- notify_ring(engine);
> >+ bool tasklet = false;
> >
> > if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
> > set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> >- tasklet_hi_schedule(&engine->irq_tasklet);
> >+ tasklet = true;
> > }
> >+
> >+ if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
> >+ notify_ring(engine);
> >+ tasklet |= i915.enable_guc_submission;
>
> Hm.. noticed that BXT was quite unhappy in CI?
>
> I wonder if we need to set ENGINE_IRQ_EXECLISTS here (and clear it
> in the irq tasklet as well) to be fully identical between the two
> backends.
>
> Not saying that was the reason for the CI unhappiness, just an
> observation at this stage.
I don't think we want to. Note that the bit will always be unset, so
that shouldn't be causing the intel_execlists_idle() timeout. I was
fearing a missed interrupt as we only idle once the request list is
empty, but that means we didn't process the interrupt. Or something
scary like that. (I do recall seeing missed interrupts on my skl with
guc btw, don't know if that is still a feature.)
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc
2017-02-15 12:20 ` Chris Wilson
@ 2017-02-15 12:33 ` Tvrtko Ursulin
2017-03-13 2:28 ` Dong, Chuanxiao
0 siblings, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2017-02-15 12:33 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 15/02/2017 12:20, Chris Wilson wrote:
> On Wed, Feb 15, 2017 at 11:56:28AM +0000, Tvrtko Ursulin wrote:
>>
>> On 14/02/2017 11:44, Chris Wilson wrote:
>>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>>> index cdc7da60d37a..aa886b5fb2cd 100644
>>> --- a/drivers/gpu/drm/i915/i915_irq.c
>>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>>> @@ -1350,13 +1350,20 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv,
>>> static __always_inline void
>>> gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
>>> {
>>> - if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
>>> - notify_ring(engine);
>>> + bool tasklet = false;
>>>
>>> if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
>>> set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>>> - tasklet_hi_schedule(&engine->irq_tasklet);
>>> + tasklet = true;
>>> }
>>> +
>>> + if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
>>> + notify_ring(engine);
>>> + tasklet |= i915.enable_guc_submission;
>>
>> Hm.. noticed that BXT was quite unhappy in CI?
>>
>> I wonder if we need to set ENGINE_IRQ_EXECLISTS here (and clear it
>> in the irq tasklet as well) to be fully identical between the two
>> backends.
>>
>> Not saying that was the reason for the CI unhappiness, just an
>> observation at this stage.
>
> I don't think we want to. Note that the bit will always be unset, so
> that shouldn't be causing the intel_execlists_idle() timeout. I was
Yeah agreed, but when you added the bit test in intel_execlists_idle,
the commit message said it is a sanity check before suspend. So
presumably it would make sense to have the same for the GuC tasklet.
> fearing a missed interrupt as we only idle once the request list is
> empty, but that means we didn't process the interrupt. Or something
> scary like that. (I do recall seeing missed interrupts on my skl with
> guc btw, don't know if that is still a feature.)
No idea and no BXT to test on. But it is something new compared to the
previous CI run. So something changed in the tree in the meantime to
cause it.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc
2017-02-15 12:33 ` Tvrtko Ursulin
@ 2017-03-13 2:28 ` Dong, Chuanxiao
0 siblings, 0 replies; 8+ messages in thread
From: Dong, Chuanxiao @ 2017-03-13 2:28 UTC (permalink / raw)
To: Tvrtko Ursulin, Chris Wilson, intel-gfx@lists.freedesktop.org
Cc: intel-gvt-dev@lists.freedesktop.org
> -----Original Message-----
> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Tvrtko Ursulin
> Sent: Wednesday, February 15, 2017 8:34 PM
> To: Chris Wilson <chris@chris-wilson.co.uk>; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915/scheduler: emulate a scheduler
> for guc
>
>
> On 15/02/2017 12:20, Chris Wilson wrote:
> > On Wed, Feb 15, 2017 at 11:56:28AM +0000, Tvrtko Ursulin wrote:
> >>
> >> On 14/02/2017 11:44, Chris Wilson wrote:
> >>> diff --git a/drivers/gpu/drm/i915/i915_irq.c
> >>> b/drivers/gpu/drm/i915/i915_irq.c index cdc7da60d37a..aa886b5fb2cd
> >>> 100644
> >>> --- a/drivers/gpu/drm/i915/i915_irq.c
> >>> +++ b/drivers/gpu/drm/i915/i915_irq.c
> >>> @@ -1350,13 +1350,20 @@ static void snb_gt_irq_handler(struct
> >>> drm_i915_private *dev_priv, static __always_inline void
> >>> gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int
> >>> test_shift) {
> >>> - if (iir & (GT_RENDER_USER_INTERRUPT << test_shift))
> >>> - notify_ring(engine);
> >>> + bool tasklet = false;
> >>>
> >>> if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) {
> >>> set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> >>> - tasklet_hi_schedule(&engine->irq_tasklet);
> >>> + tasklet = true;
> >>> }
> >>> +
> >>> + if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) {
> >>> + notify_ring(engine);
> >>> + tasklet |= i915.enable_guc_submission;
> >>
> >> Hm.. noticed that BXT was quite unhappy in CI?
> >>
> >> I wonder if we need to set ENGINE_IRQ_EXECLISTS here (and clear it in
> >> the irq tasklet as well) to be fully identical between the two
> >> backends.
> >>
> >> Not saying that was the reason for the CI unhappiness, just an
> >> observation at this stage.
> >
> > I don't think we want to. Note that the bit will always be unset, so
> > that shouldn't be causing the intel_execlists_idle() timeout. I was
>
> Yeah agreed, but when you added the bit test in intel_execlists_idle, the
> commit message said it is a sanity check before suspend. So presumably it
> would make sense to have the same for the GuC tasklet.
>
> > fearing a missed interrupt as we only idle once the request list is
> > empty, but that means we didn't process the interrupt. Or something
> > scary like that. (I do recall seeing missed interrupts on my skl with
> > guc btw, don't know if that is still a feature.)
>
> No idea and no BXT to test on. But it is something new compared to the
> previous CI run. So something changed in the tree in the meantime to cause
> it.
Seems recently community has some bug fixing for the GuC CI testing. And this patch already cannot directly be applied on the latest intel-gfx tree. Is there any plan to re-submit this patch? I am asking because the current i915 host GuC scheduling doesn't provide any chance for GVT to do the notification like using execlist submit. But with this implementation, GuC can provide the right point to do the notification for GVT as well.
Thanks
Chuanxiao
>
> Regards,
>
> Tvrtko
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-03-13 2:28 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-14 11:44 [PATCH 1/2] HAX enable guc submission for CI Chris Wilson
2017-02-14 11:44 ` [PATCH 2/2] drm/i915/scheduler: emulate a scheduler for guc Chris Wilson
2017-02-14 14:03 ` Joonas Lahtinen
2017-02-15 11:56 ` Tvrtko Ursulin
2017-02-15 12:20 ` Chris Wilson
2017-02-15 12:33 ` Tvrtko Ursulin
2017-03-13 2:28 ` Dong, Chuanxiao
2017-02-14 14:52 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] HAX enable guc submission for CI Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox