From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: stable@vger.kernel.org, Chris Wilson <chris@chris-wilson.co.uk>
Subject: Re: [Intel-gfx] [PATCH 10/39] drm/i915: Cancel outstanding work after disabling heartbeats on an engine
Date: Sat, 29 Aug 2020 12:30:27 +0300 [thread overview]
Message-ID: <87r1rpztn0.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20200826132811.17577-10-chris@chris-wilson.co.uk>
Chris Wilson <chris@chris-wilson.co.uk> writes:
> We only allow persistent requests to remain on the GPU past the closure
> of their containing context (and process) so long as they are continuously
> checked for hangs or allow other requests to preempt them, as we need to
> ensure forward progress of the system. If we allow persistent contexts
> to remain on the system after the the hangcheck mechanism is disabled,
> the system may grind to a halt. On disabling the mechanism, we sent a
> pulse along the engine to remove all executing contexts from the engine
> which would check for hung contexts -- but we did not prevent those
> contexts from being resubmitted if they survived the final hangcheck.
>
> Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs")
> Testcase: igt/gem_ctx_persistence/heartbeat-stop
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_engine.h | 9 +++++++++
> drivers/gpu/drm/i915/i915_request.c | 5 +++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index 08e2c000dcc3..7c3a1012e702 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -337,4 +337,13 @@ intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
> return intel_engine_has_preemption(engine);
> }
>
> +static inline bool
> +intel_engine_has_heartbeat(const struct intel_engine_cs *engine)
> +{
> + if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
> + return false;
> +
> + return READ_ONCE(engine->props.heartbeat_interval_ms);
> +}
> +
> #endif /* _INTEL_RINGBUFFER_H_ */
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index a931b8b571d1..c187e1ec0278 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -542,8 +542,13 @@ bool __i915_request_submit(struct i915_request *request)
> if (i915_request_completed(request))
> goto xfer;
>
> + if (unlikely(intel_context_is_closed(request->context) &&
> + !intel_engine_has_heartbeat(engine)))
> + intel_context_set_banned(request->context);
> +
> if (unlikely(intel_context_is_banned(request->context)))
> i915_request_set_error_once(request, -EIO);
> +
> if (unlikely(fatal_error(request->fence.error)))
> __i915_request_skip(request);
>
> --
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2020-08-29 9:31 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-26 13:27 [Intel-gfx] [PATCH 01/39] drm/i915/gem: Avoid implicit vmap for highmem on x86-32 Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 02/39] drm/i915/gem: Use set_pte_at() for assigning the vmapped PTE Chris Wilson
2020-08-26 16:36 ` Matthew Auld
2020-08-26 16:55 ` Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 03/39] drm/i915/gem: Prevent using pgprot_writecombine() if PAT is not supported Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 04/39] drm/i915/gt: Clear the buffer pool age before use Chris Wilson
2020-08-26 17:15 ` Matthew Auld
2020-08-26 13:27 ` [Intel-gfx] [PATCH 05/39] drm/i915/gt: Widen CSB pointer to u64 for the parsers Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 06/39] drm/i915/gt: Wait for CSB entries on Tigerlake Chris Wilson
2020-08-28 14:04 ` Mika Kuoppala
2020-08-26 13:27 ` [Intel-gfx] [PATCH 07/39] drm/i915/gt: Apply the CSB w/a for all Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 08/39] drm/i915/gt: Show engine properties in the pretty printer Chris Wilson
2020-08-27 12:25 ` Mika Kuoppala
2020-08-26 13:27 ` [Intel-gfx] [PATCH 09/39] drm/i915/gem: Hold request reference for canceling an active context Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 10/39] drm/i915: Cancel outstanding work after disabling heartbeats on an engine Chris Wilson
2020-08-29 9:30 ` Mika Kuoppala [this message]
2020-08-26 13:27 ` [Intel-gfx] [PATCH 11/39] drm/i915/gt: Always send a pulse down the engine after disabling heartbeat Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 12/39] drm/i915/gem: Always test execution status on closing the context Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 13/39] drm/i915/gt: Signal cancelled requests Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 14/39] drm/i915/selftests: Finish pending mock requests on cancellation Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 15/39] drm/i915/gt: Retire cancelled requests on unload Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 16/39] drm/i915/gt: Remove defunct intel_virtual_engine_get_sibling() Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 17/39] drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 18/39] drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 19/39] drm/i915/gt: Don't cancel the interrupt shadow too early Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 20/39] drm/i915/gt: Free stale request on destroying the virtual engine Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 21/39] drm/i915/gt: Protect context lifetime with RCU Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 22/39] drm/i915/gt: Split the breadcrumb spinlock between global and contexts Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 23/39] drm/i915/gt: Move the breadcrumb to the signaler if completed upon cancel Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 24/39] drm/i915/gt: Decouple completed requests on unwind Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 25/39] drm/i915/gt: Check for a completed last request once Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 26/39] drm/i915/gt: Replace direct submit with direct call to tasklet Chris Wilson
2020-08-26 13:27 ` [Intel-gfx] [PATCH 27/39] drm/i915/gt: ce->inflight updates are now serialised Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 28/39] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 29/39] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 30/39] drm/i915/gt: Defer schedule_out until after the next dequeue Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 31/39] drm/i915/gt: Remove virtual breadcrumb before transfer Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 32/39] drm/i915/gt: Shrink the critical section for irq signaling Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 33/39] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 34/39] drm/i915/gt: Simplify virtual engine handling for execlists_hold() Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 35/39] drm/i915: Encode fence specific waitqueue behaviour into the wait.flags Chris Wilson
2020-09-02 14:02 ` Thomas Hellström (Intel)
2020-09-03 9:50 ` Thomas Hellström (Intel)
2020-09-03 11:32 ` Chris Wilson
2020-09-03 12:08 ` Thomas Hellström (Intel)
2020-08-26 13:28 ` [Intel-gfx] [PATCH 36/39] drm/i915/selftests: Confirm RING_TIMESTAMP / CTX_TIMESTAMP share a clock Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 37/39] drm/i915/gt: Consolidate the CS timestamp clocks Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 38/39] drm/i915: Break up error capture compression loops with cond_resched() Chris Wilson
2020-08-26 13:28 ` [Intel-gfx] [PATCH 39/39] drm/i915: Reduce GPU error capture mutex hold time Chris Wilson
2020-08-26 13:53 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/39] drm/i915/gem: Avoid implicit vmap for highmem on x86-32 Patchwork
2020-08-26 13:54 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-08-26 14:11 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-08-26 14:20 ` [Intel-gfx] [PATCH 01/39] " Matthew Auld
2020-08-26 20:43 ` Harald Arnesen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r1rpztn0.fsf@gaia.fi.intel.com \
--to=mika.kuoppala@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox