From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: John.C.Harrison@Intel.com, Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH v10b 4/6] drm/i915: Interrupt driven fences
Date: Tue, 21 Jun 2016 11:44:53 +0100 [thread overview]
Message-ID: <57691AA5.1040103@linux.intel.com> (raw)
In-Reply-To: <1466161520-36366-1-git-send-email-John.C.Harrison@Intel.com>
On 17/06/16 12:05, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The intended usage model for struct fence is that the signalled status
> should be set on demand rather than polled. That is, there should not
> be a need for a 'signaled' function to be called everytime the status
> is queried. Instead, 'something' should be done to enable a signal
> callback from the hardware which will update the state directly. In
> the case of requests, this is the seqno update interrupt. The idea is
> that this callback will only be enabled on demand when something
> actually tries to wait on the fence.
>
> This change removes the polling test and replaces it with the callback
> scheme. Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler (via a deferred work queue)
> then scans through the 'poke me' list when a new seqno pops out and
> signals any matching fence/request. The fence is then removed from the
> list so the entire request stack does not need to be scanned every
> time. The fence is added to the list before the commands to generate
> the seqno interrupt are added to the ring. Thus the sequence is
> guaranteed to be race free if the interrupt is already enabled.
>
> Note that the interrupt is only enabled on demand (i.e. when
> __wait_request() is called). Thus there is still a potential race when
> enabling the interrupt as the request may already have completed.
> However, this is simply solved by calling the interrupt processing
> code immediately after enabling the interrupt and thereby checking for
> already completed requests.
>
> Lastly, the ring clean up code has the possibility to cancel
> outstanding requests (e.g. because TDR has reset the ring). These
> requests will never get signalled and so must be removed from the
> signal list manually. This is done by setting a 'cancelled' flag and
> then calling the regular notify/retire code path rather than
> attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoids any race condition where the
> cancellation request might occur after/during the completion interrupt
> actually arriving.
>
> v2: Updated to take advantage of the request unreference no longer
> requiring the mutex lock.
>
> v3: Move the signal list processing around to prevent unsubmitted
> requests being added to the list. This was occurring on Android
> because the native sync implementation calls the
> fence->enable_signalling API immediately on fence creation.
>
> Updated after review comments by Tvrtko Ursulin. Renamed list nodes to
> 'link' instead of 'list'. Added support for returning an error code on
> a cancelled fence. Update list processing to be more efficient/safer
> with respect to spinlocks.
>
> v5: Made i915_gem_request_submit a static as it is only ever called
> from one place.
>
> Fixed up the low latency wait optimisation. The time delay between the
> seqno value being to memory and the drive's ISR running can be
> significant, at least for the wait request micro-benchmark. This can
> be greatly improved by explicitly checking for seqno updates in the
> pre-wait busy poll loop. Also added some documentation comments to the
> busy poll code.
>
> Fixed up support for the faking of lost interrupts
> (test_irq_rings/missed_irq_rings). That is, there is an IGT test that
> tells the driver to loose interrupts deliberately and then check that
> everything still works as expected (albeit much slower).
>
> Updates from review comments: use non IRQ-save spinlocking, early exit
> on WARN and improved comments (Tvrtko Ursulin).
>
> v6: Updated to newer nigthly and resolved conflicts around the
> wait_request busy spin optimisation. Also fixed a race condition
> between this early exit path and the regular completion path.
>
> v7: Updated to newer nightly - lots of ring -> engine renaming plus an
> interface change on get_seqno(). Also added a list_empty() check
> before acquring spinlocks and doing list processing.
>
> v8: Updated to newer nightly - changes to request clean up code mean
> non of the deferred free mess is needed any more.
>
> v9: Moved the request completion processing out of the interrupt
> handler and into a worker thread (Chris Wilson).
>
> v10: Changed to an un-ordered work queue to allow parallel processing
> of different engines. Also set the high priority flag for reduced
> latency. Removed some unnecessary checks for invalid seqno values.
> Improved/added some comments and WARNs. Moved a spinlock release a few
> lines later to make the 'locked' parameter of
> i915_gem_request_enable_interrupt redundant and removed it. Also
> shuffled the function around in the file so as to make it static and
> remove it from the header file. Corrected the use of
> fence_signal_locked() to fence_signal() in the retire code. Dropped
> the irq save part of the spin lock calls in the notify code as this is
> no longer called from the ISR. Changed the call of
> i915_gem_retire_requests_ring() in the reset cleanup code to
> i915_gem_request_notify() instead as the former is just duplicating a
> lot of operations.
> [Review comments from Maarten Lankhorst & Tvrtko Ursulin]
>
> Made the call to _notify() from _retire_requests_ring() conditional on
> interrupts not being enabled as it is only a race condition work
> around for that case. Also re-instated the lazy_coherency flag (but
> now on the _notify() function) to reduce the overhead of
> _retire_requests_ring() calling _notify() lots and lots (even with the
> anti-interrupt check).
>
> Updated for yet more nightly changes (u64 for fence context).
>
> v10b: Re-ordered the fence signal and IRQ release in _notify() to fix a race
> condition when disabling interrupts. Also, moved the wake_up_all()
> call from the IRQ handler to the worker thread to prevent the wake up
> of waiting threads from overtaking the signalling of the request.
I was concerned by the second part of this change which will increase
the wake-up latency for the waiters and has asked John do do some quick
low-level (gem_latency) testing.
Preliminary results were a bit strange with small batches experiencing
the expected slowdown but large one being significantly faster.
Since he is out this week I will try and run some more benchmarks on this.
To re-iterate, concern is moving the wake_up_all(&engine->irq_queue)
from notify_ring (hard irq) to the fence notify worker. This adds one
additional scheduling wakeup latency to the waiters which use the i915 API.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2016-06-21 10:45 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-16 12:54 [PATCH v10 0/6] Convert requests to use struct fence John.C.Harrison
2016-06-16 12:54 ` [PATCH v10 1/6] drm/i915: Add per context timelines for fence objects John.C.Harrison
2016-06-21 12:47 ` Maarten Lankhorst
2016-06-16 12:54 ` [PATCH v10 2/6] drm/i915: Convert requests to use struct fence John.C.Harrison
2016-06-21 12:58 ` Maarten Lankhorst
2016-06-16 12:54 ` [PATCH v10 3/6] drm/i915: Removed now redundant parameter to i915_gem_request_completed() John.C.Harrison
2016-06-16 12:54 ` [PATCH v10 4/6] drm/i915: Interrupt driven fences John.C.Harrison
2016-06-17 11:05 ` [PATCH v10b " John.C.Harrison
2016-06-21 10:44 ` Tvrtko Ursulin [this message]
2016-06-21 16:27 ` Tvrtko Ursulin
2016-06-27 18:28 ` John Harrison
2016-06-30 13:52 ` John.C.Harrison
2016-06-16 12:54 ` [PATCH v10 5/6] drm/i915: Updated request structure tracing John.C.Harrison
2016-06-17 11:06 ` [PATCH v10b " John.C.Harrison
2016-06-16 12:54 ` [PATCH v10 6/6] drm/i915: Cache last IRQ seqno to reduce IRQ overhead John.C.Harrison
2016-06-21 12:29 ` Maarten Lankhorst
2016-06-16 13:15 ` ✗ Ro.CI.BAT: failure for Convert requests to use struct fence (rev7) Patchwork
2016-06-17 11:10 ` John Harrison
2016-06-23 8:53 ` ✗ Ro.CI.BAT: failure for Convert requests to use struct fence (rev9) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57691AA5.1040103@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=Intel-GFX@Lists.FreeDesktop.Org \
--cc=John.C.Harrison@Intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox