All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Harrison <John.C.Harrison@Intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH 3/4] drm/i915: Interrupt driven fences
Date: Fri, 26 Jun 2015 18:00:03 +0100	[thread overview]
Message-ID: <558D8513.1020409@Intel.com> (raw)
In-Reply-To: <20150626133405.GD7632@nuc-i3427.alporthouse.com>

On 26/06/2015 14:34, Chris Wilson wrote:
> On Fri, Jun 26, 2015 at 01:58:11PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The intended usage model for struct fence is that the signalled status should be
>> set on demand rather than polled. That is, there should not be a need for a
>> 'signaled' function to be called everytime the status is queried. Instead,
>> 'something' should be done to enable a signal callback from the hardware which
>> will update the state directly. In the case of requests, this is the seqno
>> update interrupt. The idea is that this callback will only be enabled on demand
>> when something actually tries to wait on the fence.
>>
>> This change removes the polling test and replaces it with the callback scheme.
>> Each fence is added to a 'please poke me' list at the start of
>> i915_add_request(). The interrupt handler then scans through the 'poke me' list
>> when a new seqno pops out and signals any matching fence/request. The fence is
>> then removed from the list so the entire request stack does not need to be
>> scanned every time. Note that the fence is added to the list before the commands
>> to generate the seqno interrupt are added to the ring. Thus the sequence is
>> guaranteed to be race free if the interrupt is already enabled.
>>
>> One complication here is that the 'poke me' system requires holding a reference
>> count on the request to guarantee that it won't be freed prematurely.
>> Unfortunately, it is unsafe to decrement the reference count from the interrupt
>> handler because if that is the last reference, the clean up code gets run and
>> the clean up code is not IRQ friendly. Hence, the request is added to a 'please
>> clean me' list that gets processed at retire time. Any request in this list
>> simply has its count decremented and is then removed from that list.
>>
>> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
>> called). Thus there is still a potential race when enabling the interrupt as the
>> request may already have completed. However, this is simply solved by calling
>> the interrupt processing code immediately after enabling the interrupt and
>> thereby checking for already completed requests.
>>
>> Lastly, the ring clean up code has the possibility to cancel outstanding
>> requests (e.g. because TDR has reset the ring). These requests will never get
>> signalled and so must be removed from the signal list manually. This is done by
>> setting a 'cancelled' flag and then calling the regular notify/retire code path
>> rather than attempting to duplicate the list manipulatation and clean up code in
>> multiple places. This also avoid any race condition where the cancellation
>> request might occur after/during the completion interrupt actually arriving.
> -nightly nop:
> Time to exec x 1:                15.000µs (ring=render)
> Time to exec x 1:                 2.000µs (ring=blt)
> Time to exec x 131072:            1.827µs (ring=render)
> Time to exec x 131072:            1.555µs (ring=blt)
>
> rq tuning patches nop:
> Time to exec x 1:		 12.200µs (ring=render)
> Time to exec x 1:		  1.600µs (ring=blt)
> Time to exec x 131072:		  1.516µs (ring=render)
> Time to exec x 131072:		  0.812µs (ring=blt)
>
> interrupt driven nop:
> Time to exec x 1:		 19.200µs (ring=render)
> Time to exec x 1:		  5.200µs (ring=blt)
> Time to exec x 131072:		  2.381µs (ring=render)
> Time to exec x 131072:		  2.009µs (ring=blt)
>
> So the basic question that is left unanswered from last time is why
> would we want to slow down __i915_wait_request? And enabling IRQs still
> generates very high system load when processing the 30-40k IRQs per
> second found under some workloads.
> -Chris
>
As previously stated, the scheduler requires enabling interrupts for 
each batch buffer as it needs to know when something more needs sending 
to the hardware. Android requires enabling interrupts for each batch 
buffer as it uses the sync framework to wait on batch buffer completion 
asynchronously to the driver (i.e. without calling __i915_wait_request 
or any other driver code). I presume much of the slow down to 
wait_request itself is because it has to check for missed interrupts. It 
should be possible to optimise that somewhat although it was completely 
unnecessary in the original version as you can't miss interrupts if they 
are always on.

How do you get consistent results from gem_exec_nop? For the x1 case, I 
see random variation from one run to the next of the order of 10us -> 
over 100us. And that is with a straight nightly build.

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-06-26 17:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-26 12:58 [PATCH 0/4] Convert requests to use struct fence John.C.Harrison
2015-06-26 12:58 ` [PATCH 1/4] drm/i915: " John.C.Harrison
2015-06-26 12:58 ` [PATCH 2/4] drm/i915: Removed now redudant parameter to i915_gem_request_completed() John.C.Harrison
2015-06-26 12:58 ` [PATCH 3/4] drm/i915: Interrupt driven fences John.C.Harrison
2015-06-26 13:34   ` Chris Wilson
2015-06-26 17:00     ` John Harrison [this message]
2015-06-26 17:19       ` Chris Wilson
2015-06-26 12:58 ` [PATCH 4/4] drm/i915: Updated request structure tracing John.C.Harrison
2015-06-28 15:07   ` shuang.he

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=558D8513.1020409@Intel.com \
    --to=john.c.harrison@intel.com \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=chris@chris-wilson.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.