public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] RFC drm/i915: Slaughter the thundering i915_wait_request herd
Date: Fri, 27 Nov 2015 13:53:34 +0000	[thread overview]
Message-ID: <5658605E.1060103@linux.intel.com> (raw)
In-Reply-To: <20151127130112.GC31982@nuc-i3427.alporthouse.com>


On 27/11/15 13:01, Chris Wilson wrote:
> On Fri, Nov 27, 2015 at 12:11:32PM +0000, Tvrtko Ursulin wrote:
>>
>> Hi,
>>
>> On 31/10/15 10:34, Chris Wilson wrote:
>>> One particularly stressful scenario consists of many independent tasks
>>> all competing for GPU time and waiting upon the results (e.g. realtime
>>> transcoding of many, many streams). One bottleneck in particular is that
>>> each client waits on its own results, but every client is woken up after
>>> every batchbuffer - hence the thunder of hooves as then every client must
>>> do its heavyweight dance to read a coherent seqno to see if it is the
>>> lucky one. Alternatively, we can have one worker responsible for wakeing
>>> after an interrupt, checking the seqno and only wakeing up the clients
>>> who are complete. The disadvantage is that in the uncontended scenario
>>> (i.e. only one waiter) we incur an extra context switch in the wakeup
>>> path - though that should be mitigated somewhat by the busywait we do
>>> first before sleeping.
>>
>> Could you explain the design in a bit more detail in the commit message?
>
> One worker responsible for waking up after an interrupt and waking
> clients who are complete?

I think more is required. At least explain the request tree, how it is 
built and used, and the flow a bit.

>> And add some more comments in the code, where key things are
>> happening, new struct members, etc?
>
> I did.

Maybe in some patch I don't see? The posted has zero comments on new 
members added to intel_engine_cs and incomplete comments for the same in 
drm_i915_gem_request.

And the the code throughout has really to few comments relative to the 
complexity.

There is actually only one, about schedule() which is not really the 
main point! :) Nothing about the tree handling, state variables like 
irq_first and similar.

>>> +		timer.function = NULL;
>>> +		if (fake_irq || missed_irq(engine)) {
>>> +			setup_timer_on_stack(&timer,
>>> +					     (void (*)(unsigned long))fake_irq,
>>
>> Kaboom when it fires? :)
>
> It was s/fake_irq/wake_up_process/ at some point obviously forgot to
> merge that back in (the whole point of all that casting).
>
>>> +					     (unsigned long)current);
>>> +			mod_timer(&timer, jiffies + 1);
>>> +		}
>>
>> I still see no benefit in complicating things with a timer. It is
>> just a needlessly contrived way of doing a schedule_timeout. And I
>> don't see we would need to extend the mechanism for more precision
>> since it only comes into play in such borderline conditions that it
>> doesn't matter.
>
> I like the historical reasoning, having suffered the pain long ago. But
> there is less reason not to use schedule_timeout() unlike the only
> recently introduced io_schedule_timeout(). Fine. Have it your way!

Yay! :)

>>> +
>>> +		/* Unlike the individual clients, we do not want this
>>> +		 * background thread to contribute to the system load,
>>> +		 * i.e. we do not want to use io_schedule() here.
>>> +		 */
>>> +		schedule();
>>
>> I am also thinking about whether busy spin makes sense more in wait
>> request or here. Hm.. With that optimisation which makes only the
>> waiter on the next request in the queue spin it is OK to do it
>> there.
>
> No. I don't think that makes sense. To get here we have demonstrated that
> we are in a long-wait sequence, and we already the code already has the
> inclination to prevent busy-spins when we are waiting. Now we do introduce
> yet another context switch between the irq and client, but most of the
> latency is introduced by setting up the irq in the first place.

What doesn't make sense? To keep the busy spin, even optimized, in wait 
request after this patch?

>> (So please respin that patch series as well.)
>
> In review there was only a quick rename, but I also want to change it to
> a 10us timeout as that is favourable to more synchronous igt loads.

I thought I found a reversal of logic in "drm/i915: Only spin whilst 
waiting on the current request" which needs fixing?

>> But if that brings an improvement would a short spin be beneficial
>> here as well? Less so because waiters are already sleeping but could
>> a bit I suppose.
>
> Indeed, much less of an improvement due to the context switches.
>
>>> +void intel_engine_add_wakeup(struct drm_i915_gem_request *request)
>>> +{
>>> +	struct intel_engine_cs *engine = i915_gem_request_get_ring(request);
>>> +
>>> +	spin_lock(&engine->irq_lock);
>>> +	if (request->irq_count++ == 0) {
>>> +		struct rb_node **p, *parent;
>>> +		bool first;
>>> +
>>> +		if (RB_EMPTY_ROOT(&engine->irq_requests))
>>> +			schedule_work(&engine->irq_work);
>>
>> Worth thinking about a dedicated, maybe even CPU local work queue
>> for maximum responsiveness?
>
> Considered a kthread per ring. Downsides are that it consumes a slot for
> mostly idle threads, which are what the kworkers are for. (I am not keen
> on the complaints we would get for 6 kirq-i915/* threads.)

Also, I like the worker since when there is no one waiting it does not 
get woken up from notify_ring().

So was just thinking, wasn't sure, if using the system wq makes it 
compete with other jobs. If so dedicated worker wouldn't harm. Could 
make it high priority as well. Not sure it would make sense to make it 
bound any more. Probably the period for sleep to wakeup is too long for 
locality effects.

> What have I considered is doing intel_engine_add_wakeup() before the
> spin_request. If we have a spare CPU, that CPU will be enabling the IRQ
> whilst we do the busy spin. The complication is that doing the wakeup,
> schedule and rb insert isn't particularly cheap.

Yes I thought about is as well, best leave those experiments for later.

Regards,

Tvrtko

P.S. And just realised this work is competing with the scheduler which 
changes all this again.


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2015-11-27 13:53 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-31 10:34 [PATCH] RFC drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
2015-11-02  5:39 ` Gong, Zhipeng
2015-11-02  9:53   ` Chris Wilson
2015-11-02 10:40     ` Rogozhkin, Dmitry V
2015-11-02  9:59   ` Chris Wilson
2015-11-02 11:26     ` Gong, Zhipeng
2015-11-02 11:41       ` Chris Wilson
2015-11-02 14:00         ` Gong, Zhipeng
2015-11-02 14:15           ` Chris Wilson
2015-11-02 14:57             ` Gong, Zhipeng
2015-11-02 15:28               ` Chris Wilson
2015-11-02 20:58                 ` Chris Wilson
2015-11-03  3:06                   ` Gong, Zhipeng
2015-11-03 10:03                     ` Chris Wilson
2015-11-03 12:49                       ` Chris Wilson
2015-11-03 13:31                       ` Gong, Zhipeng
2015-11-03 14:02                         ` Chris Wilson
2015-11-04  6:19                           ` Gong, Zhipeng
2015-11-04  9:53                             ` Chris Wilson
2015-11-04 13:20                               ` Gong, Zhipeng
2015-11-04 14:48                                 ` Chris Wilson
2015-11-19 14:17                                   ` Rogozhkin, Dmitry V
2015-11-27 12:11 ` Tvrtko Ursulin
2015-11-27 13:01   ` Chris Wilson
2015-11-27 13:51     ` Chris Wilson
2015-11-27 13:53     ` Tvrtko Ursulin [this message]
2015-11-27 14:10       ` Chris Wilson
2015-11-27 15:23         ` John Harrison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5658605E.1060103@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox