From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HK_RANDOM_FROM,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4AF4C433DB for ; Thu, 7 Jan 2021 12:53:24 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 763CD2336D for ; Thu, 7 Jan 2021 12:53:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 763CD2336D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 217306E442; Thu, 7 Jan 2021 12:53:24 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id E80316E442 for ; Thu, 7 Jan 2021 12:53:22 +0000 (UTC) IronPort-SDR: /wSfLJVernY/dnL04nNwVvE6A5E/xaB2rh/n5jWeX97L1e/DfvMQJlnvbGdWcnnWDtH+mLeqUc +JujNkQmXN+A== X-IronPort-AV: E=McAfee;i="6000,8403,9856"; a="173913481" X-IronPort-AV: E=Sophos;i="5.79,329,1602572400"; d="scan'208";a="173913481" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jan 2021 04:53:22 -0800 IronPort-SDR: z1UiO9YZCBT8Ro/5PhivbHhJNGhwPzElf+uwghbh2ecBgoss/43KdCdGVOizp6N4/4dGevLOlv HXYjGLMa+NYw== X-IronPort-AV: E=Sophos;i="5.79,329,1602572400"; d="scan'208";a="379705102" Received: from obenha-mobl2.ger.corp.intel.com (HELO [10.252.62.192]) ([10.252.62.192]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jan 2021 04:53:21 -0800 To: Chris Wilson , intel-gfx@lists.freedesktop.org References: <20210106163642.4405-2-chris@chris-wilson.co.uk> <20210107110555.19386-1-chris@chris-wilson.co.uk> From: Tvrtko Ursulin Organization: Intel Corporation UK Plc Message-ID: <985bf8a3-bb4c-5281-f56c-162a4131e5c1@linux.intel.com> Date: Thu, 7 Jan 2021 12:53:19 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <20210107110555.19386-1-chris@chris-wilson.co.uk> Content-Language: en-US Subject: Re: [Intel-gfx] [PATCH v3] drm/i915/gt: Remove timeslice suppression X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On 07/01/2021 11:05, Chris Wilson wrote: > In the next^W future patch, we remove the strict priority system and > continuously re-evaluate the relative priority of tasks. As such we need > to enable the timeslice whenever there is more than one context in the > pipeline. This simplifies the decision and removes some of the tweaks to > suppress timeslicing, allowing us to lift the timeslice enabling to a > common spot at the end of running the submission tasklet. > > One consequence of the suppression is that it was reducing fairness > between virtual engines on an over saturated system; undermining the > principle for timeslicing. > > v2: Commentary > v3: Commentary for the right cancel_timer() > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2802 > Testcase: igt/gem_exec_balancer/fairslice > Signed-off-by: Chris Wilson > Cc: Tvrtko Ursulin > --- > drivers/gpu/drm/i915/gt/intel_engine_types.h | 10 - > .../drm/i915/gt/intel_execlists_submission.c | 204 +++++++++--------- > 2 files changed, 99 insertions(+), 115 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h > index 430066e5884c..df62e793e747 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h > @@ -238,16 +238,6 @@ struct intel_engine_execlists { > */ > unsigned int port_mask; > > - /** > - * @switch_priority_hint: Second context priority. > - * > - * We submit multiple contexts to the HW simultaneously and would > - * like to occasionally switch between them to emulate timeslicing. > - * To know when timeslicing is suitable, we track the priority of > - * the context submitted second. > - */ > - int switch_priority_hint; > - > /** > * @queue_priority_hint: Highest pending priority. > * > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > index a5b442683c18..13be71a81a38 100644 > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > @@ -1148,25 +1148,6 @@ static void defer_active(struct intel_engine_cs *engine) > defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq))); > } > > -static bool > -need_timeslice(const struct intel_engine_cs *engine, > - const struct i915_request *rq) > -{ > - int hint; > - > - if (!intel_engine_has_timeslices(engine)) > - return false; > - > - hint = max(engine->execlists.queue_priority_hint, > - virtual_prio(&engine->execlists)); > - > - if (!list_is_last(&rq->sched.link, &engine->active.requests)) > - hint = max(hint, rq_prio(list_next_entry(rq, sched.link))); > - > - GEM_BUG_ON(hint >= I915_PRIORITY_UNPREEMPTABLE); > - return hint >= effective_prio(rq); > -} > - > static bool > timeslice_yield(const struct intel_engine_execlists *el, > const struct i915_request *rq) > @@ -1186,76 +1167,74 @@ timeslice_yield(const struct intel_engine_execlists *el, > return rq->context->lrc.ccid == READ_ONCE(el->yield); > } > > -static bool > -timeslice_expired(const struct intel_engine_execlists *el, > - const struct i915_request *rq) > +static bool needs_timeslice(const struct intel_engine_cs *engine, > + const struct i915_request *rq) > { > + if (!intel_engine_has_timeslices(engine)) > + return false; > + > + /* If not currently active, or about to switch, wait for next event */ > + if (!rq || __i915_request_is_complete(rq)) > + return false; > + > + /* We do not need to start the timeslice until after the ACK */ > + if (READ_ONCE(engine->execlists.pending[0])) > + return false; > + > + /* If ELSP[1] is occupied, always check to see if worth slicing */ > + if (!list_is_last_rcu(&rq->sched.link, &engine->active.requests)) > + return true; > + > + /* Otherwise, ELSP[0] is by itself, but may be waiting in the queue */ > + if (!RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)) > + return true; > + > + return !RB_EMPTY_ROOT(&engine->execlists.virtual.rb_root); > +} > + > +static bool > +timeslice_expired(struct intel_engine_cs *engine, const struct i915_request *rq) > +{ > + const struct intel_engine_execlists *el = &engine->execlists; > + > + if (i915_request_has_nopreempt(rq) && __i915_request_has_started(rq)) > + return false; > + > + if (!needs_timeslice(engine, rq)) > + return false; > + > return timer_expired(&el->timer) || timeslice_yield(el, rq); > } > > -static int > -switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq) > -{ > - if (list_is_last(&rq->sched.link, &engine->active.requests)) > - return engine->execlists.queue_priority_hint; > - > - return rq_prio(list_next_entry(rq, sched.link)); > -} > - > -static inline unsigned long > -timeslice(const struct intel_engine_cs *engine) > +static unsigned long timeslice(const struct intel_engine_cs *engine) > { > return READ_ONCE(engine->props.timeslice_duration_ms); > } > > -static unsigned long active_timeslice(const struct intel_engine_cs *engine) > -{ > - const struct intel_engine_execlists *execlists = &engine->execlists; > - const struct i915_request *rq = *execlists->active; > - > - if (!rq || __i915_request_is_complete(rq)) > - return 0; > - > - if (READ_ONCE(execlists->switch_priority_hint) < effective_prio(rq)) > - return 0; > - > - return timeslice(engine); > -} > - > -static void set_timeslice(struct intel_engine_cs *engine) > +static void start_timeslice(struct intel_engine_cs *engine) > { > + struct intel_engine_execlists *el = &engine->execlists; > unsigned long duration; > > - if (!intel_engine_has_timeslices(engine)) > - return; > + /* Disable the timer if there is nothing to switch to */ > + duration = 0; > + if (needs_timeslice(engine, *el->active)) { > + /* Avoid continually prolonging an active timeslice */ > + if (timer_active(&el->timer)) { > + /* > + * If we just submitted a new ELSP after an old > + * context, that context may have already consumed > + * its timeslice, so recheck. > + */ > + if (!timer_pending(&el->timer)) > + tasklet_hi_schedule(&engine->execlists.tasklet); > + return; > + } > > - duration = active_timeslice(engine); > - ENGINE_TRACE(engine, "bump timeslicing, interval:%lu", duration); > + duration = timeslice(engine); > + } > > - set_timer_ms(&engine->execlists.timer, duration); > -} > - > -static void start_timeslice(struct intel_engine_cs *engine, int prio) > -{ > - struct intel_engine_execlists *execlists = &engine->execlists; > - unsigned long duration; > - > - if (!intel_engine_has_timeslices(engine)) > - return; > - > - WRITE_ONCE(execlists->switch_priority_hint, prio); > - if (prio == INT_MIN) > - return; > - > - if (timer_pending(&execlists->timer)) > - return; > - > - duration = timeslice(engine); > - ENGINE_TRACE(engine, > - "start timeslicing, prio:%d, interval:%lu", > - prio, duration); > - > - set_timer_ms(&execlists->timer, duration); > + set_timer_ms(&el->timer, duration); > } > > static void record_preemption(struct intel_engine_execlists *execlists) > @@ -1368,16 +1347,32 @@ static void execlists_dequeue(struct intel_engine_cs *engine) > __unwind_incomplete_requests(engine); > > last = NULL; > - } else if (need_timeslice(engine, last) && > - timeslice_expired(execlists, last)) { > + } else if (timeslice_expired(engine, last)) { > ENGINE_TRACE(engine, > - "expired last=%llx:%lld, prio=%d, hint=%d, yield?=%s\n", > - last->fence.context, > - last->fence.seqno, > - last->sched.attr.priority, > + "expired:%s last=%llx:%lld, prio=%d, hint=%d, yield?=%s\n", > + yesno(timer_expired(&execlists->timer)), > + last->fence.context, last->fence.seqno, > + rq_prio(last), > execlists->queue_priority_hint, > yesno(timeslice_yield(execlists, last))); > > + /* > + * Consume this timeslice; ensure we start a new one. > + * > + * The timeslice expired, and we will unwind the > + * running contexts and recompute the next ELSP. > + * If that submit will be the same pair of contexts > + * (due to dependency ordering), we will skip the > + * submission. If we don't cancel the timer now, > + * we will see that the timer has expired and > + * reschedule the tasklet; continually until the > + * next context switch or other preeemption event. > + * > + * Since we have decided to reschedule based on > + * consumption of this timeslice, if we submit the > + * same context again, grant it a full timeslice. > + */ > + cancel_timer(&execlists->timer); > ring_set_paused(engine, 1); > defer_active(engine); > > @@ -1413,7 +1408,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) > * of timeslices, our queue might be. > */ > spin_unlock(&engine->active.lock); > - start_timeslice(engine, queue_prio(execlists)); > return; > } > } > @@ -1440,7 +1434,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) > if (last && !can_merge_rq(last, rq)) { > spin_unlock(&ve->base.active.lock); > spin_unlock(&engine->active.lock); > - start_timeslice(engine, rq_prio(rq)); > return; /* leave this for another sibling */ > } > > @@ -1604,29 +1597,23 @@ static void execlists_dequeue(struct intel_engine_cs *engine) > execlists->queue_priority_hint = queue_prio(execlists); > spin_unlock(&engine->active.lock); > > - if (submit) { > - /* > - * Skip if we ended up with exactly the same set of requests, > - * e.g. trying to timeslice a pair of ordered contexts > - */ > - if (!memcmp(execlists->active, > - execlists->pending, > - (port - execlists->pending) * sizeof(*port))) > - goto skip_submit; > - > + /* > + * We can skip poking the HW if we ended up with exactly the same set > + * of requests as currently running, e.g. trying to timeslice a pair > + * of ordered contexts. > + */ > + if (submit && > + memcmp(execlists->active, > + execlists->pending, > + (port - execlists->pending) * sizeof(*port))) { > *port = NULL; > while (port-- != execlists->pending) > execlists_schedule_in(*port, port - execlists->pending); > > - execlists->switch_priority_hint = > - switch_prio(engine, *execlists->pending); > - > WRITE_ONCE(execlists->yield, -1); > set_preempt_timeout(engine, *execlists->active); > execlists_submit_ports(engine); > } else { > - start_timeslice(engine, execlists->queue_priority_hint); > -skip_submit: > ring_set_paused(engine, 0); > while (port-- != execlists->pending) > i915_request_put(*port); > @@ -1865,6 +1852,8 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive) > * we perform the READ_ONCE(*csb_write). > */ > rmb(); > + > + *inactive = NULL; > do { > bool promote; > u64 csb; > @@ -1984,8 +1973,6 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive) > } > } while (head != tail); > > - set_timeslice(engine); > - > /* > * Gen11 has proven to fail wrt global observation point between > * entry and tail update, failing on the ordering and thus > @@ -1999,6 +1986,14 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive) > */ > invalidate_csb_entries(&buf[0], &buf[num_entries - 1]); > > + /* > + * We assume that any event reflects a change in context flow > + * and merits a fresh timeslice. We reinstall the timer after > + * inspecting the queue to see if we need to resumbit. > + */ > + if (*inactive != *execlists->active) /* elide lite-restores */ > + cancel_timer(&execlists->timer); > + > return inactive; > } > > @@ -2410,8 +2405,10 @@ static void execlists_submission_tasklet(unsigned long data) > execlists_reset(engine, msg); > } > > - if (!engine->execlists.pending[0]) > + if (!engine->execlists.pending[0]) { > execlists_dequeue_irq(engine); > + start_timeslice(engine); > + } > > post_process_csb(post, inactive); > rcu_read_unlock(); > @@ -3856,9 +3853,6 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine, > show_request(m, last, "\t\t", 0); > } > > - if (execlists->switch_priority_hint != INT_MIN) > - drm_printf(m, "\t\tSwitch priority hint: %d\n", > - READ_ONCE(execlists->switch_priority_hint)); > if (execlists->queue_priority_hint != INT_MIN) > drm_printf(m, "\t\tQueue priority hint: %d\n", > READ_ONCE(execlists->queue_priority_hint)); > Reviewed-by: Tvrtko Ursulin Regards, Tvrtko _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx