From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915: Limit the busy wait on requests to 2us not 10ms! Date: Mon, 16 Nov 2015 15:30:06 +0200 Message-ID: <20151116133006.GM4437@intel.com> References: <1447594364-4206-1-git-send-email-chris@chris-wilson.co.uk> <1447594364-4206-2-git-send-email-chris@chris-wilson.co.uk> <5649AEED.9090807@linux.intel.com> <20151116111208.GQ569@nuc-i3427.alporthouse.com> <5649C728.5040109@linux.intel.com> <20151116125537.GS569@nuc-i3427.alporthouse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <20151116125537.GS569@nuc-i3427.alporthouse.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Chris Wilson , Tvrtko Ursulin , Jens Axboe , intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Daniel Vetter , Eero Tamminen , "Rantala, Valtteri" , stable@kernel.vger.org List-Id: dri-devel@lists.freedesktop.org T24gTW9uLCBOb3YgMTYsIDIwMTUgYXQgMTI6NTU6MzdQTSArMDAwMCwgQ2hyaXMgV2lsc29uIHdy b3RlOgo+IE9uIE1vbiwgTm92IDE2LCAyMDE1IGF0IDEyOjA4OjA4UE0gKzAwMDAsIFR2cnRrbyBV cnN1bGluIHdyb3RlOgo+ID4gCj4gPiBPbiAxNi8xMS8xNSAxMToxMiwgQ2hyaXMgV2lsc29uIHdy b3RlOgo+ID4gPk9uIE1vbiwgTm92IDE2LCAyMDE1IGF0IDEwOjI0OjQ1QU0gKzAwMDAsIFR2cnRr byBVcnN1bGluIHdyb3RlOgo+ID4gPj5PbiAxNS8xMS8xNSAxMzozMiwgQ2hyaXMgV2lsc29uIHdy b3RlOgo+ID4gPj4+K3N0YXRpYyB1NjQgbG9jYWxfY2xvY2tfdXModW5zaWduZWQgKmNwdSkKPiA+ ID4+Pit7Cj4gPiA+Pj4rCXU2NCB0Owo+ID4gPj4+Kwo+ID4gPj4+KwkqY3B1ID0gZ2V0X2NwdSgp Owo+ID4gPj4+Kwl0ID0gbG9jYWxfY2xvY2soKSA+PiAxMDsKPiA+ID4+Cj4gPiA+Pk5lZWRzIGNv bW1lbnQgSSB0aGluayB0byBleHBsaWNpdGx5IG1lbnRpb24gdGhlIGFwcHJveGltYXRpb24sIG9y Cj4gPiA+Pm1heWJlIGRyb3AgdGhlIF91cyBzdWZmaXg/Cj4gPiA+Cj4gPiA+SSBkaWQgY29uc2lk ZXIgX2FwcHJveF91cyBidXQgdGhvdWdodCB0aGF0IHdhcyBvdmVya2lsbC4gQSBjb21tZW50IGFs b25nCj4gPiA+dGhlIGxpbmVzIG9mCj4gPiA+LyogQXBwcm94aW1hdGVseSBjb252ZXJ0IG5zIHRv IHVzIC0gdGhlIGVycm9yIGlzIGxlc3MgdGhhbiB0aGUKPiA+ID4gICogdHJ1bmNhdGlvbiEKPiA+ ID4gICovCj4gPiAKPiA+IEFuZCB0aGUgcmVzdWx0IGlzIG5vdCB1c2VkIGluIHN1YnNlcXVlbnQg Y2FsY3VsYXRpb25zIGFwYXJ0IGZyb20KPiA+IGNvbXBhcmluZyBhZ2FpbnN0IGFuIGFwcHJveGlt YXRlIHRpbWVvdXQ/Cj4gCj4gRXhhY3RseSwgdGhlIHRpbWVvdXQgaXMgZmFpcmx5IGFyYml0cmFy eSBhbmQgZGVmaW5lZCBpbiB0aGUgc2FtZSB1bml0cy4KPiBUaGF0IHdlIHRydW5jYXRlIGlzIGEg bXVjaCBiaWdnZXIgY2F1c2UgZm9yIGNvbmNlcm4gaW4gdGVybXMgb2Ygc3Bpbm5pbmcKPiBhY2N1 cmF0ZWx5IGZvciBhIGRlZmluaXRlIGxlbmd0aCBvZiB0aW1lLgo+ICAKPiA+ID4+PkBAIC0xMTYx LDcgKzExODMsNyBAQCBzdGF0aWMgaW50IF9faTkxNV9zcGluX3JlcXVlc3Qoc3RydWN0IGRybV9p OTE1X2dlbV9yZXF1ZXN0ICpyZXEsIGludCBzdGF0ZSkKPiA+ID4+PiAgCQlpZiAoc2lnbmFsX3Bl bmRpbmdfc3RhdGUoc3RhdGUsIGN1cnJlbnQpKQo+ID4gPj4+ICAJCQlicmVhazsKPiA+ID4+Pgo+ ID4gPj4+LQkJaWYgKHRpbWVfYWZ0ZXJfZXEoamlmZmllcywgdGltZW91dCkpCj4gPiA+Pj4rCQlp ZiAoYnVzeXdhaXRfc3RvcCh0aW1lb3V0LCBjcHUpKQo+ID4gPj4+ICAJCQlicmVhazsKPiA+ID4+ Pgo+ID4gPj4+ICAJCWNwdV9yZWxheF9sb3dsYXRlbmN5KCk7Cj4gPiA+Pj4KPiA+ID4+Cj4gPiA+ Pk90aGVyd2lzZSBsb29rcyBnb29kLiBOb3Qgc3VyZSB3aGF0IHdvdWxkIHlvdSBjb252ZXJ0IHRv IDMyLWJpdCBmcm9tCj4gPiA+PnlvdXIgZm9sbG93IHVwIHJlcGx5IHNpbmNlIHlvdSBuZWVkIHVz IHJlc29sdXRpb24/Cj4gPiA+Cj4gPiA+cy91NjQvdW5zaWduZWQgbG9uZy8gcy90aW1lX2FmdGVy NjQvdGltZV9hZnRlci8KPiA+ID4KPiA+ID4zMmJpdHMgb2YgdXMgcmVzb2x1dGlvbiBnaXZlcyB1 cyAxMDAwcyBiZWZvcmUgd3JhcGFyb3VuZCBiZXR3ZWVuIHRoZSB0d28KPiA+ID5zYW1wbGVzLiBB bmQgSSBob3BlIHRoYXQgYSAxMDAwcyBkb2Vzbid0IHBhc3MgYmV0d2VlbiBsb29wcy4gT3IgaWYg aXQgZG9lcywKPiA+ID50aGUgR1BVIG1hbmFnZWQgdG8gY29tcGxldGUgaXRzIHRhc2suCj4gPiAK PiA+IE5vdyBJIHNlZSB0aGF0IHlvdSBkaWQgc2F5IGxvdyBiaXRzLi4geWVzIHRoYXQgc291bmRz IGZpbmUuCj4gPiAKPiA+IEJ0dyB3aGlsZSB5b3UgYXJlIG9wdGltaXppbmcgdGhpbmdzIG1heWJl IHBpY2sgdXAgdGhpcyBtaWNybwo+ID4gb3B0aW1pemF0aW9uOiBodHRwOi8vcGF0Y2h3b3JrLmZy ZWVkZXNrdG9wLm9yZy9wYXRjaC82NDMzOS8KPiA+IAo+ID4gTm90IGluIHNjb3BlIG9mIHRoaXMg dGhyZWFkIGJ1dCB1bmRlciB0aGUgbm9ybWFsIGRldmVsb3BtZW50IHBhdGNoIGZsb3cuCj4gCj4g VGhlcmUncyBhIGRpZmZlcmVudCBzZXJpZXMgd2hpY2ggbG9va3MgYXQgdGFja2xpbmcgdGhlIHNj YWxhYmlsdGl5IGlzc3VlCj4gd2l0aCBkb3plbnMgb2YgY29uY3VycmVudCB3YWl0ZXJzLiBJIGhh dmUgYW4gZXF1aXZhbGVudCBwYXRjaCB0aGVyZSBhbmQKPiBvbmUgdG8gdGlkeSB1cCB0aGUgc2Vx bm8gcXVlcnkuCj4gIAo+ID4gQnR3MiwgYW55IGJlbmNobWFyayByZXN1bHQgY2hhbmdlcyB3aXRo IHRoaXM/Cj4gCj4gU3Bpbm5pbmcgc3RpbGwgZ2l2ZXMgdGhlIGRyYW1hdGljICgyeCkgaW1wcm92 ZW1lbnQgaW4gdGhlIG1pY3JvYmVuY2htYXJrcwo+IChvdmVyIHB1cmUgaW50ZXJydXB0IGRyaXZl biB3YWl0cyksIHNvIHRoYXQgaW1wcm92ZW1lbnQgaXMgcHJlc2VydmVkLgoKUHJldmlvdXNseSB0 aGUgc3Bpbm5pbmcgYWxzbyBpbmNyZWFzZWQgcG93ZXIgY29uc3VtcHRpb24gd2l0aG91dApvZmZl cmluZyBhbnkgc2lnbmlmaWNhbnQgcGVyZm9ybWFuY2UgZGlmZmVyZW5jZSBmb3Igc29tZSB3b3Jr bG9hZHMuCklJUkMgb24gbXkgQllUIHRoZSBhdmVyYWdlIENQVSBwb3dlciBjb25zdW1wdGlvbiB3 YXMgfjEwMG1XIGhpZ2hlcgooYXMgcmVwb3J0ZWQgYnkgUkFQTCkgd2l0aCB4b25vdGljIHRoZS1i aWcta2V5YmVuY2guZGVtICgxOTIweDEyMDAKdy8gIkhpZ2giIHNldHRpbmdzLCBJSVJDKSBidXQg YXZlcmFnZSBmcHMgd2Fzbid0IGltcHJvdmVkLiBNaWdodApiZSBpbnRlcmVzdGluZyB0byBrbm93 IGhvdyB0aGUgaW1wcm92ZWQgc3BpbiBjb2RlIHN0YWNrcyB1cCBvbgp0aGF0IGZyb250LgoKPiBU aGVyZSBhcmUgYSBjb3VwbGUgb2YgaW50ZXJlc3Rpbmcgc3dpbmdzIGluIHRoZSBtYWNybyB0ZXN0 cyAoY29tcGFyZWR0IHRvCj4gdGhlIHByZXZpb3VzIGppZmZpZSBwYXRjaCkganVzdCBhYm92ZSB0 aGUgbm9pc2UgbGV2ZWwgd2hpY2ggY291bGQgd2VsbCBiZQo+IGEgY2hhbmdlIGluIHRoZSB0aHJv dHRsaW5nL3NjaGVkdWxpbmcuIChBbmQgdGhvc2UgdGVzdHMgYXJlIGFsc28gdGhlCj4gb25lcyB0 aGF0IGNvcnJlc3BvbmQgdG8gdGhlIGdyZWF0ZXN0IGdhaW5zICgxMC00MCUpIHVzaW5nIHNwaW5u aW5nLikKPiAtQ2hyaXMKPiAKPiAtLSAKPiBDaHJpcyBXaWxzb24sIEludGVsIE9wZW4gU291cmNl IFRlY2hub2xvZ3kgQ2VudHJlCj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX18KPiBJbnRlbC1nZnggbWFpbGluZyBsaXN0Cj4gSW50ZWwtZ2Z4QGxpc3RzLmZy ZWVkZXNrdG9wLm9yZwo+IGh0dHA6Ly9saXN0cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0 aW5mby9pbnRlbC1nZngKCi0tIApWaWxsZSBTeXJqw6Rsw6QKSW50ZWwgT1RDCl9fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmRyaS1kZXZlbCBtYWlsaW5nIGxp c3QKZHJpLWRldmVsQGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwOi8vbGlzdHMuZnJlZWRlc2t0 b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752705AbbKPNaO (ORCPT ); Mon, 16 Nov 2015 08:30:14 -0500 Received: from mga03.intel.com ([134.134.136.65]:6236 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752194AbbKPNaL (ORCPT ); Mon, 16 Nov 2015 08:30:11 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,303,1444719600"; d="scan'208";a="601096364" Date: Mon, 16 Nov 2015 15:30:06 +0200 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= To: Chris Wilson , Tvrtko Ursulin , Jens Axboe , intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Daniel Vetter , Eero Tamminen , "Rantala, Valtteri" , stable@kernel.vger.org Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915: Limit the busy wait on requests to 2us not 10ms! Message-ID: <20151116133006.GM4437@intel.com> References: <1447594364-4206-1-git-send-email-chris@chris-wilson.co.uk> <1447594364-4206-2-git-send-email-chris@chris-wilson.co.uk> <5649AEED.9090807@linux.intel.com> <20151116111208.GQ569@nuc-i3427.alporthouse.com> <5649C728.5040109@linux.intel.com> <20151116125537.GS569@nuc-i3427.alporthouse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20151116125537.GS569@nuc-i3427.alporthouse.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 16, 2015 at 12:55:37PM +0000, Chris Wilson wrote: > On Mon, Nov 16, 2015 at 12:08:08PM +0000, Tvrtko Ursulin wrote: > > > > On 16/11/15 11:12, Chris Wilson wrote: > > >On Mon, Nov 16, 2015 at 10:24:45AM +0000, Tvrtko Ursulin wrote: > > >>On 15/11/15 13:32, Chris Wilson wrote: > > >>>+static u64 local_clock_us(unsigned *cpu) > > >>>+{ > > >>>+ u64 t; > > >>>+ > > >>>+ *cpu = get_cpu(); > > >>>+ t = local_clock() >> 10; > > >> > > >>Needs comment I think to explicitly mention the approximation, or > > >>maybe drop the _us suffix? > > > > > >I did consider _approx_us but thought that was overkill. A comment along > > >the lines of > > >/* Approximately convert ns to us - the error is less than the > > > * truncation! > > > */ > > > > And the result is not used in subsequent calculations apart from > > comparing against an approximate timeout? > > Exactly, the timeout is fairly arbitrary and defined in the same units. > That we truncate is a much bigger cause for concern in terms of spinning > accurately for a definite length of time. > > > >>>@@ -1161,7 +1183,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state) > > >>> if (signal_pending_state(state, current)) > > >>> break; > > >>> > > >>>- if (time_after_eq(jiffies, timeout)) > > >>>+ if (busywait_stop(timeout, cpu)) > > >>> break; > > >>> > > >>> cpu_relax_lowlatency(); > > >>> > > >> > > >>Otherwise looks good. Not sure what would you convert to 32-bit from > > >>your follow up reply since you need us resolution? > > > > > >s/u64/unsigned long/ s/time_after64/time_after/ > > > > > >32bits of us resolution gives us 1000s before wraparound between the two > > >samples. And I hope that a 1000s doesn't pass between loops. Or if it does, > > >the GPU managed to complete its task. > > > > Now I see that you did say low bits.. yes that sounds fine. > > > > Btw while you are optimizing things maybe pick up this micro > > optimization: http://patchwork.freedesktop.org/patch/64339/ > > > > Not in scope of this thread but under the normal development patch flow. > > There's a different series which looks at tackling the scalabiltiy issue > with dozens of concurrent waiters. I have an equivalent patch there and > one to tidy up the seqno query. > > > Btw2, any benchmark result changes with this? > > Spinning still gives the dramatic (2x) improvement in the microbenchmarks > (over pure interrupt driven waits), so that improvement is preserved. Previously the spinning also increased power consumption without offering any significant performance difference for some workloads. IIRC on my BYT the average CPU power consumption was ~100mW higher (as reported by RAPL) with xonotic the-big-keybench.dem (1920x1200 w/ "High" settings, IIRC) but average fps wasn't improved. Might be interesting to know how the improved spin code stacks up on that front. > There are a couple of interesting swings in the macro tests (comparedt to > the previous jiffie patch) just above the noise level which could well be > a change in the throttling/scheduling. (And those tests are also the > ones that correspond to the greatest gains (10-40%) using spinning.) > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC