From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tvrtko Ursulin Subject: Re: [PATCH 2/2] drm/i915: Limit the busy wait on requests to 2us not 10ms! Date: Mon, 16 Nov 2015 10:24:45 +0000 Message-ID: <5649AEED.9090807@linux.intel.com> References: <1447594364-4206-1-git-send-email-chris@chris-wilson.co.uk> <1447594364-4206-2-git-send-email-chris@chris-wilson.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <1447594364-4206-2-git-send-email-chris@chris-wilson.co.uk> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Chris Wilson , Jens Axboe , intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Daniel Vetter , Eero Tamminen , "Rantala, Valtteri" , stable@kernel.vger.org, dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org CkhpLAoKT24gMTUvMTEvMTUgMTM6MzIsIENocmlzIFdpbHNvbiB3cm90ZToKPiBXaGVuIHdhaXRp bmcgZm9yIGhpZ2ggZnJlcXVlbmN5IHJlcXVlc3RzLCB0aGUgZmluaXRlIGFtb3VudCBvZiB0aW1l Cj4gcmVxdWlyZWQgdG8gc2V0IHVwIHRoZSBpcnEgYW5kIHdhaXQgdXBvbiBpdCBsaW1pdHMgdGhl IHJlc3BvbnNlIHJhdGUuIEJ5Cj4gYnVzeXdhaXRpbmcgb24gdGhlIHJlcXVlc3QgY29tcGxldGlv biBmb3IgYSBzaG9ydCB3aGlsZSB3ZSBjYW4gc2VydmljZQo+IHRoZSBoaWdoIGZyZXF1ZW5jeSB3 YWl0cyBhcyBxdWljayBhcyBwb3NzaWJsZS4gSG93ZXZlciwgaWYgaXQgaXMgYSBzbG93Cj4gcmVx dWVzdCwgd2Ugd2FudCB0byBzbGVlcCBhcyBxdWlja2x5IGFzIHBvc3NpYmxlLiBUaGUgdHJhZGVv ZmYgYmV0d2Vlbgo+IHdhaXRpbmcgYW5kIHNsZWVwaW5nIGlzIHJvdWdobHkgdGhlIHRpbWUgaXQg dGFrZXMgdG8gc2xlZXAgb24gYSByZXF1ZXN0LAo+IG9uIHRoZSBvcmRlciBvZiBhIG1pY3Jvc2Vj b25kLiBCYXNlZCBvbiBtZWFzdXJlbWVudHMgZnJvbSBiaWcgY29yZSwgSQo+IGhhdmUgc2V0IHRo ZSBsaW1pdCBmb3IgYnVzeXdhaXRpbmcgYXMgMiBtaWNyb3NlY29uZHMuCgpTb3VuZHMgbGlrZSBz b2xpZCByZWFzb25pbmcuIFdvdWxkIGl0IGFsc28gYmUgd29ydGggZmluZGluZyB0aGUgdHJhZGUg Cm9mZiBsaW1pdCBmb3Igc21hbGwgY29yZT8KCj4gVGhlIGNvZGUgY3VycmVudGx5IHVzZXMgdGhl IGppZmZpZSBjbG9jaywgYnV0IHRoYXQgaXMgZmFyIHRvbyBjb2Fyc2UgKG9uCj4gdGhlIG9yZGVy IG9mIDEwIG1pbGxpc2Vjb25kcykgYW5kIHJlc3VsdHMgaW4gcG9vciBpbnRlcmFjdGl2aXR5IGFz IHRoZQo+IENQVSBlbmRzIHVwIGJlaW5nIGhvZ2dlZCBieSBzbG93IHJlcXVlc3RzLiBUbyBnZXQg bWljcm9zZWNvbmQgcmVzb2x1dGlvbgo+IHdlIG5lZWQgdG8gdXNlIGEgaGlnaCByZXNvbHV0aW9u IHRpbWVyLiBUaGUgY2hlYXBlc3Qgb2Ygd2hpY2ggaXMgcG9sbGluZwo+IGxvY2FsX2Nsb2NrKCks IGJ1dCB0aGF0IGlzIG9ubHkgdmFsaWQgb24gdGhlIHNhbWUgQ1BVLiBJZiB3ZSBzd2l0Y2ggQ1BV cwo+IGJlY2F1c2UgdGhlIHRhc2sgd2FzIHByZWVtcHRlZCwgd2UgY2FuIGFsc28gdXNlIHRoYXQg YXMgYW4gaW5kaWNhdG9yIHRoYXQKPiAgIHRoZSBzeXN0ZW0gaXMgdG9vIGJ1c3kgdG8gd2FzdGUg Y3ljbGVzIG9uIHNwaW5uaW5nIGFuZCB3ZSBzaG91bGQgc2xlZXAKPiBpbnN0ZWFkLgoKSG0sIG5l ZWRfcmVzY2hlZCB3b3VsZCBub3QgY292ZXIgdGhlIENQVSBzd2l0Y2ggYW55d2F5PyBPciBtYXli ZSAKbmVlZF9yZXNjaGVkIG1lYW5zIHNvbWV0aGluZyBvdGhlciB0aGFuIEkgdGhvdWdodCB3aGlj aCBpcyAidGhlcmUgYXJlIApvdGhlciBydW5uYWJsZSB0YXNrcyI/CgpUaGlzIHdvdWxkIGFsc28g aGF2ZSBpbXBhY3Qgb24gdGhlIHBhdGNoIHN1YmplY3QgbGluZS5JIHRob3VnaHQgd2Ugd291bGQg CmJ1cm4gYSBqaWZmaWUgb2YgQ1BVIGN5Y2xlcyBvbmx5IGlmIHRoZXJlIGFyZSBubyBvdGhlciBy dW5uYWJsZSB0YXNrcyAtIApzbyBob3cgY29tZSBhbiBpbXBhY3Qgb24gaW50ZXJhY3Rpdml0eT8K CkFsc28gYWdhaW4gSSB0aGluayB0aGUgY29tbWl0IG1lc3NhZ2UgbmVlZHMgc29tZSBkYXRhIG9u IGhvdyB0aGlzIHdhcyAKZm91bmQgYW5kIHdoYXQgaXMgdGhlIGltcGFjdC4KCkJ0dyBhcyBpdCBo YXBwZW5zLCBqdXN0IGxhc3Qgd2VlayBhcyBJIHdhcyBwbGF5aW5nIHdpdGggcGVyZiwgSSBkaWQg Cm5vdGljZSBidXN5IHNwaW5uaW5nIGlzIHRoZSB0b3AgY3ljbGUgd2FzdGVyIGluIHNvbWUgYmVu Y2htYXJrcy4gSSB3YXMgCmluIHRoZSBwcm9jZXNzIG9mIHRyeWluZyB0byBxdWFudGl6ZSB0aGUg ZGlmZmVyZW5jZSB3aXRoIGl0IG9uIG9yIG9mZiAKYnV0IGRpZCBub3QgY29tcGxldGUgaXQuCgo+ IF9faTkxNV9zcGluX3JlcXVlc3Qgd2FzIGludHJvZHVjZWQgaW4KPiBjb21taXQgMmRlZjRhZDk5 YmVmYTI1Nzc1ZGQyZjcxNGZkZDRkOTJmYWVjNmUzNCBbdjQuMl0KPiBBdXRob3I6IENocmlzIFdp bHNvbiA8Y2hyaXNAY2hyaXMtd2lsc29uLmNvLnVrPgo+IERhdGU6ICAgVHVlIEFwciA3IDE2OjIw OjQxIDIwMTUgKzAxMDAKPgo+ICAgICAgIGRybS9pOTE1OiBPcHRpbWlzdGljYWxseSBzcGluIGZv ciB0aGUgcmVxdWVzdCBjb21wbGV0aW9uCj4KPiBSZXBvcnRlZC1ieTogSmVucyBBeGJvZSA8YXhi b2VAa2VybmVsLmRrPgo+IExpbms6IGh0dHBzOi8vbGttbC5vcmcvbGttbC8yMDE1LzExLzEyLzYy MQo+IENjOiBKZW5zIEF4Ym9lIDxheGJvZUBrZXJuZWwuZGs+Cj4gQ2M7ICJSb2dvemhraW4sIERt aXRyeSBWIiA8ZG1pdHJ5LnYucm9nb3poa2luQGludGVsLmNvbT4KPiBDYzogRGFuaWVsIFZldHRl ciA8ZGFuaWVsLnZldHRlckBmZndsbC5jaD4KPiBDYzogVHZydGtvIFVyc3VsaW4gPHR2cnRrby51 cnN1bGluQGxpbnV4LmludGVsLmNvbT4KPiBDYzogRWVybyBUYW1taW5lbiA8ZWVyby50LnRhbW1p bmVuQGludGVsLmNvbT4KPiBDYzogIlJhbnRhbGEsIFZhbHR0ZXJpIiA8dmFsdHRlcmkucmFudGFs YUBpbnRlbC5jb20+Cj4gQ2M6IHN0YWJsZUBrZXJuZWwudmdlci5vcmcKPiAtLS0KPiAgIGRyaXZl cnMvZ3B1L2RybS9pOTE1L2k5MTVfZ2VtLmMgfCAyOCArKysrKysrKysrKysrKysrKysrKysrKysr LS0tCj4gICAxIGZpbGUgY2hhbmdlZCwgMjUgaW5zZXJ0aW9ucygrKSwgMyBkZWxldGlvbnMoLSkK Pgo+IGRpZmYgLS1naXQgYS9kcml2ZXJzL2dwdS9kcm0vaTkxNS9pOTE1X2dlbS5jIGIvZHJpdmVy cy9ncHUvZHJtL2k5MTUvaTkxNV9nZW0uYwo+IGluZGV4IDc0MDUzMGM1NzFkMS4uMmE4ODE1OGJk MWY3IDEwMDY0NAo+IC0tLSBhL2RyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfZ2VtLmMKPiArKysg Yi9kcml2ZXJzL2dwdS9kcm0vaTkxNS9pOTE1X2dlbS5jCj4gQEAgLTExNDYsMTQgKzExNDYsMzYg QEAgc3RhdGljIGJvb2wgbWlzc2VkX2lycShzdHJ1Y3QgZHJtX2k5MTVfcHJpdmF0ZSAqZGV2X3By aXYsCj4gICAJcmV0dXJuIHRlc3RfYml0KHJpbmctPmlkLCAmZGV2X3ByaXYtPmdwdV9lcnJvci5t aXNzZWRfaXJxX3JpbmdzKTsKPiAgIH0KPgo+ICtzdGF0aWMgdTY0IGxvY2FsX2Nsb2NrX3VzKHVu c2lnbmVkICpjcHUpCj4gK3sKPiArCXU2NCB0Owo+ICsKPiArCSpjcHUgPSBnZXRfY3B1KCk7Cj4g Kwl0ID0gbG9jYWxfY2xvY2soKSA+PiAxMDsKCk5lZWRzIGNvbW1lbnQgSSB0aGluayB0byBleHBs aWNpdGx5IG1lbnRpb24gdGhlIGFwcHJveGltYXRpb24sIG9yIG1heWJlIApkcm9wIHRoZSBfdXMg c3VmZml4PwoKPiArCXB1dF9jcHUoKTsKPiArCj4gKwlyZXR1cm4gdDsKPiArfQo+ICsKPiArc3Rh dGljIGJvb2wgYnVzeXdhaXRfc3RvcCh1NjQgdGltZW91dCwgdW5zaWduZWQgY3B1KQo+ICt7Cj4g Kwl1bnNpZ25lZCB0aGlzX2NwdTsKPiArCj4gKwlpZiAodGltZV9hZnRlcjY0KGxvY2FsX2Nsb2Nr X3VzKCZ0aGlzX2NwdSksIHRpbWVvdXQpKQo+ICsJCXJldHVybiB0cnVlOwo+ICsKPiArCXJldHVy biB0aGlzX2NwdSAhPSBjcHU7Cj4gK30KPiArCj4gICBzdGF0aWMgaW50IF9faTkxNV9zcGluX3Jl cXVlc3Qoc3RydWN0IGRybV9pOTE1X2dlbV9yZXF1ZXN0ICpyZXEsIGludCBzdGF0ZSkKPiAgIHsK PiAtCXVuc2lnbmVkIGxvbmcgdGltZW91dDsKPiArCXU2NCB0aW1lb3V0Owo+ICsJdW5zaWduZWQg Y3B1Owo+Cj4gICAJaWYgKGk5MTVfZ2VtX3JlcXVlc3RfZ2V0X3JpbmcocmVxKS0+aXJxX3JlZmNv dW50KQo+ICAgCQlyZXR1cm4gLUVCVVNZOwo+Cj4gLQl0aW1lb3V0ID0gamlmZmllcyArIDE7Cj4g Kwl0aW1lb3V0ID0gbG9jYWxfY2xvY2tfdXMoJmNwdSkgKyAyOwo+ICAgCXdoaWxlICghbmVlZF9y ZXNjaGVkKCkpIHsKPiAgIAkJaWYgKGk5MTVfZ2VtX3JlcXVlc3RfY29tcGxldGVkKHJlcSwgdHJ1 ZSkpCj4gICAJCQlyZXR1cm4gMDsKPiBAQCAtMTE2MSw3ICsxMTgzLDcgQEAgc3RhdGljIGludCBf X2k5MTVfc3Bpbl9yZXF1ZXN0KHN0cnVjdCBkcm1faTkxNV9nZW1fcmVxdWVzdCAqcmVxLCBpbnQg c3RhdGUpCj4gICAJCWlmIChzaWduYWxfcGVuZGluZ19zdGF0ZShzdGF0ZSwgY3VycmVudCkpCj4g ICAJCQlicmVhazsKPgo+IC0JCWlmICh0aW1lX2FmdGVyX2VxKGppZmZpZXMsIHRpbWVvdXQpKQo+ ICsJCWlmIChidXN5d2FpdF9zdG9wKHRpbWVvdXQsIGNwdSkpCj4gICAJCQlicmVhazsKPgo+ICAg CQljcHVfcmVsYXhfbG93bGF0ZW5jeSgpOwo+CgpPdGhlcndpc2UgbG9va3MgZ29vZC4gTm90IHN1 cmUgd2hhdCB3b3VsZCB5b3UgY29udmVydCB0byAzMi1iaXQgZnJvbSAKeW91ciBmb2xsb3cgdXAg cmVwbHkgc2luY2UgeW91IG5lZWQgdXMgcmVzb2x1dGlvbj8KClJlZ2FyZHMsCgpUdnJ0a28KX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4IG1h aWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0cy5m cmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9pbnRlbC1nZngK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751824AbbKPKYx (ORCPT ); Mon, 16 Nov 2015 05:24:53 -0500 Received: from mga14.intel.com ([192.55.52.115]:38739 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751380AbbKPKYv (ORCPT ); Mon, 16 Nov 2015 05:24:51 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,302,1444719600"; d="scan'208";a="601012809" Subject: Re: [PATCH 2/2] drm/i915: Limit the busy wait on requests to 2us not 10ms! To: Chris Wilson , Jens Axboe , intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org References: <1447594364-4206-1-git-send-email-chris@chris-wilson.co.uk> <1447594364-4206-2-git-send-email-chris@chris-wilson.co.uk> Cc: dri-devel@lists.freedesktop.org, Daniel Vetter , Eero Tamminen , "Rantala, Valtteri" , stable@kernel.vger.org From: Tvrtko Ursulin Organization: Intel Corporation UK Plc Message-ID: <5649AEED.9090807@linux.intel.com> Date: Mon, 16 Nov 2015 10:24:45 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <1447594364-4206-2-git-send-email-chris@chris-wilson.co.uk> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 15/11/15 13:32, Chris Wilson wrote: > When waiting for high frequency requests, the finite amount of time > required to set up the irq and wait upon it limits the response rate. By > busywaiting on the request completion for a short while we can service > the high frequency waits as quick as possible. However, if it is a slow > request, we want to sleep as quickly as possible. The tradeoff between > waiting and sleeping is roughly the time it takes to sleep on a request, > on the order of a microsecond. Based on measurements from big core, I > have set the limit for busywaiting as 2 microseconds. Sounds like solid reasoning. Would it also be worth finding the trade off limit for small core? > The code currently uses the jiffie clock, but that is far too coarse (on > the order of 10 milliseconds) and results in poor interactivity as the > CPU ends up being hogged by slow requests. To get microsecond resolution > we need to use a high resolution timer. The cheapest of which is polling > local_clock(), but that is only valid on the same CPU. If we switch CPUs > because the task was preempted, we can also use that as an indicator that > the system is too busy to waste cycles on spinning and we should sleep > instead. Hm, need_resched would not cover the CPU switch anyway? Or maybe need_resched means something other than I thought which is "there are other runnable tasks"? This would also have impact on the patch subject line.I thought we would burn a jiffie of CPU cycles only if there are no other runnable tasks - so how come an impact on interactivity? Also again I think the commit message needs some data on how this was found and what is the impact. Btw as it happens, just last week as I was playing with perf, I did notice busy spinning is the top cycle waster in some benchmarks. I was in the process of trying to quantize the difference with it on or off but did not complete it. > __i915_spin_request was introduced in > commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2] > Author: Chris Wilson > Date: Tue Apr 7 16:20:41 2015 +0100 > > drm/i915: Optimistically spin for the request completion > > Reported-by: Jens Axboe > Link: https://lkml.org/lkml/2015/11/12/621 > Cc: Jens Axboe > Cc; "Rogozhkin, Dmitry V" > Cc: Daniel Vetter > Cc: Tvrtko Ursulin > Cc: Eero Tamminen > Cc: "Rantala, Valtteri" > Cc: stable@kernel.vger.org > --- > drivers/gpu/drm/i915/i915_gem.c | 28 +++++++++++++++++++++++++--- > 1 file changed, 25 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 740530c571d1..2a88158bd1f7 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1146,14 +1146,36 @@ static bool missed_irq(struct drm_i915_private *dev_priv, > return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings); > } > > +static u64 local_clock_us(unsigned *cpu) > +{ > + u64 t; > + > + *cpu = get_cpu(); > + t = local_clock() >> 10; Needs comment I think to explicitly mention the approximation, or maybe drop the _us suffix? > + put_cpu(); > + > + return t; > +} > + > +static bool busywait_stop(u64 timeout, unsigned cpu) > +{ > + unsigned this_cpu; > + > + if (time_after64(local_clock_us(&this_cpu), timeout)) > + return true; > + > + return this_cpu != cpu; > +} > + > static int __i915_spin_request(struct drm_i915_gem_request *req, int state) > { > - unsigned long timeout; > + u64 timeout; > + unsigned cpu; > > if (i915_gem_request_get_ring(req)->irq_refcount) > return -EBUSY; > > - timeout = jiffies + 1; > + timeout = local_clock_us(&cpu) + 2; > while (!need_resched()) { > if (i915_gem_request_completed(req, true)) > return 0; > @@ -1161,7 +1183,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state) > if (signal_pending_state(state, current)) > break; > > - if (time_after_eq(jiffies, timeout)) > + if (busywait_stop(timeout, cpu)) > break; > > cpu_relax_lowlatency(); > Otherwise looks good. Not sure what would you convert to 32-bit from your follow up reply since you need us resolution? Regards, Tvrtko