From mboxrd@z Thu Jan 1 00:00:00 1970 From: Francisco Jerez Subject: Re: [PATCH 2/3] drm/i915: Do not use iowait while waiting for the GPU Date: Tue, 31 Jul 2018 12:25:48 -0700 Message-ID: <87a7q7cimb.fsf@riseup.net> References: <20180730152522.31682-1-chris@chris-wilson.co.uk> <20180730152522.31682-3-chris@chris-wilson.co.uk> <87bmana77b.fsf@gaia.fi.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2107629540==" Return-path: Received: from mx1.riseup.net (mx1.riseup.net [198.252.153.129]) by gabe.freedesktop.org (Postfix) with ESMTPS id D64C0898A8 for ; Tue, 31 Jul 2018 19:42:45 +0000 (UTC) In-Reply-To: <87bmana77b.fsf@gaia.fi.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Mika Kuoppala , Chris Wilson , intel-gfx@lists.freedesktop.org Cc: Eero Tamminen List-Id: intel-gfx@lists.freedesktop.org --===============2107629540== Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Mika Kuoppala writes: > Chris Wilson writes: > >> A recent trend for cpufreq is to boost the CPU frequencies for >> iowaiters, in particularly to benefit high frequency I/O. We do the same >> and boost the GPU clocks to try and minimise time spent waiting for the >> GPU. However, as the igfx and CPU share the same TDP, boosting the CPU >> frequency will result in the GPU being throttled and its frequency being >> reduced. Thus declaring iowait negatively impacts on GPU throughput. >> >> v2: Both sleeps! >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=3D107410 >> References: 52ccc4314293 ("cpufreq: intel_pstate: HWP boost performance = on IO wakeup") > > The commit above has it's own heuristics on when to actual ramp up, > inspecting the interval of io waits. > > Regardless of that, with shared tdp, the waiter should not stand in a > way. I've been running some tests with this series (and your previous ones). I still see statistically significant regressions in latency-sensitive benchmarks with this series applied: qgears2/render-backend=3DXRender Extension/test-mode=3DText: XXX =C2=B10.2= 6% x12 -> XXX =C2=B10.36% x15 d=3D-0.97% =C2=B10.32% p=3D0.00% lightsmark: XXX =C2=B10.51% x= 22 -> XXX =C2=B10.49% x20 d=3D-1.58% =C2=B10.50% p=3D0.00% gputest/triangle: XXX =C2=B10.67% x= 10 -> XXX =C2=B11.76% x20 d=3D-1.73% =C2=B11.47% p=3D0.52% synmark/OglMultithread:=C4=9D XXX =C2=B10.= 47% x10 -> XXX =C2=B11.06% x20 d=3D-3.59% =C2=B10.88% p=3D0.00% Numbers above are from a partial benchmark run on BXT J3455 -- I'm still waiting to get the results of a full run though. Worse, in combination with my intel_pstate branch the effect of this patch is strictly negative. There are no improvements because the cpufreq governor is able to figure out by itself that boosting the frequency of the CPU under GPU-bound conditions cannot possibly help (The HWP boost logic could be fixed to do the same thing easily which would allow us to obtain the best of both worlds on big core). The reason for the regressions is that IOWAIT is a useful signal for the cpufreq governor to provide reduced latency in applications that are unable to parallelize enough work between CPU and the IO device -- The upstream governor is just using it rather ineffectively. > And that it fixes a regression: > This patch isn't necessary anymore to fix the regression, there is another change going in that mitigates the problem [1]. Can we please keep the IO schedule calls here? (and elsewhere in the atomic commit code) [1] https://lkml.org/lkml/2018/7/30/880 > Reviewed-by: Mika Kuoppala > > On other way around, the atomic commit code on updating > planes, could potentially benefit of changing to the > io_schedule_timeout. (and/or adopting c state limits) > > -Mika > >> Signed-off-by: Chris Wilson >> Cc: Tvrtko Ursulin >> Cc: Joonas Lahtinen >> Cc: Eero Tamminen >> Cc: Francisco Jerez >> --- >> drivers/gpu/drm/i915/i915_request.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/= i915_request.c >> index f3ff8dbe363d..3e48ea87b324 100644 >> --- a/drivers/gpu/drm/i915/i915_request.c >> +++ b/drivers/gpu/drm/i915/i915_request.c >> @@ -1376,7 +1376,7 @@ long i915_request_wait(struct i915_request *rq, >> goto complete; >> } >>=20=20 >> - timeout =3D io_schedule_timeout(timeout); >> + timeout =3D schedule_timeout(timeout); >> } while (1); >>=20=20 >> GEM_BUG_ON(!intel_wait_has_seqno(&wait)); >> @@ -1414,7 +1414,7 @@ long i915_request_wait(struct i915_request *rq, >> wait.seqno - 1)) >> qos =3D wait_dma_qos_add(); >>=20=20 >> - timeout =3D io_schedule_timeout(timeout); >> + timeout =3D schedule_timeout(timeout); >>=20=20 >> if (intel_wait_complete(&wait) && >> intel_wait_check_request(&wait, rq)) >> --=20 >> 2.18.0 --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEAREIAB0WIQST8OekYz69PM20/4aDmTidfVK/WwUCW2C3vAAKCRCDmTidfVK/ W/JyAP9GPO7y6SpjiAZxhYXlxCKFpWHeBdlBazNhYaeeMhYxdQEAk2avrmzFw9U5 OQv/jisyq1mQhmbsJe3H0CujWZbUIw8= =lKxy -----END PGP SIGNATURE----- --==-=-=-- --===============2107629540== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4 IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg== --===============2107629540==--