From mboxrd@z Thu Jan 1 00:00:00 1970 From: Francisco Jerez Subject: Re: [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround. Date: Wed, 11 Jan 2017 16:05:00 -0800 Message-ID: <87pojtrvmb.fsf@riseup.net> References: <20170108073137.18665-1-currojerez@riseup.net> <20170109210756.3593-1-currojerez@riseup.net> <20170111081734.77p2iq6wmt7nekza@phenom.ffwll.local> <877f61er5i.fsf@gaia.fi.intel.com> <20170111122459.GC18077@nuc-i3427.alporthouse.com> <20170111132405.wevk5e5eboghpwfn@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1067419764==" Return-path: In-Reply-To: <20170111132405.wevk5e5eboghpwfn@phenom.ffwll.local> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Daniel Vetter , Chris Wilson , Mika Kuoppala Daniel Vetter , Jani Nikula , intel-gfx@lists.freedesktop.org, Eero Tamminen , beignet@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org --===============1067419764== Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Daniel Vetter writes: > On Wed, Jan 11, 2017 at 12:24:59PM +0000, Chris Wilson wrote: >> On Wed, Jan 11, 2017 at 02:07:37PM +0200, Mika Kuoppala wrote: >> > Daniel Vetter writes: >> >=20 >> > > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote: >> > >> The WaDisableLSQCROPERFforOCL workaround has the side effect of >> > >> disabling an L3SQ optimization that has huge performance implicatio= ns >> > >> and is unlikely to be necessary for the correct functioning of usual >> > >> graphic workloads. Userspace is free to re-enable the workaround on >> > >> demand, and is generally in a better position to determine whether = the >> > >> workaround is necessary than the DRM is (e.g. only during the >> > >> execution of compute kernels that rely on both L3 fences and HDC R/W >> > >> requests). >> > >>=20 >> > >> The same workaround seems to apply to BDW (at least to production >> > >> stepping G1) and SKL as well (the internal workaround database clai= ms >> > >> that it does for all steppings, while the BSpec workaround table on= ly >> > >> mentions pre-production steppings), but the DRM doesn't do anything >> > >> beyond whitelisting the L3SQCREG4 register so userspace can enable = it >> > >> when it sees fit. Do the same on KBL platforms. >> > >>=20 >> > >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 6= 0%, >> > >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master -- >> > >> This is followed by a regression of 35% and 10% respectively for the >> > >> same benchmarks and platform caused by my recent patch series >> > >> switching userspace to use the dataport constant cache instead of t= he >> > >> sampler to implement uniform pull constant loads, which caused us to >> > >> hit more heavily the L3 cache (and on platforms other than KBL had = the >> > >> opposite effect of improving performance of the same two benchmarks= ). >> > >> The overall effect on KBL of this change combined with the recent >> > >> userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapP= cf >> > >> was affected by the constant cache changes (though it improved as it >> > >> did on other platforms rather than regressing), but is not >> > >> significantly affected by this patch (with statistical significance= of >> > >> 5% and sample size 20). >> > >>=20 >> > >> v2: Drop some more code to avoid unused variable warning. >> > >>=20 >> > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=3D99256 >> > >> Signed-off-by: Francisco Jerez >> > >> Cc: Eero Tamminen >> > >> Cc: Jani Nikula >> > >> Cc: Mika Kuoppala >> > >> Cc: beignet@lists.freedesktop.org >> > > >> > > Don't we need some userspace flag/opt-in scheme to avoid stuff going= boom >> > > for compute kernels? Are the patches for mesa compute/beignet >> > > ready&reviewed? >> >=20 >> > This is explicit setting on kbl/E0 only. So one could argue >> > that unless they filter based on PCI-IDs, things would already >> > blow up across the skl/kbl population, if they forgot >> > to set it. The whitelisting is in place and looks sane >> > so this E0 exception is a wart that got in by me reading wa >> > database slavishly without thinking. >>=20 >> Add Fixes then? > > Yeah, cc: stable would be good to make sure it shows up in all supported > kernels, fast. Otherwise we'll get some good wtf bug reports. Agreed -- It would be nice for this to get to stable kernel branches. > -Daniel > --=20 > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlh2yCwACgkQg5k4nX1Sv1tBxQD/TK2GBflAZezEaQbhelLCB10I AsbRrV7cp3pSw1bbcPcA/13+VB2/YGhl4NWL95eyMkuy/sDQcHyPXU+CSASLZBrf =+r9n -----END PGP SIGNATURE----- --==-=-=-- --===============1067419764== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4 IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg== --===============1067419764==--