From mboxrd@z Thu Jan 1 00:00:00 1970 From: Francisco Jerez Subject: Re: [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE Date: Wed, 28 Jun 2017 14:10:40 -0700 Message-ID: <87fuejrf73.fsf@riseup.net> References: <20170504150245.wuahcnen5ebbg4js@boom> <87wp9wbj4p.fsf@riseup.net> <20170504200333.GK24019@nuc-i3427.alporthouse.com> <87k25wbanz.fsf@riseup.net> <149864516191.8075.9598564471838782681@mail.alporthouse.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0442320907==" Return-path: Received: from mx1.riseup.net (mx1.riseup.net [198.252.153.129]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0B3466E106 for ; Wed, 28 Jun 2017 21:21:01 +0000 (UTC) In-Reply-To: <149864516191.8075.9598564471838782681@mail.alporthouse.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Chris Wilson Cc: intel-gfx@lists.freedesktop.org, stable@vger.kernel.org List-Id: intel-gfx@lists.freedesktop.org --===============0442320907== Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Chris Wilson writes: > Quoting Francisco Jerez (2017-05-04 21:59:44) >> Chris Wilson writes: >>=20 >> > On Thu, May 04, 2017 at 10:56:54AM -0700, Francisco Jerez wrote: >> >> David Weinehall writes: >> >>=20 >> >> > On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote: >> >> >> A good default for garbage entries from the user is to follow the >> >> >> default setting of the object (i.e. the PTE). Currently they use t= he >> >> >> uncached entry, and now the only way to accidentally hit uncached >> >> >> performance is via explicit use of the uncached MOCS or setting the >> >> >> object to uncached. Note that these entries are currently undefine= d in >> >> >> the ABI and we reserve the right to change them. We originally cho= se >> >> >> uncached to eliminate any problem with reducing the caching level = in >> >> >> future, but the object is a much better definition of the minimum >> >> >> caching level. >> >> >>=20 >> >>=20 >> >> NAK. The reason for the default being UC is that it's the only setti= ng >> >> that guarantees full forwards compatibility with any other entry that >> >> might be added in the future. If you default to PTE on (e)LLC and WB= on >> >> L3, userspace will no longer be able to use any newly introduced entry >> >> with stricter coherency guarantees than that (e.g. any L3-uncached >> >> entry) in a backwards-compatible way. Attempting to do so may break >> >> memory coherency assumptions of the application and lead to misrender= ing >> >> when run on older kernel versions (which to my judgment is a scarier >> >> failure mode than reduced performance). >> > >> > You can't use a weaker coherency model in mocs than that specified for >> > the object as you can't control other uses of the object (even just >> > memory pressure will break your assumptions). >>=20 >> Exactly, but you can use a stronger coherency model than the application >> requested, which is why falling back to UC should generally work for >> unknown entries but falling back to PTE+WB isn't guaranteed to. > > Still wrong. GEM will write into the CPU cache believing the object is > coherent. The GPU will read from memory bypassing the CPU cache > following the UC mocs. I agree that this is a plausible scenario. > The only safe option is for it to follow PTE. Except you don't know whether the client reading or writing at the other end is the CPU, or whether the client at the other end is (set up to be) LLC-coherent. There's likely no 100% safe option on the LLC side of things. I could probably be convinced that in a number of scenarios PTE on LLC has somewhat better chances of success, but on the L3 side of things this patch enables WB which is AFAIA strictly more weakly coherent than UC, so it still gets my NAK. > -Chris --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAllUG1AACgkQg5k4nX1Sv1sJMAD+IPuOPioqTUr8bwq2xYQdnuT9 yP1eZrQ4VJaSbad/RsEA/0487cLT42srn/vrf43kx4ZbIA+zrTbZwJs+ON0Xbio7 =HSw8 -----END PGP SIGNATURE----- --==-=-=-- --===============0442320907== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4 IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg== --===============0442320907==-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.riseup.net ([198.252.153.129]:52418 "EHLO mx1.riseup.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751552AbdF1V2s (ORCPT ); Wed, 28 Jun 2017 17:28:48 -0400 From: Francisco Jerez To: Chris Wilson Cc: David Weinehall , intel-gfx@lists.freedesktop.org, stable@vger.kernel.org Subject: Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE In-Reply-To: <149864516191.8075.9598564471838782681@mail.alporthouse.com> References: <20170504150245.wuahcnen5ebbg4js@boom> <87wp9wbj4p.fsf@riseup.net> <20170504200333.GK24019@nuc-i3427.alporthouse.com> <87k25wbanz.fsf@riseup.net> <149864516191.8075.9598564471838782681@mail.alporthouse.com> Date: Wed, 28 Jun 2017 14:10:40 -0700 Message-ID: <87fuejrf73.fsf@riseup.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: stable-owner@vger.kernel.org List-ID: --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Chris Wilson writes: > Quoting Francisco Jerez (2017-05-04 21:59:44) >> Chris Wilson writes: >>=20 >> > On Thu, May 04, 2017 at 10:56:54AM -0700, Francisco Jerez wrote: >> >> David Weinehall writes: >> >>=20 >> >> > On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote: >> >> >> A good default for garbage entries from the user is to follow the >> >> >> default setting of the object (i.e. the PTE). Currently they use t= he >> >> >> uncached entry, and now the only way to accidentally hit uncached >> >> >> performance is via explicit use of the uncached MOCS or setting the >> >> >> object to uncached. Note that these entries are currently undefine= d in >> >> >> the ABI and we reserve the right to change them. We originally cho= se >> >> >> uncached to eliminate any problem with reducing the caching level = in >> >> >> future, but the object is a much better definition of the minimum >> >> >> caching level. >> >> >>=20 >> >>=20 >> >> NAK. The reason for the default being UC is that it's the only setti= ng >> >> that guarantees full forwards compatibility with any other entry that >> >> might be added in the future. If you default to PTE on (e)LLC and WB= on >> >> L3, userspace will no longer be able to use any newly introduced entry >> >> with stricter coherency guarantees than that (e.g. any L3-uncached >> >> entry) in a backwards-compatible way. Attempting to do so may break >> >> memory coherency assumptions of the application and lead to misrender= ing >> >> when run on older kernel versions (which to my judgment is a scarier >> >> failure mode than reduced performance). >> > >> > You can't use a weaker coherency model in mocs than that specified for >> > the object as you can't control other uses of the object (even just >> > memory pressure will break your assumptions). >>=20 >> Exactly, but you can use a stronger coherency model than the application >> requested, which is why falling back to UC should generally work for >> unknown entries but falling back to PTE+WB isn't guaranteed to. > > Still wrong. GEM will write into the CPU cache believing the object is > coherent. The GPU will read from memory bypassing the CPU cache > following the UC mocs. I agree that this is a plausible scenario. > The only safe option is for it to follow PTE. Except you don't know whether the client reading or writing at the other end is the CPU, or whether the client at the other end is (set up to be) LLC-coherent. There's likely no 100% safe option on the LLC side of things. I could probably be convinced that in a number of scenarios PTE on LLC has somewhat better chances of success, but on the L3 side of things this patch enables WB which is AFAIA strictly more weakly coherent than UC, so it still gets my NAK. > -Chris --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAllUG1AACgkQg5k4nX1Sv1sJMAD+IPuOPioqTUr8bwq2xYQdnuT9 yP1eZrQ4VJaSbad/RsEA/0487cLT42srn/vrf43kx4ZbIA+zrTbZwJs+ON0Xbio7 =HSw8 -----END PGP SIGNATURE----- --==-=-=--