From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Anholt Subject: Re: intel_gpu_top decode.. Date: Wed, 06 Oct 2010 15:27:53 -0700 Message-ID: <877hhvq7me.fsf@pollan.anholt.net> References: <1286387755.11699.8.camel@pcjc2lap> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0556035523==" Return-path: In-Reply-To: <1286387755.11699.8.camel@pcjc2lap> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Peter Clifton , intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org --===============0556035523== Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Wed, 06 Oct 2010 18:55:55 +0100, Peter Clifton wrote: > Hi, >=20 > Can anyone point me at what this intel_gpu_top output (below) indicates > regarding what is limiting the frame-rate of my drawing? I'll take a stab at what I can, but honestly I find the status reports of the chip fairly mystic myself. > Primarily I'm throwing a lot of triangles and texture coordinates into a > vertex array, compiling the lot into a display list and benchmarking the > frame-rate I can achieve at given window sizes (not vblank limited). For > the data below, I'm just drawing lines (with two triangles each, then > two more at each end for caps). I have some colour changes, but these > are handled with a flush of my vertex array and a glColor call. Colour > changes should be relatively infrequent though. >=20 > Round line caps are being drawn using two triangles (to make a square), > with texture coordinates spaced between -1 and 1 to span the square. The > round object is drawn with an implicit texture using this shader: >=20 > void main() > { > float sqdist; >=20 > sqdist =3D dot (gl_TexCoord[0].st, gl_TexCoord[0].st); > if (sqdist > 1.0) > discard; >=20 > gl_FragColor =3D gl_Color; > } >=20 > The line geometry is also hitting this same shader, but with texture > coordinates set to 0.0, 0.0 so it is not clipped. >=20 > This is on a GM45. Am I correct in thinking the geometry transfer is the > indicated bottle-neck? (VF CS is vertex fetch command stream, right?) >=20 > From the fact the pixel shader is at 70%, I presume I'm not (yet) > fill-rate limited, but not that far from it either. Generally, a unit appears to also report busy if it's stalled on getting its data downstream. So I read your output as probably VF is 9% busy and 88% waiting for VS, and CL (clipper) is 14% busy, and 68% waiting on windowizer (fragment shader). That's just approximately, since we don't have metrics here for how much is actually stalled vs accomplishing something. Ideally, everyone would report busy all the time getting work done, but another debug tool for Ironlake I'm working on getting released has strongly pointed to "units are either starved or stalled, and rarely doing real work." > I've no idea what the other acronyms are, and the PRM doesn't help > immediately. Is UC0 related to clipping? Can I reduce it? Not sure what that one is. > core clock: 400 Mhz > ring idle: 1%: =E2=96=8C=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 > ring space: 256/126976 (0%) > task percent busy >=20 > VF CS: 91%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8C= =20=20=20=20 > UC0 CS: 88%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8D=20=20=20= =20=20 > ISC CS: 88%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8D=20=20=20= =20=20 > GS CS: 88%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8D=20=20=20= =20=20 > VS0 CS: 82%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=20=20=20=20=20=20=20=20 > CL CS: 82%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=20=20=20=20=20=20=20=20 > MASM CS CR: 80%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=8F=20=20=20=20=20=20=20=20 > Row 1, EU 3: 78%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=8D=20=20=20=20=20=20=20=20=20 > Row 0, EU 3: 71%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =8C=20=20=20=20=20=20=20=20=20=20=20=20 > Pixel shader: 70%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96= =8F=20=20=20=20=20=20=20=20=20=20=20=20 > Bypass FIFO: 69%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8A=20=20= =20=20=20=20=20=20=20=20=20=20=20 > Windowizer: 68%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8D=20=20= =20=20=20=20=20=20=20=20=20=20=20 > Row 1, EU 2: 63%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8D=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 > Filtering: 62%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20 > Row 0, EU 2: 60%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8F=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20 > URB CS: 57%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=88=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20 > Setup Engine: 55%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=88=E2=96=8F=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20 > Map filter: 54%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=88=E2=96=8A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Row 1, EU 1: 50%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2= =96=8F=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Row 0, EU 1: 47%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Texture decompress: 45%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8F=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Sampler cache: 44%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8A=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Texture fetch: 44%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8A=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Row 1, EU 0: 43%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88= =E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8D=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Projection and LOD: 24%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=8A=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Dependent address generator: 22%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=E2=96=88=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Dispatcher: 18%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=88=E2=96=88=E2=96=88=E2=96=8D=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > Message Arbiter row 1: 11%: =E2=96=88=E2=96=88=E2=96=88=E2=96= =88=E2=96=8C=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > SVDR CS CR: 6%: =E2=96=88=E2=96=88=E2=96=8C=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20 > EM1 CS CR: 5%: =E2=96=88=E2=96=88=E2=96=8F=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20 > SVSM CS CR: 2%: =E2=96=88=20=20=20=20=20 Of this, I'd say that you're spending a surprising amount of time in texture fetch. Finding ways to reduce texture bandwidth may pay off, assuming that (texture fetch / sampler cache) is the percentage of the time you're cache missing. I'm not sure if that's true or not, though. And you said that this data was just for the line drawing, which didn't appear to have any texturing going on at all, so I'm just confused. --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAkys9+kACgkQHUdvYGzw6vdOCwCfQJi3r6gbnDdKiVs8ISETvI0X PDAAninonuXkOCIRBowt9zxbcSKEY60v =5ovr -----END PGP SIGNATURE----- --=-=-=-- --===============0556035523== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx --===============0556035523==--