All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Clifton <pcjc2@cam.ac.uk>
To: "mesa-dev@lists.freedesktop.org" <mesa-dev@lists.freedesktop.org>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>
Subject: Strange performance cliff...
Date: Tue, 12 Oct 2010 18:45:25 +0100	[thread overview]
Message-ID: <1286905525.2807.7.camel@pcjc2lap> (raw)

Using glxgears as a tool to exercise the GPU with some simple rendering,
I have noted a strange cliff in the intel_gpu_top output when resizing
the glxgears window:

Below a certain size e.g.:

  -geometry 576x868+0+29


core clock: 400 Mhz
                   render busy:  21%: ████▎                                  render space: 10/126976 (0%)
                bitstream busy:   0%:                                     bitstream space: 0/126976 (0%)

                          task  percent busy
                            CS:  20%: ████                    vert fetch: 35071068 (137619/sec)
               RC Render cache:  17%: ███▌                    prim fetch: 15422476 (60993/sec)
                            VF:  13%: ██▋                  VS invocations: 31634160 (128160/sec)
                            GS:  12%: ██▌                  GS invocations: 11848000 (48000/sec)
                   Windower IZ:  12%: ██▌                       GS prims: 0 (0/sec)
              SF (strip / fan):  12%: ██▌                  CL invocations: 0 (0/sec)
                   Row 1, EU 3:  10%: ██                        CL prims: 28373497 (108993/sec)
                   Row 0, EU 3:  10%: ██                   PS invocations: 7810009 (600886/sec)
                   Bypass FIFO:  10%: ██                   PS depth pass: 17981565671 (73371256/sec)
                  Pixel shader:  10%: ██                   
             Windower / Masker:   9%: █▉                   
                     Filtering:   9%: █▉                   
                   Row 1, EU 2:   9%: █▉                   
                   Row 0, EU 2:   9%: █▉                   
                  Setup Engine:   8%: █▋                   
                          MASF:   8%: █▋                   
                   Row 1, EU 1:   8%: █▋                   
                   Row 0, EU 1:   7%: █▌                   
                   Row 1, EU 0:   7%: █▌                   
                    Map filter:   6%: █▎                   
                           DAP:   6%: █▎                   
            Texture decompress:   6%: █▎                   
                 Sampler cache:   6%: █▎                   
                 Texture fetch:   5%: █                    
                          SVRW:   5%: █                    
                          SVRR:   4%: ▉                    
                           URB:   4%: ▉                    
            Projection and LOD:   3%: ▋                    
   Dependent address generator:   3%: ▋                    
                    Dispatcher:   2%: ▌                    
                  CL (clipper):   2%: ▌                    
                          SVDW:   2%: ▌                    
                           VS0:   2%: ▌                    
                           ISC:   1%: ▎                    
         Message Arbiter row 1:   1%: ▎                    
                          MASM:   1%: ▎                    
      SI (system instruction?):   0%:                      
                            DM:   0%:                      
                            SC:   0%:   


I get this trace.

When I increase the window size just a fraction, to:

  -geometry 576x871+0+29



The CS (command streamer) unit jumps to 100% busy, along with the render
busy graph. Does anyone have any ideas why?

core clock: 400 Mhz
                   render busy: 100%: ████████████████████                   render space: 61/126976 (0%)
                bitstream busy:   1%: ▎                                   bitstream space: 0/126976 (0%)

                          task  percent busy
                            CS: 100%: ████████████████████    vert fetch: 15165204 (133386/sec)
               RC Render cache:  18%: ███▋                    prim fetch: 6654764 (59078/sec)
                            VF:  12%: ██▌                  VS invocations: 13559328 (123888/sec)
                   Windower IZ:  12%: ██▌                  GS invocations: 5078400 (46400/sec)
                            GS:  11%: ██▎                       GS prims: 0 (0/sec)
              SF (strip / fan):  11%: ██▎                  CL invocations: 0 (0/sec)
                   Row 1, EU 3:   9%: █▉                        CL prims: 12836185 (105478/sec)
                  Pixel shader:   9%: █▉                   PS invocations: 6019756 (-1805377/sec)
                   Bypass FIFO:   9%: █▉                   PS depth pass: 7529067710 (71131031/sec)
                   Row 0, EU 3:   9%: █▉                   
             Windower / Masker:   9%: █▉                   
                   Row 1, EU 2:   8%: █▋                   
                     Filtering:   8%: █▋                   
                   Row 0, EU 2:   8%: █▋                   
                          MASF:   8%: █▋                   
                  Setup Engine:   7%: █▌                   
                   Row 1, EU 1:   7%: █▌                   
                   Row 0, EU 1:   7%: █▌                   
                           DAP:   6%: █▎                   
                   Row 1, EU 0:   6%: █▎                   
                    Map filter:   6%: █▎                   
            Texture decompress:   5%: █                    
                 Sampler cache:   5%: █                    
                 Texture fetch:   5%: █                    
                          SVRW:   4%: ▉                    
                          SVRR:   4%: ▉                    
                           URB:   3%: ▋                    
            Projection and LOD:   3%: ▋                    
   Dependent address generator:   2%: ▌                    
                    Dispatcher:   2%: ▌                    
                          SVDW:   2%: ▌                    
                  CL (clipper):   2%: ▌                    
                           ISC:   2%: ▌                    
                           VS0:   1%: ▎                    
         Message Arbiter row 1:   1%: ▎                    
                          MASM:   1%: ▎                    
      SI (system instruction?):   0%:                      
                            DM:   0%:                      
                            SC:   0%:          


NB: I've patched intel_gpu_top to add a little more human readability to
the output. In case I got it wrong, note that these are the changes I
applied:


diff --git a/lib/instdone.c b/lib/instdone.c
index 722fb03..f908a79 100644
--- a/lib/instdone.c
+++ b/lib/instdone.c
@@ -100,7 +100,7 @@ init_g965_instdone1(void)
 static void
 init_g4x_instdone1(void)
 {
-       gen4_instdone1_bit(G4X_BCS_DONE, "BCS");
+       gen4_instdone1_bit(G4X_BCS_DONE, "AVC_FE Command Streamer");
        gen4_instdone1_bit(G4X_CS_DONE, "CS");
        gen4_instdone1_bit(G4X_MASF_DONE, "MASF");
        gen4_instdone1_bit(G4X_SVDW_DONE, "SVDW");
@@ -108,11 +108,11 @@ init_g4x_instdone1(void)
        gen4_instdone1_bit(G4X_SVRW_DONE, "SVRW");
        gen4_instdone1_bit(G4X_SVRR_DONE, "SVRR");
        gen4_instdone1_bit(G4X_ISC_DONE, "ISC");
-       gen4_instdone1_bit(G4X_MT_DONE, "MT");
-       gen4_instdone1_bit(G4X_RC_DONE, "RC");
+       gen4_instdone1_bit(G4X_MT_DONE, "MT Texture cache");
+       gen4_instdone1_bit(G4X_RC_DONE, "RC Render cache");
        gen4_instdone1_bit(G4X_DAP_DONE, "DAP");
        gen4_instdone1_bit(G4X_MAWB_DONE, "MAWB");
-       gen4_instdone1_bit(G4X_MT_IDLE, "MT idle");
+       gen4_instdone1_bit(G4X_MT_IDLE, "MT (texture cache) idle");
        //gen4_instdone1_bit(G4X_GBLT_BUSY, "GBLT");
        gen4_instdone1_bit(G4X_SVSM_DONE, "SVSM");
        gen4_instdone1_bit(G4X_MASM_DONE, "MASM");
@@ -122,13 +122,13 @@ init_g4x_instdone1(void)
        gen4_instdone1_bit(G4X_DM_DONE, "DM");
        gen4_instdone1_bit(G4X_FT_DONE, "FT");
        gen4_instdone1_bit(G4X_DG_DONE, "DG");
-       gen4_instdone1_bit(G4X_SI_DONE, "SI");
+       gen4_instdone1_bit(G4X_SI_DONE, "SI (system instruction?)");
        gen4_instdone1_bit(G4X_SO_DONE, "SO");
        gen4_instdone1_bit(G4X_PL_DONE, "PL");
-       gen4_instdone1_bit(G4X_WIZ_DONE, "WIZ");
+       gen4_instdone1_bit(G4X_WIZ_DONE, "Windower IZ");
        gen4_instdone1_bit(G4X_URB_DONE, "URB");
-       gen4_instdone1_bit(G4X_SF_DONE, "SF");
-       gen4_instdone1_bit(G4X_CL_DONE, "CL");
+       gen4_instdone1_bit(G4X_SF_DONE, "SF (strip / fan)");
+       gen4_instdone1_bit(G4X_CL_DONE, "CL (clipper)");
        gen4_instdone1_bit(G4X_GS_DONE, "GS");
        gen4_instdone1_bit(G4X_VS0_DONE, "VS0");
        gen4_instdone1_bit(G4X_VF_DONE, "VF");
@@ -250,7 +250,7 @@ init_instdone_definitions(uint32_t devid)
                gen4_instdone_bit(I965_ROW_1_EU_3_DONE, "Row 1, EU 3");
                gen4_instdone_bit(I965_SF_DONE, "Strips and Fans");
                gen4_instdone_bit(I965_SE_DONE, "Setup Engine");
-               gen4_instdone_bit(I965_WM_DONE, "Windowizer");
+               gen4_instdone_bit(I965_WM_DONE, "Windower / Masker");
                gen4_instdone_bit(I965_DISPATCHER_DONE, "Dispatcher");
                gen4_instdone_bit(I965_PROJECTION_DONE, "Projection and LOD");
                gen4_instdone_bit(I965_DG_DONE, "Dependent address generator");


Best wishes,

-- 
Peter Clifton

Electrical Engineering Division,
Engineering Department,
University of Cambridge,
9, JJ Thomson Avenue,
Cambridge
CB3 0FA

Tel: +44 (0)7729 980173 - (No signal in the lab!)
Tel: +44 (0)1223 748328 - (Shared lab phone, ask for me)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

             reply	other threads:[~2010-10-12 17:45 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-12 17:45 Peter Clifton [this message]
2010-10-12 20:54 ` [Intel-gfx] Strange performance cliff Peter Clifton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1286905525.2807.7.camel@pcjc2lap \
    --to=pcjc2@cam.ac.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mesa-dev@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.