Re: [Qemu-devel] [PATCH for-4.0 00/17] tcg: Move softmmu out-of-line

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Emilio G. Cota" <cota@braap.org>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH for-4.0 00/17] tcg: Move softmmu out-of-line
Date: Fri, 16 Nov 2018 00:10:40 -0500	[thread overview]
Message-ID: <20181116051040.GA25165@flamenco> (raw)
In-Reply-To: <20181116011338.GB17566@flamenco>

On Thu, Nov 15, 2018 at 20:13:38 -0500, Emilio G. Cota wrote:
> I'll generate now some more perf numbers that we could include in the
> commit logs.

SPEC numbers are a net perf decrease, unfortunately:

                     Softmmu speedup for SPEC06int (test set)
   1.1 +-+--+----+----+----+----+----+----+---+----+----+----+----+----+--+-+
       |                                                                    |
       |                                                      aft+++        |
  1.05 +-+........................................................|.......+-+
       |                                               +++        |         |
       |                       +++                      |         |         |
       |   +++                  |                       |         |         |
     1 +-++++++++++++++++****++++++++++++++++++++++++++++++++++++***+++++++-+
       |    |         |  *  * ****                     ****      *|*        |
       |   ***  +++   |  *  * * |*       +++           *| *      *|*        |
  0.95 +-+.*|*..***...|..*..*.*.|*..+++...|............*|.*.+++..*|*..+++.+-+
       |   *|*  *+*  *** *  * * |*   |    |  +++       *| * ***  *|*  ***   |
       |   *+*  * *  *|* *  * *++*   |  ****  |        *| * *+*  *|*  *+*   |
       |   * *  * *  *|* *  * *  * **** * |* ****      *++* * *  *+*  * *   |
   0.9 +-+.*.*..*.*..*+*.*..*.*..*.*.|*.*.|*.*|.*......*..*.*.*..*.*..*.*.+-+
       |   * *  * *  * * *  * *  * *++* *++* *++* +++  *  * * *  * *  * *   |
       |   * *  * *  * * *  * *  * *  * *  * *  *  |   *  * * *  * *  * *   |
  0.85 +-+.*.*..*.*..*.*.*..*.*..*.*..*.*..*.*..*..|...*..*.*.*..*.*..*.*.+-+
       |   * *  * *  * * *  * *  * *  * *  * *  *  |   *  * * *  * *  * *   |
       |   * *  * *  * * *  * *  * *  * *  * *  * **** *  * * *  * *  * *   |
       |   * *  * *  * * *  * *  * *  * *  * *  * *| * *  * * *  * *  * *   |
   0.8 +-+.*.*..*.*..*.*.*..*.*..*.*..*.*..*.*..*.*|.*.*..*.*.*..*.*..*.*.+-+
       |   * *  * *  * * *  * *  * *  * *  * *  * *| * *  * * *  * *  * *   |
       |   * *  * *  * * *  * *  * *  * *  * *  * *++* *  * * *  * *  * *   |
  0.75 +-+-***--***--***-****-****-****-****-****-****-****-***--***--***-+-+
        401.bzi403.g429445.g456.462.libq464.h471.omn4483.xalancbgeomean
  png: https://imgur.com/aO39gyP

Turns out that the additional instructions are the problem,
despite the much lower icache miss rate. For instance, here
are some numbers for h264ref running on the not-so-recent
Xeon E5-2643 (i.e. Sandy Bridge):

- Before:
 1,137,737,512,668      instructions              #    2.02  insns per cycle
   563,574,505,040      cycles
     5,663,616,681      L1-icache-load-misses
     164.091239774 seconds time elapsed

- After:
 1,216,600,582,476      instructions              #    2.06  insns per cycle        
   591,888,969,223      cycles                                                      
     3,082,426,508      L1-icache-load-misses                                       

     172.232292897 seconds time elapsed

It's possible that newer machines with larger reorder buffers
will be able to take better advantage of the higher instruction
locality, hiding the latency of having to execute more instructions.
I'll test on Skylake tomorrow.

Thanks,

		E.

next prev parent reply	other threads:[~2018-11-16  5:10 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-12 21:44 [Qemu-devel] [PATCH for-4.0 00/17] tcg: Move softmmu out-of-line Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 01/17] tcg/i386: Add constraints for r8 and r9 Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 02/17] tcg/i386: Return a base register from tcg_out_tlb_load Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 03/17] tcg/i386: Change TCG_REG_L[01] to not overlap function arguments Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 04/17] tcg/i386: Force qemu_ld/st arguments into fixed registers Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 05/17] tcg: Return success from patch_reloc Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 06/17] tcg: Add TCG_TARGET_NEED_LDST_OOL_LABELS Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 07/17] tcg/i386: Use TCG_TARGET_NEED_LDST_OOL_LABELS Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 08/17] tcg/aarch64: Add constraints for x0, x1, x2 Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 09/17] tcg/aarch64: Parameterize the temps for tcg_out_tlb_read Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 10/17] tcg/aarch64: Parameterize the temp for tcg_out_goto_long Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 11/17] tcg/aarch64: Use B not BL " Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 12/17] tcg/aarch64: Use TCG_TARGET_NEED_LDST_OOL_LABELS Richard Henderson
2018-11-12 21:44 ` [Qemu-devel] [PATCH for-4.0 13/17] tcg/arm: Parameterize the temps for tcg_out_tlb_read Richard Henderson
2018-11-12 21:45 ` [Qemu-devel] [PATCH for-4.0 14/17] tcg/arm: Add constraints for R0-R5 Richard Henderson
2018-11-12 21:45 ` [Qemu-devel] [PATCH for-4.0 15/17] tcg/arm: Reduce the number of temps for tcg_out_tlb_read Richard Henderson
2018-11-12 21:45 ` [Qemu-devel] [PATCH for-4.0 16/17] tcg/arm: Force qemu_ld/st arguments into fixed registers Richard Henderson
2018-11-12 21:45 ` [Qemu-devel] [PATCH for-4.0 17/17] tcg/arm: Use TCG_TARGET_NEED_LDST_OOL_LABELS Richard Henderson
2018-11-13  9:00 ` [Qemu-devel] [PATCH for-4.0 00/17] tcg: Move softmmu out-of-line no-reply
2018-11-14  1:00 ` Emilio G. Cota
2018-11-15 11:32   ` Richard Henderson
2018-11-15 18:48     ` Emilio G. Cota
2018-11-15 18:54       ` Richard Henderson
2018-11-15 22:04       ` Richard Henderson
2018-11-16  1:13         ` Emilio G. Cota
2018-11-16  5:10           ` Emilio G. Cota [this message]
2018-11-16  8:07             ` Richard Henderson
2018-11-16 15:07               ` Emilio G. Cota
2018-11-16  8:10           ` Richard Henderson
2018-11-16 15:10             ` Emilio G. Cota

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181116051040.GA25165@flamenco \
    --to=cota@braap.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).