qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Richard Henderson <rth@twiddle.net>
Cc: qemu-devel@nongnu.org, pbonzini@redhat.com, vijay.kilari@gmail.com
Subject: Re: [Qemu-devel] [PATCH v3 0/9] Improve buffer_is_zero
Date: Mon, 5 Sep 2016 16:08:06 +0100	[thread overview]
Message-ID: <20160905150806.GD22496@work-vm> (raw)
In-Reply-To: <1472496380-19706-1-git-send-email-rth@twiddle.net>

* Richard Henderson (rth@twiddle.net) wrote:

Have you considered contributing something similar to this to glibc?
I filed https://sourceware.org/bugzilla/show_bug.cgi?id=19920  a while back
suggesting it would be useful to have it in libc to be used
by things other than just qemu.

Dave

> Changes from v2 to v3:
> 
>   * Unit testing.  This includes having x86 attempt all versions of
>     the accelerator that will run on the hardware.  Thus an avx2 host
>     will run the basic test 5 times (1.5sec on my laptop).
> 
>   * Drop the ppc and aarch64 specializations.  I have improved the
>     basic integer version to the point that those vectorized versions
>     are not a win.
> 
>     In the case of my aarch64 mustang, the integer version is 4 times
>     faster than the neon version that I delete.  With effort I was
>     able to rewrite the neon version to come to within a factor of 1.1,
>     but it remained slower than the integer.  To be fair, gcc6 makes
>     very good use of ldp, so the integer path is *also* loading 16 bytes
>     per insn.
> 
>     I can forward my standalone aarch64 benchmark if anyone is interested.
> 
>     Note however that at least the avx2 acceleration is still very much
>     a win, being about 3 times faster on my laptop.  Of course, it's
>     handling 4 times as much data per loop as the integer version, so
>     one can still see the overhead caused by using vector insns.
> 
>     For grins I wrote an avx512 version, if someone has a skylake upon
>     which to test and benchmark.  That requires additional configure
>     checks, so I didn't bother to include it here.
> 
> 
> r~
> 
> 
> Richard Henderson (9):
>   cutils: Move buffer_is_zero and subroutines to a new file
>   cutils: Remove SPLAT macro
>   cutils: Export only buffer_is_zero
>   cutils: Rearrange buffer_is_zero acceleration
>   cutils: Add test for buffer_is_zero
>   cutils: Add generic prefetch
>   cutils: Rewrite x86 buffer zero checking
>   cutils: Remove aarch64 buffer zero checking
>   cutils: Remove ppc buffer zero checking
> 
>  configure                 |  21 +--
>  include/qemu/cutils.h     |   3 +-
>  migration/ram.c           |   2 +-
>  migration/rdma.c          |   5 +-
>  tests/Makefile.include    |   3 +
>  tests/test-bufferiszero.c |  78 +++++++++++
>  util/Makefile.objs        |   1 +
>  util/bufferiszero.c       | 332 ++++++++++++++++++++++++++++++++++++++++++++++
>  util/cutils.c             | 244 ----------------------------------
>  9 files changed, 423 insertions(+), 266 deletions(-)
>  create mode 100644 tests/test-bufferiszero.c
>  create mode 100644 util/bufferiszero.c
> 
> -- 
> 2.7.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

      parent reply	other threads:[~2016-09-05 15:08 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-29 18:46 [Qemu-devel] [PATCH v3 0/9] Improve buffer_is_zero Richard Henderson
2016-08-29 18:46 ` [Qemu-devel] [PATCH v3 1/9] cutils: Move buffer_is_zero and subroutines to a new file Richard Henderson
2016-08-29 18:46 ` [Qemu-devel] [PATCH v3 3/9] cutils: Export only buffer_is_zero Richard Henderson
2016-08-29 18:46 ` [Qemu-devel] [PATCH v3 4/9] cutils: Rearrange buffer_is_zero acceleration Richard Henderson
2016-08-29 18:46 ` [Qemu-devel] [PATCH v3 5/9] cutils: Add test for buffer_is_zero Richard Henderson
2016-08-29 18:46 ` [Qemu-devel] [PATCH v3 6/9] cutils: Add generic prefetch Richard Henderson
2016-08-29 18:46 ` [Qemu-devel] [PATCH v3 7/9] cutils: Rewrite x86 buffer zero checking Richard Henderson
2016-09-13 13:26   ` Paolo Bonzini
2016-09-13 14:17     ` Paolo Bonzini
2016-09-13 14:49       ` Paolo Bonzini
2016-09-13 15:47         ` Paolo Bonzini
2016-08-29 18:46 ` [Qemu-devel] [PATCH v3 8/9] cutils: Remove aarch64 " Richard Henderson
2016-08-29 18:46 ` [Qemu-devel] [PATCH v3 9/9] cutils: Remove ppc " Richard Henderson
2016-08-30 11:48 ` [Qemu-devel] [PATCH v3 0/9] Improve buffer_is_zero Paolo Bonzini
2016-09-05 15:08 ` Dr. David Alan Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160905150806.GD22496@work-vm \
    --to=dgilbert@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=vijay.kilari@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).