All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Zhao, Zhou" <zhou.zhao@intel.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
	"Xu, Ling1" <ling1.xu@intel.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"Jin, Jun I" <jun.i.jin@intel.com>
Subject: Re: [PATCH 1/1] Add AVX512 support for xbzrle_encode_buffer function
Date: Thu, 21 Jul 2022 17:28:33 +0100	[thread overview]
Message-ID: <Ytl+sTUEGqt1axfW@work-vm> (raw)
In-Reply-To: <DM6PR11MB28126DAC62A921E5551C1400F5919@DM6PR11MB2812.namprd11.prod.outlook.com>

* Zhao, Zhou (zhou.zhao@intel.com) wrote:
> Hi dainel:
>   Cause our code depend on intel intrinsics lib implement. And this lib depend on macro like  " AVX512BW ". This macro need compile time check to enable some machine options . if you only use that utility to do runtime check ,you will met compile issue. And also if we want to save cpu time , we'd better check it in compile time.

You need to do *both*:

  a) You need to check at compile time to see if you have the
intrinsics.
  b) You need to check at runtime to see if you're running on a suitable
CPU.

Other things to note (I've not checked the algorithm yet):
  c) The patch needs splitting up into compile checks, the algorithm,
the tests as at least 3 patches.
  d) The test includes a benchmark, we don't need to include a benchmark
program in the code, just something to check it works.
  e) The benchmark is a microbenchmark on the routine; what's it's
effect on the whole migration - is it significant?
  f) xbzrle isn't actually used that much these days, so I'm not sure
generally it's worth it.

Dave

> -----Original Message-----
> From: Daniel P. Berrangé <berrange@redhat.com> 
> Sent: Thursday, July 21, 2022 11:11 PM
> To: Xu, Ling1 <ling1.xu@intel.com>
> Cc: qemu-devel@nongnu.org; quintela@redhat.com; dgilbert@redhat.com; Zhao, Zhou <zhou.zhao@intel.com>; Jin, Jun I <jun.i.jin@intel.com>
> Subject: Re: [PATCH 1/1] Add AVX512 support for xbzrle_encode_buffer function
> 
> On Thu, Jul 21, 2022 at 06:31:47PM +0800, ling xu wrote:
> > This commit adds AVX512 implementation of xbzrle_encode_buffer 
> > function to accelerate xbzrle encoding speed. Compared with C version 
> > of xbzrle_encode_buffer function,
> > AVX512 version can achieve almost 60%-70% performance improvement on unit test provided by qemu.
> > In addition, we provide one more unit test called 
> > "test_encode_decode_random", in which dirty data are randomly located in 4K page, and this case can achieve almost 140% performance gain.
> > 
> > Signed-off-by: ling xu <ling1.xu@intel.com>
> > Co-authored-by: Zhou Zhao <zhou.zhao@intel.com>
> > Co-authored-by: Jun Jin <jun.i.jin@intel.com>
> > ---
> >  configure                | 434 ++++++++++++++++++++++++++++++++++++++-
> >  migration/ram.c          |   6 +
> >  migration/xbzrle.c       | 177 ++++++++++++++++
> >  migration/xbzrle.h       |   4 +
> >  tests/unit/test-xbzrle.c | 307 +++++++++++++++++++++++++--
> >  5 files changed, 908 insertions(+), 20 deletions(-)
> 
> > diff --git a/migration/ram.c b/migration/ram.c index 
> > 01f9cc1d72..3b931c325f 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -747,9 +747,15 @@ static int save_xbzrle_page(RAMState *rs, uint8_t **current_data,
> >      memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
> >  
> >      /* XBZRLE encoding (if there is no overflow) */
> > +    #if defined(__x86_64__) && defined(__AVX512BW__)
> > +    encoded_len = xbzrle_encode_buffer_512(prev_cached_page, XBZRLE.current_buf,
> > +                                       TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
> > +                                       TARGET_PAGE_SIZE);
> > +    #else
> >      encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
> >                                         TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
> >                                         TARGET_PAGE_SIZE);
> > +    #endif
> 
> Shouldn't we be deciding which impl using a runtime check of the current CPUID, rather than a compile time check ? I'm thinking along the lines of what util/bufferiszero.c does to select different optimized versions based on CPUID. The build host CPU features can't be expected to match the runtime host CPU features.
> 
> 
> With regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2022-07-21 16:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-21 10:31 [PATCH 0/1] This patch provides AVX512 support for xbzrle_encode_buffer function ling xu
2022-07-21 10:31 ` [PATCH 1/1] Add " ling xu
2022-07-21 15:11   ` Daniel P. Berrangé
2022-07-21 16:02     ` Zhao, Zhou
2022-07-21 16:28       ` Dr. David Alan Gilbert [this message]
2022-07-21 16:41       ` Daniel P. Berrangé
2022-07-22  2:23         ` Zhao, Zhou
2022-07-22  8:29           ` Daniel P. Berrangé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ytl+sTUEGqt1axfW@work-vm \
    --to=dgilbert@redhat.com \
    --cc=berrange@redhat.com \
    --cc=jun.i.jin@intel.com \
    --cc=ling1.xu@intel.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=zhou.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.