All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Avi Kivity <avi@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] migration: vectorize is_dup_page
Date: Tue, 20 Dec 2011 16:45:58 +0100	[thread overview]
Message-ID: <4EF0ADB6.80006@redhat.com> (raw)
In-Reply-To: <4EF0A8AD.2010301@redhat.com>

On 12/20/2011 04:24 PM, Avi Kivity wrote:
> On 12/06/2011 07:25 PM, Paolo Bonzini wrote:
>> is_dup_page is already proceeding in 32-bit chunks.  Changing it to 16
>> bytes using Altivec or SSE is easy, and provides a noticeable improvement.
>> Pierre Riteau measured 30->25 seconds on a 16GB guest, I measured 4.6->3.9
>> seconds on a 6GB guest (best of three times for me; dunno for Pierre).
>> Both of them are approximately a 15% improvement.
>>
>> I tried playing with non-temporal prefetches, but I did not get any
>> improvement (though I did get less cache misses, so the patch was doing
>> its job).
>
> It's worthwhile anyway IMO.

The problem is that if the page is not dup (the common case), you'll get 
all the cache misses anyway when you send it over the socket.  So what I 
did was add a 4k buffer (the same for all pages), and make is_dup_page 
copy the page to it.  Because the prefetches are non-temporal, you only 
use 4k of cache.  But the code is more complex and less reusable, it 
incurs an extra copy and it cannot leave is_dup_page early.

>> +static int is_dup_page(uint8_t *page)
>>   {
>> -    uint32_t val = ch<<  24 | ch<<  16 | ch<<  8 | ch;
>> -    uint32_t *array = (uint32_t *)page;
>> +    VECTYPE *p = (VECTYPE *)page;
>> +    VECTYPE val = SPLAT(p);
>>
>
> I think you can drop the SPLAT and just compare against zero.  Full page
> repeats of anything but zero are unlikely, so we can simplify the code a
> bit here.  If we do go with non-temporal loads, it saves an additional miss.

Yeah, with non-temporal loads that would make sense.

Paolo

      reply	other threads:[~2011-12-20 15:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-06 17:25 [Qemu-devel] [PATCH] migration: vectorize is_dup_page Paolo Bonzini
2011-12-20 14:13 ` Anthony Liguori
2011-12-20 15:24 ` Avi Kivity
2011-12-20 15:45   ` Paolo Bonzini [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EF0ADB6.80006@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=avi@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.