All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>,
	Amos Kong <akong@redhat.com>,
	qemu-trivial@nongnu.org
Cc: yamahata@private.email.ne.jp, Paolo Bonzini <pbonzini@redhat.com>,
	mjt@tls.msk.ru, qemu-devel@nongnu.org,
	"Michael R. Hines" <mrhines@us.ibm.com>
Subject: Re: [Qemu-trivial] [PATCH] arch_init.c: remove duplicate function
Date: Tue, 15 Apr 2014 11:00:40 +0200	[thread overview]
Message-ID: <534CF538.9000100@redhat.com> (raw)
In-Reply-To: <534C7584.2090301@linux.vnet.ibm.com>

On 04/15/14 01:55, Michael R. Hines wrote:
> On 04/14/2014 05:19 PM, Laszlo Ersek wrote:
>> On 04/14/14 04:27, Amos Kong wrote:
>>> We already have a function buffer_is_zero() in util/cutils.c
>>>
>>> Signed-off-by: Amos Kong <akong@redhat.com>
>>> ---
>>>   arch_init.c | 9 ++-------
>>>   1 file changed, 2 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch_init.c b/arch_init.c
>>> index 60c975d..342e5dc 100644
>>> --- a/arch_init.c
>>> +++ b/arch_init.c
>>> @@ -152,11 +152,6 @@ int qemu_read_default_config_files(bool userconfig)
>>>       return 0;
>>>   }
>>>   -static inline bool is_zero_range(uint8_t *p, uint64_t size)
>>> -{
>>> -    return buffer_find_nonzero_offset(p, size) == size;
>>> -}
>>> -
>>>   /* struct contains XBZRLE cache and a static page
>>>      used by the compression */
>>>   static struct {
>>> @@ -587,7 +582,7 @@ static int ram_save_block(QEMUFile *f, bool
>>> last_stage)
>>>                           acct_info.dup_pages++;
>>>                       }
>>>                   }
>>> -            } else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
>>> +            } else if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
>>>                   acct_info.dup_pages++;
>>>                   bytes_sent = save_block_hdr(f, block, offset, cont,
>>>                                               RAM_SAVE_FLAG_COMPRESS);
>>> @@ -983,7 +978,7 @@ static inline void
>>> *host_from_stream_offset(QEMUFile *f,
>>>    */
>>>   void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
>>>   {
>>> -    if (ch != 0 || !is_zero_range(host, size)) {
>>> +    if (ch != 0 || !buffer_is_zero(host, size)) {
>>>           memset(host, ch, size);
>>>       }
>>>   }
>>>
>> This seems to be correct, functionally -- apparently buffer_is_zero()
>> has laxer size requirements than buffer_find_nonzero_offset(). However,
>> I think the latter might be faster.
>>
>> For ram_save_block() I guess the difference is negligible. But
>> ram_handle_compressed() is also called from "migration-rdma.c", where I
>> can't even guess if a little bit of slowdown would count.
>>
>> I'm OK with the patch if Michael (CC'd) is.
>>
>> Thanks
>> Laszlo
>>
> 
> Thanks for the CC.
> 
> Actually, it looks like buffer_is_zero() is calling
> buffer_find_nonzero_offset()
> as a "first try" anyway

I have no idea how I managed to miss that.

> - which is the same thing RDMA is doing. So, all
> calls to ram_handle_compressed() should hit the branch target there in
> buffer_is_zero() for can_use_buffer_find_nonzero_offset() and automatically
> drop into the vectorized-optimization to search for zeros, so there
> shouldn't
> be any change in performance. The same should apply for TCP migration
> as well - it's working on page-granularity, which is always aligned to
> 32 or 64 bits.
> 
> Paolo? I see that some of the block-migration code and the qemu-img code
> is also
> calling buffer_is_zero() - are you guys depending on the performance of any
> buffer_is_zero() calls to use the vector-optimized version like
> migration does?

The patch doesn't change buffer_is_zero() internally, so callers of
buffer_is_zero() are unaffected. The patch turns two indirect callers of
buffer_find_nonzero_offset() into two "differently indirect" callers of
the same (which I now see thanks to your explanation). Hence,

Before:
ram_save_block        -> is_zero_range -> buffer_find_nonzero_offset
ram_handle_compressed -> is_zero_range -> buffer_find_nonzero_offset

After:
ram_save_block        -> buffer_is_zero -> buffer_find_nonzero_offset
ram_handle_compressed -> buffer_is_zero -> buffer_find_nonzero_offset

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thanks!
Laszlo


  reply	other threads:[~2014-04-15  9:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-14  2:27 [Qemu-trivial] [PATCH] arch_init.c: remove duplicate function Amos Kong
2014-04-14  9:19 ` Laszlo Ersek
2014-04-14 23:55   ` Michael R. Hines
2014-04-15  9:00     ` Laszlo Ersek [this message]
2014-04-15 16:03       ` [Qemu-trivial] [Qemu-devel] " 陈梁
2014-04-27  8:48     ` [Qemu-trivial] " Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=534CF538.9000100@redhat.com \
    --to=lersek@redhat.com \
    --cc=akong@redhat.com \
    --cc=mjt@tls.msk.ru \
    --cc=mrhines@linux.vnet.ibm.com \
    --cc=mrhines@us.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-trivial@nongnu.org \
    --cc=yamahata@private.email.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.