From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:39962)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <chenliang88@huawei.com>) id 1XyYO6-0006Tb-Ti
	for qemu-devel@nongnu.org; Tue, 09 Dec 2014 22:56:35 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <chenliang88@huawei.com>) id 1XyYO2-0000D8-2m
	for qemu-devel@nongnu.org; Tue, 09 Dec 2014 22:56:30 -0500
Received: from szxga01-in.huawei.com ([119.145.14.64]:45994)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <chenliang88@huawei.com>) id 1XyYO1-0000BT-BB
	for qemu-devel@nongnu.org; Tue, 09 Dec 2014 22:56:26 -0500
Message-ID: <5487C445.9030506@huawei.com>
Date: Wed, 10 Dec 2014 11:55:49 +0800
From: ChenLiang <chenliang88@huawei.com>
MIME-Version: 1.0
References: <1416830152-524-1-git-send-email-arei.gonglei@huawei.com>
	<1416830152-524-5-git-send-email-arei.gonglei@huawei.com>
	<20141210031810.GC27208@grmbl.mre>
In-Reply-To: <20141210031810.GC27208@grmbl.mre>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH RESEND for 2.3 4/6] xbzrle: check 8 bytes
 at a time after an concurrency scene
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Amit Shah <amit.shah@redhat.com>
Cc: weidong.huang@huawei.com, quintela@redhat.com, qemu-devel@nongnu.org, dgilbert@redhat.com, arei.gonglei@huawei.com, pbonzini@redhat.com, peter.huangpeng@huawei.com

On 2014/12/10 11:18, Amit Shah wrote:

> On (Mon) 24 Nov 2014 [19:55:50], arei.gonglei@huawei.com wrote:
>> From: ChenLiang <chenliang88@huawei.com>
>>
>> The logic of old code is correct. But Checking byte by byte will
>> consume time after an concurrency scene.
>>
>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>>  xbzrle.c | 28 ++++++++++++++++++----------
>>  1 file changed, 18 insertions(+), 10 deletions(-)
>>
>> diff --git a/xbzrle.c b/xbzrle.c
>> index d27a140..0477367 100644
>> --- a/xbzrle.c
>> +++ b/xbzrle.c
>> @@ -50,16 +50,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>>  
>>          /* word at a time for speed */
>>          if (!res) {
>> -            while (i < slen &&
>> -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>> -                i += sizeof(long);
>> -                zrun_len += sizeof(long);
>> -            }
>> -
>> -            /* go over the rest */
>> -            while (i < slen && old_buf[i] == new_buf[i]) {
>> -                zrun_len++;
>> -                i++;
>> +            while (i < slen) {
>> +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>> +                    i += sizeof(long);
>> +                    zrun_len += sizeof(long);
>> +                } else {
>> +                    /* go over the rest */
>> +                    for (j = 0; j < sizeof(long); j++) {
>> +                        if (old_buf[i] == new_buf[i]) {
>> +                            i++;
>> +                            zrun_len++;
> 
> I don't see how this is different from the code it's replacing.  The
> check and increments are all the same.  Difficult to see why there'll
> be a speed benefit.  Can you please explain?  Do you have any
> performance numbers for before/after?
> 
> Thanks,
> 
> 		Amit
> 
> .
> 

Hi Amit:

+                    for (j = 0; j < sizeof(long); j++) {
+                        if (old_buf[i] == new_buf[i]) {
+                            i++;
+                            zrun_len++;
+                        } else {
+                            break;
+                        }
+                    }
+                    if (j != sizeof(long)) {
+                        break;
+                    }

The branch of *j != sizeof(long)* may not be hit after an concurrency scene.
so we can continue doing "(*(long *)(old_buf + i)) == (*(long *)(new_buf + i))".
On the another side the old code does "old_buf[i] == new_buf[i]".

To be honest, This scene is rare.

Best regards
ChenLiang