Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Avi Kivity <avi@redhat.com>
Cc: qemu-devel@nongnu.org, Rusty Russell <rusty@rustcorp.com.au>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself
Date: Sun, 23 May 2010 18:51:32 +0300	[thread overview]
Message-ID: <20100523155132.GA14733@redhat.com> (raw)
In-Reply-To: <4BF94CAD.5010504@redhat.com>

On Sun, May 23, 2010 at 06:41:33PM +0300, Avi Kivity wrote:
> On 05/23/2010 06:31 PM, Michael S. Tsirkin wrote:
>> On Thu, May 20, 2010 at 02:38:16PM +0930, Rusty Russell wrote:
>>    
>>> On Thu, 20 May 2010 02:31:50 pm Rusty Russell wrote:
>>>      
>>>> On Wed, 19 May 2010 05:36:42 pm Avi Kivity wrote:
>>>>        
>>>>>> Note that this is a exclusive->shared->exclusive bounce only, too.
>>>>>>
>>>>>>            
>>>>> A bounce is a bounce.
>>>>>          
>>>> I tried to measure this to show that you were wrong, but I was only able
>>>> to show that you're right.  How annoying.  Test code below.
>>>>        
>>> This time for sure!
>>>      
>>
>> What do you see?
>> On my laptop:
>> 	[mst@tuck testring]$ ./rusty1 share 0 1
>> 	CPU 1: share cacheline: 2820410 usec
>> 	CPU 0: share cacheline: 2823441 usec
>> 	[mst@tuck testring]$ ./rusty1 unshare 0 1
>> 	CPU 0: unshare cacheline: 2783014 usec
>> 	CPU 1: unshare cacheline: 2782951 usec
>> 	[mst@tuck testring]$ ./rusty1 lockshare 0 1
>> 	CPU 1: lockshare cacheline: 1888495 usec
>> 	CPU 0: lockshare cacheline: 1888544 usec
>> 	[mst@tuck testring]$ ./rusty1 lockunshare 0 1
>> 	CPU 0: lockunshare cacheline: 1889854 usec
>> 	CPU 1: lockunshare cacheline: 1889804 usec
>>    
>
> Ugh, can the timing be normalized per operation?  This is unreadable.
>
>> So locked version seems to be faster than unlocked,
>> and share/unshare not to matter?
>>    
>
> May be due to the processor using the LOCK operation as a hint to  
> reserve the cacheline for a bit.

Maybe we should use atomics on index then?

>> same on a workstation:
>> [root@qus19 ~]# ./rusty1 unshare 0 1
>> CPU 0: unshare cacheline: 6037002 usec
>> CPU 1: unshare cacheline: 6036977 usec
>> [root@qus19 ~]# ./rusty1 lockunshare 0 1
>> CPU 1: lockunshare cacheline: 5734362 usec
>> CPU 0: lockunshare cacheline: 5734389 usec
>> [root@qus19 ~]# ./rusty1 lockshare 0 1
>> CPU 1: lockshare cacheline: 5733537 usec
>> CPU 0: lockshare cacheline: 5733564 usec
>>
>> using another pair of CPUs gives a more drastic
>> results:
>>
>> [root@qus19 ~]# ./rusty1 lockshare 0 2
>> CPU 2: lockshare cacheline: 4226990 usec
>> CPU 0: lockshare cacheline: 4227038 usec
>> [root@qus19 ~]# ./rusty1 lockunshare 0 2
>> CPU 0: lockunshare cacheline: 4226707 usec
>> CPU 2: lockunshare cacheline: 4226662 usec
>> [root@qus19 ~]# ./rusty1 unshare 0 2
>> CPU 0: unshare cacheline: 14815048 usec
>> CPU 2: unshare cacheline: 14815006 usec
>>
>>    
>
> That's expected.  Hyperthread will be fastest (shared L1), shared L2/L3  
> will be slower, cross-socket will suck.

OK, after adding mb in code patch will be sent separately,
the test works for my workstation. locked is still fastest,
unshared sometimes shows wins and sometimes loses over shared.

[root@qus19 ~]# ./cachebounce share 0 1
CPU 0: share cacheline: 6638521 usec
CPU 1: share cacheline: 6638478 usec
[root@qus19 ~]# ./cachebounce unshare 0 1
CPU 0: unshare cacheline: 6037415 usec
CPU 1: unshare cacheline: 6037374 usec
[root@qus19 ~]# ./cachebounce lockshare 0 1
CPU 0: lockshare cacheline: 5734017 usec
CPU 1: lockshare cacheline: 5733978 usec
[root@qus19 ~]# ./cachebounce lockunshare 0 1
CPU 1: lockunshare cacheline: 5733260 usec
CPU 0: lockunshare cacheline: 5733307 usec
[root@qus19 ~]# ./cachebounce share 0 2
CPU 0: share cacheline: 14529198 usec
CPU 2: share cacheline: 14529156 usec
[root@qus19 ~]# ./cachebounce unshare 0 2
CPU 2: unshare cacheline: 14815328 usec
CPU 0: unshare cacheline: 14815374 usec
[root@qus19 ~]# ./cachebounce lockshare 0 2
CPU 0: lockshare cacheline: 4226878 usec
CPU 2: lockshare cacheline: 4226842 usec
[root@qus19 ~]# ./cachebounce locknushare 0 2
cachebounce: Usage: cachebounce share|unshare|lockshare|lockunshare <cpu0> <cpu1>
[root@qus19 ~]# ./cachebounce lockunshare 0 2
CPU 0: lockunshare cacheline: 4227432 usec
CPU 2: lockunshare cacheline: 4227375 usec

next prev parent reply	other threads:[~2010-05-23 15:56 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-05 20:58 [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself Michael S. Tsirkin
2010-05-05 21:18 ` [Qemu-devel] " Dor Laor
2010-05-06  2:31 ` Rusty Russell
2010-05-06  6:19   ` Michael S. Tsirkin
2010-05-07  3:33     ` Rusty Russell
2010-05-09 21:06       ` Michael S. Tsirkin
2010-05-06 10:00 ` [Qemu-devel] " Avi Kivity
2010-05-07  3:23   ` Rusty Russell
2010-05-11 19:27     ` Avi Kivity
2010-05-11 19:52       ` Michael S. Tsirkin
2010-05-19  7:39       ` Rusty Russell
2010-05-19  8:06         ` Avi Kivity
2010-05-19 22:33           ` Michael S. Tsirkin
2010-05-20  6:04             ` Avi Kivity
2010-05-20  5:01           ` Rusty Russell
2010-05-20  5:08             ` Rusty Russell
2010-05-23 15:31               ` Michael S. Tsirkin
2010-05-23 15:41                 ` Avi Kivity
2010-05-23 15:51                   ` Michael S. Tsirkin [this message]
2010-05-23 16:03                     ` Avi Kivity
2010-05-23 16:30                       ` Michael S. Tsirkin
2010-05-24  6:37                         ` Avi Kivity
2010-05-24  8:05                           ` Michael S. Tsirkin
2010-05-24 11:00                             ` Avi Kivity
2010-05-23 17:28                       ` Michael S. Tsirkin
2010-05-23 15:56               ` Michael S. Tsirkin
2010-05-20  7:00             ` Avi Kivity
2010-05-20 14:34               ` Rusty Russell
2010-05-20 15:46                 ` Avi Kivity
2010-05-20 10:04             ` Michael S. Tsirkin
2010-05-11 18:46 ` [Qemu-devel] " Ryan Harper
2010-05-11 19:48   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100523155132.GA14733@redhat.com \
    --to=mst@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).