linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@qumranet.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	John Stoffel <john@stoffel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/7] KVM: userspace interface
Date: Sun, 22 Oct 2006 10:10:09 +0200	[thread overview]
Message-ID: <453B2761.5020103@qumranet.com> (raw)
In-Reply-To: <4538EC4A.1000600@us.ibm.com>

Anthony Liguori wrote:
>>>
>>> You miss my point I think.  Using ioctls *requires* a thread 
>>> per-vcpu in userspace.  This is unnecessary since you could simply 
>>> provide a char-device based read/write interface.  You could then 
>>> multiplex events and poll.
>>>
>>
>> Yes, ioctl()s require userspace threads, but that's okay, because 
>> they're free for us, since we need a kernel thread for each vcpu.
>>
>> On the other hand, a single device model thread polling the vcpus is 
>> guaranteed to be on the wrong physical cpu for half of the time 
>> (assuming 2 cpus and 2 vcpus), requiring IPIs and suspending a vcpu 
>> in order to run.
>
> And your previously proposed solution of having one big lock would do 
> the same thing except require additional round trips to the kernel :-)

No, with no contention locks stay in userspace.  And if there is 
contention, we fine-grain the locks.

>
> Moreover, you could get clever and use mmap() to expose a ring queue 
> if you're really concerned about SMP.
>
> Really though, it comes down to one simple thing: blocking ioctl()s 
> are a real ugly interface.
>

I don't think they can be termed "blocking".

Most (all?) blocking calls offload work to some other device, like a 
disk or a network card, and sleep if that device has to do any 
processing.  They follow the same basic procedure:

- if data (or bufferspace) is available, read (or write) it
- otherwise, sleep

But in this case the "other device" is the processor, so the that model 
doesn't fit very well, as it *forces* a context switch.

Moreover, we need to both read and write, which ioctls() allow, but 
read()/write() require two system calls.

>>> If for nothing else, you have to be able to run timers in userspace 
>>> and interrupt the kernel execution (to signal DMA completion for 
>>> instance).  Even in the UP case, this gets ugly quickly.
>>>
>>
>> The timers aren't pretty (we use signals), yes.  But avoiding the 
>> extra thread is critical for performance IMO.
>
> We've had a lot of problems in QEMU with timers and kqemu.  Forcing 
> the guest to return to userspace to allow periodic timers to run 
> (which may simply be the VGA refresh which the guest doesn't care 
> about) is at best a hack.

You can also have an additional thread to the periodic stuff.

>   Being able to poll an FD would make this so much nicer...
>
> I've posted some patches on qemu-devel attempting to deal with these 
> issues (look for threads on optimizing char device performance).  None 
> of them are very pretty.
>

Xen is different since you already have a context switch by going to 
domain 0.

-- 
error compiling committee.c: too many arguments to function


  reply	other threads:[~2006-10-22  8:10 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-19 13:45 [PATCH 0/7] KVM: Kernel-based Virtual Machine Avi Kivity
2006-10-19 13:47 ` [PATCH 1/7] KVM: userspace interface Avi Kivity
2006-10-19 14:30   ` John Stoffel
2006-10-19 14:43     ` Avi Kivity
2006-10-19 23:26       ` Greg KH
2006-10-19 14:50     ` Alan Cox
2006-10-19 14:51       ` Avi Kivity
2006-10-19 15:25         ` John Stoffel
2006-10-19 18:49       ` Anthony Liguori
2006-10-19 19:10         ` Avi Kivity
2006-10-19 19:17           ` Anthony Liguori
2006-10-20  7:36             ` Avi Kivity
2006-10-20 15:33               ` Anthony Liguori
2006-10-22  8:10                 ` Avi Kivity [this message]
2006-10-19 20:36         ` Alan Cox
2006-10-19 18:46   ` Anthony Liguori
2006-10-19 19:04     ` Avi Kivity
2006-10-19 19:09       ` Anthony Liguori
2006-10-19 19:26         ` Avi Kivity
2006-10-19 19:31           ` Anthony Liguori
2006-10-19 22:15             ` Alan Cox
2006-10-20  7:42             ` Avi Kivity
2006-10-20 15:35               ` Anthony Liguori
2006-10-19 20:10       ` Andi Kleen
2006-10-19 20:14   ` Jan Engelhardt
2006-10-20  7:16     ` Avi Kivity
2006-10-21 15:50       ` Arnd Bergmann
2006-10-22  8:19         ` Avi Kivity
2006-10-21 13:37   ` Steven Rostedt
2006-10-22  8:14     ` Avi Kivity
2006-10-19 13:48 ` [PATCH 2/7] KVM: Intel virtual mode extensions definitions Avi Kivity
2006-10-19 20:19   ` Jan Engelhardt
2006-10-19 21:54     ` Alan Cox
2006-10-20  7:17     ` Avi Kivity
2006-10-21 13:48   ` Steven Rostedt
2006-10-22  8:17     ` Avi Kivity
2006-10-19 13:49 ` [PATCH 3/7] KVM: kvm data structures Avi Kivity
2006-10-19 13:53 ` [PATCH 5/7] KVM: mmu virtualization Avi Kivity
2006-10-19 20:26   ` Jan Engelhardt
2006-10-20  7:24     ` Avi Kivity
2006-10-19 13:54 ` [PATCH 6/7] KVM: x86 emulator Avi Kivity
2006-10-19 13:56 ` [PATCH 7/7] KVM: plumbing Avi Kivity
2006-10-19 13:58 ` [PATCH 0/7] KVM: Kernel-based Virtual Machine Avi Kivity
2006-10-19 16:05 ` Andi Kleen
2006-10-19 16:09   ` Avi Kivity
2006-10-19 19:02     ` Anthony Liguori
2006-10-19 19:14       ` Avi Kivity
2006-10-19 19:28         ` Anthony Liguori
2006-10-20  7:37           ` Avi Kivity
2006-10-19 17:31 ` Muli Ben-Yehuda
2006-10-19 18:00   ` Avi Kivity
2006-10-19 18:12     ` Randy Dunlap
2006-10-19 18:14       ` Avi Kivity
2006-10-19 18:30         ` Randy.Dunlap
2006-10-21 16:16     ` Arnd Bergmann
2006-10-22  8:37       ` Avi Kivity
2006-10-22 15:23         ` Arnd Bergmann
2006-10-22 16:18           ` Avi Kivity
2006-10-22 16:51             ` Arnd Bergmann
2006-10-22 17:01               ` Avi Kivity
2006-10-22 17:06                 ` Arnd Bergmann
2006-10-22 17:41                   ` Avi Kivity
2006-10-22 17:47                     ` Arnd Bergmann
2006-10-22 17:56                 ` Christoph Hellwig
2006-10-22 18:00                   ` Avi Kivity
2006-10-22 18:36                     ` Arnd Bergmann
2006-10-22 18:41                       ` Avi Kivity
2006-10-22 18:49                         ` Arnd Bergmann
2006-10-22 18:55                           ` Avi Kivity
2006-10-22 22:26                     ` Andi Kleen
2006-10-23 22:29                       ` Jeremy Fitzhardinge
2006-10-22 20:01                   ` Alan Cox
2006-10-22 20:45                   ` Roland Dreier
2006-10-23  0:29                   ` Anthony Liguori
2006-10-25 16:42                   ` Pavel Machek
2006-10-22 19:59               ` Alan Cox
2006-10-22 22:28                 ` Andi Kleen
2006-10-23  0:27                   ` Roland Dreier
2006-10-23  0:39                     ` Andi Kleen
2006-10-23  0:51                       ` Roland Dreier
2006-10-22 17:39         ` Anthony Liguori
2006-10-22 17:53           ` Arnd Bergmann
2006-10-22 19:56         ` Alan Cox
2006-10-23  7:42           ` Avi Kivity
2006-10-24 21:38       ` kvm_create() (was Re: [PATCH 0/7] KVM: Kernel-based Virtual Machine) Andy Isaacson
2006-10-19 18:55   ` [PATCH 0/7] KVM: Kernel-based Virtual Machine Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=453B2761.5020103@qumranet.com \
    --to=avi@qumranet.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=aliguori@us.ibm.com \
    --cc=john@stoffel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).