[Qemu-devel] Re: [RFC] Moving the kvm ioapic, pic, and pit back to userspace

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Avi Kivity <avi@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: qemu-devel <qemu-devel@nongnu.org>, KVM list <kvm@vger.kernel.org>
Subject: [Qemu-devel] Re: [RFC] Moving the kvm ioapic, pic, and pit back to userspace
Date: Tue, 08 Jun 2010 08:48:13 +0300	[thread overview]
Message-ID: <4C0DD99D.4000603@redhat.com> (raw)
In-Reply-To: <4C0D717D.8010104@codemonkey.ws>

On 06/08/2010 01:23 AM, Anthony Liguori wrote:
>>> A better example would be a generic counter kernel mechanism.  I can 
>>> envision such a device as doing nothing more than providing a 
>>> read-only view of a counter with a userspace configurable divider 
>>> and width.  Any write to the counter or read of any other byte 
>>> outside the counter register would result in a trap to userspace.
>>
>> What about latches?  byte access to word registers?  There will be as 
>> many special cases as there are timers.
>>
>> If the kernel supported a bytecode/jit facility I'd happily use that 
>> to download portions of the device model into the kernel.
>>
>>>
>>> That should allow both the PIT and the HPET to be accelerated with 
>>> minimal effort in the kernel.
>>
>> IMO it's probably more effort than porting HPET to the kernel.  Try 
>> outlining an interface that supports PIT, HPET, RTC, and ACPI PMTIMER.
>
>
> I was referring specifically to time sources, not time events.
>
> An accelerated counter for HPET is pretty trivial.  It's a 32-bit 
> register that's actually a nanosecond value in qemu.  We need to be 
> able to set an offset from the host wall clock time, a means to stop 
> it, and a means to start it.
>
> The PIT is latched so the kernel needs to know enough about how to 
> decode the PIT state to understand the latching.  There's very little 
> state associated with latching though so I don't think this is a huge 
> problem.  It's a fixed value write to a fixed register followed by a 
> read to a fixed register.  The act of latching doesn't effect the 
> state beyond the fact that you need to save the latched value in the 
> event that you have a live migration before reading the latched value.
>
> The PMTIMER is also pretty straight forward.  It's a variable port 
> address (that's fixed during execution).
>
> Even if we require three separate interfaces, the interfaces are so 
> simply that it seems like an obvious win.

So a non-generic interface - 4x the interfaces (including RTC).

Those counters raise interrupts when they expire, and set various status 
bits in their hardware.  So we need 4x of:

   set counter value, frequency, and reload interval
   raise alarm to userspace on expiration
   set counter memory/ioport location and availability
   read counter value

and we haven't solved interrupt coalescing.

>
>>>
>>>> 5. Risk
>>>>
>>>> We may find out after all this is implemented that performance is 
>>>> not acceptable and all the work will have to be dropped.
>>>
>>> That's another advantage to a straight port to userspace.  We can 
>>> collect performance data with only a modest amount of engineering 
>>> effort.
>>
>> Port what exactly?  We have a userspace irqchip implementation.  What 
>> we don't have is just the ioapic/pic/pit in userspace, and the only 
>> way to try it out is to implement the whole thing.
>
> If you take the kernel code and do a pretty straight port: switching 
> kernel functions to libc functions and maintaining all the existing 
> locking via pthreads, you could then implement a very simple MMIO/PIO 
> dispatch mechanism in the kvm code that shortcutted those devices 
> before we ever hit the qemu_mutex and the traditional qemu code 
> paths.  It should be a relatively easy conversion and it gives a 
> proper vehicle for doing experimentations.

Those devices don't exist independently of the rest of the devices.  If 
they need to post interrupts, they will need the traditional qemu code 
paths.

(I'm trying to view the move from the POV of the kernel first, assuming 
userspace is as efficient as possible; so I'm not arguing qemu 
inefficiencies should prevent us from doing it.  But they do add up 
considerably to the amount of work involved)

>
> In fact, you could pretty quickly determine viability by porting the 
> PIT to userspace and implementing a vpit interface in the kernel that 
> allowed the channel 0 counters to be latched and read within 
> lightweight exits.


Just looking at it shows the interface is incredibly messy.  You have to 
maintain the control word in the kernel (since it tells you which 
counter to read or write), so now you need a userspace interface to read 
and write the control word.  With the current interface, you have the 
entire thing in a black box that you don't need to worry about (except 
for the speaker port...).


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

next prev parent reply	other threads:[~2010-06-08  5:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-07 15:26 [Qemu-devel] [RFC] Moving the kvm ioapic, pic, and pit back to userspace Avi Kivity
2010-06-07 16:31 ` [Qemu-devel] " David S. Ahern
2010-06-07 18:46   ` Avi Kivity
2010-06-07 18:54     ` David S. Ahern
2010-06-07 19:16       ` Avi Kivity
2010-06-07 17:04 ` Anthony Liguori
2010-06-07 18:42   ` Avi Kivity
2010-06-07 22:23     ` Anthony Liguori
2010-06-08  5:48       ` Avi Kivity [this message]
2010-06-09 15:59 ` [Qemu-devel] " Dong, Eddie
2010-06-09 16:05   ` [Qemu-devel] " Avi Kivity
2010-06-10  2:37     ` [Qemu-devel] " Dong, Eddie
2010-06-10  2:59       ` [Qemu-devel] " Avi Kivity
2010-06-10 14:42         ` [Qemu-devel] " Dong, Eddie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C0DD99D.4000603@redhat.com \
    --to=avi@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).