From: Anthony Liguori <anthony@codemonkey.ws>
To: Gleb Natapov <gleb@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, Avi Kivity <avi@redhat.com>,
KVM list <kvm@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] Next gen kvm api
Date: Sun, 05 Feb 2012 10:36:06 -0600 [thread overview]
Message-ID: <4F2EAFF6.7030006@codemonkey.ws> (raw)
In-Reply-To: <20120205095153.GA29265@redhat.com>
On 02/05/2012 03:51 AM, Gleb Natapov wrote:
> On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote:
>> On 02/05/2012 11:37 AM, Gleb Natapov wrote:
>>> On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote:
>>>> Device model
>>>> ------------
>>>> Currently kvm virtualizes or emulates a set of x86 cores, with or
>>>> without local APICs, a 24-input IOAPIC, a PIC, a PIT, and a number of
>>>> PCI devices assigned from the host. The API allows emulating the local
>>>> APICs in userspace.
>>>>
>>>> The new API will do away with the IOAPIC/PIC/PIT emulation and defer
>>>> them to userspace. Note: this may cause a regression for older guests
>>>> that don't support MSI or kvmclock. Device assignment will be done
>>>> using VFIO, that is, without direct kvm involvement.
>>>>
>>> So are we officially saying that KVM is only for modern guest
>>> virtualization?
>>
>> No, but older guests may have reduced performance in some workloads
>> (e.g. RHEL4 gettimeofday() intensive workloads).
>>
> Reduced performance is what I mean. Obviously old guests will continue working.
An interesting solution to this problem would be an in-kernel device VM.
Most of the time, the hot register is just one register within a more complex
device. The reads are often side-effect free and trivially computed from some
device state + host time.
If userspace had a way to upload bytecode to the kernel that was executed for a
PIO operation, it could either pass the operation to userspace or handle it
within the kernel when possible without taking a heavy weight exit.
If the bytecode can access variables in a shared memory area, it could be pretty
efficient to work with.
This means that the kernel never has to deal with specific in-kernel devices but
that userspace can accelerator as many of its devices as it sees fit.
This could replace ioeventfd as a mechanism (which would allow clearing the
notify flag before writing to an eventfd).
We could potentially just use BPF for this.
Regards,
Anthony Liguori
WARNING: multiple messages have this Message-ID (diff)
From: Anthony Liguori <anthony@codemonkey.ws>
To: Gleb Natapov <gleb@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
KVM list <kvm@vger.kernel.org>,
qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [RFC] Next gen kvm api
Date: Sun, 05 Feb 2012 10:36:06 -0600 [thread overview]
Message-ID: <4F2EAFF6.7030006@codemonkey.ws> (raw)
In-Reply-To: <20120205095153.GA29265@redhat.com>
On 02/05/2012 03:51 AM, Gleb Natapov wrote:
> On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote:
>> On 02/05/2012 11:37 AM, Gleb Natapov wrote:
>>> On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote:
>>>> Device model
>>>> ------------
>>>> Currently kvm virtualizes or emulates a set of x86 cores, with or
>>>> without local APICs, a 24-input IOAPIC, a PIC, a PIT, and a number of
>>>> PCI devices assigned from the host. The API allows emulating the local
>>>> APICs in userspace.
>>>>
>>>> The new API will do away with the IOAPIC/PIC/PIT emulation and defer
>>>> them to userspace. Note: this may cause a regression for older guests
>>>> that don't support MSI or kvmclock. Device assignment will be done
>>>> using VFIO, that is, without direct kvm involvement.
>>>>
>>> So are we officially saying that KVM is only for modern guest
>>> virtualization?
>>
>> No, but older guests may have reduced performance in some workloads
>> (e.g. RHEL4 gettimeofday() intensive workloads).
>>
> Reduced performance is what I mean. Obviously old guests will continue working.
An interesting solution to this problem would be an in-kernel device VM.
Most of the time, the hot register is just one register within a more complex
device. The reads are often side-effect free and trivially computed from some
device state + host time.
If userspace had a way to upload bytecode to the kernel that was executed for a
PIO operation, it could either pass the operation to userspace or handle it
within the kernel when possible without taking a heavy weight exit.
If the bytecode can access variables in a shared memory area, it could be pretty
efficient to work with.
This means that the kernel never has to deal with specific in-kernel devices but
that userspace can accelerator as many of its devices as it sees fit.
This could replace ioeventfd as a mechanism (which would allow clearing the
notify flag before writing to an eventfd).
We could potentially just use BPF for this.
Regards,
Anthony Liguori
WARNING: multiple messages have this Message-ID (diff)
From: Anthony Liguori <anthony@codemonkey.ws>
To: Gleb Natapov <gleb@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, Avi Kivity <avi@redhat.com>,
KVM list <kvm@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [Qemu-devel] [RFC] Next gen kvm api
Date: Sun, 05 Feb 2012 10:36:06 -0600 [thread overview]
Message-ID: <4F2EAFF6.7030006@codemonkey.ws> (raw)
In-Reply-To: <20120205095153.GA29265@redhat.com>
On 02/05/2012 03:51 AM, Gleb Natapov wrote:
> On Sun, Feb 05, 2012 at 11:44:43AM +0200, Avi Kivity wrote:
>> On 02/05/2012 11:37 AM, Gleb Natapov wrote:
>>> On Thu, Feb 02, 2012 at 06:09:54PM +0200, Avi Kivity wrote:
>>>> Device model
>>>> ------------
>>>> Currently kvm virtualizes or emulates a set of x86 cores, with or
>>>> without local APICs, a 24-input IOAPIC, a PIC, a PIT, and a number of
>>>> PCI devices assigned from the host. The API allows emulating the local
>>>> APICs in userspace.
>>>>
>>>> The new API will do away with the IOAPIC/PIC/PIT emulation and defer
>>>> them to userspace. Note: this may cause a regression for older guests
>>>> that don't support MSI or kvmclock. Device assignment will be done
>>>> using VFIO, that is, without direct kvm involvement.
>>>>
>>> So are we officially saying that KVM is only for modern guest
>>> virtualization?
>>
>> No, but older guests may have reduced performance in some workloads
>> (e.g. RHEL4 gettimeofday() intensive workloads).
>>
> Reduced performance is what I mean. Obviously old guests will continue working.
An interesting solution to this problem would be an in-kernel device VM.
Most of the time, the hot register is just one register within a more complex
device. The reads are often side-effect free and trivially computed from some
device state + host time.
If userspace had a way to upload bytecode to the kernel that was executed for a
PIO operation, it could either pass the operation to userspace or handle it
within the kernel when possible without taking a heavy weight exit.
If the bytecode can access variables in a shared memory area, it could be pretty
efficient to work with.
This means that the kernel never has to deal with specific in-kernel devices but
that userspace can accelerator as many of its devices as it sees fit.
This could replace ioeventfd as a mechanism (which would allow clearing the
notify flag before writing to an eventfd).
We could potentially just use BPF for this.
Regards,
Anthony Liguori
next prev parent reply other threads:[~2012-02-05 16:36 UTC|newest]
Thread overview: 236+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-02 16:09 [RFC] Next gen kvm api Avi Kivity
2012-02-02 16:09 ` [Qemu-devel] " Avi Kivity
2012-02-02 22:13 ` Rob Earhart
2012-02-02 22:13 ` [Qemu-devel] " Rob Earhart
2012-02-02 22:16 ` Rob Earhart
2012-02-02 22:16 ` Rob Earhart
2012-02-05 13:14 ` Avi Kivity
2012-02-05 13:14 ` [Qemu-devel] " Avi Kivity
2012-02-06 17:41 ` Rob Earhart
2012-02-06 19:11 ` Anthony Liguori
2012-02-06 19:11 ` [Qemu-devel] " Anthony Liguori
2012-02-06 19:11 ` Anthony Liguori
2012-02-07 12:03 ` Avi Kivity
2012-02-07 12:03 ` [Qemu-devel] " Avi Kivity
2012-02-07 15:17 ` Anthony Liguori
2012-02-07 16:02 ` Avi Kivity
2012-02-07 16:18 ` Jan Kiszka
2012-02-07 16:18 ` [Qemu-devel] " Jan Kiszka
2012-02-07 16:18 ` Jan Kiszka
2012-02-07 16:21 ` Anthony Liguori
2012-02-07 16:21 ` Anthony Liguori
2012-02-07 16:29 ` Jan Kiszka
2012-02-07 16:29 ` Jan Kiszka
2012-02-15 13:41 ` Avi Kivity
2012-02-15 13:41 ` Avi Kivity
2012-02-07 16:19 ` Anthony Liguori
2012-02-15 13:47 ` Avi Kivity
2012-02-07 12:01 ` Avi Kivity
2012-02-03 2:09 ` Anthony Liguori
2012-02-03 2:09 ` [Qemu-devel] " Anthony Liguori
2012-02-03 2:09 ` Anthony Liguori
2012-02-04 2:08 ` Takuya Yoshikawa
2012-02-04 2:08 ` [Qemu-devel] " Takuya Yoshikawa
2012-02-04 2:08 ` Takuya Yoshikawa
2012-02-22 13:06 ` Peter Zijlstra
2012-02-22 13:06 ` Peter Zijlstra
2012-02-05 9:24 ` Avi Kivity
2012-02-05 9:24 ` [Qemu-devel] " Avi Kivity
2012-02-05 9:24 ` Avi Kivity
2012-02-07 1:08 ` Alexander Graf
2012-02-07 1:08 ` Alexander Graf
2012-02-07 1:08 ` Alexander Graf
2012-02-07 1:08 ` Alexander Graf
2012-02-07 12:24 ` [Qemu-devel] " Avi Kivity
2012-02-07 12:24 ` Avi Kivity
2012-02-07 12:24 ` Avi Kivity
2012-02-07 12:51 ` Alexander Graf
2012-02-07 12:51 ` Alexander Graf
2012-02-07 12:51 ` Alexander Graf
2012-02-07 13:16 ` Avi Kivity
2012-02-07 13:16 ` Avi Kivity
2012-02-07 13:16 ` Avi Kivity
2012-02-07 13:40 ` Alexander Graf
2012-02-07 13:40 ` Alexander Graf
2012-02-07 13:40 ` Alexander Graf
2012-02-07 14:21 ` Avi Kivity
2012-02-07 14:21 ` Avi Kivity
2012-02-07 14:21 ` Avi Kivity
2012-02-07 14:21 ` Avi Kivity
2012-02-07 14:39 ` [Qemu-devel] " Alexander Graf
2012-02-07 14:39 ` Alexander Graf
2012-02-07 14:39 ` Alexander Graf
2012-02-15 11:18 ` Avi Kivity
2012-02-15 11:18 ` Avi Kivity
2012-02-15 11:18 ` Avi Kivity
2012-02-15 11:57 ` Alexander Graf
2012-02-15 11:57 ` Alexander Graf
2012-02-15 11:57 ` Alexander Graf
2012-02-15 13:29 ` Avi Kivity
2012-02-15 13:29 ` Avi Kivity
2012-02-15 13:29 ` Avi Kivity
2012-02-15 13:37 ` Alexander Graf
2012-02-15 13:37 ` Alexander Graf
2012-02-15 13:37 ` Alexander Graf
2012-02-15 13:57 ` Avi Kivity
2012-02-15 13:57 ` Avi Kivity
2012-02-15 13:57 ` Avi Kivity
2012-02-15 14:08 ` Alexander Graf
2012-02-15 14:08 ` Alexander Graf
2012-02-15 14:08 ` Alexander Graf
2012-02-16 19:24 ` Avi Kivity
2012-02-16 19:24 ` Avi Kivity
2012-02-16 19:24 ` Avi Kivity
2012-02-16 19:24 ` Avi Kivity
2012-02-16 19:34 ` [Qemu-devel] " Alexander Graf
2012-02-16 19:34 ` Alexander Graf
2012-02-16 19:34 ` Alexander Graf
2012-02-16 19:38 ` Avi Kivity
2012-02-16 19:38 ` Avi Kivity
2012-02-16 19:38 ` Avi Kivity
2012-02-16 20:41 ` Scott Wood
2012-02-16 20:41 ` Scott Wood
2012-02-16 20:41 ` Scott Wood
2012-02-17 0:23 ` Alexander Graf
2012-02-17 0:23 ` Alexander Graf
2012-02-17 0:23 ` Alexander Graf
2012-02-17 18:27 ` Scott Wood
2012-02-17 18:27 ` Scott Wood
2012-02-17 18:27 ` Scott Wood
2012-02-18 9:49 ` Avi Kivity
2012-02-18 9:49 ` Avi Kivity
2012-02-18 9:49 ` Avi Kivity
2012-02-18 9:49 ` Avi Kivity
2012-02-17 0:19 ` [Qemu-devel] " Alexander Graf
2012-02-17 0:19 ` Alexander Graf
2012-02-17 0:19 ` Alexander Graf
2012-02-18 10:00 ` Avi Kivity
2012-02-18 10:00 ` Avi Kivity
2012-02-18 10:00 ` Avi Kivity
2012-02-18 10:00 ` Avi Kivity
2012-02-18 10:43 ` [Qemu-devel] " Alexander Graf
2012-02-18 10:43 ` Alexander Graf
2012-02-18 10:43 ` Alexander Graf
2012-02-15 19:17 ` Scott Wood
2012-02-15 19:17 ` Scott Wood
2012-02-15 19:17 ` Scott Wood
2012-02-12 7:10 ` Takuya Yoshikawa
2012-02-12 7:10 ` Takuya Yoshikawa
2012-02-12 7:10 ` Takuya Yoshikawa
2012-02-12 7:10 ` Takuya Yoshikawa
2012-02-15 13:32 ` [Qemu-devel] " Avi Kivity
2012-02-15 13:32 ` Avi Kivity
2012-02-15 13:32 ` Avi Kivity
2012-02-07 15:23 ` Anthony Liguori
2012-02-07 15:23 ` Anthony Liguori
2012-02-07 15:23 ` Anthony Liguori
2012-02-07 15:28 ` Alexander Graf
2012-02-07 15:28 ` Alexander Graf
2012-02-07 15:28 ` Alexander Graf
2012-02-08 17:20 ` Alan Cox
2012-02-08 17:20 ` Alan Cox
2012-02-08 17:20 ` Alan Cox
2012-02-15 13:33 ` Avi Kivity
2012-02-15 13:33 ` Avi Kivity
2012-02-15 13:33 ` Avi Kivity
2012-02-15 22:14 ` Arnd Bergmann
2012-02-15 22:14 ` Arnd Bergmann
2012-02-10 3:07 ` Jamie Lokier
2012-02-10 3:07 ` Jamie Lokier
2012-02-03 18:07 ` Eric Northup
2012-02-03 18:07 ` [Qemu-devel] " Eric Northup
2012-02-03 18:07 ` Eric Northup
2012-02-03 22:52 ` Anthony Liguori
2012-02-03 22:52 ` [Qemu-devel] " Anthony Liguori
2012-02-03 22:52 ` Anthony Liguori
2012-02-06 19:46 ` Scott Wood
2012-02-06 19:46 ` Scott Wood
2012-02-07 6:58 ` Michael Ellerman
2012-02-07 6:58 ` [Qemu-devel] " Michael Ellerman
2012-02-07 6:58 ` Michael Ellerman
2012-02-07 10:04 ` Alexander Graf
2012-02-07 10:04 ` Alexander Graf
2012-02-15 22:21 ` Arnd Bergmann
2012-02-15 22:21 ` Arnd Bergmann
2012-02-16 1:04 ` Michael Ellerman
2012-02-16 1:04 ` [Qemu-devel] " Michael Ellerman
2012-02-16 1:04 ` Michael Ellerman
2012-02-16 19:28 ` Avi Kivity
2012-02-16 19:28 ` Avi Kivity
2012-02-17 0:09 ` Michael Ellerman
2012-02-17 0:09 ` [Qemu-devel] " Michael Ellerman
2012-02-17 0:09 ` Michael Ellerman
2012-02-18 10:03 ` Avi Kivity
2012-02-18 10:03 ` [Qemu-devel] " Avi Kivity
2012-02-18 10:03 ` Avi Kivity
2012-02-16 10:26 ` Avi Kivity
2012-02-16 10:26 ` [Qemu-devel] " Avi Kivity
2012-02-16 10:26 ` Avi Kivity
2012-02-07 12:28 ` Anthony Liguori
2012-02-07 12:28 ` Anthony Liguori
2012-02-07 12:40 ` Avi Kivity
2012-02-07 12:40 ` Avi Kivity
2012-02-07 12:51 ` Anthony Liguori
2012-02-07 12:51 ` Anthony Liguori
2012-02-07 13:18 ` Avi Kivity
2012-02-07 13:18 ` [Qemu-devel] " Avi Kivity
2012-02-07 13:18 ` Avi Kivity
2012-02-07 15:15 ` Anthony Liguori
2012-02-07 15:15 ` Anthony Liguori
2012-02-07 18:28 ` Chris Wright
2012-02-07 18:28 ` Chris Wright
2012-02-08 17:02 ` Scott Wood
2012-02-08 17:02 ` Scott Wood
2012-02-08 17:12 ` Alan Cox
2012-02-08 17:12 ` [Qemu-devel] " Alan Cox
2012-02-08 17:12 ` Alan Cox
2012-02-05 9:37 ` Gleb Natapov
2012-02-05 9:37 ` [Qemu-devel] " Gleb Natapov
2012-02-05 9:37 ` Gleb Natapov
2012-02-05 9:44 ` Avi Kivity
2012-02-05 9:44 ` [Qemu-devel] " Avi Kivity
2012-02-05 9:44 ` Avi Kivity
2012-02-05 9:51 ` Gleb Natapov
2012-02-05 9:51 ` [Qemu-devel] " Gleb Natapov
2012-02-05 9:51 ` Gleb Natapov
2012-02-05 9:56 ` Avi Kivity
2012-02-05 9:56 ` [Qemu-devel] " Avi Kivity
2012-02-05 9:56 ` Avi Kivity
2012-02-05 10:58 ` Gleb Natapov
2012-02-05 10:58 ` [Qemu-devel] " Gleb Natapov
2012-02-05 10:58 ` Gleb Natapov
2012-02-05 13:16 ` Avi Kivity
2012-02-05 13:16 ` [Qemu-devel] " Avi Kivity
2012-02-05 13:16 ` Avi Kivity
2012-02-05 16:36 ` Anthony Liguori [this message]
2012-02-05 16:36 ` [Qemu-devel] " Anthony Liguori
2012-02-05 16:36 ` Anthony Liguori
2012-02-06 9:34 ` Avi Kivity
2012-02-06 9:34 ` [Qemu-devel] " Avi Kivity
2012-02-06 9:34 ` Avi Kivity
2012-02-06 13:33 ` Anthony Liguori
2012-02-06 13:33 ` Anthony Liguori
2012-02-06 13:54 ` Avi Kivity
2012-02-06 13:54 ` Avi Kivity
2012-02-06 14:00 ` Anthony Liguori
2012-02-06 14:00 ` Anthony Liguori
2012-02-06 14:08 ` Avi Kivity
2012-02-06 14:08 ` Avi Kivity
2012-02-07 18:12 ` Rusty Russell
2012-02-07 18:12 ` [Qemu-devel] " Rusty Russell
2012-02-07 18:12 ` Rusty Russell
2012-02-15 13:39 ` Avi Kivity
2012-02-15 13:39 ` Avi Kivity
2012-02-15 21:59 ` Anthony Liguori
2012-02-15 21:59 ` Anthony Liguori
2012-02-16 8:57 ` Gleb Natapov
2012-02-16 8:57 ` [Qemu-devel] " Gleb Natapov
2012-02-16 8:57 ` Gleb Natapov
2012-02-16 14:46 ` Anthony Liguori
2012-02-16 14:46 ` Anthony Liguori
2012-02-16 19:34 ` Avi Kivity
2012-02-16 19:34 ` [Qemu-devel] " Avi Kivity
2012-02-16 19:34 ` Avi Kivity
2012-02-15 23:08 ` Rusty Russell
2012-02-15 23:08 ` [Qemu-devel] " Rusty Russell
2012-02-15 23:08 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F2EAFF6.7030006@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=gleb@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.