* [PATCH 0/5] kvm: Batch writes to MMIO
@ 2008-05-23 8:50 Laurent Vivier
2008-05-26 12:00 ` Avi Kivity
2008-05-26 14:20 ` Avi Kivity
0 siblings, 2 replies; 12+ messages in thread
From: Laurent Vivier @ 2008-05-23 8:50 UTC (permalink / raw)
To: kvm; +Cc: Laurent Vivier
When kernel has to send MMIO writes to userspace, it stores them
in memory until it has to pass the hand to userspace for another
reason. This avoids to have too many context switches on operations
that can wait.
These patches introduce an ioctl() to define MMIO allowed to be delayed.
WITHOUT WITH
PATCH PATCH
iperf (e1000) 169 MB/s 185,5 MB/s +9,7%
host_state_reload (626594) (391825) -37%
[9,7% is a more realistic value than my previous benchmark]
boot XP
host_state_reload 764677 516059 -32%
VGA text scroll
host_state_reload 13280568 (6:15) 3608362 (4:42) -73% (-25%)
This is the kernel part of the MMIO batching functionality.
[PATCH 1/5] kvm_io_device: extend in_range() to manage len and write attribute
Modify member in_range() of structure kvm_io_device to pass length
and the type of the I/O (write or read).
[PATCH 2/5] Add delayed MMIO support (common part)
This patch adds all needed structures to batch MMIOs.
Until an architecture uses it, it is not compiled.
[PATCH 3/5] Add delayed MMIO support (x86 part)
This patch enables MMIO batching for x86 architecture.
[PATCH 4/5] Add delayed MMIO support (powerpc part)
This patch enables MMIO batching for powerpc architecture.
WARNING: this has not been tested.
[PATCH 5/5] Add delayed MMIO support (ia64 part)
This patch enables MMIO batching for ia64 architecture.
WARNING: this has not been tested.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/5] kvm: Batch writes to MMIO
2008-05-23 8:50 Laurent Vivier
@ 2008-05-26 12:00 ` Avi Kivity
2008-05-26 12:10 ` Laurent Vivier
2008-05-26 14:20 ` Avi Kivity
1 sibling, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2008-05-26 12:00 UTC (permalink / raw)
To: Laurent Vivier; +Cc: kvm
Laurent Vivier wrote:
> When kernel has to send MMIO writes to userspace, it stores them
> in memory until it has to pass the hand to userspace for another
> reason. This avoids to have too many context switches on operations
> that can wait.
>
>
Looks good. The only issue I see is inconsistent naming. Sometimes you
use delayed_mmio, sometimes batched_mmio, and there's a way-too-generic
kvm_batch. Please stick to a single name throughout the code.
> VGA text scroll
> host_state_reload 13280568 (6:15) 3608362 (4:42) -73% (-25%)
This is probably better addressed by allowing the guest to write
directly to VRAM, like in the graphic modes.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/5] kvm: Batch writes to MMIO
2008-05-26 12:00 ` Avi Kivity
@ 2008-05-26 12:10 ` Laurent Vivier
2008-05-26 13:43 ` Avi Kivity
0 siblings, 1 reply; 12+ messages in thread
From: Laurent Vivier @ 2008-05-26 12:10 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm
Le lundi 26 mai 2008 à 15:00 +0300, Avi Kivity a écrit :
> Laurent Vivier wrote:
> > When kernel has to send MMIO writes to userspace, it stores them
> > in memory until it has to pass the hand to userspace for another
> > reason. This avoids to have too many context switches on operations
> > that can wait.
> >
> >
>
> Looks good. The only issue I see is inconsistent naming. Sometimes you
> use delayed_mmio, sometimes batched_mmio, and there's a way-too-generic
> kvm_batch. Please stick to a single name throughout the code.
Which is your favorite ?
> > VGA text scroll
> > host_state_reload 13280568 (6:15) 3608362 (4:42) -73% (-25%)
>
> This is probably better addressed by allowing the guest to write
> directly to VRAM, like in the graphic modes.
Should I modify something in the patch ?
Thank you for your comments.
Regards,
Laurent
--
------------- Laurent.Vivier@bull.net ---------------
"The best way to predict the future is to invent it."
- Alan Kay
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/5] kvm: Batch writes to MMIO
2008-05-26 12:10 ` Laurent Vivier
@ 2008-05-26 13:43 ` Avi Kivity
0 siblings, 0 replies; 12+ messages in thread
From: Avi Kivity @ 2008-05-26 13:43 UTC (permalink / raw)
To: Laurent Vivier; +Cc: kvm
Laurent Vivier wrote:
> Le lundi 26 mai 2008 à 15:00 +0300, Avi Kivity a écrit :
>
>> Laurent Vivier wrote:
>>
>>> When kernel has to send MMIO writes to userspace, it stores them
>>> in memory until it has to pass the hand to userspace for another
>>> reason. This avoids to have too many context switches on operations
>>> that can wait.
>>>
>>>
>>>
>> Looks good. The only issue I see is inconsistent naming. Sometimes you
>> use delayed_mmio, sometimes batched_mmio, and there's a way-too-generic
>> kvm_batch. Please stick to a single name throughout the code.
>>
>
> Which is your favorite ?
>
>
kvm_coalesced_mmio? :)
>>> VGA text scroll
>>> host_state_reload 13280568 (6:15) 3608362 (4:42) -73% (-25%)
>>>
>> This is probably better addressed by allowing the guest to write
>> directly to VRAM, like in the graphic modes.
>>
>
> Should I modify something in the patch ?
>
No, but when we direct-map text-mode vga, this benchmark will be obsolete.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/5] kvm: Batch writes to MMIO
2008-05-23 8:50 Laurent Vivier
2008-05-26 12:00 ` Avi Kivity
@ 2008-05-26 14:20 ` Avi Kivity
1 sibling, 0 replies; 12+ messages in thread
From: Avi Kivity @ 2008-05-26 14:20 UTC (permalink / raw)
To: Laurent Vivier; +Cc: kvm, Hollis Blanchard, Zhang, Xiantao
Laurent Vivier wrote:
> When kernel has to send MMIO writes to userspace, it stores them
> in memory until it has to pass the hand to userspace for another
> reason. This avoids to have too many context switches on operations
> that can wait.
>
>
Hollis/Xiantao, can you review this as well? Maybe even a test compile
and run?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 0/5] kvm: Batch writes to MMIO
@ 2008-05-30 14:05 Laurent Vivier
2008-05-30 14:05 ` [PATCH 1/5] kvm_io_device: extend in_range() to manage len and write attribute Laurent Vivier
2008-06-04 15:13 ` [PATCH 0/5] kvm: Batch writes to MMIO Avi Kivity
0 siblings, 2 replies; 12+ messages in thread
From: Laurent Vivier @ 2008-05-30 14:05 UTC (permalink / raw)
To: kvm; +Cc: Laurent Vivier
When kernel has to send MMIO writes to userspace, it stores them
in memory until it has to pass the hand to userspace for another
reason. This avoids to have too many context switches on operations
that can wait.
These patches introduce an ioctl() to define MMIO allowed to be coalesced.
This is the kernel part of the coalesced MMIO functionality.
[PATCH 1/5] kvm_io_device: extend in_range() to manage len and write attribute
Modify member in_range() of structure kvm_io_device to pass length
and the type of the I/O (write or read).
[PATCH 2/5] Add coalesced MMIO support (common part)
This patch adds all needed structures to coalesce MMIOs.
[PATCH 3/5] Add coalesced MMIO support (x86 part)
This patch enables coalesced MMIO for x86 architecture.
[PATCH 4/5] Add coalesced MMIO support (powerpc part)
This patch enables coalesced MMIO for powerpc architecture.
WARNING: this has not been tested.
[PATCH 5/5] Add coalesced MMIO support (ia64 part)
This patch enables coalesced MMIO for ia64 architecture.
WARNING: this has not been tested.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/5] kvm_io_device: extend in_range() to manage len and write attribute
2008-05-30 14:05 [PATCH 0/5] kvm: Batch writes to MMIO Laurent Vivier
@ 2008-05-30 14:05 ` Laurent Vivier
2008-05-30 14:05 ` [PATCH 2/5] Add coalesced MMIO support (common part) Laurent Vivier
2008-06-04 15:13 ` [PATCH 0/5] kvm: Batch writes to MMIO Avi Kivity
1 sibling, 1 reply; 12+ messages in thread
From: Laurent Vivier @ 2008-05-30 14:05 UTC (permalink / raw)
To: kvm; +Cc: Laurent Vivier
Modify member in_range() of structure kvm_io_device to pass length and the type
of the I/O (write or read).
This modification allows to use kvm_io_device with coalesced MMIO.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
arch/ia64/kvm/kvm-ia64.c | 6 +++---
arch/x86/kvm/i8254.c | 6 ++++--
arch/x86/kvm/i8259.c | 3 ++-
arch/x86/kvm/lapic.c | 3 ++-
arch/x86/kvm/x86.c | 28 +++++++++++++++++-----------
include/linux/kvm_host.h | 3 ++-
virt/kvm/ioapic.c | 3 ++-
virt/kvm/iodev.h | 8 +++++---
virt/kvm/kvm_main.c | 5 +++--
9 files changed, 40 insertions(+), 25 deletions(-)
diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index adb74f7..b59231b 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -195,11 +195,11 @@ int kvm_dev_ioctl_check_extension(long ext)
}
static struct kvm_io_device *vcpu_find_mmio_dev(struct kvm_vcpu *vcpu,
- gpa_t addr)
+ gpa_t addr, int len, int is_write)
{
struct kvm_io_device *dev;
- dev = kvm_io_bus_find_dev(&vcpu->kvm->mmio_bus, addr);
+ dev = kvm_io_bus_find_dev(&vcpu->kvm->mmio_bus, addr, len, is_write);
return dev;
}
@@ -231,7 +231,7 @@ static int handle_mmio(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
kvm_run->exit_reason = KVM_EXIT_MMIO;
return 0;
mmio:
- mmio_dev = vcpu_find_mmio_dev(vcpu, p->addr);
+ mmio_dev = vcpu_find_mmio_dev(vcpu, p->addr, p->size, !p->dir);
if (mmio_dev) {
if (!p->dir)
kvm_iodevice_write(mmio_dev, p->addr, p->size,
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 6d6dc6c..1558034 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -445,7 +445,8 @@ static void pit_ioport_read(struct kvm_io_device *this,
mutex_unlock(&pit_state->lock);
}
-static int pit_in_range(struct kvm_io_device *this, gpa_t addr)
+static int pit_in_range(struct kvm_io_device *this, gpa_t addr,
+ int len, int is_write)
{
return ((addr >= KVM_PIT_BASE_ADDRESS) &&
(addr < KVM_PIT_BASE_ADDRESS + KVM_PIT_MEM_LENGTH));
@@ -486,7 +487,8 @@ static void speaker_ioport_read(struct kvm_io_device *this,
mutex_unlock(&pit_state->lock);
}
-static int speaker_in_range(struct kvm_io_device *this, gpa_t addr)
+static int speaker_in_range(struct kvm_io_device *this, gpa_t addr,
+ int len, int is_write)
{
return (addr == KVM_SPEAKER_BASE_ADDRESS);
}
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index ab29cf2..5857f59 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -346,7 +346,8 @@ static u32 elcr_ioport_read(void *opaque, u32 addr1)
return s->elcr;
}
-static int picdev_in_range(struct kvm_io_device *this, gpa_t addr)
+static int picdev_in_range(struct kvm_io_device *this, gpa_t addr,
+ int len, int is_write)
{
switch (addr) {
case 0x20:
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8fcd84e..7d0670e 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -785,7 +785,8 @@ static void apic_mmio_write(struct kvm_io_device *this,
}
-static int apic_mmio_range(struct kvm_io_device *this, gpa_t addr)
+static int apic_mmio_range(struct kvm_io_device *this, gpa_t addr,
+ int len, int size)
{
struct kvm_lapic *apic = (struct kvm_lapic *)this->private;
int ret = 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2751e98..f829319 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1748,13 +1748,14 @@ static void kvm_init_msr_list(void)
* Only apic need an MMIO device hook, so shortcut now..
*/
static struct kvm_io_device *vcpu_find_pervcpu_dev(struct kvm_vcpu *vcpu,
- gpa_t addr)
+ gpa_t addr, int len,
+ int is_write)
{
struct kvm_io_device *dev;
if (vcpu->arch.apic) {
dev = &vcpu->arch.apic->dev;
- if (dev->in_range(dev, addr))
+ if (dev->in_range(dev, addr, len, is_write))
return dev;
}
return NULL;
@@ -1762,13 +1763,15 @@ static struct kvm_io_device *vcpu_find_pervcpu_dev(struct kvm_vcpu *vcpu,
static struct kvm_io_device *vcpu_find_mmio_dev(struct kvm_vcpu *vcpu,
- gpa_t addr)
+ gpa_t addr, int len,
+ int is_write)
{
struct kvm_io_device *dev;
- dev = vcpu_find_pervcpu_dev(vcpu, addr);
+ dev = vcpu_find_pervcpu_dev(vcpu, addr, len, is_write);
if (dev == NULL)
- dev = kvm_io_bus_find_dev(&vcpu->kvm->mmio_bus, addr);
+ dev = kvm_io_bus_find_dev(&vcpu->kvm->mmio_bus, addr, len,
+ is_write);
return dev;
}
@@ -1836,7 +1839,7 @@ mmio:
* Is this MMIO handled locally?
*/
mutex_lock(&vcpu->kvm->lock);
- mmio_dev = vcpu_find_mmio_dev(vcpu, gpa);
+ mmio_dev = vcpu_find_mmio_dev(vcpu, gpa, bytes, 0);
if (mmio_dev) {
kvm_iodevice_read(mmio_dev, gpa, bytes, val);
mutex_unlock(&vcpu->kvm->lock);
@@ -1891,7 +1894,7 @@ mmio:
* Is this MMIO handled locally?
*/
mutex_lock(&vcpu->kvm->lock);
- mmio_dev = vcpu_find_mmio_dev(vcpu, gpa);
+ mmio_dev = vcpu_find_mmio_dev(vcpu, gpa, bytes, 1);
if (mmio_dev) {
kvm_iodevice_write(mmio_dev, gpa, bytes, val);
mutex_unlock(&vcpu->kvm->lock);
@@ -2268,9 +2271,10 @@ static void pio_string_write(struct kvm_io_device *pio_dev,
}
static struct kvm_io_device *vcpu_find_pio_dev(struct kvm_vcpu *vcpu,
- gpa_t addr)
+ gpa_t addr, int len,
+ int is_write)
{
- return kvm_io_bus_find_dev(&vcpu->kvm->pio_bus, addr);
+ return kvm_io_bus_find_dev(&vcpu->kvm->pio_bus, addr, len, is_write);
}
int kvm_emulate_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
@@ -2302,7 +2306,7 @@ int kvm_emulate_pio(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
kvm_x86_ops->skip_emulated_instruction(vcpu);
- pio_dev = vcpu_find_pio_dev(vcpu, port);
+ pio_dev = vcpu_find_pio_dev(vcpu, port, size, !in);
if (pio_dev) {
kernel_pio(pio_dev, vcpu, vcpu->arch.pio_data);
complete_pio(vcpu);
@@ -2384,7 +2388,9 @@ int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
}
}
- pio_dev = vcpu_find_pio_dev(vcpu, port);
+ pio_dev = vcpu_find_pio_dev(vcpu, port,
+ vcpu->arch.pio.cur_count,
+ !vcpu->arch.pio.in);
if (!vcpu->arch.pio.in) {
/* string PIO write */
ret = pio_copy_data(vcpu);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d622309..57b376b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -51,7 +51,8 @@ struct kvm_io_bus {
void kvm_io_bus_init(struct kvm_io_bus *bus);
void kvm_io_bus_destroy(struct kvm_io_bus *bus);
-struct kvm_io_device *kvm_io_bus_find_dev(struct kvm_io_bus *bus, gpa_t addr);
+struct kvm_io_device *kvm_io_bus_find_dev(struct kvm_io_bus *bus,
+ gpa_t addr, int len, int is_write);
void kvm_io_bus_register_dev(struct kvm_io_bus *bus,
struct kvm_io_device *dev);
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 99a1736..763b114 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -315,7 +315,8 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector)
ioapic_deliver(ioapic, gsi);
}
-static int ioapic_in_range(struct kvm_io_device *this, gpa_t addr)
+static int ioapic_in_range(struct kvm_io_device *this, gpa_t addr,
+ int len, int is_write)
{
struct kvm_ioapic *ioapic = (struct kvm_ioapic *)this->private;
diff --git a/virt/kvm/iodev.h b/virt/kvm/iodev.h
index c14e642..55e8846 100644
--- a/virt/kvm/iodev.h
+++ b/virt/kvm/iodev.h
@@ -27,7 +27,8 @@ struct kvm_io_device {
gpa_t addr,
int len,
const void *val);
- int (*in_range)(struct kvm_io_device *this, gpa_t addr);
+ int (*in_range)(struct kvm_io_device *this, gpa_t addr, int len,
+ int is_write);
void (*destructor)(struct kvm_io_device *this);
void *private;
@@ -49,9 +50,10 @@ static inline void kvm_iodevice_write(struct kvm_io_device *dev,
dev->write(dev, addr, len, val);
}
-static inline int kvm_iodevice_inrange(struct kvm_io_device *dev, gpa_t addr)
+static inline int kvm_iodevice_inrange(struct kvm_io_device *dev,
+ gpa_t addr, int len, int is_write)
{
- return dev->in_range(dev, addr);
+ return dev->in_range(dev, addr, len, is_write);
}
static inline void kvm_iodevice_destructor(struct kvm_io_device *dev)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e8f9fda..d602700 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1350,14 +1350,15 @@ void kvm_io_bus_destroy(struct kvm_io_bus *bus)
}
}
-struct kvm_io_device *kvm_io_bus_find_dev(struct kvm_io_bus *bus, gpa_t addr)
+struct kvm_io_device *kvm_io_bus_find_dev(struct kvm_io_bus *bus,
+ gpa_t addr, int len, int is_write)
{
int i;
for (i = 0; i < bus->dev_count; i++) {
struct kvm_io_device *pos = bus->devs[i];
- if (pos->in_range(pos, addr))
+ if (pos->in_range(pos, addr, len, is_write))
return pos;
}
--
1.5.2.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/5] Add coalesced MMIO support (common part)
2008-05-30 14:05 ` [PATCH 1/5] kvm_io_device: extend in_range() to manage len and write attribute Laurent Vivier
@ 2008-05-30 14:05 ` Laurent Vivier
2008-05-30 14:05 ` [PATCH 3/5] Add coalesced MMIO support (x86 part) Laurent Vivier
0 siblings, 1 reply; 12+ messages in thread
From: Laurent Vivier @ 2008-05-30 14:05 UTC (permalink / raw)
To: kvm; +Cc: Laurent Vivier
This patch adds all needed structures to coalesce MMIOs.
Until an architecture uses it, it is not compiled.
Coalesced MMIO introduces two ioctl() to define where are the MMIO zones that
can be coalesced:
- KVM_REGISTER_COALESCED_MMIO registers a coalesced MMIO zone.
It requests one parameter (struct kvm_coalesced_mmio_zone) which defines
a memory area where MMIOs can be coalesced until the next switch to
user space. The maximum number of MMIO zones is KVM_COALESCED_MMIO_ZONE_MAX.
- KVM_UNREGISTER_COALESCED_MMIO cancels all registered zones inside
the given bounds (bounds are also given by struct kvm_coalesced_mmio_zone).
The userspace client can check kernel coalesced MMIO availability by asking
ioctl(KVM_CHECK_EXTENSION) for the KVM_CAP_COALESCED_MMIO capability.
The ioctl() call to KVM_CAP_COALESCED_MMIO will return 0 if not supported,
or the page offset where will be stored the ring buffer.
The page offset depends on the architecture.
After an ioctl(KVM_RUN), the first page of the KVM memory mapped points to
a kvm_run structure. The offset given by KVM_CAP_COALESCED_MMIO is
an offset to the coalesced MMIO ring expressed in PAGE_SIZE relatively
to the address of the start of th kvm_run structure. The MMIO ring buffer
is defined by the structure kvm_coalesced_mmio_ring.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
include/linux/kvm.h | 29 ++++++++
include/linux/kvm_host.h | 4 +
virt/kvm/coalesced_mmio.c | 156 +++++++++++++++++++++++++++++++++++++++++++++
virt/kvm/coalesced_mmio.h | 23 +++++++
virt/kvm/kvm_main.c | 57 ++++++++++++++++
5 files changed, 269 insertions(+), 0 deletions(-)
create mode 100644 virt/kvm/coalesced_mmio.c
create mode 100644 virt/kvm/coalesced_mmio.h
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index a281afe..1c908ac 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -173,6 +173,30 @@ struct kvm_run {
};
};
+/* for KVM_REGISTER_COALESCED_MMIO / KVM_UNREGISTER_COALESCED_MMIO */
+
+struct kvm_coalesced_mmio_zone {
+ __u64 addr;
+ __u32 size;
+ __u32 pad;
+};
+
+struct kvm_coalesced_mmio {
+ __u64 phys_addr;
+ __u32 len;
+ __u32 pad;
+ __u8 data[8];
+};
+
+struct kvm_coalesced_mmio_ring {
+ __u32 first, last;
+ struct kvm_coalesced_mmio coalesced_mmio[0];
+};
+
+#define KVM_COALESCED_MMIO_MAX \
+ ((PAGE_SIZE - sizeof(struct kvm_coalesced_mmio_ring)) / \
+ sizeof(struct kvm_coalesced_mmio))
+
/* for KVM_TRANSLATE */
struct kvm_translation {
/* in */
@@ -346,6 +370,7 @@ struct kvm_trace_rec {
#define KVM_CAP_NOP_IO_DELAY 12
#define KVM_CAP_PV_MMU 13
#define KVM_CAP_MP_STATE 14
+#define KVM_CAP_COALESCED_MMIO 15
/*
* ioctls for VM fds
@@ -371,6 +396,10 @@ struct kvm_trace_rec {
#define KVM_CREATE_PIT _IO(KVMIO, 0x64)
#define KVM_GET_PIT _IOWR(KVMIO, 0x65, struct kvm_pit_state)
#define KVM_SET_PIT _IOR(KVMIO, 0x66, struct kvm_pit_state)
+#define KVM_REGISTER_COALESCED_MMIO \
+ _IOW(KVMIO, 0x67, struct kvm_coalesced_mmio_zone)
+#define KVM_UNREGISTER_COALESCED_MMIO \
+ _IOW(KVMIO, 0x68, struct kvm_coalesced_mmio_zone)
/*
* ioctls for vcpu fds
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 57b376b..5424ecc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -116,6 +116,10 @@ struct kvm {
struct kvm_vm_stat stat;
struct kvm_arch arch;
atomic_t users_count;
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+ struct kvm_coalesced_mmio_dev *coalesced_mmio_dev;
+ struct kvm_coalesced_mmio_ring *coalesced_mmio_ring;
+#endif
};
/* The guest did something we don't support. */
diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
new file mode 100644
index 0000000..5ae620d
--- /dev/null
+++ b/virt/kvm/coalesced_mmio.c
@@ -0,0 +1,156 @@
+/*
+ * KVM coalesced MMIO
+ *
+ * Copyright (c) 2008 Bull S.A.S.
+ *
+ * Author: Laurent Vivier <Laurent.Vivier@bull.net>
+ *
+ */
+
+#include "iodev.h"
+
+#include <linux/kvm_host.h>
+#include <linux/kvm.h>
+
+#include "coalesced_mmio.h"
+
+static int coalesced_mmio_in_range(struct kvm_io_device *this,
+ gpa_t addr, int len, int is_write)
+{
+ struct kvm_coalesced_mmio_dev *dev =
+ (struct kvm_coalesced_mmio_dev*)this->private;
+ struct kvm_coalesced_mmio_zone *zone;
+ int next;
+ int i;
+
+ if (!is_write)
+ return 0;
+
+ /* kvm->lock is taken by the caller and must be not released before
+ * dev.read/write
+ */
+
+ /* Are we able to batch it ? */
+
+ /* last is the first free entry
+ * check if we don't meet the first used entry
+ * there is always one unused entry in the buffer
+ */
+
+ next = (dev->kvm->coalesced_mmio_ring->last + 1) %
+ KVM_COALESCED_MMIO_MAX;
+ if (next == dev->kvm->coalesced_mmio_ring->first) {
+ /* full */
+ return 0;
+ }
+
+ /* is it in a batchable area ? */
+
+ for (i = 0; i < dev->nb_zones; i++) {
+ zone = &dev->zone[i];
+
+ /* (addr,len) is fully included in
+ * (zone->addr, zone->size)
+ */
+
+ if (zone->addr <= addr &&
+ addr + len <= zone->addr + zone->size)
+ return 1;
+ }
+ return 0;
+}
+
+static void coalesced_mmio_write(struct kvm_io_device *this,
+ gpa_t addr, int len, const void *val)
+{
+ struct kvm_coalesced_mmio_dev *dev =
+ (struct kvm_coalesced_mmio_dev*)this->private;
+ struct kvm_coalesced_mmio_ring *ring = dev->kvm->coalesced_mmio_ring;
+
+ /* kvm->lock must be taken by caller before call to in_range()*/
+
+ /* copy data in first free entry of the ring */
+
+ ring->coalesced_mmio[ring->last].phys_addr = addr;
+ ring->coalesced_mmio[ring->last].len = len;
+ memcpy(ring->coalesced_mmio[ring->last].data, val, len);
+ smp_wmb();
+ ring->last = (ring->last + 1) % KVM_COALESCED_MMIO_MAX;
+}
+
+static void coalesced_mmio_destructor(struct kvm_io_device *this)
+{
+ kfree(this);
+}
+
+int kvm_coalesced_mmio_init(struct kvm *kvm)
+{
+ struct kvm_coalesced_mmio_dev *dev;
+
+ dev = kzalloc(sizeof(struct kvm_coalesced_mmio_dev), GFP_KERNEL);
+ if (!dev)
+ return -ENOMEM;
+ dev->dev.write = coalesced_mmio_write;
+ dev->dev.in_range = coalesced_mmio_in_range;
+ dev->dev.destructor = coalesced_mmio_destructor;
+ dev->dev.private = dev;
+ dev->kvm = kvm;
+ kvm->coalesced_mmio_dev = dev;
+ kvm_io_bus_register_dev(&kvm->mmio_bus, &dev->dev);
+
+ return 0;
+}
+
+int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm,
+ struct kvm_coalesced_mmio_zone *zone)
+{
+ struct kvm_coalesced_mmio_dev *dev = kvm->coalesced_mmio_dev;
+
+ if (dev == NULL)
+ return -EINVAL;
+
+ mutex_lock(&kvm->lock);
+ if (dev->nb_zones >= KVM_COALESCED_MMIO_ZONE_MAX) {
+ mutex_unlock(&kvm->lock);
+ return -ENOBUFS;
+ }
+
+ dev->zone[dev->nb_zones] = *zone;
+ dev->nb_zones++;
+
+ mutex_unlock(&kvm->lock);
+ return 0;
+}
+
+int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
+ struct kvm_coalesced_mmio_zone *zone)
+{
+ int i;
+ struct kvm_coalesced_mmio_dev *dev = kvm->coalesced_mmio_dev;
+ struct kvm_coalesced_mmio_zone *z;
+
+ if (dev == NULL)
+ return -EINVAL;
+
+ mutex_lock(&kvm->lock);
+
+ i = dev->nb_zones;
+ while(i) {
+ z = &dev->zone[i - 1];
+
+ /* unregister all zones
+ * included in (zone->addr, zone->size)
+ */
+
+ if (zone->addr <= z->addr &&
+ z->addr + z->size <= zone->addr + zone->size) {
+ dev->nb_zones--;
+ *z = dev->zone[dev->nb_zones];
+ }
+ i--;
+ }
+
+ mutex_unlock(&kvm->lock);
+
+ return 0;
+}
diff --git a/virt/kvm/coalesced_mmio.h b/virt/kvm/coalesced_mmio.h
new file mode 100644
index 0000000..5ac0ec6
--- /dev/null
+++ b/virt/kvm/coalesced_mmio.h
@@ -0,0 +1,23 @@
+/*
+ * KVM coalesced MMIO
+ *
+ * Copyright (c) 2008 Bull S.A.S.
+ *
+ * Author: Laurent Vivier <Laurent.Vivier@bull.net>
+ *
+ */
+
+#define KVM_COALESCED_MMIO_ZONE_MAX 100
+
+struct kvm_coalesced_mmio_dev {
+ struct kvm_io_device dev;
+ struct kvm *kvm;
+ int nb_zones;
+ struct kvm_coalesced_mmio_zone zone[KVM_COALESCED_MMIO_ZONE_MAX];
+};
+
+int kvm_coalesced_mmio_init(struct kvm *kvm);
+int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm,
+ struct kvm_coalesced_mmio_zone *zone);
+int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
+ struct kvm_coalesced_mmio_zone *zone);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d602700..4f38f5c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -47,6 +47,10 @@
#include <asm/uaccess.h>
#include <asm/pgtable.h>
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+#include "coalesced_mmio.h"
+#endif
+
MODULE_AUTHOR("Qumranet");
MODULE_LICENSE("GPL");
@@ -185,10 +189,23 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_uninit);
static struct kvm *kvm_create_vm(void)
{
struct kvm *kvm = kvm_arch_create_vm();
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+ struct page *page;
+#endif
if (IS_ERR(kvm))
goto out;
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+ page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+ if (!page) {
+ kfree(kvm);
+ return ERR_PTR(-ENOMEM);
+ }
+ kvm->coalesced_mmio_ring =
+ (struct kvm_coalesced_mmio_ring *)page_address(page);
+#endif
+
kvm->mm = current->mm;
atomic_inc(&kvm->mm->mm_count);
spin_lock_init(&kvm->mmu_lock);
@@ -200,6 +217,9 @@ static struct kvm *kvm_create_vm(void)
spin_lock(&kvm_lock);
list_add(&kvm->vm_list, &vm_list);
spin_unlock(&kvm_lock);
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+ kvm_coalesced_mmio_init(kvm);
+#endif
out:
return kvm;
}
@@ -243,6 +263,10 @@ static void kvm_destroy_vm(struct kvm *kvm)
kvm_io_bus_destroy(&kvm->pio_bus);
kvm_io_bus_destroy(&kvm->mmio_bus);
kvm_arch_destroy_vm(kvm);
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+ if (kvm->coalesced_mmio_ring != NULL)
+ free_page((unsigned long)kvm->coalesced_mmio_ring);
+#endif
mmdrop(mm);
}
@@ -826,6 +850,10 @@ static int kvm_vcpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
else if (vmf->pgoff == KVM_PIO_PAGE_OFFSET)
page = virt_to_page(vcpu->arch.pio_data);
#endif
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+ else if (vmf->pgoff == KVM_COALESCED_MMIO_PAGE_OFFSET)
+ page = virt_to_page(vcpu->kvm->coalesced_mmio_ring);
+#endif
else
return VM_FAULT_SIGBUS;
get_page(page);
@@ -1148,6 +1176,32 @@ static long kvm_vm_ioctl(struct file *filp,
goto out;
break;
}
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+ case KVM_REGISTER_COALESCED_MMIO: {
+ struct kvm_coalesced_mmio_zone zone;
+ r = -EFAULT;
+ if (copy_from_user(&zone, argp, sizeof zone))
+ goto out;
+ r = -ENXIO;
+ r = kvm_vm_ioctl_register_coalesced_mmio(kvm, &zone);
+ if (r)
+ goto out;
+ r = 0;
+ break;
+ }
+ case KVM_UNREGISTER_COALESCED_MMIO: {
+ struct kvm_coalesced_mmio_zone zone;
+ r = -EFAULT;
+ if (copy_from_user(&zone, argp, sizeof zone))
+ goto out;
+ r = -ENXIO;
+ r = kvm_vm_ioctl_unregister_coalesced_mmio(kvm, &zone);
+ if (r)
+ goto out;
+ r = 0;
+ break;
+ }
+#endif
default:
r = kvm_arch_vm_ioctl(filp, ioctl, arg);
}
@@ -1232,6 +1286,9 @@ static long kvm_dev_ioctl(struct file *filp,
#ifdef CONFIG_X86
r += PAGE_SIZE; /* pio data page */
#endif
+#ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
+ r += PAGE_SIZE; /* coalesced mmio ring page */
+#endif
break;
case KVM_TRACE_ENABLE:
case KVM_TRACE_PAUSE:
--
1.5.2.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/5] Add coalesced MMIO support (x86 part)
2008-05-30 14:05 ` [PATCH 2/5] Add coalesced MMIO support (common part) Laurent Vivier
@ 2008-05-30 14:05 ` Laurent Vivier
2008-05-30 14:05 ` [PATCH 4/5] Add coalesced MMIO support (powerpc part) Laurent Vivier
0 siblings, 1 reply; 12+ messages in thread
From: Laurent Vivier @ 2008-05-30 14:05 UTC (permalink / raw)
To: kvm; +Cc: Laurent Vivier
This patch enables coalesced MMIO for x86 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
arch/x86/kvm/Makefile | 3 ++-
arch/x86/kvm/x86.c | 3 +++
include/asm-x86/kvm_host.h | 1 +
3 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index c97d35c..d0e940b 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -2,7 +2,8 @@
# Makefile for Kernel-based Virtual Machine module
#
-common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o)
+common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
+ coalesced_mmio.o)
ifeq ($(CONFIG_KVM_TRACE),y)
common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o)
endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f829319..c7c41b7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -836,6 +836,9 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_MP_STATE:
r = 1;
break;
+ case KVM_CAP_COALESCED_MMIO:
+ r = KVM_COALESCED_MMIO_PAGE_OFFSET;
+ break;
case KVM_CAP_VAPIC:
r = !kvm_x86_ops->cpu_has_accelerated_tpr();
break;
diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index 8322fc1..f414858 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -26,6 +26,7 @@
#define KVM_PRIVATE_MEM_SLOTS 4
#define KVM_PIO_PAGE_OFFSET 1
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 2
#define CR3_PAE_RESERVED_BITS ((X86_CR3_PWT | X86_CR3_PCD) - 1)
#define CR3_NONPAE_RESERVED_BITS ((PAGE_SIZE-1) & ~(X86_CR3_PWT | X86_CR3_PCD))
--
1.5.2.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/5] Add coalesced MMIO support (powerpc part)
2008-05-30 14:05 ` [PATCH 3/5] Add coalesced MMIO support (x86 part) Laurent Vivier
@ 2008-05-30 14:05 ` Laurent Vivier
2008-05-30 14:05 ` [PATCH 5/5] Add coalesced MMIO support (ia64 part) Laurent Vivier
0 siblings, 1 reply; 12+ messages in thread
From: Laurent Vivier @ 2008-05-30 14:05 UTC (permalink / raw)
To: kvm; +Cc: Laurent Vivier
This patch enables coalesced MMIO for powerpc architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
WARNING: this has not been tested.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
arch/powerpc/kvm/Makefile | 2 +-
arch/powerpc/kvm/powerpc.c | 3 +++
include/asm-powerpc/kvm_host.h | 2 ++
3 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index d0d358d..04e3449 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -4,7 +4,7 @@
EXTRA_CFLAGS += -Ivirt/kvm -Iarch/powerpc/kvm
-common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o)
+common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o)
kvm-objs := $(common-objs) powerpc.o emulate.o booke_guest.o
obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 0513b35..b850d24 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -145,6 +145,9 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_USER_MEMORY:
r = 1;
break;
+ case KVM_CAP_COALESCED_MMIO:
+ r = KVM_COALESCED_MMIO_PAGE_OFFSET;
+ break;
default:
r = 0;
break;
diff --git a/include/asm-powerpc/kvm_host.h b/include/asm-powerpc/kvm_host.h
index 81a69d7..2655e2a 100644
--- a/include/asm-powerpc/kvm_host.h
+++ b/include/asm-powerpc/kvm_host.h
@@ -31,6 +31,8 @@
/* memory slots that does not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 4
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+
/* We don't currently support large pages. */
#define KVM_PAGES_PER_HPAGE (1<<31)
--
1.5.2.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 5/5] Add coalesced MMIO support (ia64 part)
2008-05-30 14:05 ` [PATCH 4/5] Add coalesced MMIO support (powerpc part) Laurent Vivier
@ 2008-05-30 14:05 ` Laurent Vivier
0 siblings, 0 replies; 12+ messages in thread
From: Laurent Vivier @ 2008-05-30 14:05 UTC (permalink / raw)
To: kvm; +Cc: Laurent Vivier
This patch enables coalesced MMIO for ia64 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
WARNING: this has not been tested.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
arch/ia64/kvm/Makefile | 3 ++-
arch/ia64/kvm/kvm-ia64.c | 3 +++
include/asm-ia64/kvm_host.h | 1 +
3 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/arch/ia64/kvm/Makefile b/arch/ia64/kvm/Makefile
index 112791d..bf22fb9 100644
--- a/arch/ia64/kvm/Makefile
+++ b/arch/ia64/kvm/Makefile
@@ -43,7 +43,8 @@ $(obj)/$(offsets-file): arch/ia64/kvm/asm-offsets.s
EXTRA_CFLAGS += -Ivirt/kvm -Iarch/ia64/kvm/
EXTRA_AFLAGS += -Ivirt/kvm -Iarch/ia64/kvm/
-common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o)
+common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
+ coalesced_mmio.o)
kvm-objs := $(common-objs) kvm-ia64.o kvm_fw.o
obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index b59231b..4815c31 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -187,6 +187,9 @@ int kvm_dev_ioctl_check_extension(long ext)
r = 1;
break;
+ case KVM_CAP_COALESCED_MMIO::
+ r = KVM_COALESCED_MMIO_PAGE_OFFSET;
+ break;
default:
r = 0;
}
diff --git a/include/asm-ia64/kvm_host.h b/include/asm-ia64/kvm_host.h
index 5c958b0..1efe513 100644
--- a/include/asm-ia64/kvm_host.h
+++ b/include/asm-ia64/kvm_host.h
@@ -38,6 +38,7 @@
/* memory slots that does not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 4
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
/* define exit reasons from vmm to kvm*/
#define EXIT_REASON_VM_PANIC 0
--
1.5.2.4
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/5] kvm: Batch writes to MMIO
2008-05-30 14:05 [PATCH 0/5] kvm: Batch writes to MMIO Laurent Vivier
2008-05-30 14:05 ` [PATCH 1/5] kvm_io_device: extend in_range() to manage len and write attribute Laurent Vivier
@ 2008-06-04 15:13 ` Avi Kivity
1 sibling, 0 replies; 12+ messages in thread
From: Avi Kivity @ 2008-06-04 15:13 UTC (permalink / raw)
To: Laurent Vivier; +Cc: kvm
Laurent Vivier wrote:
> When kernel has to send MMIO writes to userspace, it stores them
> in memory until it has to pass the hand to userspace for another
> reason. This avoids to have too many context switches on operations
> that can wait.
>
> These patches introduce an ioctl() to define MMIO allowed to be coalesced.
>
> This is the kernel part of the coalesced MMIO functionality.
>
>
Applied all, thanks.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-06-04 15:13 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-30 14:05 [PATCH 0/5] kvm: Batch writes to MMIO Laurent Vivier
2008-05-30 14:05 ` [PATCH 1/5] kvm_io_device: extend in_range() to manage len and write attribute Laurent Vivier
2008-05-30 14:05 ` [PATCH 2/5] Add coalesced MMIO support (common part) Laurent Vivier
2008-05-30 14:05 ` [PATCH 3/5] Add coalesced MMIO support (x86 part) Laurent Vivier
2008-05-30 14:05 ` [PATCH 4/5] Add coalesced MMIO support (powerpc part) Laurent Vivier
2008-05-30 14:05 ` [PATCH 5/5] Add coalesced MMIO support (ia64 part) Laurent Vivier
2008-06-04 15:13 ` [PATCH 0/5] kvm: Batch writes to MMIO Avi Kivity
-- strict thread matches above, loose matches on Subject: below --
2008-05-23 8:50 Laurent Vivier
2008-05-26 12:00 ` Avi Kivity
2008-05-26 12:10 ` Laurent Vivier
2008-05-26 13:43 ` Avi Kivity
2008-05-26 14:20 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox