* [Qemu-devel] [PATCH 0/3] Object-based physical memory management
@ 2009-05-24 9:29 Avi Kivity
2009-05-24 9:29 ` [Qemu-devel] [PATCH 1/3] Add PhysicalMemoryRegion type Avi Kivity
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Avi Kivity @ 2009-05-24 9:29 UTC (permalink / raw)
To: qemu-devel
Currently, qemu manages physical memory at a page granularity (and with
subpage, at a byte granularity). This is expensive in memory and runtime
(requiring 8 bytes per page, and likely a cache miss on access), and is
difficult to manage: devices must remember their ranges so they can free
them, and the memory system has no insight into how memory is laid out.
This patchset introduces object-based memory management. The first patch
adds objects and functions for memory regions; the second patch wraps this
for PCI; and the third converts RTL8139 as an example.
Avi Kivity (3):
Add PhysicalMemoryRegion type
Add PCI memory region registration
Convert RTL8139 to use PCI memory regitration facility
cpu-all.h | 13 +++++++++
exec.c | 42 +++++++++++++++++++++++++++++++
hw/pci.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
hw/pci.h | 26 +++++++++++++++++++
hw/rtl8139.c | 15 +++-------
5 files changed, 163 insertions(+), 11 deletions(-)
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 1/3] Add PhysicalMemoryRegion type
2009-05-24 9:29 [Qemu-devel] [PATCH 0/3] Object-based physical memory management Avi Kivity
@ 2009-05-24 9:29 ` Avi Kivity
2009-05-27 14:16 ` Christoph Hellwig
2009-05-24 9:29 ` [Qemu-devel] [PATCH 2/3] Add PCI memory region registration Avi Kivity
` (2 subsequent siblings)
3 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-24 9:29 UTC (permalink / raw)
To: qemu-devel
A PhysicalMemoryRegion represents a region in the physical address space.
Cuurently supported operations include creation and destruction.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
cpu-all.h | 13 +++++++++++++
exec.c | 42 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 55 insertions(+), 0 deletions(-)
diff --git a/cpu-all.h b/cpu-all.h
index 0df54b6..dc9c8b7 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -893,6 +893,19 @@ extern ram_addr_t last_ram_offset;
typedef void CPUWriteMemoryFunc(void *opaque, target_phys_addr_t addr, uint32_t value);
typedef uint32_t CPUReadMemoryFunc(void *opaque, target_phys_addr_t addr);
+typedef struct PhysicalMemoryRegion PhysicalMemoryRegion;
+
+PhysicalMemoryRegion *physical_memory_region_register(
+ target_phys_addr_t start_addr,
+ target_phys_addr_t size,
+ ram_addr_t ram_addr);
+PhysicalMemoryRegion *physical_memory_region_register_offset(
+ target_phys_addr_t start_addr,
+ target_phys_addr_t size,
+ ram_addr_t ram_addr,
+ ram_addr_t region_offset);
+void physical_memory_region_unregister(PhysicalMemoryRegion *pmr);
+
void cpu_register_physical_memory_offset(target_phys_addr_t start_addr,
ram_addr_t size,
ram_addr_t phys_offset,
diff --git a/exec.c b/exec.c
index c5c9280..03c03dc 100644
--- a/exec.c
+++ b/exec.c
@@ -2418,6 +2418,48 @@ void cpu_register_physical_memory_offset(target_phys_addr_t start_addr,
}
}
+struct PhysicalMemoryRegion {
+ target_phys_addr_t start_addr;
+ target_phys_addr_t size;
+ ram_addr_t ram_addr;
+ ram_addr_t region_offset;
+};
+
+PhysicalMemoryRegion *physical_memory_region_register(
+ target_phys_addr_t start_addr,
+ target_phys_addr_t size,
+ ram_addr_t ram_addr)
+{
+ return physical_memory_region_register_offset(start_addr, size, ram_addr, 0);
+}
+
+PhysicalMemoryRegion *physical_memory_region_register_offset(
+ target_phys_addr_t start_addr,
+ target_phys_addr_t size,
+ ram_addr_t ram_addr,
+ ram_addr_t region_offset)
+{
+ PhysicalMemoryRegion *pmr = (PhysicalMemoryRegion *)qemu_malloc(sizeof(*pmr));
+
+ pmr->start_addr = start_addr;
+ pmr->size = size;
+ pmr->ram_addr = ram_addr;
+ pmr->region_offset = region_offset;
+
+ cpu_register_physical_memory_offset(start_addr, size, ram_addr, region_offset);
+
+ return pmr;
+}
+
+void physical_memory_region_unregister(PhysicalMemoryRegion *pmr)
+{
+ if (kvm_enabled()) {
+ kvm_uncoalesce_mmio_region(pmr->start_addr, pmr->size);
+ }
+ cpu_register_physical_memory(pmr->start_addr, pmr->size, IO_MEM_UNASSIGNED);
+ qemu_free(pmr);
+}
+
/* XXX: temporary until new memory mapping API */
ram_addr_t cpu_get_physical_page_desc(target_phys_addr_t addr)
{
--
1.6.0.6
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-24 9:29 [Qemu-devel] [PATCH 0/3] Object-based physical memory management Avi Kivity
2009-05-24 9:29 ` [Qemu-devel] [PATCH 1/3] Add PhysicalMemoryRegion type Avi Kivity
@ 2009-05-24 9:29 ` Avi Kivity
2009-05-27 14:08 ` Anthony Liguori
2009-05-24 9:29 ` [Qemu-devel] [PATCH 3/3] Convert RTL8139 to use PCI memory regitration facility Avi Kivity
2009-05-27 14:08 ` [Qemu-devel] [PATCH 0/3] Object-based physical memory management Anthony Liguori
3 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-24 9:29 UTC (permalink / raw)
To: qemu-devel
A registered PCI memory region will be automatically mapped and unmapped
as the associated BAR is manipulated.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
hw/pci.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
hw/pci.h | 26 ++++++++++++++++++++
2 files changed, 103 insertions(+), 1 deletions(-)
diff --git a/hw/pci.c b/hw/pci.c
index ee7d403..4c5323a 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -286,6 +286,9 @@ static void pci_unregister_io_regions(PCIDevice *pci_dev)
for(i = 0; i < PCI_NUM_REGIONS; i++) {
r = &pci_dev->io_regions[i];
+ while (!LIST_EMPTY(&r->components)) {
+ pci_unregister_physical_memory(pci_dev, i, LIST_FIRST(&r->components));
+ }
if (!r->size || r->addr == -1)
continue;
if (r->type == PCI_ADDRESS_SPACE_IO) {
@@ -343,11 +346,77 @@ void pci_register_io_region(PCIDevice *pci_dev, int region_num,
addr = 0x10 + region_num * 4;
}
*(uint32_t *)(pci_dev->config + addr) = cpu_to_le32(type);
+ LIST_INIT(&r->components);
+}
+
+PCIIORegionComponent *pci_register_physical_memory(PCIDevice *pci_dev,
+ int region_num,
+ target_phys_addr_t size,
+ ram_addr_t ram_addr)
+{
+ return pci_register_physical_memory_offset(pci_dev, region_num, 0, size,
+ ram_addr, 0);
+}
+
+static void pci_region_instantiate(PCIIORegionComponent *pci_mem,
+ uint32_t old_addr, uint32_t new_addr)
+{
+ if (new_addr == old_addr) {
+ return;
+ }
+
+ if (old_addr != -1) {
+ physical_memory_region_unregister(pci_mem->pmr);
+ pci_mem->pmr = NULL;
+ }
+
+ if (new_addr != -1) {
+ pci_mem->pmr = physical_memory_region_register_offset(
+ new_addr + pci_mem->offset,
+ pci_mem->size,
+ pci_mem->ram_addr,
+ pci_mem->region_offset);
+ }
+}
+
+PCIIORegionComponent *pci_register_physical_memory_offset(
+ PCIDevice *pci_dev,
+ int region_num,
+ target_phys_addr_t offset,
+ target_phys_addr_t size,
+ ram_addr_t ram_addr,
+ ram_addr_t region_offset)
+{
+ PCIIORegionComponent *pci_mem = (PCIIORegionComponent *)qemu_malloc(sizeof(*pci_mem));
+ PCIIORegion *r = &pci_dev->io_regions[region_num];
+
+ pci_mem->offset = offset;
+ pci_mem->size = size;
+ pci_mem->ram_addr = ram_addr;
+ pci_mem->region_offset = region_offset;
+ pci_mem->pmr = NULL;
+ LIST_INSERT_HEAD(&r->components, pci_mem, link);
+
+ pci_region_instantiate(pci_mem, -1, r->addr);
+
+ return pci_mem;
+}
+
+void pci_unregister_physical_memory(PCIDevice *pci_dev,
+ int region_num,
+ PCIIORegionComponent *pci_mem)
+{
+ PCIIORegion *r = &pci_dev->io_regions[region_num];
+
+ pci_region_instantiate(pci_mem, r->addr, -1);
+ LIST_REMOVE(pci_mem, link);
+ qemu_free(pci_mem);
}
static void pci_update_mappings(PCIDevice *d)
{
PCIIORegion *r;
+ PCIIORegionComponent *pci_mem;
int cmd, i;
uint32_t last_addr, new_addr, config_ofs;
@@ -396,6 +465,11 @@ static void pci_update_mappings(PCIDevice *d)
new_addr = -1;
}
}
+
+ LIST_FOREACH(pci_mem, &r->components, link) {
+ pci_region_instantiate(pci_mem, r->addr, new_addr);
+ }
+
/* now do the real mapping */
if (new_addr != r->addr) {
if (r->addr != -1) {
@@ -418,7 +492,9 @@ static void pci_update_mappings(PCIDevice *d)
}
r->addr = new_addr;
if (r->addr != -1) {
- r->map_func(d, i, r->addr, r->size, r->type);
+ if (r->map_func) {
+ r->map_func(d, i, r->addr, r->size, r->type);
+ }
}
}
}
diff --git a/hw/pci.h b/hw/pci.h
index ff858a1..b27e8fd 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -2,6 +2,7 @@
#define QEMU_PCI_H
#include "qemu-common.h"
+#include "sys-queue.h"
/* PCI includes legacy ISA access. */
#include "isa.h"
@@ -80,11 +81,21 @@ typedef int PCIUnregisterFunc(PCIDevice *pci_dev);
#define PCI_ADDRESS_SPACE_IO 0x01
#define PCI_ADDRESS_SPACE_MEM_PREFETCH 0x08
+typedef struct PCIIORegionComponent {
+ PhysicalMemoryRegion *pmr;
+ target_phys_addr_t offset;
+ target_phys_addr_t size;
+ ram_addr_t ram_addr;
+ ram_addr_t region_offset;
+ LIST_ENTRY(PCIIORegionComponent) link;
+} PCIIORegionComponent;
+
typedef struct PCIIORegion {
uint32_t addr; /* current PCI mapping address. -1 means not mapped */
uint32_t size;
uint8_t type;
PCIMapIORegionFunc *map_func;
+ LIST_HEAD(pci_region_list, PCIIORegionComponent) components;
} PCIIORegion;
#define PCI_ROM_SLOT 6
@@ -171,6 +182,21 @@ void pci_register_io_region(PCIDevice *pci_dev, int region_num,
uint32_t size, int type,
PCIMapIORegionFunc *map_func);
+PCIIORegionComponent *pci_register_physical_memory(PCIDevice *pci_dev,
+ int region_num,
+ target_phys_addr_t size,
+ ram_addr_t ram_addr);
+PCIIORegionComponent *pci_register_physical_memory_offset(
+ PCIDevice *pci_dev,
+ int region_num,
+ target_phys_addr_t offset,
+ target_phys_addr_t size,
+ ram_addr_t ram_addr,
+ ram_addr_t region_offset);
+void pci_unregister_physical_memory(PCIDevice *pci_dev,
+ int region_num,
+ PCIIORegionComponent *pci_mem);
+
uint32_t pci_default_read_config(PCIDevice *d,
uint32_t address, int len);
void pci_default_write_config(PCIDevice *d,
--
1.6.0.6
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 3/3] Convert RTL8139 to use PCI memory regitration facility
2009-05-24 9:29 [Qemu-devel] [PATCH 0/3] Object-based physical memory management Avi Kivity
2009-05-24 9:29 ` [Qemu-devel] [PATCH 1/3] Add PhysicalMemoryRegion type Avi Kivity
2009-05-24 9:29 ` [Qemu-devel] [PATCH 2/3] Add PCI memory region registration Avi Kivity
@ 2009-05-24 9:29 ` Avi Kivity
2009-05-27 14:08 ` [Qemu-devel] [PATCH 0/3] Object-based physical memory management Anthony Liguori
3 siblings, 0 replies; 18+ messages in thread
From: Avi Kivity @ 2009-05-24 9:29 UTC (permalink / raw)
To: qemu-devel
Signed-off-by: Avi Kivity <avi@redhat.com>
---
hw/rtl8139.c | 15 +++++----------
1 files changed, 5 insertions(+), 10 deletions(-)
diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index c0be1de..f2bc072 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -467,6 +467,7 @@ typedef struct RTL8139State {
VLANClientState *vc;
uint8_t macaddr[6];
int rtl8139_mmio_io_addr;
+ PCIIORegionComponent *pci_mem;
/* C ring mode */
uint32_t currTxDesc;
@@ -3324,15 +3325,6 @@ typedef struct PCIRTL8139State {
RTL8139State rtl8139;
} PCIRTL8139State;
-static void rtl8139_mmio_map(PCIDevice *pci_dev, int region_num,
- uint32_t addr, uint32_t size, int type)
-{
- PCIRTL8139State *d = (PCIRTL8139State *)pci_dev;
- RTL8139State *s = &d->rtl8139;
-
- cpu_register_physical_memory(addr + 0, 0x100, s->rtl8139_mmio_io_addr);
-}
-
static void rtl8139_ioport_map(PCIDevice *pci_dev, int region_num,
uint32_t addr, uint32_t size, int type)
{
@@ -3476,7 +3468,10 @@ PCIDevice *pci_rtl8139_init(PCIBus *bus, NICInfo *nd, int devfn)
PCI_ADDRESS_SPACE_IO, rtl8139_ioport_map);
pci_register_io_region(&d->dev, 1, 0x100,
- PCI_ADDRESS_SPACE_MEM, rtl8139_mmio_map);
+ PCI_ADDRESS_SPACE_MEM, NULL);
+
+ s->pci_mem = pci_register_physical_memory(&d->dev, 1, 0x100,
+ s->rtl8139_mmio_io_addr);
s->pci_dev = (PCIDevice *)d;
memcpy(s->macaddr, nd->macaddr, 6);
--
1.6.0.6
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-24 9:29 ` [Qemu-devel] [PATCH 2/3] Add PCI memory region registration Avi Kivity
@ 2009-05-27 14:08 ` Anthony Liguori
2009-05-27 15:02 ` Avi Kivity
0 siblings, 1 reply; 18+ messages in thread
From: Anthony Liguori @ 2009-05-27 14:08 UTC (permalink / raw)
To: Avi Kivity; +Cc: qemu-devel
Avi Kivity wrote:
> A registered PCI memory region will be automatically mapped and unmapped
> as the associated BAR is manipulated.
>
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
> hw/pci.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> hw/pci.h | 26 ++++++++++++++++++++
> 2 files changed, 103 insertions(+), 1 deletions(-)
>
> diff --git a/hw/pci.h b/hw/pci.h
> index ff858a1..b27e8fd 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -2,6 +2,7 @@
> #define QEMU_PCI_H
>
> #include "qemu-common.h"
> +#include "sys-queue.h"
>
> /* PCI includes legacy ISA access. */
> #include "isa.h"
> @@ -80,11 +81,21 @@ typedef int PCIUnregisterFunc(PCIDevice *pci_dev);
> #define PCI_ADDRESS_SPACE_IO 0x01
> #define PCI_ADDRESS_SPACE_MEM_PREFETCH 0x08
>
> +typedef struct PCIIORegionComponent {
> + PhysicalMemoryRegion *pmr;
> + target_phys_addr_t offset;
> + target_phys_addr_t size;
> + ram_addr_t ram_addr;
> + ram_addr_t region_offset;
> + LIST_ENTRY(PCIIORegionComponent) link;
> +} PCIIORegionComponent;
> +
> typedef struct PCIIORegion {
> uint32_t addr; /* current PCI mapping address. -1 means not mapped */
> uint32_t size;
> uint8_t type;
> PCIMapIORegionFunc *map_func;
> + LIST_HEAD(pci_region_list, PCIIORegionComponent) components;
> } PCIIORegion;
>
> #define PCI_ROM_SLOT 6
> @@ -171,6 +182,21 @@ void pci_register_io_region(PCIDevice *pci_dev, int region_num,
> uint32_t size, int type,
> PCIMapIORegionFunc *map_func);
>
> +PCIIORegionComponent *pci_register_physical_memory(PCIDevice *pci_dev,
> + int region_num,
> + target_phys_addr_t size,
> + ram_addr_t ram_addr);
>
PCI devices should not care about physical memory. They should only
care about IO regions. It ought to look something like:
enum {
PCI_REGION_IO,
PCI_REGION_MEM,
};
typedef struct PCIIOFunction
{
uint32_t (*read)(void *opaque, uint64_t offset, int size);
void (*write)(void *opaque, uint64_t offset, int size, uint64_t value);
} PCIIOFunction;
pci_register_io_region(PCIDevice *pci_dev,
int region_num,
int region_type,
int order,
PCIIOFunction *io_func,
void *opaque);
The current PCI map API goes away. Devices don't have to know anything
about that. Special handling for PIO/MMIO for PCI devices also go
away. Devices can register PCI_REGION_IO regions and they'll be
automatically mapped to MMIO for non-x86 architectures. This means a
lot of current PCI devices can be built once instead of being built for
each architecture.
If you want, we can setup a temporary branch in qemu to stage the
conversion. I'm willing to help do the conversion too.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] Object-based physical memory management
2009-05-24 9:29 [Qemu-devel] [PATCH 0/3] Object-based physical memory management Avi Kivity
` (2 preceding siblings ...)
2009-05-24 9:29 ` [Qemu-devel] [PATCH 3/3] Convert RTL8139 to use PCI memory regitration facility Avi Kivity
@ 2009-05-27 14:08 ` Anthony Liguori
2009-05-27 14:54 ` Avi Kivity
3 siblings, 1 reply; 18+ messages in thread
From: Anthony Liguori @ 2009-05-27 14:08 UTC (permalink / raw)
To: Avi Kivity; +Cc: qemu-devel
Avi Kivity wrote:
> Currently, qemu manages physical memory at a page granularity (and with
> subpage, at a byte granularity). This is expensive in memory and runtime
> (requiring 8 bytes per page, and likely a cache miss on access), and is
> difficult to manage: devices must remember their ranges so they can free
> them, and the memory system has no insight into how memory is laid out.
>
> This patchset introduces object-based memory management. The first patch
> adds objects and functions for memory regions; the second patch wraps this
> for PCI; and the third converts RTL8139 as an example.
>
I like the basic notion here. Instead of introducing the API and
converting one user, I'd rather do a flag day and convert everything.
Once we agree on the API of course.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] Add PhysicalMemoryRegion type
2009-05-24 9:29 ` [Qemu-devel] [PATCH 1/3] Add PhysicalMemoryRegion type Avi Kivity
@ 2009-05-27 14:16 ` Christoph Hellwig
2009-05-27 15:07 ` Avi Kivity
0 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2009-05-27 14:16 UTC (permalink / raw)
To: Avi Kivity; +Cc: qemu-devel
On Sun, May 24, 2009 at 12:29:33PM +0300, Avi Kivity wrote:
> +struct PhysicalMemoryRegion {
> + target_phys_addr_t start_addr;
> + target_phys_addr_t size;
> + ram_addr_t ram_addr;
> + ram_addr_t region_offset;
> +};
> +
> +PhysicalMemoryRegion *physical_memory_region_register(
> + target_phys_addr_t start_addr,
> + target_phys_addr_t size,
> + ram_addr_t ram_addr)
> +{
> + return physical_memory_region_register_offset(start_addr, size, ram_addr, 0);
> +}
Wouldn't it be much easier if the client allocated the
PhysicalMemoryRegion and just passed a pointer to it into
physical_memory_region_register? That would also take care of the
physical_memory_region_register vs
physical_memory_region_register_offset split as the non-offset case
would just not fill out the region_offset field and be done with it.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 0/3] Object-based physical memory management
2009-05-27 14:08 ` [Qemu-devel] [PATCH 0/3] Object-based physical memory management Anthony Liguori
@ 2009-05-27 14:54 ` Avi Kivity
0 siblings, 0 replies; 18+ messages in thread
From: Avi Kivity @ 2009-05-27 14:54 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel
Anthony Liguori wrote:
> Avi Kivity wrote:
>> Currently, qemu manages physical memory at a page granularity (and with
>> subpage, at a byte granularity). This is expensive in memory and
>> runtime
>> (requiring 8 bytes per page, and likely a cache miss on access), and is
>> difficult to manage: devices must remember their ranges so they can free
>> them, and the memory system has no insight into how memory is laid out.
>>
>> This patchset introduces object-based memory management. The first
>> patch
>> adds objects and functions for memory regions; the second patch wraps
>> this
>> for PCI; and the third converts RTL8139 as an example.
>>
>
> I like the basic notion here. Instead of introducing the API and
> converting one user, I'd rather do a flag day and convert everything.
I agree, just getting this out to collect feedback (and to undust this
month-old branch).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-27 14:08 ` Anthony Liguori
@ 2009-05-27 15:02 ` Avi Kivity
2009-05-27 15:11 ` Anthony Liguori
0 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-27 15:02 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel
Anthony Liguori wrote:
>>
>> +PCIIORegionComponent *pci_register_physical_memory(PCIDevice *pci_dev,
>> + int region_num,
>> +
>> target_phys_addr_t size,
>> + ram_addr_t
>> ram_addr);
>>
>
> PCI devices should not care about physical memory. They should only
> care about IO regions. It ought to look something like:
>
> enum {
> PCI_REGION_IO,
> PCI_REGION_MEM,
> };
>
> typedef struct PCIIOFunction
> {
> uint32_t (*read)(void *opaque, uint64_t offset, int size);
> void (*write)(void *opaque, uint64_t offset, int size, uint64_t value);
> } PCIIOFunction;
That doesn't work for direct-mapped memory like the cirrus framebuffer
or host devices.
I also don't think we need to add another layer of indirection here (for
traditional callback based regions), where the physical memory layer
calls one set of callbacks which calls the PCIIOFunction callbacks.
>
> pci_register_io_region(PCIDevice *pci_dev,
> int region_num,
> int region_type,
> int order,
> PCIIOFunction *io_func,
> void *opaque);
>
> The current PCI map API goes away.
At the end of the conversion, yes.
> Devices don't have to know anything about that. Special handling for
> PIO/MMIO for PCI devices also go away. Devices can register
> PCI_REGION_IO regions and they'll be automatically mapped to MMIO for
> non-x86 architectures. This means a lot of current PCI devices can be
> built once instead of being built for each architecture.
That too can be handled at the physical memory layer by allocating a
dedicated ram address with callbacks to do the translation. That fixes
non-pci too (ia64 does it in qemu-kvm.git).
> If you want, we can setup a temporary branch in qemu to stage the
> conversion. I'm willing to help do the conversion too.
It's now memory-tracking in qemu-kvm.git (a rebasable branch, so don't
branch off it). Patches welcome.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] Add PhysicalMemoryRegion type
2009-05-27 14:16 ` Christoph Hellwig
@ 2009-05-27 15:07 ` Avi Kivity
0 siblings, 0 replies; 18+ messages in thread
From: Avi Kivity @ 2009-05-27 15:07 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: qemu-devel
Christoph Hellwig wrote:
> On Sun, May 24, 2009 at 12:29:33PM +0300, Avi Kivity wrote:
>
>> +struct PhysicalMemoryRegion {
>> + target_phys_addr_t start_addr;
>> + target_phys_addr_t size;
>> + ram_addr_t ram_addr;
>> + ram_addr_t region_offset;
>> +};
>> +
>> +PhysicalMemoryRegion *physical_memory_region_register(
>> + target_phys_addr_t start_addr,
>> + target_phys_addr_t size,
>> + ram_addr_t ram_addr)
>> +{
>> + return physical_memory_region_register_offset(start_addr, size, ram_addr, 0);
>> +}
>>
>
> Wouldn't it be much easier if the client allocated the
> PhysicalMemoryRegion and just passed a pointer to it into
> physical_memory_region_register? That would also take care of the
> physical_memory_region_register vs
> physical_memory_region_register_offset split as the non-offset case
> would just not fill out the region_offset field and be done with it.
>
Good idea. I'll make is so.
It's a little different from current qemu style, but I don't this that
should be a problem.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-27 15:02 ` Avi Kivity
@ 2009-05-27 15:11 ` Anthony Liguori
2009-05-27 15:24 ` Avi Kivity
0 siblings, 1 reply; 18+ messages in thread
From: Anthony Liguori @ 2009-05-27 15:11 UTC (permalink / raw)
To: Avi Kivity; +Cc: qemu-devel
Avi Kivity wrote:
> Anthony Liguori wrote:
>>>
>>> +PCIIORegionComponent *pci_register_physical_memory(PCIDevice *pci_dev,
>>> + int region_num,
>>> +
>>> target_phys_addr_t size,
>>> + ram_addr_t
>>> ram_addr);
>>>
>>
>> PCI devices should not care about physical memory. They should only
>> care about IO regions. It ought to look something like:
>>
>> enum {
>> PCI_REGION_IO,
>> PCI_REGION_MEM,
>> };
>>
>> typedef struct PCIIOFunction
>> {
>> uint32_t (*read)(void *opaque, uint64_t offset, int size);
>> void (*write)(void *opaque, uint64_t offset, int size, uint64_t
>> value);
>> } PCIIOFunction;
>
> That doesn't work for direct-mapped memory like the cirrus framebuffer
> or host devices.
That's because it's an internal performance hack. We should just avoid
the PCI routines for that device, if we can, although that suggests we
need a map hook which is ugly. Clever ideas are welcome.
The above API is needed to sanely deal with PCI busses that do either
IOMMU translation or endianness conversion.
Whether we use the existing IO memory functions verses a new set is not
important to me. What I care most about is that we have a region vs.
memory API, we don't have knowledge of where the region is mapped (so we
can get rid of map), and that we provide a read/write indirection layer
that is hookable by the PCI bus.
Direct mapping is a very special case that can be handled differently.
We shouldn't design the API around it.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-27 15:11 ` Anthony Liguori
@ 2009-05-27 15:24 ` Avi Kivity
2009-05-27 22:33 ` Anthony Liguori
0 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-27 15:24 UTC (permalink / raw)
To: Anthony Liguori; +Cc: qemu-devel
Anthony Liguori wrote:
> Avi Kivity wrote:
>> Anthony Liguori wrote:
>>>>
>>>> +PCIIORegionComponent *pci_register_physical_memory(PCIDevice
>>>> *pci_dev,
>>>> + int region_num,
>>>> +
>>>> target_phys_addr_t size,
>>>> + ram_addr_t
>>>> ram_addr);
>>>>
>>>
>>> PCI devices should not care about physical memory. They should only
>>> care about IO regions. It ought to look something like:
>>>
>>> enum {
>>> PCI_REGION_IO,
>>> PCI_REGION_MEM,
>>> };
>>>
>>> typedef struct PCIIOFunction
>>> {
>>> uint32_t (*read)(void *opaque, uint64_t offset, int size);
>>> void (*write)(void *opaque, uint64_t offset, int size,
>>> uint64_t value);
>>> } PCIIOFunction;
>>
>> That doesn't work for direct-mapped memory like the cirrus
>> framebuffer or host devices.
>
> That's because it's an internal performance hack. We should just
> avoid the PCI routines for that device, if we can, although that
> suggests we need a map hook which is ugly. Clever ideas are welcome.
My original proposal. Note it uses ram addresses, not cpu physical
addresses.
>
> The above API is needed to sanely deal with PCI busses that do either
> IOMMU translation or endianness conversion.
IOMMUs or endian conversions can be handled by allocating a new ram
address whose callbacks to the required translation, then perform the
operation on the original ram address. If we teach
cpu_physical_memory_map() about them, then we can even do dma through an
iommu.
>
> Whether we use the existing IO memory functions verses a new set is
> not important to me. What I care most about is that we have a region
> vs. memory API, we don't have knowledge of where the region is mapped
> (so we can get rid of map),
My proposal provides this. For example BAR reprogramming will move the
region (whether it is callback based or direct map) without the device
knowing about it.
> and that we provide a read/write indirection layer that is hookable by
> the PCI bus.
This is also covered, though of course the code is missing.
> Direct mapping is a very special case that can be handled
> differently. We shouldn't design the API around it.
It happens to be a very important case. I don't want to neglect it or
have very different APIs for direct mapped and callback based.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-27 15:24 ` Avi Kivity
@ 2009-05-27 22:33 ` Anthony Liguori
2009-05-31 11:02 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Anthony Liguori @ 2009-05-27 22:33 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, qemu-devel
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Avi Kivity wrote:
>>
>> That's because it's an internal performance hack. We should just
>> avoid the PCI routines for that device, if we can, although that
>> suggests we need a map hook which is ugly. Clever ideas are welcome.
>
> My original proposal. Note it uses ram addresses, not cpu physical
> addresses.
I've thought about it, and I think what I find confusing about your API
is that pci_register_physical_memory includes the phrase "physical
memory". A PIO IO region on x86 is definitely not physical memory though.
It overloads the term physical memory and still requires separate APIs
for IO regions and MEM regions. I know you mentioned that ram_addr_t
could be overloaded to also support IO regions but IMHO that's rather
confusing. If the new code looked like:
s->rtl8139_mmio_io_addr =
cpu_register_io_memory(0, rtl8139_mmio_read, rtl8139_mmio_write, s);
s->rtl8139_io_io_addr =
cpu_register_io_memory(0, rtl8139_ioport_read, rtl8139_ioport_write, s);
pci_register_io_region(&d->dev, 0, 0x100,
PCI_ADDRESS_SPACE_IO, s->rtl8139_io_io_addr);
pci_register_io_region(&d->dev, 1, 0x100,
PCI_ADDRESS_SPACE_MEM, s->rtl8139_mmio_addr);
I think it would be more understandable. However, I think that the
normal case is exactly this work flow so I think it makes sense to
collapse it into two function calls. So it could look like:
pci_register_io_region(&d->dev, 0, 0x100,
PCI_ADDRESS_SPACE_IO, rtl8139_ioport_read,
rtl8139_ioport_write, s);
pci_register_io_region(&d->dev, 1, 0x100,
PCI_ADDRESS_SPACE_MEM, rtl8139_mmio_read,
rtl8139_mmio_write, s);
Moreover, you could probably drop the opaque parameter and and just use
d->dev. I hope it's possible to get from one to the other.
You could still have a two step process for where it's absolutely
required (like VGA optimization).
I think it's worth looking at changing the signatures of the mem
read/write functions. Introducing a size parameter would greatly
simplify adding 64-bit IO support, for instance.
I would argue that ram_addr_t is the wrong thing to overload for PIO but
as long as it's not exposed in the common API, it doesn't matter that
much to me.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-27 22:33 ` Anthony Liguori
@ 2009-05-31 11:02 ` Christoph Hellwig
2009-05-31 11:50 ` Avi Kivity
2009-05-31 11:47 ` Avi Kivity
2009-05-31 13:03 ` Avi Kivity
2 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2009-05-31 11:02 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Anthony Liguori, Avi Kivity, qemu-devel
On Wed, May 27, 2009 at 05:33:37PM -0500, Anthony Liguori wrote:
> pci_register_io_region(&d->dev, 0, 0x100,
> PCI_ADDRESS_SPACE_IO, rtl8139_ioport_read,
> rtl8139_ioport_write, s);
>
> pci_register_io_region(&d->dev, 1, 0x100,
> PCI_ADDRESS_SPACE_MEM, rtl8139_mmio_read,
> rtl8139_mmio_write, s);
>
> Moreover, you could probably drop the opaque parameter and and just use
> d->dev. I hope it's possible to get from one to the other.
I also think this is a much more natural API, and would simplify drivers
a lot. As for the naming the PCI spec uses the term bar for these
PIO/MMIO regions, so maybe call it pci_register_bar or pci_setup_bar?
> I think it's worth looking at changing the signatures of the mem
> read/write functions. Introducing a size parameter would greatly
> simplify adding 64-bit IO support, for instance.
Yeah.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-27 22:33 ` Anthony Liguori
2009-05-31 11:02 ` Christoph Hellwig
@ 2009-05-31 11:47 ` Avi Kivity
2009-05-31 13:52 ` Anthony Liguori
2009-05-31 13:03 ` Avi Kivity
2 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-05-31 11:47 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Anthony Liguori, qemu-devel
Anthony Liguori wrote:
> Avi Kivity wrote:
>> Anthony Liguori wrote:
>>> Avi Kivity wrote:
>>>
>>> That's because it's an internal performance hack. We should just
>>> avoid the PCI routines for that device, if we can, although that
>>> suggests we need a map hook which is ugly. Clever ideas are welcome.
>>
>> My original proposal. Note it uses ram addresses, not cpu physical
>> addresses.
>
> I've thought about it, and I think what I find confusing about your
> API is that pci_register_physical_memory includes the phrase "physical
> memory". A PIO IO region on x86 is definitely not physical memory
> though.
Currently the API does not support PIO.
I wanted to make it an io_region as well, but pci_register_io_region()
is taken. We could rename it to pci_register_bar(), but that will cause
too much churn IMO.
> It overloads the term physical memory and still requires separate APIs
> for IO regions and MEM regions. I know you mentioned that ram_addr_t
> could be overloaded to also support IO regions but IMHO that's rather
> confusing. If the new code looked like:
>
> s->rtl8139_mmio_io_addr =
> cpu_register_io_memory(0, rtl8139_mmio_read, rtl8139_mmio_write, s);
>
> s->rtl8139_io_io_addr =
> cpu_register_io_memory(0, rtl8139_ioport_read,
> rtl8139_ioport_write, s);
>
> pci_register_io_region(&d->dev, 0, 0x100,
> PCI_ADDRESS_SPACE_IO, s->rtl8139_io_io_addr);
>
> pci_register_io_region(&d->dev, 1, 0x100,
> PCI_ADDRESS_SPACE_MEM, s->rtl8139_mmio_addr);
>
> I think it would be more understandable. However, I think that the
> normal case is exactly this work flow so I think it makes sense to
> collapse it into two function calls. So it could look like:
>
> pci_register_io_region(&d->dev, 0, 0x100,
> PCI_ADDRESS_SPACE_IO, rtl8139_ioport_read,
> rtl8139_ioport_write, s);
>
> pci_register_io_region(&d->dev, 1, 0x100,
> PCI_ADDRESS_SPACE_MEM, rtl8139_mmio_read,
> rtl8139_mmio_write, s);
That neglects the case of direct mapping. We could add it as a helper
though (which would also need to unregister the ram_addr when the
io_region is unregistered).
>
> Moreover, you could probably drop the opaque parameter and and just
> use d->dev. I hope it's possible to get from one to the other.
Yes.
>
> You could still have a two step process for where it's absolutely
> required (like VGA optimization).
>
> I think it's worth looking at changing the signatures of the mem
> read/write functions. Introducing a size parameter would greatly
> simplify adding 64-bit IO support, for instance.
Well, let's separate those unrelated changes, otherwise nothing will get
done.
> I would argue that ram_addr_t is the wrong thing to overload for PIO
> but as long as it's not exposed in the common API, it doesn't matter
> that much to me.
Currently ram_addr_t is really a 64-bit encoding for a QEMUIORegion
object, plus support for address arithmetic. IMO we should drop it and
move towards explicit object representation, and drop the page
descriptor array.
One of the things that prevents this is that the page descriptor array
is the only place holding the physical -> ioregion mapping. Once we
move this information to other objects, we can start on that. Hence
this patchset.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-31 11:02 ` Christoph Hellwig
@ 2009-05-31 11:50 ` Avi Kivity
0 siblings, 0 replies; 18+ messages in thread
From: Avi Kivity @ 2009-05-31 11:50 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Anthony Liguori, qemu-devel
Christoph Hellwig wrote:
> On Wed, May 27, 2009 at 05:33:37PM -0500, Anthony Liguori wrote:
>
>> pci_register_io_region(&d->dev, 0, 0x100,
>> PCI_ADDRESS_SPACE_IO, rtl8139_ioport_read,
>> rtl8139_ioport_write, s);
>>
>> pci_register_io_region(&d->dev, 1, 0x100,
>> PCI_ADDRESS_SPACE_MEM, rtl8139_mmio_read,
>> rtl8139_mmio_write, s);
>>
>> Moreover, you could probably drop the opaque parameter and and just use
>> d->dev. I hope it's possible to get from one to the other.
>>
>
> I also think this is a much more natural API, and would simplify drivers
> a lot. As for the naming the PCI spec uses the term bar for these
> PIO/MMIO regions, so maybe call it pci_register_bar or pci_setup_bar?
>
Note that this sets up an io region within a BAR, not a complete BAR.
Currently PCI IO is a three stage setup process:
- set up BAR
- set up ram_addr
- set up mapping between BAR and ram_addr
We'll need to keep this flexibility. I agree that helpers that collapse
all three changes will help the majority of devices.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-27 22:33 ` Anthony Liguori
2009-05-31 11:02 ` Christoph Hellwig
2009-05-31 11:47 ` Avi Kivity
@ 2009-05-31 13:03 ` Avi Kivity
2 siblings, 0 replies; 18+ messages in thread
From: Avi Kivity @ 2009-05-31 13:03 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Anthony Liguori, qemu-devel
Anthony Liguori wrote:
> I think it's worth looking at changing the signatures of the mem
> read/write functions. Introducing a size parameter would greatly
> simplify adding 64-bit IO support, for instance.
I looked at this. If the caller supplies a sparse callback array this
triggers the subwidth code. I don't think I understand this enough to
touch it.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 2/3] Add PCI memory region registration
2009-05-31 11:47 ` Avi Kivity
@ 2009-05-31 13:52 ` Anthony Liguori
0 siblings, 0 replies; 18+ messages in thread
From: Anthony Liguori @ 2009-05-31 13:52 UTC (permalink / raw)
To: Avi Kivity; +Cc: Anthony Liguori, qemu-devel
Avi Kivity wrote:
>
> I wanted to make it an io_region as well, but pci_register_io_region()
> is taken. We could rename it to pci_register_bar(), but that will
> cause too much churn IMO.
bar is okay by me.
>> It overloads the term physical memory and still requires separate
>> APIs for IO regions and MEM regions. I know you mentioned that
>> ram_addr_t could be overloaded to also support IO regions but IMHO
>> that's rather confusing. If the new code looked like:
>>
>> s->rtl8139_mmio_io_addr =
>> cpu_register_io_memory(0, rtl8139_mmio_read, rtl8139_mmio_write, s);
>>
>> s->rtl8139_io_io_addr =
>> cpu_register_io_memory(0, rtl8139_ioport_read,
>> rtl8139_ioport_write, s);
>>
>> pci_register_io_region(&d->dev, 0, 0x100,
>> PCI_ADDRESS_SPACE_IO, s->rtl8139_io_io_addr);
>>
>> pci_register_io_region(&d->dev, 1, 0x100,
>> PCI_ADDRESS_SPACE_MEM, s->rtl8139_mmio_addr);
>>
>> I think it would be more understandable. However, I think that the
>> normal case is exactly this work flow so I think it makes sense to
>> collapse it into two function calls. So it could look like:
>>
>> pci_register_io_region(&d->dev, 0, 0x100,
>> PCI_ADDRESS_SPACE_IO, rtl8139_ioport_read,
>> rtl8139_ioport_write, s);
>>
>> pci_register_io_region(&d->dev, 1, 0x100,
>> PCI_ADDRESS_SPACE_MEM, rtl8139_mmio_read,
>> rtl8139_mmio_write, s);
>
> That neglects the case of direct mapping. We could add it as a helper
> though (which would also need to unregister the ram_addr when the
> io_region is unregistered).
I was thinking that the first API (which is very similar to your
original API) would be the implementation with the second API being
helpers that everyone actually used.
>> You could still have a two step process for where it's absolutely
>> required (like VGA optimization).
>>
>> I think it's worth looking at changing the signatures of the mem
>> read/write functions. Introducing a size parameter would greatly
>> simplify adding 64-bit IO support, for instance.
>
> Well, let's separate those unrelated changes, otherwise nothing will
> get done.
As long as we agree on the final API, we can take incremental steps there.
>> I would argue that ram_addr_t is the wrong thing to overload for PIO
>> but as long as it's not exposed in the common API, it doesn't matter
>> that much to me.
>
> Currently ram_addr_t is really a 64-bit encoding for a QEMUIORegion
> object, plus support for address arithmetic. IMO we should drop it
> and move towards explicit object representation, and drop the page
> descriptor array.
>
> One of the things that prevents this is that the page descriptor array
> is the only place holding the physical -> ioregion mapping. Once we
> move this information to other objects, we can start on that. Hence
> this patchset.
So let's pick a better name then physical_memory and I think we've got a
good starting point.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2009-05-31 13:52 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-24 9:29 [Qemu-devel] [PATCH 0/3] Object-based physical memory management Avi Kivity
2009-05-24 9:29 ` [Qemu-devel] [PATCH 1/3] Add PhysicalMemoryRegion type Avi Kivity
2009-05-27 14:16 ` Christoph Hellwig
2009-05-27 15:07 ` Avi Kivity
2009-05-24 9:29 ` [Qemu-devel] [PATCH 2/3] Add PCI memory region registration Avi Kivity
2009-05-27 14:08 ` Anthony Liguori
2009-05-27 15:02 ` Avi Kivity
2009-05-27 15:11 ` Anthony Liguori
2009-05-27 15:24 ` Avi Kivity
2009-05-27 22:33 ` Anthony Liguori
2009-05-31 11:02 ` Christoph Hellwig
2009-05-31 11:50 ` Avi Kivity
2009-05-31 11:47 ` Avi Kivity
2009-05-31 13:52 ` Anthony Liguori
2009-05-31 13:03 ` Avi Kivity
2009-05-24 9:29 ` [Qemu-devel] [PATCH 3/3] Convert RTL8139 to use PCI memory regitration facility Avi Kivity
2009-05-27 14:08 ` [Qemu-devel] [PATCH 0/3] Object-based physical memory management Anthony Liguori
2009-05-27 14:54 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).