From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:55534)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1QMiMN-0000hR-0O
	for qemu-devel@nongnu.org; Wed, 18 May 2011 11:08:28 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1QMiML-0003mW-Oh
	for qemu-devel@nongnu.org; Wed, 18 May 2011 11:08:26 -0400
Received: from mail-yx0-f173.google.com ([209.85.213.173]:48163)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1QMiML-0003mP-IA
	for qemu-devel@nongnu.org; Wed, 18 May 2011 11:08:25 -0400
Received: by yxk8 with SMTP id 8so657281yxk.4
	for <qemu-devel@nongnu.org>; Wed, 18 May 2011 08:08:24 -0700 (PDT)
Message-ID: <4DD3E0E5.2030700@codemonkey.ws>
Date: Wed, 18 May 2011 10:08:21 -0500
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
References: <4DD3C5B9.1080908@redhat.com>
In-Reply-To: <4DD3C5B9.1080908@redhat.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [RFC] Memory API
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>

On 05/18/2011 08:12 AM, Avi Kivity wrote:
> The current memory APIs (cpu_register_io_memory,
> cpu_register_physical_memory) suffer from a number of drawbacks:
>
> - lack of centralized bookkeeping
> - a cpu_register_physical_memory() that overlaps an existing region will
> overwrite the preexisting region; but a following
> cpu_unregister_physical_memory() will not restore things to status quo
> ante.
> - coalescing and dirty logging are all cleared on unregistering, so the
> client has to re-issue them when re-registering
> - lots of opaques
> - no nesting
> - if a region has multiple subregions that need different handling
> (different callbacks, RAM interspersed with MMIO) then client code has
> to deal with that manually
> - we can't interpose code between a device and global memory handling
>
> To fix that, I propose an new API to replace the existing one:
>
>
> #ifndef MEMORY_H
> #define MEMORY_H
>
> typedef struct MemoryRegionOps MemoryRegionOps;
> typedef struct MemoryRegion MemoryRegion;
>
> typedef uint32_t (*MemoryReadFunc)(MemoryRegion *mr, target_phys_addr_t
> addr);
> typedef void (*MemoryWriteFunc)(MemoryRegion *mr, target_phys_addr_t addr,
> uint32_t data);


The API should be 64-bit and needs to have a size associated with it.

> struct MemoryRegionOps {
> MemoryReadFunc readb, readw, readl;
> MemoryWriteFunc writeb, writew, writel;
> };

I'm not a fan of having per-access type function pointers.

> struct MemoryRegion {
> const MemoryRegionOps *ops;
> target_phys_addr_t size;
> target_phys_addr_t addr;
> };

Should include a flags argument for future expansion.

>
> void memory_region_init(MemoryRegion *mr,
> target_phys_addr_t size);

What is this used for?  It's not clear to me.

> void memory_region_init_io(MemoryRegion *mr,
> const MemoryRegionOps *ops,
> target_phys_addr_t size);

How does one test a MemoryRegion to determine it's type?

> void memory_region_init_ram(MemoryRegion *mr,
> target_phys_addr_t size);
> void memory_region_init_ram_ptr(MemoryRegion *mr,
> target_phys_addr_t size,
> void *ptr);
> void memory_region_destroy(MemoryRegion *mr);
> void memory_region_set_offset(MemoryRegion *mr, target_phys_addr_t offset);

What's "offset" mean?

> void memory_region_set_log(MemoryRegion *mr, bool log);
> void memory_region_clear_coalescing(MemoryRegion *mr);
> void memory_region_add_coalescing(MemoryRegion *mr,
> target_phys_addr_t offset,
> target_phys_addr_t size);

I don't think it's worth while to try to fit coalescing into this API. 
It's a KVM specific hack.  I think it's fine to be a hacked on API.

>
> void memory_region_add_subregion(MemoryRegion *mr,
> target_phys_addr_t offset,
> MemoryRegion *subregion);
> void memory_region_del_subregion(MemoryRegion *mr,
> target_phys_addr_t offset,
> MemoryRegion *subregion);
>
> void cpu_register_memory_region(MemoryRegion *mr, target_phys_addr_t addr);
> void cpu_unregister_memory_region(MemoryRegion *mr);
>
> #endif
>
> The API is nested: you can define, say, a PCI BAR containing RAM and
> MMIO, and give it to the PCI subsystem. PCI can then enable/disable the
> BAR and move it to different addresses without calling any callbacks;
> the client code can enable or disable logging or coalescing without
> caring if the BAR is mapped or not. For example:
>
> MemoryRegion mr, mr_mmio, mr_ram;
>
> memory_region_init(&mr);
> memory_region_init_io(&mr_mmio, &mmio_ops, 0x1000);
> memory_region_init_ram(&mr_ram, 0x100000);
> memory_region_add_subregion(&mr, 0, &mr_ram);
> memory_region_add_subregion(&mr, 0x10000, &mr_io);
> memory_region_add_coalescing(&mr_ram, 0, 0x100000);
> pci_register_bar(&pci_dev, 0, &mr);
>
> at this point the PCI subsystem knows everything about the BAR and can
> enable or disable it, or move it around, without further help from the
> device code. On the other hand, the device code can change logging or
> coalescing, or even change the structure of the region, without caring
> about whether the region is currently registered or not.
>
> If we can agree on the API, then I think the way forward is to implement
> it in terms of the old API, change over all devices, then fold the old
> API into the new one.

I think it pretty much makes sense to me.  It might be worth while to 
allow memory region definitions to be static.  For instance:

MemoryRegion e1000_bar = { ... };

Regards,

Anthony Liguori