qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: qemu-devel@nongnu.org
Cc: kvm@vger.kernel.org
Subject: [Qemu-devel] [PATCH 13/23] memory: document the memory API
Date: Mon, 25 Jul 2011 17:02:54 +0300	[thread overview]
Message-ID: <1311602584-23409-14-git-send-email-avi@redhat.com> (raw)
In-Reply-To: <1311602584-23409-1-git-send-email-avi@redhat.com>

Signed-off-by: Avi Kivity <avi@redhat.com>
---
 docs/memory.txt |  172 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 172 insertions(+), 0 deletions(-)
 create mode 100644 docs/memory.txt

diff --git a/docs/memory.txt b/docs/memory.txt
new file mode 100644
index 0000000..4460c06
--- /dev/null
+++ b/docs/memory.txt
@@ -0,0 +1,172 @@
+The memory API
+==============
+
+The memory API models the memory and I/O buses and controllers of a QEMU
+machine.  It attempts to allow modelling of:
+
+ - ordinary RAM
+ - memory-mapped I/O (MMIO)
+ - memory controllers that can dynamically reroute physical memory regions
+  to different destinations
+
+The memory model provides support for
+
+ - tracking RAM changes by the guest
+ - setting up coalesced memory for kvm
+ - setting up ioeventfd regions for kvm
+
+Memory is modelled as an tree (really acyclic graph) of MemoryRegion objects.
+The root of the tree is memory as seen from the CPU's viewpoint (the system
+bus).  Nodes in the tree represent other buses, memory controllers, and
+memory regions that have been rerouted.  Leaves are RAM and MMIO regions.
+
+Types of regions
+----------------
+
+There are four types of memory regions (all represented by a single C type
+MemoryRegion):
+
+- RAM: a RAM region is simply a range of host memory that can be made available
+  to the guest.
+
+- MMIO: a range of guest memory that is implemented by host callbacks;
+  each read or write causes a callback to be called on the host.
+
+- container: a container simply includes other memory regions, each at
+  a different offset.  Containers are useful for grouping several regions
+  into one unit.  For example, a PCI BAR may be composed of a RAM region
+  and an MMIO region.
+
+  A container's subregions are usually non-overlapping.  In some cases it is
+  useful to have overlapping regions; for example a memory controller that
+  can overlay a subregion of RAM with MMIO or ROM, or a PCI controller
+  that does not prevent card from claiming overlapping BARs.
+
+- alias: a subsection of another region.  Aliases allow a region to be
+  split apart into discontiguous regions.  Examples of uses are memory banks
+  used when the guest address space is smaller than the amount of RAM
+  addressed, or a memory controller that splits main memory to expose a "PCI
+  hole".  Aliases may point to any type of region, including other aliases,
+  but an alias may not point back to itself, directly or indirectly.
+
+
+Region names
+------------
+
+Regions are assigned names by the constructor.  For most regions these are
+only used for debugging purposes, but RAM regions also use the name to identify
+live migration sections.  This means that RAM region names need to have ABI
+stability.
+
+Region lifecycle
+----------------
+
+A region is created by one of the constructor functions (memory_region_init*())
+and destroyed by the destructor (memory_region_destroy()).  In between,
+a region can be added to an address space by using memory_region_add_subregion()
+and removed using memory_region_del_subregion().  Region attributes may be
+changed at any point; they take effect once the region becomes exposed to the
+guest.
+
+Overlapping regions and priority
+--------------------------------
+Usually, regions may not overlap each other; a memory address decodes into
+exactly one target.  In some cases it is useful to allow regions to overlap,
+and sometimes to control which of an overlapping regions is visible to the
+guest.  This is done with memory_region_add_subregion_overlap(), which
+allows the region to overlap any other region in the same container, and
+specifies a priority that allows the core to decide which of two regions at
+the same address are visible (highest wins).
+
+Visibility
+----------
+The memory core uses the following rules to select a memory region when the
+guest accesses an address:
+
+- all direct subregions of the root region are matched against the address, in
+  descending priority order
+  - if the address lies outside the region offset/size, the subregion is
+    discarded
+  - if the subregion is a leaf (RAM or MMIO), the seach terminates
+  - if the subregion is a container, the same algorithm is used within the
+    subregion (after the address is adjusted by the subregion offset)
+  - if the subregion is an alias, the search is continues at the alias target
+    (after the address is adjusted by the subregion offset and alias offset)
+
+Example memory map
+------------------
+
+system_memory: container@0-2^48-1
+ |
+ +---- lomem: alias@0-0xdfffffff ---> #ram (0-0xdfffffff)
+ |
+ +---- himem: alias@0x100000000-0x11fffffff ---> #ram (0xe0000000-0xffffffff)
+ |
+ +---- vga-window: alias@0xa0000-0xbfffff ---> #pci (0xa0000-0xbffff)
+ |      (prio 1)
+ |
+ +---- pci-hole: alias@0xe0000000-0xffffffff ---> #pci (0xe0000000-0xffffffff)
+
+pci (0-2^32-1)
+ |
+ +--- vga-area: container@0xa0000-0xbffff
+ |      |
+ |      +--- alias@0x00000-0x7fff  ---> #vram (0x010000-0x017fff)
+ |      |
+ |      +--- alias@0x08000-0xffff  ---> #vram (0x020000-0x027fff)
+ |
+ +---- vram: ram@0xe1000000-0xe1ffffff
+ |
+ +---- vga-mmio: mmio@0xe2000000-0xe200ffff
+
+ram: ram@0x00000000-0xffffffff
+
+The is a (simplified) PC memory map. The 4GB RAM block is mapped into the
+system address space via two aliases: "lomem" is a 1:1 mapping of the first
+3.5GB; "himem" maps the last 0.5GB at address 4GB.  This leaves 0.5GB for the
+so-called PCI hole, that allows a 32-bit PCI bus to exist in a system with
+4GB of memory.
+
+The memory controller diverts addresses in the range 640K-768K to the PCI
+address space.  This is modeled using the "vga-window" alias, mapped at a
+higher priority so it obscures the RAM at the same addresses.  The vga window
+can be removed by programming the memory controller; this is modelled by
+removing the alias and exposing the RAM underneath.
+
+The pci address space is not a direct child of the system address space, since
+we only want parts of it to be visible (we accomplish this using aliases).
+It has two subregions: vga-area models the legacy vga window and is occupied
+by two 32K memory banks pointing at two sections of the framebuffer.
+In addition the vram is mapped as a BAR at address e1000000, and an additional
+BAR containing MMIO registers is mapped after it.
+
+Note that if the guest maps a BAR outside the PCI hole, it would not be
+visible as the pci-hole alias clips it to a 0.5GB range.
+
+Attributes
+----------
+
+Various region attributes (read-only, dirty logging, coalesced mmio, ioeventfd)
+can be changed during the region lifecycle.  They take effect once the region
+is made visible (which can be immediately, later, or never).
+
+MMIO Operations
+---------------
+
+MMIO regions are provided with ->read() and ->write() callbacks; in addition
+various constraints can be supplied to control how these callbacks are called:
+
+ - .valid.min_access_size, .valid.max_access_size define the access sizes
+   (in bytes) which the device accepts; accesses outside this range will
+   have device and bus specific behaviour (ignored, or machine check)
+ - .valid.aligned specifies that the device only accepts naturally aligned
+   accesses.  Unaligned accesses invoke device and bus specific behaviour.
+ - .impl.min_access_size, .impl.max_access_size define the access sizes
+   (in bytes) supported by the *implementation*; other access sizes will be
+   emulated using the ones available.  For example a 4-byte write will be
+   emulated using four 1-byte write, is .impl.max_access_size = 1.
+ - .impl.valid specifies that the *implementation* only supports unaligned
+   accesses; unaligned accesses will be emulated by two aligned accesses.
+ - .old_portio and .old_mmio can be used to ease porting from code using
+   cpu_register_io_memory() and register_ioport().  They should not be used
+   in new code.
-- 
1.7.5.3

  parent reply	other threads:[~2011-07-25 14:03 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-25 14:02 [Qemu-devel] [PATCH 00/23] Memory API, batch 1 Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 01/23] Hierarchical memory region API Avi Kivity
2011-07-25 18:41   ` Anthony Liguori
2011-07-26  9:35     ` Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 02/23] memory: implement dirty tracking Avi Kivity
2011-07-25 18:43   ` Anthony Liguori
2011-07-25 14:02 ` [Qemu-devel] [PATCH 03/23] memory: merge adjacent segments of a single memory region Avi Kivity
2011-07-25 18:48   ` Anthony Liguori
2011-07-26  9:55     ` Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 04/23] Internal interfaces for memory API Avi Kivity
2011-07-25 18:49   ` Anthony Liguori
2011-07-25 14:02 ` [Qemu-devel] [PATCH 05/23] memory: abstract address space operations Avi Kivity
2011-07-25 18:51   ` Anthony Liguori
2011-07-25 14:02 ` [Qemu-devel] [PATCH 06/23] memory: rename MemoryRegion::has_ram_addr to ::terminates Avi Kivity
2011-07-25 18:56   ` Anthony Liguori
2011-07-26  9:59     ` Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 07/23] memory: late initialization of ram_addr Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 08/23] memory: I/O address space support Avi Kivity
2011-07-25 19:00   ` Anthony Liguori
2011-07-25 14:02 ` [Qemu-devel] [PATCH 09/23] memory: add backward compatibility for old portio registration Avi Kivity
2011-07-25 19:01   ` Anthony Liguori
2011-07-25 14:02 ` [Qemu-devel] [PATCH 10/23] memory: add backward compatibility for old mmio registration Avi Kivity
2011-07-25 19:02   ` Anthony Liguori
2011-07-25 14:02 ` [Qemu-devel] [PATCH 11/23] memory: add ioeventfd support Avi Kivity
2011-07-25 15:16   ` malc
2011-07-25 15:17     ` Avi Kivity
2011-07-25 15:22       ` malc
2011-07-25 15:28         ` Avi Kivity
2011-07-25 15:38           ` malc
2011-07-25 15:43             ` Avi Kivity
2011-07-25 19:08   ` Anthony Liguori
2011-07-26 10:08     ` Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 12/23] memory: separate building the final memory map into two steps Avi Kivity
2011-07-25 19:12   ` Anthony Liguori
2011-07-26 10:43     ` Avi Kivity
2011-07-25 14:02 ` Avi Kivity [this message]
2011-07-25 19:15   ` [Qemu-devel] [PATCH 13/23] memory: document the memory API Anthony Liguori
2011-07-26 10:44     ` Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 14/23] memory: transaction API Avi Kivity
2011-07-25 19:16   ` Anthony Liguori
2011-07-26 10:48     ` Avi Kivity
2011-07-26 11:39       ` Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 15/23] exec.c: initialize memory map Avi Kivity
2011-07-25 19:17   ` Anthony Liguori
2011-07-26 10:55     ` Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 16/23] ioport: register ranges by byte aligned addresses always Avi Kivity
2011-07-25 19:20   ` Anthony Liguori
2011-07-26 10:59     ` Avi Kivity
2011-07-25 14:02 ` [Qemu-devel] [PATCH 17/23] pc: grab system_memory Avi Kivity
2011-07-25 19:22   ` Anthony Liguori
2011-07-25 14:02 ` [Qemu-devel] [PATCH 18/23] pc: convert pc_memory_init() to memory API Avi Kivity
2011-07-25 19:23   ` Anthony Liguori
2011-07-25 14:03 ` [Qemu-devel] [PATCH 19/23] pc: move global memory map out of pc_init1() and into its callers Avi Kivity
2011-07-25 20:02   ` Anthony Liguori
2011-07-26 11:02     ` Avi Kivity
2011-07-25 14:03 ` [Qemu-devel] [PATCH 20/23] pci: pass address space to pci bus when created Avi Kivity
2011-07-25 20:03   ` Anthony Liguori
2011-07-25 14:03 ` [Qemu-devel] [PATCH 21/23] pci: add MemoryRegion based BAR management API Avi Kivity
2011-07-25 20:20   ` Anthony Liguori
2011-07-26 11:06     ` Avi Kivity
2011-07-25 14:03 ` [Qemu-devel] [PATCH 22/23] sysbus: add MemoryRegion based memory " Avi Kivity
2011-07-25 20:21   ` Anthony Liguori
2011-07-25 14:03 ` [Qemu-devel] [PATCH 23/23] usb-ohci: convert to MemoryRegion Avi Kivity
2011-07-25 20:22   ` Anthony Liguori
2011-07-25 20:23 ` [Qemu-devel] [PATCH 00/23] Memory API, batch 1 Anthony Liguori
2011-07-26 11:32   ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1311602584-23409-14-git-send-email-avi@redhat.com \
    --to=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).