From: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
To: qemu-devel@nongnu.org, kvm@vger.kernel.org, seabios@seabios.org
Cc: avi@redhat.com, anthony@codemonkey.ws, gleb@redhat.com,
imammedo@redhat.com, kevin@koconnor.net, wency@cn.fujitsu.com,
Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Subject: [RFC PATCH v2 09/21] pc: Add dimm paravirt SRAT info
Date: Wed, 11 Jul 2012 12:31:54 +0200 [thread overview]
Message-ID: <1342002726-18258-10-git-send-email-vasilis.liaskovitis@profitbricks.com> (raw)
In-Reply-To: <1342002726-18258-1-git-send-email-vasilis.liaskovitis@profitbricks.com>
The numa_fw_cfg paravirt interface is extended to include SRAT information for
all hotplug-able dimms. There are 3 words for each hotplug-able memory slot,
denoting start address, size and node proximity. The new info is appended after
existing numa info, so that the fw_cfg layout does not break. This information
is used by Seabios to build hotplug memory device objects at runtime.
nb_numa_nodes is set to 1 by default (not 0), so that we always pass srat info
to SeaBIOS.
v1->v2:
Dimm SRAT info (#dimms) is appended at end of existing numa fw_cfg in order not
to break existing layout
Documentation of the new fwcfg layout is included in docs/specs/fwcfg.txt
Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
---
docs/specs/fwcfg.txt | 28 ++++++++++++++++++++++++++
hw/pc.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++-
vl.c | 2 +-
3 files changed, 80 insertions(+), 3 deletions(-)
create mode 100644 docs/specs/fwcfg.txt
diff --git a/docs/specs/fwcfg.txt b/docs/specs/fwcfg.txt
new file mode 100644
index 0000000..e6fcd8f
--- /dev/null
+++ b/docs/specs/fwcfg.txt
@@ -0,0 +1,28 @@
+QEMU<->BIOS Paravirt Documentation
+--------------------------------------
+
+This document describes paravirt data structures passed from QEMU to BIOS.
+
+fw_cfg SRAT paravirt info
+--------------------
+The SRAT info passed from QEMU to BIOS has the following layout:
+
+-----------------------------------------------------------------------------------------------
+#nodes | cpu0_pxm | cpu1_pxm | ... | cpulast_pxm | node0_mem | node1_mem | ... | nodelast_mem
+
+-----------------------------------------------------------------------------------------------
+#dimms | dimm0_start | dimm0_sz | dimm0_pxm | ... | dimmlast_start | dimmlast_sz | dimmlast_pxm
+
+Entry 0 contains the number of numa nodes (nb_numa_nodes).
+
+Entries 1..max_cpus: The next max_cpus entries describe node proximity for each
+one of the vCPUs in the system.
+
+Entries max_cpus+1..max_cpus+nb_numa_nodes+1: The next nb_numa_nodes entries
+describe the memory size for each one of the NUMA nodes in the system.
+
+Entry max_cpus+nb_numa_nodes+1 contains the number of memory dimms (nb_hp_dimms)
+
+The last 3 * nb_hp_dimms entries are organized in triplets: Each triplet contains
+the physical address offset, size (in bytes), and node proximity for the
+respective dimm.
diff --git a/hw/pc.c b/hw/pc.c
index ef9901a..cf651d0 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -598,12 +598,15 @@ int e820_add_entry(uint64_t address, uint64_t length, uint32_t type)
return index;
}
+static void setup_hp_dimms(uint64_t *fw_cfg_slots);
+
static void *bochs_bios_init(void)
{
void *fw_cfg;
uint8_t *smbios_table;
size_t smbios_len;
uint64_t *numa_fw_cfg;
+ uint64_t *hp_dimms_fw_cfg;
int i, j;
register_ioport_write(0x400, 1, 2, bochs_bios_write, NULL);
@@ -638,8 +641,10 @@ static void *bochs_bios_init(void)
/* allocate memory for the NUMA channel: one (64bit) word for the number
* of nodes, one word for each VCPU->node and one word for each node to
* hold the amount of memory.
+ * Finally one word for the number of hotplug memory slots and three words
+ * for each hotplug memory slot (start address, size and node proximity).
*/
- numa_fw_cfg = g_malloc0((1 + max_cpus + nb_numa_nodes) * 8);
+ numa_fw_cfg = g_malloc0((2 + max_cpus + nb_numa_nodes + 3 * nb_hp_dimms) * 8);
numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes);
for (i = 0; i < max_cpus; i++) {
for (j = 0; j < nb_numa_nodes; j++) {
@@ -652,8 +657,15 @@ static void *bochs_bios_init(void)
for (i = 0; i < nb_numa_nodes; i++) {
numa_fw_cfg[max_cpus + 1 + i] = cpu_to_le64(node_mem[i]);
}
+
+ numa_fw_cfg[1 + max_cpus + nb_numa_nodes] = cpu_to_le64(nb_hp_dimms);
+
+ hp_dimms_fw_cfg = numa_fw_cfg + 2 + max_cpus + nb_numa_nodes;
+ if (nb_hp_dimms)
+ setup_hp_dimms(hp_dimms_fw_cfg);
+
fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, (uint8_t *)numa_fw_cfg,
- (1 + max_cpus + nb_numa_nodes) * 8);
+ (2 + max_cpus + nb_numa_nodes + 3 * nb_hp_dimms) * 8);
return fw_cfg;
}
@@ -1223,3 +1235,40 @@ target_phys_addr_t pc_set_hp_memory_offset(uint64_t size)
return ret;
}
+
+static void setup_hp_dimms(uint64_t *fw_cfg_slots)
+{
+ int i = 0;
+ Error *err = NULL;
+ DeviceState *dev;
+ DimmState *slot;
+ const char *type;
+ BusChild *kid;
+ BusState *bus = sysbus_get_default();
+
+ QTAILQ_FOREACH(kid, &bus->children, sibling) {
+ dev = kid->child;
+ type = object_property_get_str(OBJECT(dev), "type", &err);
+ if (err) {
+ error_free(err);
+ fprintf(stderr, "error getting device type\n");
+ exit(1);
+ }
+
+ if (!strcmp(type, "dimm")) {
+ if (!dev->id) {
+ fprintf(stderr, "error getting dimm device id\n");
+ exit(1);
+ }
+ slot = DIMM(dev);
+ /* determine starting physical address for this memory slot */
+ assert(slot->start);
+ fw_cfg_slots[3 * slot->idx] = cpu_to_le64(slot->start);
+ fw_cfg_slots[3 * slot->idx + 1] = cpu_to_le64(slot->size);
+ fw_cfg_slots[3 * slot->idx + 2] = cpu_to_le64(slot->node);
+ i++;
+ }
+ }
+ assert(i == nb_hp_dimms);
+}
+
diff --git a/vl.c b/vl.c
index 0ff8818..37c9798 100644
--- a/vl.c
+++ b/vl.c
@@ -2335,7 +2335,7 @@ int main(int argc, char **argv, char **envp)
node_cpumask[i] = 0;
}
- nb_numa_nodes = 0;
+ nb_numa_nodes = 1;
nb_nics = 0;
autostart= 1;
--
1.7.9
next prev parent reply other threads:[~2012-07-11 10:32 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-11 10:31 [RFC PATCH v2 00/21] ACPI memory hotplug Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 01/21][SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 02/21][SeaBIOS] Add SSDT memory device support Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 03/21][SeaBIOS] acpi-dsdt: Implement functions for memory hotplug Vasilis Liaskovitis
2012-07-17 7:23 ` Wen Congyang
2012-07-20 8:48 ` Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 04/21][SeaBIOS] acpi: generate hotplug memory devices Vasilis Liaskovitis
2012-07-11 10:48 ` Wen Congyang
2012-07-11 16:39 ` Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 05/21][SeaBIOS] pciinit: Fix pcimem_start value Vasilis Liaskovitis
2012-07-11 11:56 ` Gerd Hoffmann
2012-07-11 16:45 ` Vasilis Liaskovitis
2012-07-12 7:22 ` Gerd Hoffmann
2012-07-12 9:09 ` Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 06/21] dimm: Implement memory device abstraction Vasilis Liaskovitis
2012-07-12 19:55 ` Blue Swirl
2012-07-13 17:39 ` Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 07/21] acpi_piix4: Implement memory device hotplug registers Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 08/21] pc: calculate dimm physical addresses and adjust memory map Vasilis Liaskovitis
2012-07-11 10:31 ` Vasilis Liaskovitis [this message]
2012-07-12 19:48 ` [RFC PATCH v2 09/21] pc: Add dimm paravirt SRAT info Blue Swirl
2012-07-13 17:40 ` [Qemu-devel] " Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 10/21] Implement "-dimm" command line option Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 11/21] Implement dimm_add and dimm_del hmp/qmp commands Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 12/21] fix live-migration when "populated=on" is missing Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 13/21] Implement memory hotplug notification lists Vasilis Liaskovitis
2012-07-11 14:59 ` [Qemu-devel] " Eric Blake
2012-07-11 16:47 ` Vasilis Liaskovitis
2012-07-11 10:31 ` [RFC PATCH v2 14/21][SeaBIOS] acpi_dsdt: Support _OST dimm method Vasilis Liaskovitis
2012-07-11 10:32 ` [RFC PATCH v2 15/21] acpi_piix4: _OST dimm support Vasilis Liaskovitis
2012-07-11 10:32 ` [RFC PATCH v2 16/21] acpi_piix4: Update dimm state on VM reboot Vasilis Liaskovitis
2012-07-11 10:32 ` [RFC PATCH v2 17/21][SeaBIOS] acpi_dsdt: Revert internal dimm state on _OST failure Vasilis Liaskovitis
2012-07-11 10:32 ` [RFC PATCH v2 18/21] acpi_piix4: Update dimm bitmap state on hot-remove fail Vasilis Liaskovitis
2012-07-11 10:32 ` [RFC PATCH v2 19/21] Implement "info memtotal" and "query-memtotal" Vasilis Liaskovitis
2012-07-11 15:14 ` Eric Blake
2012-07-11 16:55 ` [Qemu-devel] " Vasilis Liaskovitis
2012-07-11 10:32 ` [RFC PATCH v2 20/21] Implement -dimms, -dimmspop command line options Vasilis Liaskovitis
2012-07-11 14:55 ` Avi Kivity
2012-07-11 16:57 ` Vasilis Liaskovitis
2012-07-11 10:32 ` [RFC PATCH v2 21/21] Implement mem_increase, mem_decrease hmp/qmp commands Vasilis Liaskovitis
2012-07-12 20:04 ` [Qemu-devel] [RFC PATCH v2 00/21] ACPI memory hotplug Blue Swirl
2012-07-13 17:49 ` Vasilis Liaskovitis
2012-07-14 9:08 ` Blue Swirl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1342002726-18258-10-git-send-email-vasilis.liaskovitis@profitbricks.com \
--to=vasilis.liaskovitis@profitbricks.com \
--cc=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=gleb@redhat.com \
--cc=imammedo@redhat.com \
--cc=kevin@koconnor.net \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
--cc=seabios@seabios.org \
--cc=wency@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).