* [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory
@ 2016-06-06 11:37 Bharata B Rao
2016-06-06 14:14 ` Nathan Fontenot
2016-06-07 0:46 ` Michael Roth
0 siblings, 2 replies; 6+ messages in thread
From: Bharata B Rao @ 2016-06-06 11:37 UTC (permalink / raw)
To: qemu-devel; +Cc: david, mdroth, nfont, aik, qemu-ppc, Bharata B Rao
Memory hotplug can fail for some combinations of RAM and maxmem when
DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
on maximum addressable memory returned by guest and this value is currently
being calculated wrongly by the guest kernel routine memory_hotplug_max().
While there is an attempt to fix the guest kernel, this patch works
around the problem within QEMU itself.
memory_hotplug_max() routine in the guest kernel arrives at max
addressable memory by multiplying lmb-size with the lmb-count obtained
from ibm,dynamic-memory property. There are two assumptions here:
- All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
where only hot-pluggable LMBs are present in this property.
- The memory area comprising of RAM and hotplug region is contiguous: This
needn't be true always for PowerKVM as there can be gap between
boot time RAM and hotplug region.
To work around this guest kernel bug, ensure that ibm,dynamic-memory
has information about all the LMBs (RMA, boot-time LMBs, future
hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
hotpluggable region).
RMA is represented separately by memory@0 node. Hence mark RMA LMBs
and also the LMBs for the gap b/n RAM and hotpluggable region as
reserved so that these LMBs are not recounted/counted by guest.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
Changes in v2:
- Dropped the patch that removed alignment gap b/n RAM and hotplug
region, but instead populated ibm,dynamic-memory with LMBs represented
as RESERVED for that gap. We create DRC objects for these LMBs
as it simplifies the logic of populating ibm,dynamic-memory. There can
at max be 4 such DRC objects for the gap area (1GB max) and hence it should
be fine.
v1: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg00627.html
hw/ppc/spapr.c | 51 ++++++++++++++++++++++++++++++++++++--------------
include/hw/ppc/spapr.h | 5 +++--
2 files changed, 40 insertions(+), 16 deletions(-)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0636642..0f4f7a3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -762,18 +762,14 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
int ret, i, offset;
uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
uint32_t prop_lmb_size[] = {0, cpu_to_be32(lmb_size)};
- uint32_t nr_lmbs = (machine->maxram_size - machine->ram_size)/lmb_size;
+ uint32_t hotplug_lmb_start = spapr->hotplug_memory.base / lmb_size;
+ uint32_t nr_lmbs = (spapr->hotplug_memory.base +
+ memory_region_size(&spapr->hotplug_memory.mr)) /
+ lmb_size;
uint32_t *int_buf, *cur_index, buf_len;
int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
/*
- * Don't create the node if there are no DR LMBs.
- */
- if (!nr_lmbs) {
- return 0;
- }
-
- /*
* Allocate enough buffer size to fit in ibm,dynamic-memory
* or ibm,associativity-lookup-arrays
*/
@@ -805,11 +801,18 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
for (i = 0; i < nr_lmbs; i++) {
sPAPRDRConnector *drc;
sPAPRDRConnectorClass *drck;
- uint64_t addr = i * lmb_size + spapr->hotplug_memory.base;;
+ uint64_t addr;
uint32_t *dynamic_memory = cur_index;
+ if (i < hotplug_lmb_start) {
+ addr = i * lmb_size;
+ } else {
+ addr = (i - hotplug_lmb_start) * lmb_size +
+ spapr->hotplug_memory.base;
+ }
+
drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
- addr/lmb_size);
+ addr / lmb_size);
g_assert(drc);
drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -820,7 +823,14 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
dynamic_memory[4] = cpu_to_be32(numa_get_node(addr, NULL));
if (addr < machine->ram_size ||
memory_region_present(get_system_memory(), addr)) {
- dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED);
+ if (addr < spapr->rma_size) {
+ dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED);
+ } else {
+ dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED);
+ }
+ } else if (addr >= machine->ram_size &&
+ addr < spapr->hotplug_memory.base) {
+ dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED);
} else {
dynamic_memory[5] = cpu_to_be32(0);
}
@@ -1652,16 +1662,29 @@ static void spapr_drc_reset(void *opaque)
static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
{
- MachineState *machine = MACHINE(spapr);
uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
- uint32_t nr_lmbs = (machine->maxram_size - machine->ram_size)/lmb_size;
+ uint32_t hotplug_lmb_start = spapr->hotplug_memory.base / lmb_size;
+ uint32_t nr_lmbs = (spapr->hotplug_memory.base +
+ memory_region_size(&spapr->hotplug_memory.mr)) /
+ lmb_size;
int i;
for (i = 0; i < nr_lmbs; i++) {
sPAPRDRConnector *drc;
uint64_t addr;
- addr = i * lmb_size + spapr->hotplug_memory.base;
+ /*
+ * Create DRC objects for entire memory range including RMA, boot-time
+ * memory and hotplug memory and for the gap b/n RAM and hotplug
+ * memory region.
+ */
+ if (i < hotplug_lmb_start) {
+ addr = i * lmb_size;
+ } else {
+ addr = (i - hotplug_lmb_start) * lmb_size +
+ spapr->hotplug_memory.base;
+ }
+
drc = spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR_TYPE_LMB,
addr/lmb_size);
qemu_register_reset(spapr_drc_reset, drc);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 971df3d..bb265a2 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -620,9 +620,10 @@ int spapr_rng_populate_dt(void *fdt);
#define SPAPR_DR_LMB_LIST_ENTRY_SIZE 6
/*
- * This flag value defines the LMB as assigned in ibm,dynamic-memory
- * property under ibm,dynamic-reconfiguration-memory node.
+ * Defines for flag value in ibm,dynamic-memory property under
+ * ibm,dynamic-reconfiguration-memory node.
*/
#define SPAPR_LMB_FLAGS_ASSIGNED 0x00000008
+#define SPAPR_LMB_FLAGS_RESERVED 0x00000080
#endif /* !defined (__HW_SPAPR_H__) */
--
2.1.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory
2016-06-06 11:37 [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory Bharata B Rao
@ 2016-06-06 14:14 ` Nathan Fontenot
2016-06-06 14:47 ` Bharata B Rao
2016-06-07 0:46 ` Michael Roth
1 sibling, 1 reply; 6+ messages in thread
From: Nathan Fontenot @ 2016-06-06 14:14 UTC (permalink / raw)
To: Bharata B Rao, qemu-devel; +Cc: david, mdroth, aik, qemu-ppc
On 06/06/2016 06:37 AM, Bharata B Rao wrote:
> Memory hotplug can fail for some combinations of RAM and maxmem when
> DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
> on maximum addressable memory returned by guest and this value is currently
> being calculated wrongly by the guest kernel routine memory_hotplug_max().
> While there is an attempt to fix the guest kernel, this patch works
> around the problem within QEMU itself.
>
> memory_hotplug_max() routine in the guest kernel arrives at max
> addressable memory by multiplying lmb-size with the lmb-count obtained
> from ibm,dynamic-memory property. There are two assumptions here:
>
> - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
> where only hot-pluggable LMBs are present in this property.
> - The memory area comprising of RAM and hotplug region is contiguous: This
> needn't be true always for PowerKVM as there can be gap between
> boot time RAM and hotplug region.
>
> To work around this guest kernel bug, ensure that ibm,dynamic-memory
> has information about all the LMBs (RMA, boot-time LMBs, future
> hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
> hotpluggable region).
>
> RMA is represented separately by memory@0 node. Hence mark RMA LMBs
> and also the LMBs for the gap b/n RAM and hotpluggable region as
> reserved so that these LMBs are not recounted/counted by guest.
What does qemu do if a guest tries to add or remove a reserved LMB?
Asking because the current guest code (drmgr and kernel) does not
take the reserved flag into consideration when searching for lmbs to
add/remove. This seems like something I should be fixed on the guest
side.
-Nathan
>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
> Changes in v2:
>
> - Dropped the patch that removed alignment gap b/n RAM and hotplug
> region, but instead populated ibm,dynamic-memory with LMBs represented
> as RESERVED for that gap. We create DRC objects for these LMBs
> as it simplifies the logic of populating ibm,dynamic-memory. There can
> at max be 4 such DRC objects for the gap area (1GB max) and hence it should
> be fine.
>
> v1: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg00627.html
>
> hw/ppc/spapr.c | 51 ++++++++++++++++++++++++++++++++++++--------------
> include/hw/ppc/spapr.h | 5 +++--
> 2 files changed, 40 insertions(+), 16 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 0636642..0f4f7a3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -762,18 +762,14 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> int ret, i, offset;
> uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
> uint32_t prop_lmb_size[] = {0, cpu_to_be32(lmb_size)};
> - uint32_t nr_lmbs = (machine->maxram_size - machine->ram_size)/lmb_size;
> + uint32_t hotplug_lmb_start = spapr->hotplug_memory.base / lmb_size;
> + uint32_t nr_lmbs = (spapr->hotplug_memory.base +
> + memory_region_size(&spapr->hotplug_memory.mr)) /
> + lmb_size;
> uint32_t *int_buf, *cur_index, buf_len;
> int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
>
> /*
> - * Don't create the node if there are no DR LMBs.
> - */
> - if (!nr_lmbs) {
> - return 0;
> - }
> -
> - /*
> * Allocate enough buffer size to fit in ibm,dynamic-memory
> * or ibm,associativity-lookup-arrays
> */
> @@ -805,11 +801,18 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> for (i = 0; i < nr_lmbs; i++) {
> sPAPRDRConnector *drc;
> sPAPRDRConnectorClass *drck;
> - uint64_t addr = i * lmb_size + spapr->hotplug_memory.base;;
> + uint64_t addr;
> uint32_t *dynamic_memory = cur_index;
>
> + if (i < hotplug_lmb_start) {
> + addr = i * lmb_size;
> + } else {
> + addr = (i - hotplug_lmb_start) * lmb_size +
> + spapr->hotplug_memory.base;
> + }
> +
> drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> - addr/lmb_size);
> + addr / lmb_size);
> g_assert(drc);
> drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>
> @@ -820,7 +823,14 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> dynamic_memory[4] = cpu_to_be32(numa_get_node(addr, NULL));
> if (addr < machine->ram_size ||
> memory_region_present(get_system_memory(), addr)) {
> - dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED);
> + if (addr < spapr->rma_size) {
> + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED);
> + } else {
> + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED);
> + }
> + } else if (addr >= machine->ram_size &&
> + addr < spapr->hotplug_memory.base) {
> + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED);
> } else {
> dynamic_memory[5] = cpu_to_be32(0);
> }
> @@ -1652,16 +1662,29 @@ static void spapr_drc_reset(void *opaque)
>
> static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
> {
> - MachineState *machine = MACHINE(spapr);
> uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
> - uint32_t nr_lmbs = (machine->maxram_size - machine->ram_size)/lmb_size;
> + uint32_t hotplug_lmb_start = spapr->hotplug_memory.base / lmb_size;
> + uint32_t nr_lmbs = (spapr->hotplug_memory.base +
> + memory_region_size(&spapr->hotplug_memory.mr)) /
> + lmb_size;
> int i;
>
> for (i = 0; i < nr_lmbs; i++) {
> sPAPRDRConnector *drc;
> uint64_t addr;
>
> - addr = i * lmb_size + spapr->hotplug_memory.base;
> + /*
> + * Create DRC objects for entire memory range including RMA, boot-time
> + * memory and hotplug memory and for the gap b/n RAM and hotplug
> + * memory region.
> + */
> + if (i < hotplug_lmb_start) {
> + addr = i * lmb_size;
> + } else {
> + addr = (i - hotplug_lmb_start) * lmb_size +
> + spapr->hotplug_memory.base;
> + }
> +
> drc = spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR_TYPE_LMB,
> addr/lmb_size);
> qemu_register_reset(spapr_drc_reset, drc);
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 971df3d..bb265a2 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -620,9 +620,10 @@ int spapr_rng_populate_dt(void *fdt);
> #define SPAPR_DR_LMB_LIST_ENTRY_SIZE 6
>
> /*
> - * This flag value defines the LMB as assigned in ibm,dynamic-memory
> - * property under ibm,dynamic-reconfiguration-memory node.
> + * Defines for flag value in ibm,dynamic-memory property under
> + * ibm,dynamic-reconfiguration-memory node.
> */
> #define SPAPR_LMB_FLAGS_ASSIGNED 0x00000008
> +#define SPAPR_LMB_FLAGS_RESERVED 0x00000080
>
> #endif /* !defined (__HW_SPAPR_H__) */
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory
2016-06-06 14:14 ` Nathan Fontenot
@ 2016-06-06 14:47 ` Bharata B Rao
2016-06-06 16:02 ` Nathan Fontenot
0 siblings, 1 reply; 6+ messages in thread
From: Bharata B Rao @ 2016-06-06 14:47 UTC (permalink / raw)
To: Nathan Fontenot; +Cc: qemu-devel, david, mdroth, aik, qemu-ppc
On Mon, Jun 06, 2016 at 09:14:48AM -0500, Nathan Fontenot wrote:
> On 06/06/2016 06:37 AM, Bharata B Rao wrote:
> > Memory hotplug can fail for some combinations of RAM and maxmem when
> > DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
> > on maximum addressable memory returned by guest and this value is currently
> > being calculated wrongly by the guest kernel routine memory_hotplug_max().
> > While there is an attempt to fix the guest kernel, this patch works
> > around the problem within QEMU itself.
> >
> > memory_hotplug_max() routine in the guest kernel arrives at max
> > addressable memory by multiplying lmb-size with the lmb-count obtained
> > from ibm,dynamic-memory property. There are two assumptions here:
> >
> > - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
> > where only hot-pluggable LMBs are present in this property.
> > - The memory area comprising of RAM and hotplug region is contiguous: This
> > needn't be true always for PowerKVM as there can be gap between
> > boot time RAM and hotplug region.
> >
> > To work around this guest kernel bug, ensure that ibm,dynamic-memory
> > has information about all the LMBs (RMA, boot-time LMBs, future
> > hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
> > hotpluggable region).
> >
> > RMA is represented separately by memory@0 node. Hence mark RMA LMBs
> > and also the LMBs for the gap b/n RAM and hotpluggable region as
> > reserved so that these LMBs are not recounted/counted by guest.
>
> What does qemu do if a guest tries to add or remove a reserved LMB?
Currently in this approach, LMBs belonging two regions are marked as
reserved:
- RMA region
- Gap b/n end of RAM and beginning of hotplug region
Any hotplug attempts to above regions will be refused by QEMU as they
don't fall under the hotplug memory region.
>
> Asking because the current guest code (drmgr and kernel) does not
> take the reserved flag into consideration when searching for lmbs to
> add/remove. This seems like something I should be fixed on the guest
> side.
Oh ok, but as I said earlier QEMU won't send hotplug request for such
LMBs.
However, I am seeing that when I mark RMA LMBs as reserved in
ibm,dynamic-memory and create separate memory@0 to represent RMA, guest is
just ignoring those LMBs and not doing double detection of RMA memory.
Same is true for the reserved LMBs that I put in ibm,dyanamic-memory
to cover the gap b/n RAM and hotplug region. Guest isn't not
considering this.
Do you still see any problems ?
Regards,
Bharata.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory
2016-06-06 14:47 ` Bharata B Rao
@ 2016-06-06 16:02 ` Nathan Fontenot
0 siblings, 0 replies; 6+ messages in thread
From: Nathan Fontenot @ 2016-06-06 16:02 UTC (permalink / raw)
To: bharata; +Cc: qemu-devel, david, mdroth, aik, qemu-ppc
On 06/06/2016 09:47 AM, Bharata B Rao wrote:
> On Mon, Jun 06, 2016 at 09:14:48AM -0500, Nathan Fontenot wrote:
>> On 06/06/2016 06:37 AM, Bharata B Rao wrote:
>>> Memory hotplug can fail for some combinations of RAM and maxmem when
>>> DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
>>> on maximum addressable memory returned by guest and this value is currently
>>> being calculated wrongly by the guest kernel routine memory_hotplug_max().
>>> While there is an attempt to fix the guest kernel, this patch works
>>> around the problem within QEMU itself.
>>>
>>> memory_hotplug_max() routine in the guest kernel arrives at max
>>> addressable memory by multiplying lmb-size with the lmb-count obtained
>>> from ibm,dynamic-memory property. There are two assumptions here:
>>>
>>> - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
>>> where only hot-pluggable LMBs are present in this property.
>>> - The memory area comprising of RAM and hotplug region is contiguous: This
>>> needn't be true always for PowerKVM as there can be gap between
>>> boot time RAM and hotplug region.
>>>
>>> To work around this guest kernel bug, ensure that ibm,dynamic-memory
>>> has information about all the LMBs (RMA, boot-time LMBs, future
>>> hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
>>> hotpluggable region).
>>>
>>> RMA is represented separately by memory@0 node. Hence mark RMA LMBs
>>> and also the LMBs for the gap b/n RAM and hotpluggable region as
>>> reserved so that these LMBs are not recounted/counted by guest.
>>
>> What does qemu do if a guest tries to add or remove a reserved LMB?
>
> Currently in this approach, LMBs belonging two regions are marked as
> reserved:
>
> - RMA region
> - Gap b/n end of RAM and beginning of hotplug region
>
> Any hotplug attempts to above regions will be refused by QEMU as they
> don't fall under the hotplug memory region.
That's good. I wanted to make sure that this wouldn't break anything
with the current memory hotplug code paths.
>
>>
>> Asking because the current guest code (drmgr and kernel) does not
>> take the reserved flag into consideration when searching for lmbs to
>> add/remove. This seems like something I should be fixed on the guest
>> side.
>
> Oh ok, but as I said earlier QEMU won't send hotplug request for such
> LMBs.
Yes, but that doesn't stop a user from running the drmgr command and
making a request.
>
> However, I am seeing that when I mark RMA LMBs as reserved in
> ibm,dynamic-memory and create separate memory@0 to represent RMA, guest is
> just ignoring those LMBs and not doing double detection of RMA memory.
>
> Same is true for the reserved LMBs that I put in ibm,dyanamic-memory
> to cover the gap b/n RAM and hotplug region. Guest isn't not
> considering this.
>
> Do you still see any problems ?
This should be fine. The boot-up code ignores any LMB that has the
'reserved' flag set.
-Nathan
>
> Regards,
> Bharata.
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory
2016-06-06 11:37 [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory Bharata B Rao
2016-06-06 14:14 ` Nathan Fontenot
@ 2016-06-07 0:46 ` Michael Roth
2016-06-07 2:16 ` David Gibson
1 sibling, 1 reply; 6+ messages in thread
From: Michael Roth @ 2016-06-07 0:46 UTC (permalink / raw)
To: Bharata B Rao, qemu-devel; +Cc: david, nfont, aik, qemu-ppc
Quoting Bharata B Rao (2016-06-06 06:37:49)
> Memory hotplug can fail for some combinations of RAM and maxmem when
> DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
> on maximum addressable memory returned by guest and this value is currently
> being calculated wrongly by the guest kernel routine memory_hotplug_max().
> While there is an attempt to fix the guest kernel, this patch works
> around the problem within QEMU itself.
>
> memory_hotplug_max() routine in the guest kernel arrives at max
> addressable memory by multiplying lmb-size with the lmb-count obtained
> from ibm,dynamic-memory property. There are two assumptions here:
>
> - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
> where only hot-pluggable LMBs are present in this property.
> - The memory area comprising of RAM and hotplug region is contiguous: This
> needn't be true always for PowerKVM as there can be gap between
> boot time RAM and hotplug region.
>
> To work around this guest kernel bug, ensure that ibm,dynamic-memory
> has information about all the LMBs (RMA, boot-time LMBs, future
> hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
> hotpluggable region).
>
> RMA is represented separately by memory@0 node. Hence mark RMA LMBs
> and also the LMBs for the gap b/n RAM and hotpluggable region as
> reserved so that these LMBs are not recounted/counted by guest.
>
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
> Changes in v2:
>
> - Dropped the patch that removed alignment gap b/n RAM and hotplug
> region, but instead populated ibm,dynamic-memory with LMBs represented
> as RESERVED for that gap. We create DRC objects for these LMBs
> as it simplifies the logic of populating ibm,dynamic-memory. There can
> at max be 4 such DRC objects for the gap area (1GB max) and hence it should
> be fine.
>
> v1: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg00627.html
>
> hw/ppc/spapr.c | 51 ++++++++++++++++++++++++++++++++++++--------------
> include/hw/ppc/spapr.h | 5 +++--
> 2 files changed, 40 insertions(+), 16 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 0636642..0f4f7a3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -762,18 +762,14 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> int ret, i, offset;
> uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
> uint32_t prop_lmb_size[] = {0, cpu_to_be32(lmb_size)};
> - uint32_t nr_lmbs = (machine->maxram_size - machine->ram_size)/lmb_size;
> + uint32_t hotplug_lmb_start = spapr->hotplug_memory.base / lmb_size;
> + uint32_t nr_lmbs = (spapr->hotplug_memory.base +
> + memory_region_size(&spapr->hotplug_memory.mr)) /
> + lmb_size;
> uint32_t *int_buf, *cur_index, buf_len;
> int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
>
> /*
> - * Don't create the node if there are no DR LMBs.
> - */
> - if (!nr_lmbs) {
> - return 0;
> - }
> -
> - /*
> * Allocate enough buffer size to fit in ibm,dynamic-memory
> * or ibm,associativity-lookup-arrays
> */
> @@ -805,11 +801,18 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> for (i = 0; i < nr_lmbs; i++) {
> sPAPRDRConnector *drc;
> sPAPRDRConnectorClass *drck;
> - uint64_t addr = i * lmb_size + spapr->hotplug_memory.base;;
> + uint64_t addr;
> uint32_t *dynamic_memory = cur_index;
>
> + if (i < hotplug_lmb_start) {
> + addr = i * lmb_size;
> + } else {
> + addr = (i - hotplug_lmb_start) * lmb_size +
> + spapr->hotplug_memory.base;
> + }
I think both cases reduce to addr = i * lmb_size
> +
> drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> - addr/lmb_size);
> + addr / lmb_size);
> g_assert(drc);
> drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>
> @@ -820,7 +823,14 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> dynamic_memory[4] = cpu_to_be32(numa_get_node(addr, NULL));
> if (addr < machine->ram_size ||
> memory_region_present(get_system_memory(), addr)) {
> - dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED);
> + if (addr < spapr->rma_size) {
> + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED);
> + } else {
> + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED);
> + }
> + } else if (addr >= machine->ram_size &&
> + addr < spapr->hotplug_memory.base) {
> + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED);
> } else {
> dynamic_memory[5] = cpu_to_be32(0);
> }
> @@ -1652,16 +1662,29 @@ static void spapr_drc_reset(void *opaque)
>
> static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
> {
> - MachineState *machine = MACHINE(spapr);
> uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
> - uint32_t nr_lmbs = (machine->maxram_size - machine->ram_size)/lmb_size;
> + uint32_t hotplug_lmb_start = spapr->hotplug_memory.base / lmb_size;
> + uint32_t nr_lmbs = (spapr->hotplug_memory.base +
> + memory_region_size(&spapr->hotplug_memory.mr)) /
> + lmb_size;
> int i;
>
> for (i = 0; i < nr_lmbs; i++) {
> sPAPRDRConnector *drc;
> uint64_t addr;
>
> - addr = i * lmb_size + spapr->hotplug_memory.base;
> + /*
> + * Create DRC objects for entire memory range including RMA, boot-time
> + * memory and hotplug memory and for the gap b/n RAM and hotplug
> + * memory region.
> + */
> + if (i < hotplug_lmb_start) {
> + addr = i * lmb_size;
> + } else {
> + addr = (i - hotplug_lmb_start) * lmb_size +
> + spapr->hotplug_memory.base;
> + }
Here too
> +
> drc = spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR_TYPE_LMB,
> addr/lmb_size);
And since there are no gaps, addr / lmb_size can now just be i, which I
guess means you don't need work with addr at all here anymore...
> qemu_register_reset(spapr_drc_reset, drc);
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 971df3d..bb265a2 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -620,9 +620,10 @@ int spapr_rng_populate_dt(void *fdt);
> #define SPAPR_DR_LMB_LIST_ENTRY_SIZE 6
>
> /*
> - * This flag value defines the LMB as assigned in ibm,dynamic-memory
> - * property under ibm,dynamic-reconfiguration-memory node.
> + * Defines for flag value in ibm,dynamic-memory property under
> + * ibm,dynamic-reconfiguration-memory node.
> */
> #define SPAPR_LMB_FLAGS_ASSIGNED 0x00000008
> +#define SPAPR_LMB_FLAGS_RESERVED 0x00000080
>
> #endif /* !defined (__HW_SPAPR_H__) */
> --
> 2.1.0
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory
2016-06-07 0:46 ` Michael Roth
@ 2016-06-07 2:16 ` David Gibson
0 siblings, 0 replies; 6+ messages in thread
From: David Gibson @ 2016-06-07 2:16 UTC (permalink / raw)
To: Michael Roth; +Cc: Bharata B Rao, qemu-devel, nfont, aik, qemu-ppc
[-- Attachment #1: Type: text/plain, Size: 8219 bytes --]
On Mon, Jun 06, 2016 at 07:46:53PM -0500, Michael Roth wrote:
> Quoting Bharata B Rao (2016-06-06 06:37:49)
> > Memory hotplug can fail for some combinations of RAM and maxmem when
> > DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
> > on maximum addressable memory returned by guest and this value is currently
> > being calculated wrongly by the guest kernel routine memory_hotplug_max().
> > While there is an attempt to fix the guest kernel, this patch works
> > around the problem within QEMU itself.
> >
> > memory_hotplug_max() routine in the guest kernel arrives at max
> > addressable memory by multiplying lmb-size with the lmb-count obtained
> > from ibm,dynamic-memory property. There are two assumptions here:
> >
> > - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
> > where only hot-pluggable LMBs are present in this property.
> > - The memory area comprising of RAM and hotplug region is contiguous: This
> > needn't be true always for PowerKVM as there can be gap between
> > boot time RAM and hotplug region.
> >
> > To work around this guest kernel bug, ensure that ibm,dynamic-memory
> > has information about all the LMBs (RMA, boot-time LMBs, future
> > hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
> > hotpluggable region).
> >
> > RMA is represented separately by memory@0 node. Hence mark RMA LMBs
> > and also the LMBs for the gap b/n RAM and hotpluggable region as
> > reserved so that these LMBs are not recounted/counted by guest.
> >
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > ---
> > Changes in v2:
> >
> > - Dropped the patch that removed alignment gap b/n RAM and hotplug
> > region, but instead populated ibm,dynamic-memory with LMBs represented
> > as RESERVED for that gap. We create DRC objects for these LMBs
> > as it simplifies the logic of populating ibm,dynamic-memory. There can
> > at max be 4 such DRC objects for the gap area (1GB max) and hence it should
> > be fine.
> >
> > v1: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg00627.html
> >
> > hw/ppc/spapr.c | 51 ++++++++++++++++++++++++++++++++++++--------------
> > include/hw/ppc/spapr.h | 5 +++--
> > 2 files changed, 40 insertions(+), 16 deletions(-)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 0636642..0f4f7a3 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -762,18 +762,14 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> > int ret, i, offset;
> > uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
> > uint32_t prop_lmb_size[] = {0, cpu_to_be32(lmb_size)};
> > - uint32_t nr_lmbs = (machine->maxram_size - machine->ram_size)/lmb_size;
> > + uint32_t hotplug_lmb_start = spapr->hotplug_memory.base / lmb_size;
> > + uint32_t nr_lmbs = (spapr->hotplug_memory.base +
> > + memory_region_size(&spapr->hotplug_memory.mr)) /
> > + lmb_size;
> > uint32_t *int_buf, *cur_index, buf_len;
> > int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
> >
> > /*
> > - * Don't create the node if there are no DR LMBs.
> > - */
> > - if (!nr_lmbs) {
> > - return 0;
> > - }
> > -
> > - /*
> > * Allocate enough buffer size to fit in ibm,dynamic-memory
> > * or ibm,associativity-lookup-arrays
> > */
> > @@ -805,11 +801,18 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> > for (i = 0; i < nr_lmbs; i++) {
> > sPAPRDRConnector *drc;
> > sPAPRDRConnectorClass *drck;
> > - uint64_t addr = i * lmb_size + spapr->hotplug_memory.base;;
> > + uint64_t addr;
> > uint32_t *dynamic_memory = cur_index;
> >
> > + if (i < hotplug_lmb_start) {
> > + addr = i * lmb_size;
> > + } else {
> > + addr = (i - hotplug_lmb_start) * lmb_size +
> > + spapr->hotplug_memory.base;
> > + }
>
> I think both cases reduce to addr = i * lmb_size
Yes, I think so too.
> > +
> > drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> > - addr/lmb_size);
> > + addr / lmb_size);
> > g_assert(drc);
> > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> >
> > @@ -820,7 +823,14 @@ static int spapr_populate_drconf_memory(sPAPRMachineState *spapr, void *fdt)
> > dynamic_memory[4] = cpu_to_be32(numa_get_node(addr, NULL));
> > if (addr < machine->ram_size ||
> > memory_region_present(get_system_memory(), addr)) {
> > - dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED);
> > + if (addr < spapr->rma_size) {
> > + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED);
> > + } else {
> > + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED);
> > + }
> > + } else if (addr >= machine->ram_size &&
> > + addr < spapr->hotplug_memory.base) {
> > + dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED);
> > } else {
> > dynamic_memory[5] = cpu_to_be32(0);
> > }
> > @@ -1652,16 +1662,29 @@ static void spapr_drc_reset(void *opaque)
> >
> > static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
> > {
> > - MachineState *machine = MACHINE(spapr);
> > uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE;
> > - uint32_t nr_lmbs = (machine->maxram_size - machine->ram_size)/lmb_size;
> > + uint32_t hotplug_lmb_start = spapr->hotplug_memory.base / lmb_size;
> > + uint32_t nr_lmbs = (spapr->hotplug_memory.base +
> > + memory_region_size(&spapr->hotplug_memory.mr)) /
> > + lmb_size;
> > int i;
> >
> > for (i = 0; i < nr_lmbs; i++) {
> > sPAPRDRConnector *drc;
> > uint64_t addr;
> >
> > - addr = i * lmb_size + spapr->hotplug_memory.base;
> > + /*
> > + * Create DRC objects for entire memory range including RMA, boot-time
> > + * memory and hotplug memory and for the gap b/n RAM and hotplug
> > + * memory region.
> > + */
> > + if (i < hotplug_lmb_start) {
> > + addr = i * lmb_size;
> > + } else {
> > + addr = (i - hotplug_lmb_start) * lmb_size +
> > + spapr->hotplug_memory.base;
> > + }
>
> Here too
But also, I'm not sure why you're changing the logic here anyway. For
constructiong the DT property you want to insert the "dummy" LMBs, but
for actually constructing the DRCs you still want to start at hotplug
memory based, don't you?
> > +
> > drc = spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR_TYPE_LMB,
> > addr/lmb_size);
>
> And since there are no gaps, addr / lmb_size can now just be i, which I
> guess means you don't need work with addr at all here anymore...
>
> > qemu_register_reset(spapr_drc_reset, drc);
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 971df3d..bb265a2 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -620,9 +620,10 @@ int spapr_rng_populate_dt(void *fdt);
> > #define SPAPR_DR_LMB_LIST_ENTRY_SIZE 6
> >
> > /*
> > - * This flag value defines the LMB as assigned in ibm,dynamic-memory
> > - * property under ibm,dynamic-reconfiguration-memory node.
> > + * Defines for flag value in ibm,dynamic-memory property under
> > + * ibm,dynamic-reconfiguration-memory node.
> > */
> > #define SPAPR_LMB_FLAGS_ASSIGNED 0x00000008
> > +#define SPAPR_LMB_FLAGS_RESERVED 0x00000080
> >
> > #endif /* !defined (__HW_SPAPR_H__) */
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-06-07 2:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-06 11:37 [Qemu-devel] [RFC PATCH v2] spapr: Ensure all LMBs are represented in ibm, dynamic-memory Bharata B Rao
2016-06-06 14:14 ` Nathan Fontenot
2016-06-06 14:47 ` Bharata B Rao
2016-06-06 16:02 ` Nathan Fontenot
2016-06-07 0:46 ` Michael Roth
2016-06-07 2:16 ` David Gibson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).