From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51684) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8jEM-00089T-79 for qemu-devel@nongnu.org; Fri, 03 Jun 2016 03:09:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b8jEH-00085w-Te for qemu-devel@nongnu.org; Fri, 03 Jun 2016 03:09:17 -0400 Date: Fri, 3 Jun 2016 17:10:48 +1000 From: David Gibson Message-ID: <20160603071048.GS1087@voom.fritz.box> References: <1464932984-26623-1-git-send-email-bharata@linux.vnet.ibm.com> <201606030550.u535jnDh039622@mx0a-001b2d01.pphosted.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="xe2geHXJg22At20M" Content-Disposition: inline In-Reply-To: <201606030550.u535jnDh039622@mx0a-001b2d01.pphosted.com> Subject: Re: [Qemu-devel] [RFC PATCH v1 3/3] spapr: spapr: Work around the memory hotplug failure with DDW List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bharata B Rao Cc: qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com, nfont@linux.vnet.ibm.com, aik@au1.ibm.com, qemu-ppc@nongnu.org --xe2geHXJg22At20M Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jun 03, 2016 at 11:19:44AM +0530, Bharata B Rao wrote: > Memory hotplug can fail for some combinations of RAM and maxmem when > DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends > on maximum addressable memory returned by guest and this value is current= ly > being calculated wrongly by the guest kernel routine memory_hotplug_max(). > While there is an attempt to fix the guest kernel, this patch works > around the problem within QEMU itself. >=20 > memory_hotplug_max() routine in the guest kernel arrives at max > addressable memory by multiplying lmb-size with the lmb-count obtained > from ibm,dynamic-memory property. There are two assumptions here: >=20 > - All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM > where only hot-pluggable LMBs are present in this property. > - The memory area comprising of RAM and hotplug region is contiguous: This > needn't be true always for PowerKVM as there can be gap between > boot time RAM and hotplug region. >=20 > This work around involves having all the LMBs (RMA, rest of the boot time > LMBs and hot-pluggable LMBs) as part of ibm,dynamic-memory so that > guest kernel's calculation of max addressable memory comes out correct > resulting in correct DDW value which prevents memory hotplug failures. > memory@0 is created for RMA, but RMA LMBs are also represented as > "reserved" LMBs in ibm,dynamic-memory. Parts of this are essenitally a > revert of e8f986fc57a664a74b9f685b466506366a15201b. >=20 > In addition to this, the alignment of hotplug memory region is reduced fr= om > current 1G to 256M (LMB size in PowerKVM) so that we don't end up with any > gaps between boot time RAM and hotplug region. Hmm.. could we work around the problem without altering the memory alignment by inserting extra dummy reserved LMBs covering the gap? >=20 > Signed-off-by: Bharata B Rao > --- > hw/ppc/spapr.c | 59 +++++++++++++++++++++++++++++++++++---------= ------ > include/hw/ppc/spapr.h | 5 +++-- > 2 files changed, 45 insertions(+), 19 deletions(-) >=20 > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 623c35f..3dfbc37 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -569,7 +569,6 @@ static int spapr_populate_memory(sPAPRMachineState *s= papr, void *fdt) > } > if (!mem_start) { > /* ppc_spapr_init() checks for rma_size <=3D node0_size alre= ady */ > - spapr_populate_memory_node(fdt, i, 0, spapr->rma_size); > mem_start +=3D spapr->rma_size; > node_size -=3D spapr->rma_size; > } > @@ -762,18 +761,13 @@ static int spapr_populate_drconf_memory(sPAPRMachin= eState *spapr, void *fdt) > int ret, i, offset; > uint64_t lmb_size =3D SPAPR_MEMORY_BLOCK_SIZE; > uint32_t prop_lmb_size[] =3D {0, cpu_to_be32(lmb_size)}; > - uint32_t nr_lmbs =3D (machine->maxram_size - machine->ram_size)/lmb_= size; > + uint32_t nr_rma_lmbs =3D spapr->rma_size / lmb_size; > + uint32_t nr_lmbs =3D machine->maxram_size / lmb_size; > + uint32_t nr_assigned_lmbs =3D machine->ram_size / lmb_size; > uint32_t *int_buf, *cur_index, buf_len; > int nr_nodes =3D nb_numa_nodes ? nb_numa_nodes : 1; > =20 > /* > - * Don't create the node if there are no DR LMBs. > - */ > - if (!nr_lmbs) { > - return 0; > - } > - > - /* > * Allocate enough buffer size to fit in ibm,dynamic-memory > * or ibm,associativity-lookup-arrays > */ > @@ -805,9 +799,15 @@ static int spapr_populate_drconf_memory(sPAPRMachine= State *spapr, void *fdt) > for (i =3D 0; i < nr_lmbs; i++) { > sPAPRDRConnector *drc; > sPAPRDRConnectorClass *drck; > - uint64_t addr =3D i * lmb_size + spapr->hotplug_memory.base;; > + uint64_t addr; > uint32_t *dynamic_memory =3D cur_index; > =20 > + if (i < nr_assigned_lmbs) { > + addr =3D i * lmb_size; > + } else { > + addr =3D (i - nr_assigned_lmbs) * lmb_size + > + spapr->hotplug_memory.base; > + } > drc =3D spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB, > addr/lmb_size); > g_assert(drc); > @@ -820,7 +820,11 @@ static int spapr_populate_drconf_memory(sPAPRMachine= State *spapr, void *fdt) > dynamic_memory[4] =3D cpu_to_be32(numa_get_node(addr, NULL)); > if (addr < machine->ram_size || > memory_region_present(get_system_memory(), addr)) { > - dynamic_memory[5] =3D cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGNED); > + if (i < nr_rma_lmbs) { > + dynamic_memory[5] =3D cpu_to_be32(SPAPR_LMB_FLAGS_RESERV= ED); > + } else { > + dynamic_memory[5] =3D cpu_to_be32(SPAPR_LMB_FLAGS_ASSIGN= ED); > + } > } else { > dynamic_memory[5] =3D cpu_to_be32(0); > } > @@ -882,6 +886,8 @@ int spapr_h_cas_compose_response(sPAPRMachineState *s= papr, > /* Generate ibm,dynamic-reconfiguration-memory node if required */ > if (memory_update && smc->dr_lmb_enabled) { > _FDT((spapr_populate_drconf_memory(spapr, fdt))); > + } else { > + _FDT((spapr_populate_memory(spapr, fdt))); > } > =20 > /* Pack resulting tree */ > @@ -919,10 +925,23 @@ static void spapr_finalize_fdt(sPAPRMachineState *s= papr, > /* open out the base tree into a temp buffer for the final tweaks */ > _FDT((fdt_open_into(spapr->fdt_skel, fdt, FDT_MAX_SIZE))); > =20 > - ret =3D spapr_populate_memory(spapr, fdt); > - if (ret < 0) { > - fprintf(stderr, "couldn't setup memory nodes in fdt\n"); > - exit(1); > + /* > + * Add memory@0 node to represent RMA. Rest of the memory is either > + * represented by memory nodes or ibm,dynamic-reconfiguration-memory > + * node later during ibm,client-architecture-support call. > + * > + * If NUMA is configured, ensure that memory@0 ends up in the > + * first memory-less node. > + */ > + if (nb_numa_nodes) { > + for (i =3D 0; i < nb_numa_nodes; ++i) { > + if (numa_info[i].node_mem) { > + spapr_populate_memory_node(fdt, i, 0, spapr->rma_size); > + break; > + } > + } > + } else { > + spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size); > } > =20 > ret =3D spapr_populate_vdevice(spapr->vio_bus, fdt); > @@ -1654,14 +1673,20 @@ static void spapr_create_lmb_dr_connectors(sPAPRM= achineState *spapr) > { > MachineState *machine =3D MACHINE(spapr); > uint64_t lmb_size =3D SPAPR_MEMORY_BLOCK_SIZE; > - uint32_t nr_lmbs =3D (machine->maxram_size - machine->ram_size)/lmb_= size; > + uint32_t nr_lmbs =3D machine->maxram_size / lmb_size; > + uint32_t nr_assigned_lmbs =3D machine->ram_size / lmb_size; > int i; > =20 > for (i =3D 0; i < nr_lmbs; i++) { > sPAPRDRConnector *drc; > uint64_t addr; > =20 > - addr =3D i * lmb_size + spapr->hotplug_memory.base; > + if (i < nr_assigned_lmbs) { > + addr =3D i * lmb_size; > + } else { > + addr =3D (i - nr_assigned_lmbs) * lmb_size + > + spapr->hotplug_memory.base; > + } > drc =3D spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR= _TYPE_LMB, > addr/lmb_size); > qemu_register_reset(spapr_drc_reset, drc); > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index b2aeb15..e5ef979 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -619,9 +619,10 @@ int spapr_rng_populate_dt(void *fdt); > #define SPAPR_DR_LMB_LIST_ENTRY_SIZE 6 > =20 > /* > - * This flag value defines the LMB as assigned in ibm,dynamic-memory > - * property under ibm,dynamic-reconfiguration-memory node. > + * Defines for flag value in ibm,dynamic-memory property under > + * ibm,dynamic-reconfiguration-memory node. > */ > #define SPAPR_LMB_FLAGS_ASSIGNED 0x00000008 > +#define SPAPR_LMB_FLAGS_RESERVED 0x00000080 > =20 > #endif /* !defined (__HW_SPAPR_H__) */ --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --xe2geHXJg22At20M Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJXUS14AAoJEGw4ysog2bOSSEgP/jAnNct+hqxx/fX9ZbW725t+ oeJDeNisFbky9UqS4+kq/HRXASXdA4OaPIVGhdeiLygLiyOXkwlYaIYIVu3ADNY0 VTrdth7Xnpc7iQjO0Lf+RqaukWeWiYM0sW/VCfB3t4XY54QyvRGo1DcHAfsJ5nfO /0vsKVIJIJjag4+DcxdKa9mA1/n5VM7QUSl20osqsD4crt5/sKyyP75wRQqWKX09 mKaF2qfI4lSQGAzlBcreEVmevG748y7kh2kw5lA3++uCf9hkzBNRbI77s/Ebump6 kH6bRjKhjSFVK/KeUOD4VtCuxNTkNA4zZPn48GT0/4BcJJQGftBqgwncvrhdtyVz QSJk+9uGBg07zEXwXra1knZ7rDlUZztv0tj/3clOLDaQnqLqG4Fv7VpkZXTfE4kx 5gYs5EIBvGSQPFcHdsLURlL0cg/e0FwLqTUkw+wtU3scEvysOTQQ7L2zCLFEaEDf STeLct1bkSMYjYERsMeZx11uI3h4xvTv2hxoSnBjXFbSo+NGzH1MjcifTXtfcUFW IkwYsXvRq5ilYO8CbaqiHROzz4StolH2salpdzrXld0b8ACtMmzZ2q6LWLc9E4Sj KfazUUtGimQAGQr4aQp413z+tOODDplWu01qYkcWIPhRyb2JQGQkZJpF6nKiKwZB 6MNTaL1DeGBcP0s2igbH =N2XI -----END PGP SIGNATURE----- --xe2geHXJg22At20M--