From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54739) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aw6BR-0007DI-0a for qemu-devel@nongnu.org; Fri, 29 Apr 2016 07:02:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aw6BE-00037z-4k for qemu-devel@nongnu.org; Fri, 29 Apr 2016 07:01:59 -0400 References: <1458016736-10544-1-git-send-email-bharata@linux.vnet.ibm.com> <1458016736-10544-3-git-send-email-bharata@linux.vnet.ibm.com> <20160316013605.GC9032@voom> <20160316044154.GD13176@in.ibm.com> <20160425112050.545f4ff3@nial.brq.redhat.com> <20160426050923.GB9793@in.ibm.com> <20160426095236.23e51b77@nial.brq.redhat.com> <20160426210337.11723.83860@loki> <20160429032403.GA3010@littlecatz.fritz.box> <57230311.1060200@redhat.com> <20160429065939.GB12284@in.ibm.com> <572319AB.5000004@redhat.com> <20160429103019.48948336@nial.brq.redhat.com> From: Thomas Huth Message-ID: <57233F09.9090804@redhat.com> Date: Fri, 29 Apr 2016 13:01:29 +0200 MIME-Version: 1.0 In-Reply-To: <20160429103019.48948336@nial.brq.redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH v2 2/2] spapr: Memory hot-unplug support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Mammedov Cc: bharata@linux.vnet.ibm.com, David Gibson , Michael Roth , qemu-devel@nongnu.org, qemu-ppc@nongnu.org, nfont@linux.vnet.ibm.com On 29.04.2016 10:30, Igor Mammedov wrote: > On Fri, 29 Apr 2016 10:22:03 +0200 > Thomas Huth wrote: >=20 >> On 29.04.2016 08:59, Bharata B Rao wrote: >>> On Fri, Apr 29, 2016 at 08:45:37AM +0200, Thomas Huth wrote: =20 >>>> On 29.04.2016 05:24, David Gibson wrote: =20 >>>>> On Tue, Apr 26, 2016 at 04:03:37PM -0500, Michael Roth wrote: =20 >>>> ... =20 >>>>>> In the case of pseries, the DIMM abstraction isn't really exposed = to >>>>>> the guest, but rather the memory blocks we use to make the backing >>>>>> memdev memory available to the guest. During unplug, the guest >>>>>> completely releases these blocks back to QEMU, and if it can only >>>>>> release a subset of what's requested it does not attempt to recove= r. >>>>>> We can potentially change that behavior on the guest side, since >>>>>> partially-freed DIMMs aren't currently useful on the host-side... >>>>>> >>>>>> But, in the case of pseries, I wonder if it makes sense to maybe g= o >>>>>> ahead and MADV_DONTNEED the ranges backing these released blocks s= o the >>>>>> host can at least partially reclaim the memory from a partially >>>>>> unplugged DIMM? =20 >>>>> >>>>> Urgh.. I can see the benefit, but I'm a bit uneasy about making the >>>>> DIMM semantics different in this way on Power. >>>>> >>>>> I'm shoehorning the PAPR DR memory mechanism into the qemu DIMM mod= el >>>>> was a good idea after all. =20 >>>> >>>> Ignorant question (sorry, I really don't have much experience yet he= re): >>>> Could we maybe align the size of the LMBs with the size of the DIMMs= ? >>>> E.g. make the LMBs bigger or the DIMMs smaller, so that they match? = =20 >>> >>> Should work, but the question is what should be the right size so tha= t >>> we have good granularity of hotplug but also not run out of mem slots >>> thereby limiting us on the maxmem. I remember you changed the memslot= s >>> to 512 in KVM, but we are yet to move up from 32 in QEMU for sPAPR th= ough. =20 >> >> Half of the slots should be "reserved" for PCI and other stuff, so we >> could use 256 for memory - that way we would also on the same level as >> x86 which also uses 256 memslots here, as far as I know. >> >> Anyway, couldn't we simply calculate the SPAPR_MEMORY_BLOCK_SIZE >> dynamically, according to the maxmem and slot values that the user >> specified? So that SPAPR_MEMORY_BLOCK_SIZE simply would match the DIMM >> size? ... or is there some constraint that I've missed so that >> SPAPR_MEMORY_BLOCK_SIZE has to be a compile-time #defined value? > If you do that than possible DIMM size should be decided at startup > and fixed. If DIMM of wrong size is plugged in machine should fail > hotplug request. > Question is how mgmt will know fixed DIMM size that sPAPR just calculat= ed? Ok, sorry, I somehow had that bad idea in mind that all DIMMs for hot-plugging should have the same size. That's of course not the case if we model something similar to DIMM plugging on real hardware. So please never mind, it was just a wrong assumption on my side. OTOH, it maybe also does not make sense to keep the LMB size always at such a small, fixed value. Imagine the user specifies slots=3D32 and maxmem=3D32G ... maybe we should then disallow plugging DIMMs that are smaller than 1G, so we could use a LMB size of 1G in this case? (plugging DIMMs of different size > 1G would then still be allowed, too, of course) Thomas