From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57161) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1avJMy-0007BL-Pp for qemu-devel@nongnu.org; Wed, 27 Apr 2016 02:54:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1avJMu-0005F0-NJ for qemu-devel@nongnu.org; Wed, 27 Apr 2016 02:54:44 -0400 References: <1458016736-10544-1-git-send-email-bharata@linux.vnet.ibm.com> <1458016736-10544-3-git-send-email-bharata@linux.vnet.ibm.com> <20160316013605.GC9032@voom> <20160316044154.GD13176@in.ibm.com> <20160425112050.545f4ff3@nial.brq.redhat.com> <20160426050923.GB9793@in.ibm.com> <20160426095236.23e51b77@nial.brq.redhat.com> <20160426210337.11723.83860@loki> From: Thomas Huth Message-ID: <5720622A.2040407@redhat.com> Date: Wed, 27 Apr 2016 08:54:34 +0200 MIME-Version: 1.0 In-Reply-To: <20160426210337.11723.83860@loki> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH v2 2/2] spapr: Memory hot-unplug support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Michael Roth , Igor Mammedov , Bharata B Rao Cc: nfont@linux.vnet.ibm.com, qemu-ppc@nongnu.org, qemu-devel@nongnu.org, David Gibson On 26.04.2016 23:03, Michael Roth wrote: > Quoting Igor Mammedov (2016-04-26 02:52:36) >> On Tue, 26 Apr 2016 10:39:23 +0530 >> Bharata B Rao wrote: >> >>> On Mon, Apr 25, 2016 at 11:20:50AM +0200, Igor Mammedov wrote: >>>> On Wed, 16 Mar 2016 10:11:54 +0530 >>>> Bharata B Rao wrote: >>>> >>>>> On Wed, Mar 16, 2016 at 12:36:05PM +1100, David Gibson wrote: >>>>>> On Tue, Mar 15, 2016 at 10:08:56AM +0530, Bharata B Rao wrote: >>>>>>> Add support to hot remove pc-dimm memory devices. >>>>>>> >>>>>>> Signed-off-by: Bharata B Rao >>>>>> >>>>>> Reviewed-by: David Gibson >>>>>> >>>>>> Looks correct, but again, needs to wait on the PAPR change. >>>> [...] >>>>> >>>>> While we are here, I would also like to get some opinion on the real >>>>> need for memory unplug. Is there anything that memory unplug gives us >>>>> which memory ballooning (shrinking mem via ballooning) can't give ? >>>> Sure ballooning can complement memory hotplug but turning it on would >>>> effectively reduce hotplug to balloning as it would enable overcommit >>>> capability instead of hard partitioning pc-dimms provides. So one >>>> could just use ballooning only and not bother with hotplug at all. >>>> >>>> On the other hand memory hotplug/unplug (at least on x86) tries >>>> to model real hardware, thus removing need in paravirt ballooning >>>> solution in favor of native guest support. >>> >>> Thanks for your views. >>> >>>> >>>> PS: >>>> Guest wise, currently hot-unplug is not well supported in linux, >>>> i.e. it's not guarantied that guest will honor unplug request >>>> as it may pin dimm by using it as a non migratable memory. So >>>> there is something to work on guest side to make unplug more >>>> reliable/guarantied. >>> >>> In the above scenario where the guest doesn't allow removal of certain >>> parts of DIMM memory, what is the expected behaviour as far as QEMU >>> DIMM device is concerned ? I seem to be running into this situation >>> very often with PowerPC mem unplug where I am left with a DIMM device >>> that has only some memory blocks released. In this situation, I would like >>> to block further unplug requests on the same device, but QEMU seems >>> to allow more such unplug requests to come in via the monitor. So >>> qdev won't help me here ? Should I detect such condition from the >>> machine unplug() handler and take required action ? >> I think offlining is a guests task along with recovering from >> inability to offline (i.e. offline all + eject or restore original state). >> QUEM does it's job by notifying guest what dimm it wants to remove >> and removes it when guest asks it (at least in x86 world). > > In the case of pseries, the DIMM abstraction isn't really exposed to > the guest, but rather the memory blocks we use to make the backing > memdev memory available to the guest. During unplug, the guest > completely releases these blocks back to QEMU, and if it can only > release a subset of what's requested it does not attempt to recover. > We can potentially change that behavior on the guest side, since > partially-freed DIMMs aren't currently useful on the host-side... > > But, in the case of pseries, I wonder if it makes sense to maybe go > ahead and MADV_DONTNEED the ranges backing these released blocks so the > host can at least partially reclaim the memory from a partially > unplugged DIMM? Sounds like this could be a good compromise. Thomas