From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54320) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uz4C1-000435-NN for qemu-devel@nongnu.org; Tue, 16 Jul 2013 08:17:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uz4Bx-0008FT-DF for qemu-devel@nongnu.org; Tue, 16 Jul 2013 08:17:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37595) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uz4Bw-0008FB-VM for qemu-devel@nongnu.org; Tue, 16 Jul 2013 08:17:17 -0400 Message-ID: <51E539BC.3060203@redhat.com> Date: Tue, 16 Jul 2013 14:17:00 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <6b5ff346b23fba9a8707507fda7f9b71719a55be.1372234719.git.hutao@cn.fujitsu.com> <51CAB866.2080507@redhat.com> <51CBC8B3.8070708@cn.fujitsu.com> <51CBE1DD.9020301@redhat.com> <20130715170551.GC11958@dhcp-192-168-178-175.profitbricks.localdomain> <51E42D06.9080004@redhat.com> <20130716012744.GE10917@G08FNSTD100614.fnst.cn.fujitsu.com> <51E4E604.6060608@redhat.com> <20130716121901.157d7e85@nial.usersys.redhat.com> <51E52112.3010803@redhat.com> <20130716140006.25b5175c@nial.usersys.redhat.com> In-Reply-To: <20130716140006.25b5175c@nial.usersys.redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v5 05/14] vl: handle "-device dimm" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Mammedov Cc: Eduardo Habkost , Hu Tao , qemu-devel@nongnu.org, Vasilis Liaskovitis , Bandan Das , gaowanlong@cn.fujitsu.com Il 16/07/2013 14:00, Igor Mammedov ha scritto: >>> we can leave -numa for initial memory mapping and manage of the mapping >>> of hotpluggable regions with -device dimm,node=X,size=Y. >>> >>> It that case command line -device dimm will provide a fully initialized >>> dimm device usable at startup (but hot-unplugable) and >>> (monitor) device_add dimm,,node=X,size=Y >>> would serve hot-plug case. >>> >>> That way arbitrary sized dimm could be hot-pluged without specifying them >>> at startup, like it's done on bare-metal. >> >> But the memory ranges need to be specified at startup in the ACPI >> tables, and that's what "-numa mem" is for. > not really, there is caveat with windows, which needs a hotplugable SRAT entry > that tells it max possible limit (otherwise windows sees new dimm device but > refuses to use it saying "server is not configured for hotplug" or something > like this), but as far as such entry exists, windows is happily uses dynamic > _CRS() and _PXM() if they are below that limit (even if a new range is not in > any range defined by SRAT). > > And ACPI spec doesn't say that SRAT MUST be populated with hotplug ranges. Right, what is required in ACPI is only pre-allocation of slots. > It's kind of simplier for bare-metal, where they might do it due to limited > supported DIMM capacity by reserving static entries with max supported ranges > per DIMM and know in advance DIMM count for platform. But actual _CRS() anyway > dynamic since plugged in DIMM could have a smaller capacity then supported > max for slot. > > To summarize ACPI + windows limitations: > - ACPI needs to pre-allocate memory devices, i.e. number of possible increments > OSPM could utilize. It might be possible to overcome limitation be using > Load() or LoadTable() in runtime, but I haven't tried it. > - Windows needs to know max supported limit, a fake entry in SRAT from RamSize > to max_mem works nicely there (tested with ws2008r2DC and ws2012DC). > > That's why I was proposing to extend "-m" option for "slots" number (i.e. nr of > memory devices) and 'max_mem' to make Windows happy and cap mgmt tools from > going over initially configured limit. As far as memory hot-plug is concerned, the "-numa mem" proposal is exactly the same thing that you are proposing, minus the ability to specify many slots in one go. The same "-numa mem" can be used for host node binding as well, but that's not really relevant for memory hot-plug. In case you want multiple hotpluggable ranges, bonud to different host nodes, It doesn't matter if the SRAT will have one or many fake entries. > then -device dimm could be used for hotpluggable mem available at startup > and device_add fir adding more dimms with user defined sizes to desired nodes > at runtime. Yes, no disagreement on this. > Works nice without any need for 'populated=xxx' and predefined ranges. > > PS: > I'll be able to post more or less usable RFC that does it on top of mst's ACPI > tables in QEMU by the end of this week. Good! Paolo >> >>> In addition command line -device would be used in migration case to describe >>> already hot-plugged dimms on target. >> >> Yep. >> >> Paolo >