From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: [PATCHv1] xen/balloon: disable memory hotplug in PV guests Date: Wed, 18 Mar 2015 10:36:42 +0000 Message-ID: <5509553A.7030707@citrix.com> References: <1425910200-17541-1-git-send-email-david.vrabel@citrix.com> <54FED833.1080609@citrix.com> <54FEF308.3090702@oracle.com> <5500544B.4020302@citrix.com> <550061F0.9030206@oracle.com> <55066B88.9040300@suse.com> <20150316100344.GV27971@olila.local.net-space.pl> <5506B115.8090804@suse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1YYBLD-0001TT-5Q for xen-devel@lists.xenproject.org; Wed, 18 Mar 2015 10:36:47 +0000 In-Reply-To: <5506B115.8090804@suse.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Juergen Gross , Daniel Kiper Cc: xen-devel@lists.xenproject.org, Boris Ostrovsky , David Vrabel List-Id: xen-devel@lists.xenproject.org On 16/03/15 10:31, Juergen Gross wrote: > On 03/16/2015 11:03 AM, Daniel Kiper wrote: >> On Mon, Mar 16, 2015 at 06:35:04AM +0100, Juergen Gross wrote: >>> On 03/11/2015 04:40 PM, Boris Ostrovsky wrote: >>>> On 03/11/2015 10:42 AM, David Vrabel wrote: >>>>> On 10/03/15 13:35, Boris Ostrovsky wrote: >>>>>> On 03/10/2015 07:40 AM, David Vrabel wrote: >>>>>>> On 09/03/15 14:10, David Vrabel wrote: >>>>>>>> Memory hotplug doesn't work with PV guests because: >>>>>>>> >>>>>>>> a) The p2m cannot be expanded to cover the new sections. >>>>>>> Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to >>>>>>> linear virtual mapped sparse p2m list). >>>>>>> >>>>>>> This one would be non-trivial to fix. We'd need a sparse set of >>>>>>> vm_area's for the p2m or similar. >>>>>>> >>>>>>>> b) add_memory() builds page tables for the new sections which >>>>>>>> means >>>>>>>> the new pages must have valid p2m entries (or a BUG occurs). >>>>>>> After some more testing this appears to be broken by: >>>>>>> >>>>>>> 25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions above >>>>>>> the >>>>>>> end of RAM as 1:1) included 3.16. >>>>>>> >>>>>>> This one can be trivially fixed by setting the new sections in >>>>>>> the p2m >>>>>>> to INVALID_P2M_ENTRY before calling add_memory(). >>>>>> Have you tried 3.17? As I said yesterday, it worked for me (with 4.4 >>>>>> Xen). >>>>> No. But there are three bugs that prevent it from working in 3.16+ so >>>>> I'm really not sure how you had a working in a 3.17 PV guest. >>>> >>>> This is what I have: >>>> >>>> [build@build-mk2 linux-boris]$ ssh root@tst008 cat >>>> /mnt/lab/bootstrap-x86_64/test_small.xm >>>> extra="console=hvc0 debug earlyprintk=xen " >>>> kernel="/mnt/lab/bootstrap-x86_64/vmlinuz" >>>> ramdisk="/mnt/lab/bootstrap-x86_64/initramfs.cpio.gz" >>>> memory=1024 >>>> maxmem = 4096 >>>> vcpus=1 >>>> maxvcpus=3 >>>> name="bootstrap-x86_64" >>>> on_crash="preserve" >>>> vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ] >>>> vnc=1 >>>> vnclisten="0.0.0.0" >>>> disk=['phy:/dev/guests/bootstrap-x86_64,xvda,w'] >>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl create >>>> /mnt/lab/bootstrap-x86_64/test_small.xm >>>> Parsing config from /mnt/lab/bootstrap-x86_64/test_small.xm >>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep >>>> bootstrap-x86_64 >>>> bootstrap-x86_64 2 1024 1 >>>> -b---- 5.4 >>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops uname -r >>>> 3.17.0upstream >>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops dmesg|grep >>>> paravirtualized >>>> [ 0.000000] Booting paravirtualized kernel on Xen >>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal >>>> /proc/meminfo >>>> MemTotal: 968036 kB >>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl mem-set >>>> bootstrap-x86_64 2048 >>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep >>>> bootstrap-x86_64 >>>> bootstrap-x86_64 2 2048 1 >>>> -b---- 5.7 >>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal >>>> /proc/meminfo >>>> MemTotal: 2016612 kB >>>> [build@build-mk2 linux-boris]$ >>>> >>>> >>>>> >>>>> Regardless, it definitely doesn't work now because of the linear p2m >>>>> change. What do you want to do about this? >>>> >>>> Since backing out p2m changes is not an option I guess your patch is >>>> the >>>> only short-term alternative. >>>> >>>> But this still looks like a regression so perhaps Juergen can take a >>>> look to see how it can be fixed. >>> >>> Hmm, the p2m list is allocated for the maximum memory size of the domain >>> which is obtained from the hypervisor. In case of Dom0 it is read via >>> XENMEM_maximum_reservation, for a domU it is based on the E820 memory >>> map read via XENMEM_memory_map. >>> >>> I just tested it with a 4.0-rc1 domU kernel with 512MB initial memory >>> and 4GB of maxmem. The E820 map looked like this: >>> >>> [ 0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable >>> [ 0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved >>> [ 0.000000] Xen: [mem 0x0000000000100000-0x00000000ffffffff] usable >>> >>> So the complete 4GB were included, like they should. The resulting p2m >>> list is allocated in the needed size: >>> >>> [ 0.000000] p2m virtual area at ffffc90000000000, size is 800000 >>> >>> So what is your problem here? Can you post the E820 map and the p2m map >>> info for your failing domain, please? >> >> If you use memory hotplug then maxmem is not a limit from guest kernel >> point of view (host still must allow that operation but it is another >> not related issue). The problem is that p2m must be dynamically >> expendable >> to support it. Earlier implementation supported that thing and memory >> hotplug worked without any issue. > > Okay, now I get it. > > The problem with the earlier p2m implementation was that it was > expendable to support only up to 512GB of RAM. So we need some way to > tell the kernel how much virtual memory it should reserve for the p2m > list if memory hotplug is enabled. We could: > > a) use a configurable maximum (e.g. for 512GB RAM as today) I would set the p2m virtual area to cover up to 512 GB (needs 1 GB of virt space) for a 64-bit guest and up to 64 GB (needs 64 MB of virt space) for a 32-bit guest. This fixes the regression with minimal complexity and reasonable overheads. David