All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: Daniel Kiper <daniel.kiper@oracle.com>
Cc: xen-devel@lists.xenproject.org,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	David Vrabel <david.vrabel@citrix.com>
Subject: Re: [PATCHv1] xen/balloon: disable memory hotplug in PV guests
Date: Mon, 16 Mar 2015 11:31:49 +0100	[thread overview]
Message-ID: <5506B115.8090804@suse.com> (raw)
In-Reply-To: <20150316100344.GV27971@olila.local.net-space.pl>

On 03/16/2015 11:03 AM, Daniel Kiper wrote:
> On Mon, Mar 16, 2015 at 06:35:04AM +0100, Juergen Gross wrote:
>> On 03/11/2015 04:40 PM, Boris Ostrovsky wrote:
>>> On 03/11/2015 10:42 AM, David Vrabel wrote:
>>>> On 10/03/15 13:35, Boris Ostrovsky wrote:
>>>>> On 03/10/2015 07:40 AM, David Vrabel wrote:
>>>>>> On 09/03/15 14:10, David Vrabel wrote:
>>>>>>> Memory hotplug doesn't work with PV guests because:
>>>>>>>
>>>>>>>     a) The p2m cannot be expanded to cover the new sections.
>>>>>> Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
>>>>>> linear virtual mapped sparse p2m list).
>>>>>>
>>>>>> This one would be non-trivial to fix.  We'd need a sparse set of
>>>>>> vm_area's for the p2m or similar.
>>>>>>
>>>>>>>     b) add_memory() builds page tables for the new sections which
>>>>>>> means
>>>>>>>        the new pages must have valid p2m entries (or a BUG occurs).
>>>>>> After some more testing this appears to be broken by:
>>>>>>
>>>>>> 25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions above
>>>>>> the
>>>>>> end of RAM as 1:1) included 3.16.
>>>>>>
>>>>>> This one can be trivially fixed by setting the new sections in the p2m
>>>>>> to INVALID_P2M_ENTRY before calling add_memory().
>>>>> Have you tried 3.17? As I said yesterday, it worked for me (with 4.4
>>>>> Xen).
>>>> No.  But there are three bugs that prevent it from working in 3.16+ so
>>>> I'm really not sure how you had a working in a 3.17 PV guest.
>>>
>>> This is what I have:
>>>
>>> [build@build-mk2 linux-boris]$ ssh root@tst008 cat
>>> /mnt/lab/bootstrap-x86_64/test_small.xm
>>> extra="console=hvc0 debug earlyprintk=xen "
>>> kernel="/mnt/lab/bootstrap-x86_64/vmlinuz"
>>> ramdisk="/mnt/lab/bootstrap-x86_64/initramfs.cpio.gz"
>>> memory=1024
>>> maxmem = 4096
>>> vcpus=1
>>> maxvcpus=3
>>> name="bootstrap-x86_64"
>>> on_crash="preserve"
>>> vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ]
>>> vnc=1
>>> vnclisten="0.0.0.0"
>>> disk=['phy:/dev/guests/bootstrap-x86_64,xvda,w']
>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl create
>>> /mnt/lab/bootstrap-x86_64/test_small.xm
>>> Parsing config from /mnt/lab/bootstrap-x86_64/test_small.xm
>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
>>> bootstrap-x86_64
>>> bootstrap-x86_64                             2  1024     1 -b----       5.4
>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops uname -r
>>> 3.17.0upstream
>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops dmesg|grep paravirtualized
>>> [    0.000000] Booting paravirtualized kernel on Xen
>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal /proc/meminfo
>>> MemTotal:         968036 kB
>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl mem-set
>>> bootstrap-x86_64 2048
>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
>>> bootstrap-x86_64
>>> bootstrap-x86_64                             2  2048     1 -b----       5.7
>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal /proc/meminfo
>>> MemTotal:        2016612 kB
>>> [build@build-mk2 linux-boris]$
>>>
>>>
>>>>
>>>> Regardless, it definitely doesn't work now because of the linear p2m
>>>> change.  What do you want to do about this?
>>>
>>> Since backing out p2m changes is not an option I guess your patch is the
>>> only short-term alternative.
>>>
>>> But this still looks like a regression so perhaps Juergen can take a
>>> look to see how it can be fixed.
>>
>> Hmm, the p2m list is allocated for the maximum memory size of the domain
>> which is obtained from the hypervisor. In case of Dom0 it is read via
>> XENMEM_maximum_reservation, for a domU it is based on the E820 memory
>> map read via XENMEM_memory_map.
>>
>> I just tested it with a 4.0-rc1 domU kernel with 512MB initial memory
>> and 4GB of maxmem. The E820 map looked like this:
>>
>> [    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
>> [    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved
>> [    0.000000] Xen: [mem 0x0000000000100000-0x00000000ffffffff] usable
>>
>> So the complete 4GB were included, like they should. The resulting p2m
>> list is allocated in the needed size:
>>
>> [    0.000000] p2m virtual area at ffffc90000000000, size is 800000
>>
>> So what is your problem here? Can you post the E820 map and the p2m map
>> info for your failing domain, please?
>
> If you use memory hotplug then maxmem is not a limit from guest kernel
> point of view (host still must allow that operation but it is another
> not related issue). The problem is that p2m must be dynamically expendable
> to support it. Earlier implementation supported that thing and memory
> hotplug worked without any issue.

Okay, now I get it.

The problem with the earlier p2m implementation was that it was
expendable to support only up to 512GB of RAM. So we need some way to
tell the kernel how much virtual memory it should reserve for the p2m
list if memory hotplug is enabled. We could:

a) use a configurable maximum (e.g. for 512GB RAM as today)

b) use the maximum of RAM the machine the domain is started on can ever
    have (what about migration then?)

c) use a kernel parameter specifying the maximum memory size to support

d) a combination of some of the above possibilities

Any thoughts? I think I'd prefer b)+c).


Juergen

  reply	other threads:[~2015-03-16 10:31 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-09 14:10 [PATCHv1] xen/balloon: disable memory hotplug in PV guests David Vrabel
2015-03-09 15:09 ` Boris Ostrovsky
2015-03-09 15:13   ` David Vrabel
2015-03-09 15:25     ` Boris Ostrovsky
2015-03-09 15:31       ` David Vrabel
2015-03-09 15:40         ` Konrad Rzeszutek Wilk
2015-03-09 15:50         ` Boris Ostrovsky
2015-03-09 20:45         ` Daniel Kiper
2015-03-09 20:22       ` Daniel Kiper
2015-03-10 11:40 ` David Vrabel
2015-03-10 13:35   ` Boris Ostrovsky
2015-03-11 14:42     ` David Vrabel
2015-03-11 15:40       ` Boris Ostrovsky
2015-03-16  5:35         ` Juergen Gross
2015-03-16 10:03           ` Daniel Kiper
2015-03-16 10:31             ` Juergen Gross [this message]
2015-03-17 12:40               ` Daniel Kiper
2015-03-17 13:00                 ` Juergen Gross
2015-03-18 10:36               ` David Vrabel
2015-03-18 13:57                 ` Juergen Gross
2015-03-18 13:59                   ` David Vrabel
2015-03-18 15:14                     ` Daniel Kiper
2015-03-19  9:55                       ` Juergen Gross
2015-03-19 11:34                         ` Daniel Kiper
2015-03-19 13:38                           ` Juergen Gross
2015-03-19 14:21                             ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5506B115.8090804@suse.com \
    --to=jgross@suse.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=daniel.kiper@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.