From: Juergen Gross <jgross@suse.com>
To: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: Increasing domain memory beyond initial maxmem
Date: Thu, 31 Mar 2022 14:22:03 +0200 [thread overview]
Message-ID: <362b6115-e296-e01e-520f-31a0826426eb@suse.com> (raw)
In-Reply-To: <YkWYGFJ/Cl+B2C37@mail-itl>
[-- Attachment #1.1.1: Type: text/plain, Size: 7619 bytes --]
On 31.03.22 14:01, Marek Marczykowski-Górecki wrote:
> On Thu, Mar 31, 2022 at 08:41:19AM +0200, Juergen Gross wrote:
>> On 31.03.22 05:51, Marek Marczykowski-Górecki wrote:
>>> Hi,
>>>
>>> I'm trying to make use of CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y to increase
>>> domain memory beyond initial maxmem, but I hit few issues.
>>>
>>> A little context: domains in Qubes OS start with rather little memory
>>> (400MB by default) but maxmem set higher (4GB by default). Then, there is
>>> qmemman daemon, that adjust balloon targets for domains, based on (among
>>> other things) demand reported by the domains themselves. There is also a
>>> little swap, to mitigate qmemman latency (few hundreds ms at worst).
>>> This initial memory < maxmmem in case of PVH / HVM makes use of PoD
>>> which I'm trying to get rid of. But also, IIUC Linux will waste some
>>> memory for bookkeeping based on maxmem, not actually usable memory.
>>>
>>> First issue: after using `xl mem-max`, `xl mem-set` still refuses to
>>> increase memory more than initial maxmem. That's because xl mem-max does
>>> not update 'memory/static-max' xenstore node. This one is easy to work
>>> around.
>>>
>>> Then, the actual hotplug fails on the domU side with:
>>>
>>> [ 50.004734] xen-balloon: vmemmap alloc failure: order:9, mode:0x4cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL), nodemask=(null),cpuset=/,mems_allowed=0
>>> [ 50.004774] CPU: 1 PID: 34 Comm: xen-balloon Not tainted 5.16.15-1.37.fc32.qubes.x86_64 #1
>>> [ 50.004792] Call Trace:
>>> [ 50.004799] <TASK>
>>> [ 50.004808] dump_stack_lvl+0x48/0x5e
>>> [ 50.004821] warn_alloc+0x162/0x190
>>> [ 50.004832] ? __alloc_pages+0x1fa/0x230
>>> [ 50.004842] vmemmap_alloc_block+0x11c/0x1c5
>>> [ 50.004856] vmemmap_populate_hugepages+0x185/0x519
>>> [ 50.004868] vmemmap_populate+0x9e/0x16c
>>> [ 50.004878] __populate_section_memmap+0x6a/0xb1
>>> [ 50.004890] section_activate+0x20a/0x278
>>> [ 50.004901] sparse_add_section+0x70/0x160
>>> [ 50.004911] __add_pages+0xc3/0x150
>>> [ 50.004921] add_pages+0x12/0x60
>>> [ 50.004931] add_memory_resource+0x12b/0x320
>>> [ 50.004943] reserve_additional_memory+0x10c/0x150
>>> [ 50.004958] balloon_thread+0x206/0x360
>>> [ 50.004968] ? do_wait_intr_irq+0xa0/0xa0
>>> [ 50.004978] ? decrease_reservation.constprop.0+0x2e0/0x2e0
>>> [ 50.004991] kthread+0x16b/0x190
>>> [ 50.005001] ? set_kthread_struct+0x40/0x40
>>> [ 50.005011] ret_from_fork+0x22/0x30
>>> [ 50.005022] </TASK>
>>>
>>> Full dmesg: https://gist.github.com/marmarek/72dd1f9dbdd63cfe479c94a3f4392b45
>>>
>>> After the above, `free` reports correct size (1GB in this case), but
>>> that memory seems to be unusable really. "used" is kept low, and soon
>>> OOM-killer kicks in.
>>>
>>> I know the initial 400MB is not much for a full Linux, with X11 etc. But
>>> I wouldn't expect it to fail this way when _adding_ memory.
>>>
>>> I've tried also with initial 800MB. In this case, I do not get "alloc
>>> failure" any more, but monitoring `free`, the extra memory still doesn't
>>> seem to be used.
>>>
>>> Any ideas?
>>>
>>
>> I can't reproduce that.
>>
>> I started a guest with 8GB of memory, in the guest I'm seeing:
>>
>> # uname -a
>> Linux linux-d1cy 5.17.0-rc5-default+ #406 SMP PREEMPT Mon Feb 21 09:31:12
>> CET 2022 x86_64 x86_64 x86_64 GNU/Linux
>> # free
>> total used free shared buff/cache available
>> Mem: 8178260 71628 8023300 8560 83332 8010196
>> Swap: 2097132 0 2097132
>>
>> Then I'm raising the memory for the guest in dom0:
>>
>> # xl list
>> Name ID Mem VCPUs State Time(s)
>> Domain-0 0 2634 8 r----- 1016.5
>> Xenstore 1 31 1 -b---- 0.9
>> sle15sp1 3 8190 6 -b---- 184.6
>> # xl mem-max 3 10000
>> # xenstore-write /local/domain/3/memory/static-max 10240000
>> # xl mem-set 3 10000
>> # xl list
>> Name ID Mem VCPUs State Time(s)
>> Domain-0 0 2634 8 r----- 1018.5
>> Xenstore 1 31 1 -b---- 1.0
>> sle15sp1 3 10000 6 -b---- 186.7
>>
>> In the guest I get now:
>>
>> # free
>> total used free shared buff/cache available
>> Mem: 10031700 110904 9734172 8560 186624 9814344
>> Swap: 2097132 0 2097132
>>
>> And after using lots of memory via a ramdisk:
>>
>> # free
>> total used free shared buff/cache available
>> Mem: 10031700 116660 1663840 7181776 8251200 2635372
>> Swap: 2097132 0 2097132
>>
>> You can see buff/cache is now larger than the initial total memory
>> and free is lower than the added memory amount.
>
> Hmm, I have a different behavior:
>
> I'm starting with 800M
>
> # uname -a
> Linux personal 5.16.15-1.37.fc32.qubes.x86_64 #1 SMP PREEMPT Tue Mar 22 12:59:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> # free -m
> total used free shared buff/cache available
> Mem: 740 209 278 2 252 415
> Swap: 1023 0 1023
>
> Then raising to ~2GB:
>
> [root@dom0 ~]# xl list
> Name ID Mem VCPUs State Time(s)
> Domain-0 0 4082 6 r----- 184271.3
> (...)
> personal 21 800 2 -b---- 4.8
> [root@dom0 ~]# xl mem-max personal 2048
> [root@dom0 ~]# xenstore-write /local/domain/$(xl domid personal)/memory/static-max $((2048*1024))
> [root@dom0 ~]# xl mem-set personal 2000
> [root@dom0 ~]# xenstore-ls -fp /local/domain/$(xl domid personal)/memory
> /local/domain/21/memory/static-max = "2097152" (n0,r21)
> /local/domain/21/memory/target = "2048001" (n0,r21)
> /local/domain/21/memory/videoram = "-1" (n0,r21)
>
> And then observe inside domU:
> [root@personal ~]# free -m
> total used free shared buff/cache available
> Mem: 1940 235 1452 2 252 1585
> Swap: 1023 0 1023
>
> So far so good. But when trying to actually use it, it doesn't work:
>
> [root@personal ~]# free -m
> total used free shared buff/cache available
> Mem: 1940 196 1240 454 503 1206
> Swap: 1023 472 551
>
> As you can see, all the new memory is still in "free", and swap is used
> instead.
Hmm, weird.
Maybe some kernel config differences, or other udev rules (memory onlining
is done via udev in my guest)?
I'm seeing:
# zgrep MEMORY_HOTPLUG /proc/config.gz
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG=y
# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y
CONFIG_XEN_MEMORY_HOTPLUG_LIMIT=512
The relevant udev rule seems to be:
SUBSYSTEM=="memory", ACTION=="add", PROGRAM=="/bin/sh -c
'/usr/bin/systemd-detect-virt || :'", RESULT!="zvm", ATTR{state}=="offline", \
ATTR{state}="online"
What type of guest are you using? Mine was a PVH guest.
> There is also /proc/meminfo (state before filling ramdisk), if that
> would give some hints:
> [root@personal ~]# cat /proc/meminfo
...
No, I don't think this is helping. At least not me.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
next prev parent reply other threads:[~2022-03-31 12:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-31 3:51 Increasing domain memory beyond initial maxmem Marek Marczykowski-Górecki
2022-03-31 6:41 ` Juergen Gross
2022-03-31 12:01 ` Marek Marczykowski-Górecki
2022-03-31 12:22 ` Juergen Gross [this message]
2022-03-31 12:36 ` Marek Marczykowski-Górecki
2022-04-05 11:03 ` Juergen Gross
2022-04-05 16:24 ` Marek Marczykowski-Górecki
2022-04-06 5:13 ` Juergen Gross
2022-04-06 12:58 ` Marek Marczykowski-Górecki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=362b6115-e296-e01e-520f-31a0826426eb@suse.com \
--to=jgross@suse.com \
--cc=marmarek@invisiblethingslab.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.