All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: Juergen Gross <jgross@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: Increasing domain memory beyond initial maxmem
Date: Thu, 31 Mar 2022 14:01:28 +0200	[thread overview]
Message-ID: <YkWYGFJ/Cl+B2C37@mail-itl> (raw)
In-Reply-To: <2684376b-3ae6-a2b7-581f-2bd38ab6056b@suse.com>

[-- Attachment #1: Type: text/plain, Size: 8338 bytes --]

On Thu, Mar 31, 2022 at 08:41:19AM +0200, Juergen Gross wrote:
> On 31.03.22 05:51, Marek Marczykowski-Górecki wrote:
> > Hi,
> > 
> > I'm trying to make use of CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y to increase
> > domain memory beyond initial maxmem, but I hit few issues.
> > 
> > A little context: domains in Qubes OS start with rather little memory
> > (400MB by default) but maxmem set higher (4GB by default). Then, there is
> > qmemman daemon, that adjust balloon targets for domains, based on (among
> > other things) demand reported by the domains themselves. There is also a
> > little swap, to mitigate qmemman latency (few hundreds ms at worst).
> > This initial memory < maxmmem in case of PVH / HVM makes use of PoD
> > which I'm trying to get rid of. But also, IIUC Linux will waste some
> > memory for bookkeeping based on maxmem, not actually usable memory.
> > 
> > First issue: after using `xl mem-max`, `xl mem-set` still refuses to
> > increase memory more than initial maxmem. That's because xl mem-max does
> > not update 'memory/static-max' xenstore node. This one is easy to work
> > around.
> > 
> > Then, the actual hotplug fails on the domU side with:
> > 
> > [   50.004734] xen-balloon: vmemmap alloc failure: order:9, mode:0x4cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL), nodemask=(null),cpuset=/,mems_allowed=0
> > [   50.004774] CPU: 1 PID: 34 Comm: xen-balloon Not tainted 5.16.15-1.37.fc32.qubes.x86_64 #1
> > [   50.004792] Call Trace:
> > [   50.004799]  <TASK>
> > [   50.004808]  dump_stack_lvl+0x48/0x5e
> > [   50.004821]  warn_alloc+0x162/0x190
> > [   50.004832]  ? __alloc_pages+0x1fa/0x230
> > [   50.004842]  vmemmap_alloc_block+0x11c/0x1c5
> > [   50.004856]  vmemmap_populate_hugepages+0x185/0x519
> > [   50.004868]  vmemmap_populate+0x9e/0x16c
> > [   50.004878]  __populate_section_memmap+0x6a/0xb1
> > [   50.004890]  section_activate+0x20a/0x278
> > [   50.004901]  sparse_add_section+0x70/0x160
> > [   50.004911]  __add_pages+0xc3/0x150
> > [   50.004921]  add_pages+0x12/0x60
> > [   50.004931]  add_memory_resource+0x12b/0x320
> > [   50.004943]  reserve_additional_memory+0x10c/0x150
> > [   50.004958]  balloon_thread+0x206/0x360
> > [   50.004968]  ? do_wait_intr_irq+0xa0/0xa0
> > [   50.004978]  ? decrease_reservation.constprop.0+0x2e0/0x2e0
> > [   50.004991]  kthread+0x16b/0x190
> > [   50.005001]  ? set_kthread_struct+0x40/0x40
> > [   50.005011]  ret_from_fork+0x22/0x30
> > [   50.005022]  </TASK>
> > 
> > Full dmesg: https://gist.github.com/marmarek/72dd1f9dbdd63cfe479c94a3f4392b45
> > 
> > After the above, `free` reports correct size (1GB in this case), but
> > that memory seems to be unusable really. "used" is kept low, and soon
> > OOM-killer kicks in.
> > 
> > I know the initial 400MB is not much for a full Linux, with X11 etc. But
> > I wouldn't expect it to fail this way when _adding_ memory.
> > 
> > I've tried also with initial 800MB. In this case, I do not get "alloc
> > failure" any more, but monitoring `free`, the extra memory still doesn't
> > seem to be used.
> > 
> > Any ideas?
> > 
> 
> I can't reproduce that.
> 
> I started a guest with 8GB of memory, in the guest I'm seeing:
> 
> # uname -a
> Linux linux-d1cy 5.17.0-rc5-default+ #406 SMP PREEMPT Mon Feb 21 09:31:12
> CET 2022 x86_64 x86_64 x86_64 GNU/Linux
> # free
>         total     used      free   shared  buff/cache   available
> Mem:  8178260    71628   8023300     8560       83332     8010196
> Swap: 2097132        0   2097132
> 
> Then I'm raising the memory for the guest in dom0:
> 
> # xl list
> Name                ID   Mem VCPUs      State   Time(s)
> Domain-0             0  2634     8     r-----    1016.5
> Xenstore             1    31     1     -b----       0.9
> sle15sp1             3  8190     6     -b----     184.6
> # xl mem-max 3 10000
> # xenstore-write /local/domain/3/memory/static-max 10240000
> # xl mem-set 3 10000
> # xl list
> Name                ID   Mem VCPUs      State   Time(s)
> Domain-0             0  2634     8     r-----    1018.5
> Xenstore             1    31     1     -b----       1.0
> sle15sp1             3 10000     6     -b----     186.7
> 
> In the guest I get now:
> 
> # free
>         total     used     free   shared  buff/cache   available
> Mem: 10031700   110904  9734172     8560      186624     9814344
> Swap: 2097132        0  2097132
> 
> And after using lots of memory via a ramdisk:
> 
> # free
>         total     used     free   shared  buff/cache   available
> Mem: 10031700   116660  1663840  7181776     8251200     2635372
> Swap: 2097132        0  2097132
> 
> You can see buff/cache is now larger than the initial total memory
> and free is lower than the added memory amount.

Hmm, I have a different behavior:

I'm starting with 800M

# uname -a
Linux personal 5.16.15-1.37.fc32.qubes.x86_64 #1 SMP PREEMPT Tue Mar 22 12:59:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# free -m
              total        used        free      shared  buff/cache   available
Mem:            740         209         278           2         252         415
Swap:          1023           0        1023

Then raising to ~2GB:

[root@dom0 ~]# xl list
Name                                        ID   Mem VCPUs	State	Time(s)
Domain-0                                     0  4082     6     r-----  184271.3
(...)
personal                                    21   800     2     -b----       4.8
[root@dom0 ~]# xl mem-max personal 2048
[root@dom0 ~]# xenstore-write /local/domain/$(xl domid personal)/memory/static-max $((2048*1024))
[root@dom0 ~]# xl mem-set personal 2000
[root@dom0 ~]# xenstore-ls -fp /local/domain/$(xl domid personal)/memory
/local/domain/21/memory/static-max = "2097152"   (n0,r21)
/local/domain/21/memory/target = "2048001"   (n0,r21)
/local/domain/21/memory/videoram = "-1"   (n0,r21)

And then observe inside domU:
[root@personal ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           1940         235        1452           2         252        1585
Swap:          1023           0        1023

So far so good. But when trying to actually use it, it doesn't work:

[root@personal ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:           1940         196        1240         454         503        1206
Swap:          1023         472         551

As you can see, all the new memory is still in "free", and swap is used
instead.


There is also /proc/meminfo (state before filling ramdisk), if that
would give some hints:
[root@personal ~]# cat /proc/meminfo
MemTotal:        1986800 kB
MemFree:         1487116 kB
MemAvailable:    1624060 kB
Buffers:           26236 kB
Cached:           207268 kB
SwapCached:            0 kB
Active:            74828 kB
Inactive:         258724 kB
Active(anon):       1008 kB
Inactive(anon):   101668 kB
Active(file):      73820 kB
Inactive(file):   157056 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       1048572 kB
SwapFree:        1048572 kB
Dirty:               216 kB
Writeback:             0 kB
AnonPages:        100184 kB
Mapped:           117472 kB
Shmem:              2628 kB
KReclaimable:      24960 kB
Slab:              52136 kB
SReclaimable:      24960 kB
SUnreclaim:        27176 kB
KernelStack:        3120 kB
PageTables:         4364 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     2041972 kB
Committed_AS:     825816 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       10064 kB
VmallocChunk:          0 kB
Percpu:             1240 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:       79872 kB
DirectMap2M:     1132544 kB
DirectMap1G:     1048576 kB


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2022-03-31 12:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-31  3:51 Increasing domain memory beyond initial maxmem Marek Marczykowski-Górecki
2022-03-31  6:41 ` Juergen Gross
2022-03-31 12:01   ` Marek Marczykowski-Górecki [this message]
2022-03-31 12:22     ` Juergen Gross
2022-03-31 12:36       ` Marek Marczykowski-Górecki
2022-04-05 11:03         ` Juergen Gross
2022-04-05 16:24           ` Marek Marczykowski-Górecki
2022-04-06  5:13             ` Juergen Gross
2022-04-06 12:58               ` Marek Marczykowski-Górecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YkWYGFJ/Cl+B2C37@mail-itl \
    --to=marmarek@invisiblethingslab.com \
    --cc=jgross@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.