From: "Michael S. Tsirkin" <mst@redhat.com>
To: Tyler Sanderson <tysand@google.com>
Cc: virtualization@lists.linux-foundation.org
Subject: Re: VIRTIO_BALLOON_F_FREE_PAGE_HINT
Date: Thu, 3 Oct 2019 14:31:44 -0400 [thread overview]
Message-ID: <20191003142854-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAJuQAmpQV26kb9vTyoW-Q7PsD0SOfX+otkiQZAks1L6k7rgdig@mail.gmail.com>
On Thu, Oct 03, 2019 at 11:27:46AM -0700, Tyler Sanderson wrote:
> Sorry for the slow reply, I did some verification on my end. See responses
> inline.
>
> On Mon, Sep 16, 2019 at 12:26 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 16.09.19 03:41, Wei Wang wrote:
> > On 09/14/2019 02:36 AM, Tyler Sanderson wrote:
> >> Hello, I'm curious about the intent of VIRTIO_BALLOON_F_FREE_PAGE_HINT
> >> (commit
> >> <https://github.com/torvalds/linux/commit/
> 86a559787e6f5cf662c081363f64a20cad654195#
> diff-fd202acf694d9eba19c8c64da3e480c9>).
> >>
> >>
> >> My understanding is that this mechanism works similarly to the
> >> existing inflate/deflate queues. Pages are allocated by the guest and
> >> then reported on VQ_FREE_PAGE.
> >>
> >> Question: Is there a limit to how many pages will be allocated? What
> >> controls the amount of memory pressure applied?
> >
> > No control for the limit currently. The implementation reports all the
> > guest free pages to host.
> > The main usage for this feature so far is to have guest skip sending
> > those guest free pages
> > (the more, the better) during live migration.
>
> How does this differ from the regular inflate/deflate queue?
> Also, couldn't you simply skip sending pages that do not have host pages
> backing them (assuming pages added to the balloon are unbacked to reclaim the
> memory)?
Yes but putting most guest memory into the balloon would
slow the guest down significantly.
>
> >
> >
> >>
> >> In my experience with virtio balloon there are problems with the
> >> mechanisms that are supposed to deflate the balloon in response to
> >> memory pressure (e.g. OOM notifier).
> >
> > What problem did you see? We've also changed balloon to use memory
> shrinker,
> > did you see the problem with shrinker as well?
>
> Yes, I've observed problems both before and after the shrinker change (although
> different problems).
> Before the shrinker change, the overcommit accounting feature gets in the way
> and prevents allocations, even when the balloon could be deflated. The OOM
> notifier is never invoked so the balloon driver's hook into the OOM notifier is
> useless.
> After the shrinker change the overcommit accounting problem is fixed, but I
> have still found that forcibly deflating the balloon under memory pressure is
> slow enough that random allocations can still fail (is there a timeout for
> allocations?).
> For example, I've seen:
> tysand@vm ~ $ fallocate -l 5G d/foo // d is tmpfs mount. This command causes
> balloon to require deflation.
> tysand@vm grep Mem /proc/meminfo
> MemTotal: 8172852 kB
> MemFree: 138932 kB
> MemAvailable: 83428 kB
> tysand@vm ~ $ grep Mem /proc/meminfo
> free(): invalid pointer
> -bash: wait_for: No record of process 5415
> free(): invalid pointer
>
> Or similarly, I've seen SSH terminate with:
> tysand@vm:~$ grep Mem /proc/meminfo
> *** stack smashing detected ***: <unknown> terminated
>
> Presumably the stack smashing and "free(): invalid pointer" are caused by
> malloc returning null in those programs and the programs not handling it
> correctly.
>
> Notably I don't see the fallocate command fail. Usually only other processes.
>
>
> >
> >>
> >> It seems an ideal balloon interface would allow the guest to round
> >> robin through free guest physical pages, allowing the host to unback
> >> them, but never having more than a few pages allocated to the balloon
> >> at any one time. For example:
> >> 1. Guest allocates 1 page and notifies balloon device of this page's
> >> address.
> >> 2. Host debacks the received page.
> >> 3. Guest frees the page.
> >> 4. Repeat at #1, but ensure that different pages are allocated each
> time.
> >
> > Probably you need a mechanism to "ensure" different pages to be
> allocated.
> > The current implementation (having balloon hold the allocated pages)
> could
> > be thought of as one mechanism (it is simple).
> >
> >>
> >> This way the "balloon size" is never more than a few pages and does
> >> not create memory pressure. However the difficulty is in ensuring each
> >> set of sent pages is disjoint from previously sent pages. Is there a
> >> mechanism to round-robin allocations through all of guest physical
> >> memory? Does VIRTIO_BALLOON_F_FREE_PAGE_HINT enable this?
>
> There are use cases where you really want memory pressure (page cache is
> the prime example). Anyhow, I think the use case you want the
> "round-robin allocations" for is better tackled by "free page reporting"
> (used to be called "free page hinting") currently discussed on various
> lists.
>
> "allowing the host to unback them, but never having more than a few
> pages allocated to the balloon at any one time." is similar to what
> "free page reporting" does. We decided to only report bigger pages
> (avoid splitting up THP in the hypervisor, overhead) and only
> temporarily pull out a fixed amount of pages (16) from the page
> allocator to avoid false-OOM. Guaranteeing forward progress (similar to
> what you describe) is one important key concept.
>
>
> I'm really excited to see this being pursued! It looks like things are actively
> moving forward.
>
>
>
> --
>
> Thanks,
>
> David / dhildenb
>
next prev parent reply other threads:[~2019-10-03 18:31 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAJuQAmpQmNN1EJHm4RinZnBven9Bx4GGqd-8Mt+L=3Z-3pd+zg@mail.gmail.com>
2019-09-16 1:41 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
2019-09-16 7:26 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT David Hildenbrand
[not found] ` <CAJuQAmpQV26kb9vTyoW-Q7PsD0SOfX+otkiQZAks1L6k7rgdig@mail.gmail.com>
2019-10-03 18:31 ` Michael S. Tsirkin [this message]
[not found] ` <CAJuQAmrCiPsofYpDvm8=i32d9c9yCmKpJRBSRFkeubP_2=XKtw@mail.gmail.com>
2019-10-04 8:06 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT David Hildenbrand
2019-10-04 8:35 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT Michael S. Tsirkin
2019-10-04 8:56 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT David Hildenbrand
[not found] ` <CAJuQAmpwQ4guGtHTTWC60EAYBuJ264d6CgWmWEHSnb8-CRtWBw@mail.gmail.com>
2019-10-05 21:03 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT Michael S. Tsirkin
2019-10-06 8:30 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT David Hildenbrand
2019-10-06 11:48 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT Michael S. Tsirkin
2019-10-06 21:42 ` VIRTIO_BALLOON_F_FREE_PAGE_HINT Tyler Sanderson via Virtualization
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191003142854-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=tysand@google.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.