All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Stevens <stevensd@chromium.org>
Cc: virtualization@lists.linux-foundation.org
Subject: Re: [RFC PATCH] virtio_balloon: add param to skip adjusting pages
Date: Thu, 18 Nov 2021 01:25:09 -0500	[thread overview]
Message-ID: <20211118012455-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAD=HUj6t6jqzQAP++wRgqmteLHkOimE6YzSygq4Z6Qg0dBmcPA@mail.gmail.com>

On Thu, Nov 18, 2021 at 10:25:45AM +0900, David Stevens wrote:
> On Wed, Nov 17, 2021 at 10:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Nov 17, 2021 at 07:06:34PM +0900, David Stevens wrote:
> > > From: David Stevens <stevensd@chromium.org>
> > >
> > > Add a module parameters to virtio_balloon to allow specifying whether or
> > > not the driver should call adjust_managed_page_count. If the parameter
> > > is set, it overrides the default behavior inferred from the deflate on
> > > OOM flag. This allows the balloon to operate without changing the amount
> > > of memory visible to userspace via /proc/meminfo or sysinfo, even on a
> > > system that cannot set the default on OOM flag.
> > >
> > > The motivation for this patch is to allow userspace to more accurately
> > > take advantage of virtio_balloon's cooperative memory control on a
> > > system without the ability to use deflate on OOM. As it stands,
> > > userspace has no way to know how much memory may be available on such a
> > > system, which makes tasks such as sizing caches impossible.
> > >
> > > When deflate on OOM is not enabled, the current behavior of the
> > > virtio_balloon more or less resembles hotplugging individual pages, at
> > > least from an accounting perspective. This is basically hardcoding the
> > > requirement that totalram_pages must be available to the guest
> > > immediately, regardless of what the host does. While that is a valid
> > > policy, on Linux (which supports memory overcommit) with virtio_balloon
> > > (which is designed to facilitate overcommit in the host), it is not the
> > > only possible policy.
> > >
> > > The param added by this patch allows the guest to operate under the
> > > assumption that pages in the virtio_balloon will generally be made
> > > available when needed. This assumption may not always hold, but when it
> > > is violated, the guest will just fall back to the normal mechanisms for
> > > dealing with overcommitted memory.
> > >
> > > Signed-off-by: David Stevens <stevensd@chromium.org>
> > > ---
> > >
> > > Another option to achieve similar behavior would be to add a new feature
> > > flag that would be used in conjunction with DEFLATE_ON_OOM to determine
> > > whether or not adjust_managed_page_count should be called. However, I
> > > feel that this sort of policy decision is a little outside the scope of
> > > what the virtio_balloon API deals with.
> > >
> > > ---
> > >  drivers/virtio/virtio_balloon.c | 23 ++++++++++++++++++-----
> > >  1 file changed, 18 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > > index c22ff0117b46..17dac286899c 100644
> > > --- a/drivers/virtio/virtio_balloon.c
> > > +++ b/drivers/virtio/virtio_balloon.c
> > > @@ -133,6 +133,21 @@ static const struct virtio_device_id id_table[] = {
> > >       { 0 },
> > >  };
> > >
> > > +static char *adjust_pages = "";
> > > +module_param(adjust_pages, charp, 0);
> > > +MODULE_PARM_DESC(adjust_pages, "Whether or not pages in the balloon should be removed from the managed page count");
> > > +
> > > +static bool should_adjust_pages(struct virtio_balloon *vb)
> > > +{
> > > +     if (!strcmp(adjust_pages, "always"))
> > > +             return true;
> > > +     else if (!strcmp(adjust_pages, "never"))
> > > +             return false;
> > > +
> > > +     return !virtio_has_feature(vb->vdev,
> > > +                                VIRTIO_BALLOON_F_DEFLATE_ON_OOM);
> > > +}
> > > +
> > >  static u32 page_to_balloon_pfn(struct page *page)
> > >  {
> > >       unsigned long pfn = page_to_pfn(page);
> > > @@ -243,8 +258,7 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> > >
> > >               set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> > >               vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> > > -             if (!virtio_has_feature(vb->vdev,
> > > -                                     VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> > > +             if (should_adjust_pages(vb))
> > >                       adjust_managed_page_count(page, -1);
> > >               vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE;
> > >       }
> > > @@ -264,8 +278,7 @@ static void release_pages_balloon(struct virtio_balloon *vb,
> > >       struct page *page, *next;
> > >
> > >       list_for_each_entry_safe(page, next, pages, lru) {
> > > -             if (!virtio_has_feature(vb->vdev,
> > > -                                     VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> > > +             if (should_adjust_pages(vb))
> > >                       adjust_managed_page_count(page, 1);
> > >               list_del(&page->lru);
> > >               put_page(page); /* balloon reference */
> > > @@ -775,7 +788,7 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
> > >         * managed page count when inflating, we have to fixup the count of
> > >         * both involved zones.
> > >         */
> > > -     if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM) &&
> > > +     if (should_adjust_pages(vb) &&
> > >           page_zone(page) != page_zone(newpage)) {
> > >               adjust_managed_page_count(page, 1);
> > >               adjust_managed_page_count(newpage, -1);
> >
> > A problem here is that the host also cares: IIUC
> > with VIRTIO_BALLOON_F_STATS_VQ we are sending the info
> > about page counts to host, changing what the stats
> > mean.
> >
> > So I suspect we need a device feature for this at least
> > to control this aspect.
> >
> 
> The only stat this would affect is VIRTIO_BALLOON_S_MEMTOT, I think.
> If that's not supposed to include memory held by the balloon, then we
> can just subtract the balloon's size from i.totalram in
> update_balloon_stats if should_adjust_pages() is true but
> VIRTIO_BALLOON_F_DEFLATE_ON_OOM is not set. That should preserve the
> current behavior.

OK, that sounds reasonable.

> >
> > > --
> > > 2.34.0.rc2.393.gf8c9666880-goog
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

      parent reply	other threads:[~2021-11-18  6:25 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211117100634.3012869-1-stevensd@google.com>
2021-11-17 13:32 ` [RFC PATCH] virtio_balloon: add param to skip adjusting pages Michael S. Tsirkin
2021-11-17 13:36   ` Michael S. Tsirkin
     [not found]   ` <CAD=HUj6t6jqzQAP++wRgqmteLHkOimE6YzSygq4Z6Qg0dBmcPA@mail.gmail.com>
2021-11-18  6:25     ` Michael S. Tsirkin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211118012455-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=stevensd@chromium.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.