From: "Michael S. Tsirkin" <mst@redhat.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: mhocko@kernel.org, wei.w.wang@intel.com,
virtualization@lists.linux-foundation.org, linux-mm@kvack.org,
rmaksudova@parallels.com, den@openvz.org
Subject: Re: [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify()
Date: Thu, 19 Oct 2017 16:00:16 +0300 [thread overview]
Message-ID: <20171019155312-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <201710192052.JCE26064.OFtOLSFJVFQOMH@I-love.SAKURA.ne.jp>
On Thu, Oct 19, 2017 at 08:52:20PM +0900, Tetsuo Handa wrote:
> Michael S. Tsirkin wrote:
> > On Wed, Oct 18, 2017 at 07:59:23PM +0900, Tetsuo Handa wrote:
> > > Do you see anything wrong with the patch I used for emulating
> > > VIRTIO_BALLOON_F_DEFLATE_ON_OOM path (shown below) ?
> > >
> > > ----------------------------------------
> > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > > index f0b3a0b..a679ac2 100644
> > > --- a/drivers/virtio/virtio_balloon.c
> > > +++ b/drivers/virtio/virtio_balloon.c
> > > @@ -164,7 +164,7 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> > > }
> > > set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> > > vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
> > > - if (!virtio_has_feature(vb->vdev,
> > > + if (virtio_has_feature(vb->vdev,
> > > VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> > > adjust_managed_page_count(page, -1);
> > > }
> > > @@ -184,7 +184,7 @@ static void release_pages_balloon(struct virtio_balloon *vb,
> > > struct page *page, *next;
> > >
> > > list_for_each_entry_safe(page, next, pages, lru) {
> > > - if (!virtio_has_feature(vb->vdev,
> > > + if (virtio_has_feature(vb->vdev,
> > > VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> > > adjust_managed_page_count(page, 1);
> > > list_del(&page->lru);
> > > @@ -363,7 +363,7 @@ static int virtballoon_oom_notify(struct notifier_block *self,
> > > unsigned num_freed_pages;
> > >
> > > vb = container_of(self, struct virtio_balloon, nb);
> > > - if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> > > + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> > > return NOTIFY_OK;
> > >
> > > freed = parm;
> > > ----------------------------------------
> >
> > Looks right but it's probably easier to configure qemu to set that
> > feature bit. Basically you just add deflate-on-oom=on to the
> > balloon device.
>
> I'm using CentOS 7 where qemu does not recognize deflate-on-oom option. ;-)
>
> > OK. Or if you use my patch, you can just set a flag and go
> > if (vb->oom)
> > msleep(1000);
> > at beginning of fill_balloon.
>
> I don't think it is a good manner to sleep for long from the point of view of
> system_freezable_wq, for system_freezable_wq is expected to flush shortly
> according to include/linux/workqueue.h . I think that using delayed_work is better.
Well it's already using msleep, I'm fine with reworking it all to use
delayed_work. That's unrelated to the OOM issues though.
>
> > > While response was better than now, inflating again spoiled the effort.
> > > Retrying to inflate until allocation fails is already too painful.
> > >
> > > Michael S. Tsirkin wrote:
> > > > I think that's the case. Question is, when can we inflate again?
> > >
> > > I think that it is when the host explicitly asked again, for
> > > VIRTIO_BALLOON_F_DEFLATE_ON_OOM path does not schedule for later inflation.
> >
> > Problem is host has no idea when it's safe.
> > If we expect host to ask again after X seconds we
> > might just as well do it in the guest.
>
> To me, fill_balloon() with VIRTIO_BALLOON_F_DEFLATE_ON_OOM sounds like
> doing
>
> echo 3 > /proc/sys/vm/drop_caches
>
> where nobody knows whether it won't impact the system.
> Thus, I don't think it is a problem. It will be up to administrator
> who enters that command.
Right now existing hypervisors do not send that interrupt.
If you suggest a new feature where hypervisors send an interrupt,
that might work but will need a new feature bit. Please send an email
to virtio-dev@lists.oasis-open.org (subscriber only, sorry about that)
so the bit can be reserved.
--
MST
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2017-10-19 13:00 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1507632457-4611-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>
2017-10-10 11:47 ` [PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify() Michal Hocko
2017-10-12 2:36 ` Wei Wang
2017-10-13 11:28 ` Tetsuo Handa
2017-10-13 13:19 ` Michael S. Tsirkin
2017-10-13 13:23 ` Michael S. Tsirkin
2017-10-13 16:41 ` Tetsuo Handa
2017-10-15 0:22 ` Michael S. Tsirkin
2017-10-15 5:38 ` Tetsuo Handa
2017-10-16 10:58 ` Tetsuo Handa
2017-10-16 17:01 ` Michael S. Tsirkin
2017-10-18 10:59 ` Tetsuo Handa
2017-10-18 17:16 ` Michael S. Tsirkin
2017-10-19 11:52 ` Tetsuo Handa
2017-10-19 13:00 ` Michael S. Tsirkin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171019155312-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=den@openvz.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rmaksudova@parallels.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=wei.w.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).