From: "Michael S. Tsirkin" <mst@redhat.com>
To: teawater <teawaterz@linux.alibaba.com>
Cc: Hui Zhu <teawater@gmail.com>,
jasowang@redhat.com, akpm@linux-foundation.org,
pagupta@redhat.com, mojha@codeaurora.org, david@redhat.com,
namit@vmware.com, virtualization@lists.linux-foundation.org,
linux-kernel@vger.kernel.org, qemu-devel@nongnu.org
Subject: Re: [RFC for QEMU] virtio-balloon: Add option thp-order to set VIRTIO_BALLOON_F_THP_ORDER
Date: Thu, 26 Mar 2020 03:07:45 -0400 [thread overview]
Message-ID: <20200326030636-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <C9436807-D9CA-49FD-AEE3-3B7CE4BBB711@linux.alibaba.com>
On Tue, Mar 17, 2020 at 06:13:32PM +0800, teawater wrote:
>
>
> > 2020年3月12日 16:25,Michael S. Tsirkin <mst@redhat.com> 写道:
> >
> > On Thu, Mar 12, 2020 at 03:49:55PM +0800, Hui Zhu wrote:
> >> If the guest kernel has many fragmentation pages, use virtio_balloon
> >> will split THP of QEMU when it calls MADV_DONTNEED madvise to release
> >> the balloon pages.
> >> Set option thp-order to on will open flags VIRTIO_BALLOON_F_THP_ORDER.
> >> It will set balloon size to THP size to handle the THP split issue.
> >>
> >> Signed-off-by: Hui Zhu <teawaterz@linux.alibaba.com>
> >
> > What's wrong with just using the PartiallyBalloonedPage machinery
> > instead? That would make it guest transparent.
>
> In balloon_inflate_page:
> rb_page_size = qemu_ram_pagesize(rb);
>
> if (rb_page_size == BALLOON_PAGE_SIZE) {
> /* Easy case */
>
> It seems that PartiallyBalloonedPage is only used when rb_page_size is greater than BALLOON_PAGE_SIZE.
> Do you mean I should modify the working mechanism of balloon_inflate_page function?
>
> Thanks,
> Hui
Yes, we can tweak it to unconditionally combine pages to
a huge page.
> >
> >> ---
> >> hw/virtio/virtio-balloon.c | 67 ++++++++++++++++---------
> >> include/standard-headers/linux/virtio_balloon.h | 4 ++
> >> 2 files changed, 47 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >> index a4729f7..cfe86b0 100644
> >> --- a/hw/virtio/virtio-balloon.c
> >> +++ b/hw/virtio/virtio-balloon.c
> >> @@ -340,37 +340,49 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> >> while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
> >> unsigned int p = virtio_ldl_p(vdev, &pfn);
> >> hwaddr pa;
> >> + size_t handle_size = BALLOON_PAGE_SIZE;
> >>
> >> pa = (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT;
> >> offset += 4;
> >>
> >> - section = memory_region_find(get_system_memory(), pa,
> >> - BALLOON_PAGE_SIZE);
> >> - if (!section.mr) {
> >> - trace_virtio_balloon_bad_addr(pa);
> >> - continue;
> >> - }
> >> - if (!memory_region_is_ram(section.mr) ||
> >> - memory_region_is_rom(section.mr) ||
> >> - memory_region_is_romd(section.mr)) {
> >> - trace_virtio_balloon_bad_addr(pa);
> >> - memory_region_unref(section.mr);
> >> - continue;
> >> - }
> >> + if (virtio_has_feature(s->host_features,
> >> + VIRTIO_BALLOON_F_THP_ORDER))
> >> + handle_size = BALLOON_PAGE_SIZE << VIRTIO_BALLOON_THP_ORDER;
> >> +
> >> + while (handle_size > 0) {
> >> + section = memory_region_find(get_system_memory(), pa,
> >> + BALLOON_PAGE_SIZE);
> >> + if (!section.mr) {
> >> + trace_virtio_balloon_bad_addr(pa);
> >> + continue;
> >> + }
> >> + if (!memory_region_is_ram(section.mr) ||
> >> + memory_region_is_rom(section.mr) ||
> >> + memory_region_is_romd(section.mr)) {
> >> + trace_virtio_balloon_bad_addr(pa);
> >> + memory_region_unref(section.mr);
> >> + continue;
> >> + }
> >>
> >> - trace_virtio_balloon_handle_output(memory_region_name(section.mr),
> >> - pa);
> >> - if (!qemu_balloon_is_inhibited()) {
> >> - if (vq == s->ivq) {
> >> - balloon_inflate_page(s, section.mr,
> >> - section.offset_within_region, &pbp);
> >> - } else if (vq == s->dvq) {
> >> - balloon_deflate_page(s, section.mr, section.offset_within_region);
> >> - } else {
> >> - g_assert_not_reached();
> >> + trace_virtio_balloon_handle_output(memory_region_name(section.mr),
> >> + pa);
> >> + if (!qemu_balloon_is_inhibited()) {
> >> + if (vq == s->ivq) {
> >> + balloon_inflate_page(s, section.mr,
> >> + section.offset_within_region,
> >> + &pbp);
> >> + } else if (vq == s->dvq) {
> >> + balloon_deflate_page(s, section.mr,
> >> + section.offset_within_region);
> >> + } else {
> >> + g_assert_not_reached();
> >> + }
> >> }
> >> + memory_region_unref(section.mr);
> >> +
> >> + pa += BALLOON_PAGE_SIZE;
> >> + handle_size -= BALLOON_PAGE_SIZE;
> >> }
> >> - memory_region_unref(section.mr);
> >> }
> >>
> >> virtqueue_push(vq, elem, offset);
> >> @@ -693,6 +705,8 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
> >>
> >> memcpy(&config, config_data, virtio_balloon_config_size(dev));
> >> dev->actual = le32_to_cpu(config.actual);
> >> + if (virtio_has_feature(vdev->host_features, VIRTIO_BALLOON_F_THP_ORDER))
> >> + dev->actual <<= VIRTIO_BALLOON_THP_ORDER;
> >> if (dev->actual != oldactual) {
> >> qapi_event_send_balloon_change(vm_ram_size -
> >> ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
> >> @@ -728,6 +742,9 @@ static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
> >> }
> >> if (target) {
> >> dev->num_pages = (vm_ram_size - target) >> VIRTIO_BALLOON_PFN_SHIFT;
> >> + if (virtio_has_feature(dev->host_features,
> >> + VIRTIO_BALLOON_F_THP_ORDER))
> >> + dev->num_pages >>= VIRTIO_BALLOON_THP_ORDER;
> >> virtio_notify_config(vdev);
> >> }
> >> trace_virtio_balloon_to_target(target, dev->num_pages);
> >> @@ -916,6 +933,8 @@ static Property virtio_balloon_properties[] = {
> >> VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
> >> DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
> >> VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
> >> + DEFINE_PROP_BIT("thp-order", VirtIOBalloon, host_features,
> >> + VIRTIO_BALLOON_F_THP_ORDER, false),
> >> /* QEMU 4.0 accidentally changed the config size even when free-page-hint
> >> * is disabled, resulting in QEMU 3.1 migration incompatibility. This
> >> * property retains this quirk for QEMU 4.1 machine types.
> >> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
> >> index 9375ca2..f54d613 100644
> >> --- a/include/standard-headers/linux/virtio_balloon.h
> >> +++ b/include/standard-headers/linux/virtio_balloon.h
> >> @@ -36,10 +36,14 @@
> >> #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */
> >> #define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
> >> #define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
> >> +#define VIRTIO_BALLOON_F_THP_ORDER 5 /* Set balloon page order to thp order */
> >>
> >> /* Size of a PFN in the balloon interface. */
> >> #define VIRTIO_BALLOON_PFN_SHIFT 12
> >>
> >> +/* The order of the balloon page */
> >> +#define VIRTIO_BALLOON_THP_ORDER 9
> >> +
> >> #define VIRTIO_BALLOON_CMD_ID_STOP 0
> >> #define VIRTIO_BALLOON_CMD_ID_DONE 1
> >> struct virtio_balloon_config {
> >> --
> >> 2.7.4
WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: teawater <teawaterz@linux.alibaba.com>
Cc: pagupta@redhat.com, david@redhat.com, qemu-devel@nongnu.org,
mojha@codeaurora.org, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, namit@vmware.com,
akpm@linux-foundation.org, jasowang@redhat.com,
Hui Zhu <teawater@gmail.com>
Subject: Re: [RFC for QEMU] virtio-balloon: Add option thp-order to set VIRTIO_BALLOON_F_THP_ORDER
Date: Thu, 26 Mar 2020 03:07:45 -0400 [thread overview]
Message-ID: <20200326030636-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <C9436807-D9CA-49FD-AEE3-3B7CE4BBB711@linux.alibaba.com>
On Tue, Mar 17, 2020 at 06:13:32PM +0800, teawater wrote:
>
>
> > 2020年3月12日 16:25,Michael S. Tsirkin <mst@redhat.com> 写道:
> >
> > On Thu, Mar 12, 2020 at 03:49:55PM +0800, Hui Zhu wrote:
> >> If the guest kernel has many fragmentation pages, use virtio_balloon
> >> will split THP of QEMU when it calls MADV_DONTNEED madvise to release
> >> the balloon pages.
> >> Set option thp-order to on will open flags VIRTIO_BALLOON_F_THP_ORDER.
> >> It will set balloon size to THP size to handle the THP split issue.
> >>
> >> Signed-off-by: Hui Zhu <teawaterz@linux.alibaba.com>
> >
> > What's wrong with just using the PartiallyBalloonedPage machinery
> > instead? That would make it guest transparent.
>
> In balloon_inflate_page:
> rb_page_size = qemu_ram_pagesize(rb);
>
> if (rb_page_size == BALLOON_PAGE_SIZE) {
> /* Easy case */
>
> It seems that PartiallyBalloonedPage is only used when rb_page_size is greater than BALLOON_PAGE_SIZE.
> Do you mean I should modify the working mechanism of balloon_inflate_page function?
>
> Thanks,
> Hui
Yes, we can tweak it to unconditionally combine pages to
a huge page.
> >
> >> ---
> >> hw/virtio/virtio-balloon.c | 67 ++++++++++++++++---------
> >> include/standard-headers/linux/virtio_balloon.h | 4 ++
> >> 2 files changed, 47 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >> index a4729f7..cfe86b0 100644
> >> --- a/hw/virtio/virtio-balloon.c
> >> +++ b/hw/virtio/virtio-balloon.c
> >> @@ -340,37 +340,49 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> >> while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
> >> unsigned int p = virtio_ldl_p(vdev, &pfn);
> >> hwaddr pa;
> >> + size_t handle_size = BALLOON_PAGE_SIZE;
> >>
> >> pa = (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT;
> >> offset += 4;
> >>
> >> - section = memory_region_find(get_system_memory(), pa,
> >> - BALLOON_PAGE_SIZE);
> >> - if (!section.mr) {
> >> - trace_virtio_balloon_bad_addr(pa);
> >> - continue;
> >> - }
> >> - if (!memory_region_is_ram(section.mr) ||
> >> - memory_region_is_rom(section.mr) ||
> >> - memory_region_is_romd(section.mr)) {
> >> - trace_virtio_balloon_bad_addr(pa);
> >> - memory_region_unref(section.mr);
> >> - continue;
> >> - }
> >> + if (virtio_has_feature(s->host_features,
> >> + VIRTIO_BALLOON_F_THP_ORDER))
> >> + handle_size = BALLOON_PAGE_SIZE << VIRTIO_BALLOON_THP_ORDER;
> >> +
> >> + while (handle_size > 0) {
> >> + section = memory_region_find(get_system_memory(), pa,
> >> + BALLOON_PAGE_SIZE);
> >> + if (!section.mr) {
> >> + trace_virtio_balloon_bad_addr(pa);
> >> + continue;
> >> + }
> >> + if (!memory_region_is_ram(section.mr) ||
> >> + memory_region_is_rom(section.mr) ||
> >> + memory_region_is_romd(section.mr)) {
> >> + trace_virtio_balloon_bad_addr(pa);
> >> + memory_region_unref(section.mr);
> >> + continue;
> >> + }
> >>
> >> - trace_virtio_balloon_handle_output(memory_region_name(section.mr),
> >> - pa);
> >> - if (!qemu_balloon_is_inhibited()) {
> >> - if (vq == s->ivq) {
> >> - balloon_inflate_page(s, section.mr,
> >> - section.offset_within_region, &pbp);
> >> - } else if (vq == s->dvq) {
> >> - balloon_deflate_page(s, section.mr, section.offset_within_region);
> >> - } else {
> >> - g_assert_not_reached();
> >> + trace_virtio_balloon_handle_output(memory_region_name(section.mr),
> >> + pa);
> >> + if (!qemu_balloon_is_inhibited()) {
> >> + if (vq == s->ivq) {
> >> + balloon_inflate_page(s, section.mr,
> >> + section.offset_within_region,
> >> + &pbp);
> >> + } else if (vq == s->dvq) {
> >> + balloon_deflate_page(s, section.mr,
> >> + section.offset_within_region);
> >> + } else {
> >> + g_assert_not_reached();
> >> + }
> >> }
> >> + memory_region_unref(section.mr);
> >> +
> >> + pa += BALLOON_PAGE_SIZE;
> >> + handle_size -= BALLOON_PAGE_SIZE;
> >> }
> >> - memory_region_unref(section.mr);
> >> }
> >>
> >> virtqueue_push(vq, elem, offset);
> >> @@ -693,6 +705,8 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
> >>
> >> memcpy(&config, config_data, virtio_balloon_config_size(dev));
> >> dev->actual = le32_to_cpu(config.actual);
> >> + if (virtio_has_feature(vdev->host_features, VIRTIO_BALLOON_F_THP_ORDER))
> >> + dev->actual <<= VIRTIO_BALLOON_THP_ORDER;
> >> if (dev->actual != oldactual) {
> >> qapi_event_send_balloon_change(vm_ram_size -
> >> ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
> >> @@ -728,6 +742,9 @@ static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
> >> }
> >> if (target) {
> >> dev->num_pages = (vm_ram_size - target) >> VIRTIO_BALLOON_PFN_SHIFT;
> >> + if (virtio_has_feature(dev->host_features,
> >> + VIRTIO_BALLOON_F_THP_ORDER))
> >> + dev->num_pages >>= VIRTIO_BALLOON_THP_ORDER;
> >> virtio_notify_config(vdev);
> >> }
> >> trace_virtio_balloon_to_target(target, dev->num_pages);
> >> @@ -916,6 +933,8 @@ static Property virtio_balloon_properties[] = {
> >> VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
> >> DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
> >> VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
> >> + DEFINE_PROP_BIT("thp-order", VirtIOBalloon, host_features,
> >> + VIRTIO_BALLOON_F_THP_ORDER, false),
> >> /* QEMU 4.0 accidentally changed the config size even when free-page-hint
> >> * is disabled, resulting in QEMU 3.1 migration incompatibility. This
> >> * property retains this quirk for QEMU 4.1 machine types.
> >> diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
> >> index 9375ca2..f54d613 100644
> >> --- a/include/standard-headers/linux/virtio_balloon.h
> >> +++ b/include/standard-headers/linux/virtio_balloon.h
> >> @@ -36,10 +36,14 @@
> >> #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */
> >> #define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
> >> #define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
> >> +#define VIRTIO_BALLOON_F_THP_ORDER 5 /* Set balloon page order to thp order */
> >>
> >> /* Size of a PFN in the balloon interface. */
> >> #define VIRTIO_BALLOON_PFN_SHIFT 12
> >>
> >> +/* The order of the balloon page */
> >> +#define VIRTIO_BALLOON_THP_ORDER 9
> >> +
> >> #define VIRTIO_BALLOON_CMD_ID_STOP 0
> >> #define VIRTIO_BALLOON_CMD_ID_DONE 1
> >> struct virtio_balloon_config {
> >> --
> >> 2.7.4
next prev parent reply other threads:[~2020-03-26 7:07 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-12 7:49 [RFC for Linux] virtio_balloon: Add VIRTIO_BALLOON_F_THP_ORDER to handle THP spilt issue Hui Zhu
2020-03-12 7:49 ` Hui Zhu
2020-03-12 7:49 ` [RFC for QEMU] virtio-balloon: Add option thp-order to set VIRTIO_BALLOON_F_THP_ORDER Hui Zhu
2020-03-12 7:49 ` Hui Zhu
2020-03-12 8:22 ` no-reply
2020-03-12 8:22 ` no-reply
2020-03-12 8:22 ` no-reply
2020-03-12 8:25 ` Michael S. Tsirkin
2020-03-12 8:25 ` Michael S. Tsirkin
2020-03-17 10:13 ` teawater
2020-03-17 10:13 ` teawater
2020-03-26 7:07 ` Michael S. Tsirkin [this message]
2020-03-26 7:07 ` Michael S. Tsirkin
2020-03-12 8:18 ` [RFC for Linux] virtio_balloon: Add VIRTIO_BALLOON_F_THP_ORDER to handle THP spilt issue Michael S. Tsirkin
2020-03-12 8:18 ` Michael S. Tsirkin
2020-03-12 8:37 ` David Hildenbrand
2020-03-12 8:37 ` David Hildenbrand
2020-03-12 8:47 ` Michael S. Tsirkin
2020-03-12 8:47 ` Michael S. Tsirkin
2020-03-12 8:51 ` David Hildenbrand
2020-03-12 8:51 ` David Hildenbrand
2020-03-26 7:10 ` Michael S. Tsirkin
2020-03-26 7:10 ` Michael S. Tsirkin
2020-03-26 7:20 ` Michael S. Tsirkin
2020-03-26 7:20 ` Michael S. Tsirkin
2020-03-26 7:54 ` David Hildenbrand
2020-03-26 7:54 ` David Hildenbrand
2020-03-26 9:49 ` Michael S. Tsirkin
2020-03-26 9:49 ` Michael S. Tsirkin
2020-03-31 10:35 ` David Hildenbrand
2020-03-31 10:35 ` David Hildenbrand
2020-03-31 13:24 ` Michael S. Tsirkin
2020-03-31 13:24 ` Michael S. Tsirkin
2020-03-31 13:32 ` David Hildenbrand
2020-03-31 13:32 ` David Hildenbrand
2020-03-31 13:37 ` Michael S. Tsirkin
2020-03-31 13:37 ` Michael S. Tsirkin
2020-03-31 14:03 ` David Hildenbrand
2020-03-31 14:03 ` David Hildenbrand
2020-03-31 14:07 ` Michael S. Tsirkin
2020-03-31 14:07 ` Michael S. Tsirkin
2020-03-31 14:09 ` David Hildenbrand
2020-03-31 14:09 ` David Hildenbrand
2020-03-31 14:18 ` Michael S. Tsirkin
2020-03-31 14:18 ` Michael S. Tsirkin
2020-03-31 14:29 ` David Hildenbrand
2020-03-31 14:29 ` David Hildenbrand
2020-03-31 14:29 ` David Hildenbrand
2020-03-31 14:34 ` David Hildenbrand
2020-03-31 14:34 ` David Hildenbrand
2020-03-31 15:28 ` Michael S. Tsirkin
2020-03-31 15:28 ` Michael S. Tsirkin
2020-03-31 16:37 ` Nadav Amit
2020-04-01 9:48 ` David Hildenbrand
2020-04-01 9:48 ` David Hildenbrand
2020-04-01 9:48 ` David Hildenbrand
2020-04-02 4:02 ` teawater
2020-04-02 4:02 ` teawater
2020-04-02 8:00 ` teawater
2020-04-02 8:00 ` teawater
2020-04-02 12:37 ` Michael S. Tsirkin
2020-04-02 12:37 ` Michael S. Tsirkin
2020-03-31 16:27 ` Nadav Amit
2020-04-01 11:21 ` David Hildenbrand
2020-04-01 11:21 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200326030636-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mojha@codeaurora.org \
--cc=namit@vmware.com \
--cc=pagupta@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=teawater@gmail.com \
--cc=teawaterz@linux.alibaba.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.