From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Wang, Wei W" <wei.w.wang@intel.com>
Cc: "virtio-dev@lists.oasis-open.org"
<virtio-dev@lists.oasis-open.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"virtualization@lists.linux-foundation.org"
<virtualization@lists.linux-foundation.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"david@redhat.com" <david@redhat.com>,
"Hansen, Dave" <dave.hansen@intel.com>,
"cornelia.huck@de.ibm.com" <cornelia.huck@de.ibm.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"mgorman@techsingularity.net" <mgorman@techsingularity.net>,
"aarcange@redhat.com" <aarcange@redhat.com>,
"amit.shah@redhat.com" <amit.shah@redhat.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"liliang.opensource@gmail.com" <liliang.opensource@gmail.com>
Subject: Re: [PATCH kernel v8 2/4] virtio-balloon: VIRTIO_BALLOON_F_CHUNK_TRANSFER
Date: Wed, 5 Apr 2017 06:53:58 +0300 [thread overview]
Message-ID: <20170405065313-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <286AC319A985734F985F78AFA26841F7391E1962@shsmsx102.ccr.corp.intel.com>
On Wed, Apr 05, 2017 at 03:31:36AM +0000, Wang, Wei W wrote:
> On Thursday, March 16, 2017 3:09 PM Wei Wang wrote:
> > The implementation of the current virtio-balloon is not very efficient, because
> > the ballooned pages are transferred to the host one by one. Here is the
> > breakdown of the time in percentage spent on each step of the balloon inflating
> > process (inflating 7GB of an 8GB idle guest).
> >
> > 1) allocating pages (6.5%)
> > 2) sending PFNs to host (68.3%)
> > 3) address translation (6.1%)
> > 4) madvise (19%)
> >
> > It takes about 4126ms for the inflating process to complete.
> > The above profiling shows that the bottlenecks are stage 2) and stage 4).
> >
> > This patch optimizes step 2) by transferring pages to the host in chunks. A chunk
> > consists of guest physically continuous pages, and it is offered to the host via a
> > base PFN (i.e. the start PFN of those physically continuous pages) and the size
> > (i.e. the total number of the pages). A chunk is formated as below:
> >
> > --------------------------------------------------------
> > | Base (52 bit) | Rsvd (12 bit) |
> > --------------------------------------------------------
> > --------------------------------------------------------
> > | Size (52 bit) | Rsvd (12 bit) |
> > --------------------------------------------------------
> >
> > By doing so, step 4) can also be optimized by doing address translation and
> > madvise() in chunks rather than page by page.
> >
> > This optimization requires the negotiation of a new feature bit,
> > VIRTIO_BALLOON_F_CHUNK_TRANSFER.
> >
> > With this new feature, the above ballooning process takes ~590ms resulting in
> > an improvement of ~85%.
> >
> > TODO: optimize stage 1) by allocating/freeing a chunk of pages instead of a
> > single page each time.
> >
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > drivers/virtio/virtio_balloon.c | 371 +++++++++++++++++++++++++++++++++-
> > --
> > include/uapi/linux/virtio_balloon.h | 9 +
> > 2 files changed, 353 insertions(+), 27 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index
> > f59cb4f..3f4a161 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -42,6 +42,10 @@
> > #define OOM_VBALLOON_DEFAULT_PAGES 256
> > #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
> >
> > +#define PAGE_BMAP_SIZE (8 * PAGE_SIZE)
> > +#define PFNS_PER_PAGE_BMAP (PAGE_BMAP_SIZE * BITS_PER_BYTE)
> > +#define PAGE_BMAP_COUNT_MAX 32
> > +
> > static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> > module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> > MODULE_PARM_DESC(oom_pages, "pages to free on OOM"); @@ -50,6 +54,14
> > @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM"); static struct
> > vfsmount *balloon_mnt; #endif
> >
> > +#define BALLOON_CHUNK_BASE_SHIFT 12
> > +#define BALLOON_CHUNK_SIZE_SHIFT 12
> > +struct balloon_page_chunk {
> > + __le64 base;
> > + __le64 size;
> > +};
> > +
> > +typedef __le64 resp_data_t;
> > struct virtio_balloon {
> > struct virtio_device *vdev;
> > struct virtqueue *inflate_vq, *deflate_vq, *stats_vq; @@ -67,6 +79,31
> > @@ struct virtio_balloon {
> >
> > /* Number of balloon pages we've told the Host we're not using. */
> > unsigned int num_pages;
> > + /* Pointer to the response header. */
> > + struct virtio_balloon_resp_hdr *resp_hdr;
> > + /* Pointer to the start address of response data. */
> > + resp_data_t *resp_data;
>
> I think the implementation has an issue here - both the balloon pages and the unused pages use the same buffer ("resp_data" above) to store chunks. It would cause a race in this case: live migration starts while ballooning is also in progress. I plan to use separate buffers for CHUNKS_OF_BALLOON_PAGES and CHUNKS_OF_UNUSED_PAGES. Please let me know if you have a different suggestion. Thanks.
>
> Best,
> Wei
Is only one resp data ever in flight for each kind?
If not you want as many buffers as vq allows.
--
MST
next prev parent reply other threads:[~2017-04-05 3:54 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-16 7:08 [PATCH kernel v8 0/4] Extend virtio-balloon for fast (de)inflating & fast live migration Wei Wang
2017-03-16 7:08 ` [PATCH kernel v8 1/4] virtio-balloon: deflate via a page list Wei Wang
2017-03-16 7:08 ` [PATCH kernel v8 2/4] virtio-balloon: VIRTIO_BALLOON_F_CHUNK_TRANSFER Wei Wang
2017-04-05 3:31 ` Wang, Wei W
2017-04-05 3:53 ` Michael S. Tsirkin [this message]
2017-04-05 4:31 ` Wang, Wei W
2017-04-05 7:47 ` Wang, Wei W
2017-03-16 7:08 ` [PATCH kernel v8 3/4] mm: add inerface to offer info about unused pages Wei Wang
2017-03-16 21:28 ` Andrew Morton
2017-03-17 6:55 ` Wei Wang
2017-03-22 10:52 ` Wang, Wei W
2017-03-29 17:48 ` Michael S. Tsirkin
2017-03-31 9:53 ` Wei Wang
2017-03-31 16:25 ` Michael S. Tsirkin
2017-04-13 11:07 ` Wei Wang
2017-03-17 1:21 ` Michael S. Tsirkin
2017-03-16 7:08 ` [PATCH kernel v8 4/4] virtio-balloon: VIRTIO_BALLOON_F_HOST_REQ_VQ Wei Wang
2017-03-17 1:39 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170405065313-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=amit.shah@redhat.com \
--cc=cornelia.huck@de.ibm.com \
--cc=dave.hansen@intel.com \
--cc=david@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=liliang.opensource@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=wei.w.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).