From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NCWfN-0000UD-HJ for qemu-devel@nongnu.org; Mon, 23 Nov 2009 06:01:09 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NCWfH-0000KW-9R for qemu-devel@nongnu.org; Mon, 23 Nov 2009 06:01:07 -0500 Received: from [199.232.76.173] (port=55224 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NCWfF-0000Jt-Rw for qemu-devel@nongnu.org; Mon, 23 Nov 2009 06:01:01 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46003) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NCWfF-0007qn-17 for qemu-devel@nongnu.org; Mon, 23 Nov 2009 06:01:01 -0500 Message-ID: <4B0A6B59.6030102@redhat.com> Date: Mon, 23 Nov 2009 13:00:41 +0200 From: Dor Laor MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: virtio: Add memory statistics reporting to the balloon driver (V3) References: <1258643169.3464.3.camel@aglitke> <1258643945.3464.5.camel@aglitke> <20091123094425.GB3748@redhat.com> In-Reply-To: <20091123094425.GB3748@redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Reply-To: dlaor@redhat.com List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Anthony Liguori , qemu-devel@nongnu.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Avi Kivity , Adam Litke , Vadim Rozenfeld On 11/23/2009 11:44 AM, Michael S. Tsirkin wrote: > On Thu, Nov 19, 2009 at 09:19:05AM -0600, Adam Litke wrote: >> Rusty and Anthony, >> If I've addressed all outstanding issues, please consider this patch f= or >> inclusion. Thanks. >> >> Changes since V2: >> - Increase stat field size to 64 bits >> - Report all sizes in kb (not pages) >> - Drop anon_pages stat and fix endianness conversion >> >> Changes since V1: >> - Use a virtqueue instead of the device config space >> >> When using ballooning to manage overcommitted memory on a host, a syst= em for >> guests to communicate their memory usage to the host can provide infor= mation >> that will minimize the impact of ballooning on the guests. The curren= t method >> employs a daemon running in each guest that communicates memory statis= tics to a >> host daemon at a specified time interval. The host daemon aggregates = this >> information and inflates and/or deflates balloons according to the lev= el of >> host memory pressure. This approach is effective but overly complex s= ince a >> daemon must be installed inside each guest and coordinated to communic= ate with >> the host. A simpler approach is to collect memory statistics in the v= irtio >> balloon driver and communicate them directly to the hypervisor. >> >> This patch enables the guest-side support by adding stats collection a= nd >> reporting to the virtio balloon driver. >> >> Signed-off-by: Adam Litke >> Cc: Rusty Russell >> Cc: Anthony Liguori >> Cc: virtualization@lists.linux-foundation.org >> Cc: linux-kernel@vger.kernel.org > > that's pretty clean. some comments: I wrote them while off-line, and no= w > that I was going to send it out, I see that there's some overlap with > what Rusty wrote. Still, here it is: > >> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_b= alloon.c >> index 200c22f..ebc9d39 100644 >> --- a/drivers/virtio/virtio_balloon.c >> +++ b/drivers/virtio/virtio_balloon.c >> @@ -29,7 +29,7 @@ >> struct virtio_balloon >> { >> struct virtio_device *vdev; >> - struct virtqueue *inflate_vq, *deflate_vq; >> + struct virtqueue *inflate_vq, *deflate_vq, *stats_vq; >> >> /* Where the ballooning thread waits for config to change. */ >> wait_queue_head_t config_change; >> @@ -50,6 +50,9 @@ struct virtio_balloon >> /* The array of pfns we tell the Host about. */ >> unsigned int num_pfns; >> u32 pfns[256]; >> + >> + /* Memory statistics */ >> + struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; >> }; >> >> static struct virtio_device_id id_table[] =3D { >> @@ -155,6 +158,57 @@ static void leak_balloon(struct virtio_balloon *v= b, size_t num) >> } >> } >> >> +static inline void update_stat(struct virtio_balloon *vb, int idx, >> + __le16 tag, __le64 val) >> +{ >> + BUG_ON(idx>=3D VIRTIO_BALLOON_S_NR); >> + vb->stats[idx].tag =3D cpu_to_le16(tag); >> + vb->stats[idx].val =3D cpu_to_le64(val); > > you should do le16_to_cpu etc on __le values. > Try running this patch through sparce checker. > >> +} >> + >> +#define K(x) ((x)<< (PAGE_SHIFT - 10)) > > can't this overflow? > also, won't it be simpler to just report memory is > in bytes, then just x * PAGE_SIZE instead of hairy constants? > >> +static void update_balloon_stats(struct virtio_balloon *vb) >> +{ >> + unsigned long events[NR_VM_EVENT_ITEMS]; > > that's about 1/2K worth of stack space on a 64 bit machine. > better keep it as part of struct virtio_balloon. > >> + struct sysinfo i; >> + int idx =3D 0; >> + >> + all_vm_events(events); >> + si_meminfo(&i); >> + >> + update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_IN, K(events[PSWPIN])); >> + update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_OUT, K(events[PSWPOUT])= ); >> + update_stat(vb, idx++, VIRTIO_BALLOON_S_MAJFLT, events[PGMAJFAULT]); >> + update_stat(vb, idx++, VIRTIO_BALLOON_S_MINFLT, events[PGFAULT]); >> + update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMFREE, K(i.freeram)); >> + update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMTOT, K(i.totalram)); Finally I found some data from our M$ neighbors. This is from=20 http://www.microsoft.com/whdc/system/sysperf/Perf_tun_srv-R2.mspx "When running Windows in the child partition, you can use the following=20 performance counters within a child partition to identify whether the=20 child partition is experiencing memory pressure and is likely to perform=20 better with a higher VM memory size: Memory =96 Standby Cache Reserve Bytes: Sum of Standby Cache Reserve Bytes and Free and Zero Page List Bytes=20 should be 200 MB or more on systems with 1 GB, and 300 MB or more on=20 systems with 2 GB or more of visible RAM. Memory =96 Free & Zero Page List Bytes: Sum of Standby Cache Reserve Bytes and Free and Zero Page List Bytes=20 should be 200 MB or more on systems with 1 GB, and 300 MB or more on=20 systems with 2 GB or more of visible RAM. Memory =96 Pages Input/Sec: Average over a 1-hour period is less than 10. " The biggest take out of it is that we may get #0-pages in windows. It's valuable information for the host since KSM will collapse all of=20 them anyway, so there is not point in ballooning them. Vadim, can you check if we can extract it from windows? >> +} >> + >> +/* >> + * While most virtqueues communicate guest-initiated requests to the = hypervisor, >> + * the stats queue operates in reverse. The driver initializes the v= irtqueue >> + * with a single buffer. From that point forward, all conversations = consist of >> + * a hypervisor request (a call to this function) which directs us to= refill >> + * the virtqueue with a fresh stats buffer. >> + */ >> +static void stats_ack(struct virtqueue *vq) >> +{ >> + struct virtio_balloon *vb; >> + unsigned int len; >> + struct scatterlist sg; >> + >> + vb =3D vq->vq_ops->get_buf(vq,&len); >> + if (!vb) >> + return; >> + >> + update_balloon_stats(vb); >> + >> + sg_init_one(&sg, vb->stats, sizeof(vb->stats)); >> + if (vq->vq_ops->add_buf(vq,&sg, 1, 0, vb)< 0) >> + BUG(); >> + vq->vq_ops->kick(vq); >> +} >> + >> static void virtballoon_changed(struct virtio_device *vdev) >> { >> struct virtio_balloon *vb =3D vdev->priv; >> @@ -205,10 +259,10 @@ static int balloon(void *_vballoon) >> static int virtballoon_probe(struct virtio_device *vdev) >> { >> struct virtio_balloon *vb; >> - struct virtqueue *vqs[2]; >> - vq_callback_t *callbacks[] =3D { balloon_ack, balloon_ack }; >> - const char *names[] =3D { "inflate", "deflate" }; >> - int err; >> + struct virtqueue *vqs[3]; >> + vq_callback_t *callbacks[] =3D { balloon_ack, balloon_ack, stats_ack= }; >> + const char *names[] =3D { "inflate", "deflate", "stats" }; >> + int err, nvqs; >> >> vdev->priv =3D vb =3D kmalloc(sizeof(*vb), GFP_KERNEL); >> if (!vb) { >> @@ -221,13 +275,28 @@ static int virtballoon_probe(struct virtio_devic= e *vdev) >> init_waitqueue_head(&vb->config_change); >> vb->vdev =3D vdev; >> >> - /* We expect two virtqueues. */ >> - err =3D vdev->config->find_vqs(vdev, 2, vqs, callbacks, names); >> + /* We expect two virtqueues: inflate and deflate, >> + * and optionally stat. */ >> + nvqs =3D virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3= : 2; >> + err =3D vdev->config->find_vqs(vdev, nvqs, vqs, callbacks, names); >> if (err) >> goto out_free_vb; >> >> vb->inflate_vq =3D vqs[0]; >> vb->deflate_vq =3D vqs[1]; >> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) { >> + struct scatterlist sg; >> + vb->stats_vq =3D vqs[2]; >> + >> + /* >> + * Prime this virtqueue with one buffer so the hypervisor can >> + * use it to signal us later. >> + */ >> + sg_init_one(&sg, vb->stats, sizeof(vb->stats)); > > should be "sizeof vb->stats" > >> + if (vb->stats_vq->vq_ops->add_buf(vb->stats_vq,&sg, 1, 0, vb)< 0) >> + BUG(); >> + vb->stats_vq->vq_ops->kick(vb->stats_vq); >> + } >> >> vb->thread =3D kthread_run(balloon, vb, "vballoon"); >> if (IS_ERR(vb->thread)) { >> @@ -265,7 +334,9 @@ static void virtballoon_remove(struct virtio_devic= e *vdev) >> kfree(vb); >> } >> >> -static unsigned int features[] =3D { VIRTIO_BALLOON_F_MUST_TELL_HOST = }; >> +static unsigned int features[] =3D { >> + VIRTIO_BALLOON_F_MUST_TELL_HOST, VIRTIO_BALLOON_F_STATS_VQ, > > one per line > >> +}; >> >> static struct virtio_driver virtio_balloon =3D { >> .feature_table =3D features, >> diff --git a/include/linux/virtio_balloon.h b/include/linux/virtio_bal= loon.h >> index 09d7300..a5c09e6 100644 >> --- a/include/linux/virtio_balloon.h >> +++ b/include/linux/virtio_balloon.h >> @@ -6,6 +6,7 @@ >> >> /* The feature bitmap for virtio balloon */ >> #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming = pages */ >> +#define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */ > > align with a tab >> >> /* Size of a PFN in the balloon interface. */ >> #define VIRTIO_BALLOON_PFN_SHIFT 12 >> @@ -17,4 +18,19 @@ struct virtio_balloon_config >> /* Number of pages we've actually got in balloon. */ >> __le32 actual; >> }; >> + >> +#define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in = */ >> +#define VIRTIO_BALLOON_S_SWAP_OUT 1 /* Amount of memory swapped out= */ >> +#define VIRTIO_BALLOON_S_MAJFLT 2 /* Number of major faults */ >> +#define VIRTIO_BALLOON_S_MINFLT 3 /* Number of minor faults */ >> +#define VIRTIO_BALLOON_S_MEMFREE 4 /* Total amount of free memory = */ >> +#define VIRTIO_BALLOON_S_MEMTOT 5 /* Total amount of memory */ > > this seems to imply the total amount of memory in bytes? > >> +#define VIRTIO_BALLOON_S_NR 6 >> + >> +struct virtio_balloon_stat >> +{ >> + __le16 tag; >> + __le64 val; > > This might create padding issues e.g. when guest > is 32 bit and host is 64 bit. Better add padding > manually, or pack the structure (but packed structures > often cause gcc to generate terrible code). > > >> +}; >> + >> #endif /* _LINUX_VIRTIO_BALLOON_H */ >> >> >> -- >> Thanks, >> Adam >> >> _______________________________________________ >> Virtualization mailing list >> Virtualization@lists.linux-foundation.org >> https://lists.linux-foundation.org/mailman/listinfo/virtualization > _______________________________________________ > Virtualization mailing list > Virtualization@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/virtualization