* [PATCH RFC] virtio_balloon: conservative balloon page shrinking
@ 2020-02-06 8:01 Wei Wang
2020-02-06 9:04 ` Michael S. Tsirkin
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Wei Wang @ 2020-02-06 8:01 UTC (permalink / raw)
To: linux-kernel, virtualization
Cc: tysand, mst, david, alexander.h.duyck, rientjes, mhocko, namit,
penguin-kernel, wei.w.wang
There are cases that users want to shrink balloon pages after the
pagecache depleted. The conservative_shrinker lets the shrinker
shrink balloon pages when all the pagecache has been reclaimed.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
drivers/virtio/virtio_balloon.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 93f995f6cf36..b4c5bb13a867 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -42,6 +42,10 @@
static struct vfsmount *balloon_mnt;
#endif
+static bool conservative_shrinker = true;
+module_param(conservative_shrinker, bool, 0644);
+MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon pages");
+
enum virtio_balloon_vq {
VIRTIO_BALLOON_VQ_INFLATE,
VIRTIO_BALLOON_VQ_DEFLATE,
@@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
{
unsigned long pages_freed = 0;
+ /* Balloon pages only gets shrunk when the pagecache depleted */
+ if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
+ return 0;
+
/*
* One invocation of leak_balloon can deflate at most
* VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
@@ -837,7 +845,11 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
struct virtio_balloon, shrinker);
unsigned long count;
- count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+ if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
+ count = 0;
+ else
+ count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+
count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
return count;
--
2.17.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 8:01 [PATCH RFC] virtio_balloon: conservative balloon page shrinking Wei Wang
@ 2020-02-06 9:04 ` Michael S. Tsirkin
2020-02-06 9:27 ` Wang, Wei W
2020-02-06 9:09 ` David Hildenbrand
2020-02-08 12:32 ` Tetsuo Handa
2 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2020-02-06 9:04 UTC (permalink / raw)
To: Wei Wang
Cc: linux-kernel, virtualization, tysand, david, alexander.h.duyck,
rientjes, mhocko, namit, penguin-kernel
On Thu, Feb 06, 2020 at 04:01:47PM +0800, Wei Wang wrote:
> There are cases that users want to shrink balloon pages after the
> pagecache depleted. The conservative_shrinker lets the shrinker
> shrink balloon pages when all the pagecache has been reclaimed.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
I'd rather avoid module parameters, but otherwise looks
like a reasonable idea.
Tyler, what do you think?
> ---
> drivers/virtio/virtio_balloon.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 93f995f6cf36..b4c5bb13a867 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
> static struct vfsmount *balloon_mnt;
> #endif
>
> +static bool conservative_shrinker = true;
> +module_param(conservative_shrinker, bool, 0644);
> +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon pages");
> +
> enum virtio_balloon_vq {
> VIRTIO_BALLOON_VQ_INFLATE,
> VIRTIO_BALLOON_VQ_DEFLATE,
> @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
> {
> unsigned long pages_freed = 0;
>
> + /* Balloon pages only gets shrunk when the pagecache depleted */
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + return 0;
> +
> /*
> * One invocation of leak_balloon can deflate at most
> * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> @@ -837,7 +845,11 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
> struct virtio_balloon, shrinker);
> unsigned long count;
>
> - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
I'd rather have an API for that in mm/. In particular, do we want other
shrinkers to run, not just pagecache? To pick an example I'm familiar
with, kvm mmu cache for nested virt?
> + count = 0;
> + else
> + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +
> count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
>
> return count;
> --
> 2.17.1
^ permalink raw reply [flat|nested] 17+ messages in thread* RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 9:04 ` Michael S. Tsirkin
@ 2020-02-06 9:27 ` Wang, Wei W
2020-02-06 9:31 ` Michael S. Tsirkin
0 siblings, 1 reply; 17+ messages in thread
From: Wang, Wei W @ 2020-02-06 9:27 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, tysand@google.com,
david@redhat.com, alexander.h.duyck@linux.intel.com,
rientjes@google.com, mhocko@kernel.org, namit@vmware.com,
penguin-kernel@i-love.sakura.ne.jp
On Thursday, February 6, 2020 5:04 PM, Michael S. Tsirkin wrote:
> virtio_balloon_shrinker_count(struct shrinker *shrinker,
> > struct virtio_balloon, shrinker);
> > unsigned long count;
> >
> > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
>
> I'd rather have an API for that in mm/. In particular, do we want other
> shrinkers to run, not just pagecache? To pick an example I'm familiar
> with, kvm mmu cache for nested virt?
We could make it extendable:
#define BALLOON_SHRINKER_AFTER_PAGE_CACHE (1 << 0)
#define BALLOON_SHRINKER_AFTER_KVM_MMU_CACHE (1 << 1)
...
uint64_t conservative_shrinker;
if ((conservative_shrinker | BALLOON_SHRINKER_AFTER_PAGE_CACHE) && global_node_page_state(NR_FILE_PAGES))
return 0;
For now, we probably only need BALLOON_SHRINKER_AFTER_PAGE_CACHE.
Best,
Wei
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 9:27 ` Wang, Wei W
@ 2020-02-06 9:31 ` Michael S. Tsirkin
2020-02-06 9:43 ` Wang, Wei W
0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2020-02-06 9:31 UTC (permalink / raw)
To: Wang, Wei W
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, tysand@google.com,
david@redhat.com, alexander.h.duyck@linux.intel.com,
rientjes@google.com, mhocko@kernel.org, namit@vmware.com,
penguin-kernel@i-love.sakura.ne.jp
On Thu, Feb 06, 2020 at 09:27:04AM +0000, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:04 PM, Michael S. Tsirkin wrote:
> > virtio_balloon_shrinker_count(struct shrinker *shrinker,
> > > struct virtio_balloon, shrinker);
> > > unsigned long count;
> > >
> > > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> > > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> >
> > I'd rather have an API for that in mm/. In particular, do we want other
> > shrinkers to run, not just pagecache? To pick an example I'm familiar
> > with, kvm mmu cache for nested virt?
>
> We could make it extendable:
>
> #define BALLOON_SHRINKER_AFTER_PAGE_CACHE (1 << 0)
> #define BALLOON_SHRINKER_AFTER_KVM_MMU_CACHE (1 << 1)
> ...
>
> uint64_t conservative_shrinker;
> if ((conservative_shrinker | BALLOON_SHRINKER_AFTER_PAGE_CACHE) && global_node_page_state(NR_FILE_PAGES))
> return 0;
>
> For now, we probably only need BALLOON_SHRINKER_AFTER_PAGE_CACHE.
>
> Best,
> Wei
How about just making this a last resort thing to be compatible with
existing hypervisors? if someone wants to change behaviour
that really should use a feature bit ...
--
MST
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 9:31 ` Michael S. Tsirkin
@ 2020-02-06 9:43 ` Wang, Wei W
2020-02-06 11:26 ` Michael S. Tsirkin
0 siblings, 1 reply; 17+ messages in thread
From: Wang, Wei W @ 2020-02-06 9:43 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, tysand@google.com,
david@redhat.com, alexander.h.duyck@linux.intel.com,
rientjes@google.com, mhocko@kernel.org, namit@vmware.com,
penguin-kernel@i-love.sakura.ne.jp
On Thursday, February 6, 2020 5:31 PM, Michael S. Tsirkin wrote:
>
> How about just making this a last resort thing to be compatible with existing
> hypervisors? if someone wants to change behaviour that really should use a
> feature bit ...
Yeah, sounds good to me to control via feature bits.
Best,
Wei
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 9:43 ` Wang, Wei W
@ 2020-02-06 11:26 ` Michael S. Tsirkin
0 siblings, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2020-02-06 11:26 UTC (permalink / raw)
To: Wang, Wei W
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, tysand@google.com,
david@redhat.com, alexander.h.duyck@linux.intel.com,
rientjes@google.com, mhocko@kernel.org, namit@vmware.com,
penguin-kernel@i-love.sakura.ne.jp
On Thu, Feb 06, 2020 at 09:43:10AM +0000, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:31 PM, Michael S. Tsirkin wrote:
> >
> > How about just making this a last resort thing to be compatible with existing
> > hypervisors? if someone wants to change behaviour that really should use a
> > feature bit ...
>
> Yeah, sounds good to me to control via feature bits.
>
> Best,
> Wei
To clarify, shrinker use could be a feature bit. OOM behaviour was
there for years and has been used to dynamically size guests.
--
MST
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 8:01 [PATCH RFC] virtio_balloon: conservative balloon page shrinking Wei Wang
2020-02-06 9:04 ` Michael S. Tsirkin
@ 2020-02-06 9:09 ` David Hildenbrand
2020-02-06 9:28 ` Wang, Wei W
2020-02-08 12:32 ` Tetsuo Handa
2 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2020-02-06 9:09 UTC (permalink / raw)
To: Wei Wang, linux-kernel, virtualization
Cc: tysand, mst, alexander.h.duyck, rientjes, mhocko, namit,
penguin-kernel
On 06.02.20 09:01, Wei Wang wrote:
> There are cases that users want to shrink balloon pages after the
> pagecache depleted. The conservative_shrinker lets the shrinker
> shrink balloon pages when all the pagecache has been reclaimed.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> ---
> drivers/virtio/virtio_balloon.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 93f995f6cf36..b4c5bb13a867 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
> static struct vfsmount *balloon_mnt;
> #endif
>
> +static bool conservative_shrinker = true;
> +module_param(conservative_shrinker, bool, 0644);
> +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon pages");
> +
> enum virtio_balloon_vq {
> VIRTIO_BALLOON_VQ_INFLATE,
> VIRTIO_BALLOON_VQ_DEFLATE,
> @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
> {
> unsigned long pages_freed = 0;
>
> + /* Balloon pages only gets shrunk when the pagecache depleted */
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + return 0;
> +
> /*
> * One invocation of leak_balloon can deflate at most
> * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> @@ -837,7 +845,11 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
> struct virtio_balloon, shrinker);
> unsigned long count;
>
> - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + count = 0;
> + else
> + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +
> count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
>
> return count;
>
so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
deflate the balloon when conservative_shrinker=true?
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 17+ messages in thread* RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 9:09 ` David Hildenbrand
@ 2020-02-06 9:28 ` Wang, Wei W
2020-02-06 9:32 ` David Hildenbrand
0 siblings, 1 reply; 17+ messages in thread
From: Wang, Wei W @ 2020-02-06 9:28 UTC (permalink / raw)
To: David Hildenbrand, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org
Cc: tysand@google.com, mst@redhat.com,
alexander.h.duyck@linux.intel.com, rientjes@google.com,
mhocko@kernel.org, namit@vmware.com,
penguin-kernel@I-love.SAKURA.ne.jp
On Thursday, February 6, 2020 5:10 PM, David Hildenbrand wrote:
> so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
> deflate the balloon when conservative_shrinker=true?
>
Should be. Need Tyler's help to test it.
Best,
Wei
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 9:28 ` Wang, Wei W
@ 2020-02-06 9:32 ` David Hildenbrand
2020-02-06 9:44 ` Wang, Wei W
0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2020-02-06 9:32 UTC (permalink / raw)
To: Wang, Wei W, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org
Cc: tysand@google.com, mst@redhat.com,
alexander.h.duyck@linux.intel.com, rientjes@google.com,
mhocko@kernel.org, namit@vmware.com,
penguin-kernel@I-love.SAKURA.ne.jp
On 06.02.20 10:28, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:10 PM, David Hildenbrand wrote:
>> so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
>> deflate the balloon when conservative_shrinker=true?
>>
>
> Should be. Need Tyler's help to test it.
>
If the page cache is empty, a drop_slab() will deflate the whole balloon
if I am not wrong.
Especially, a echo 3 > /proc/sys/vm/drop_caches
will first drop the page cache and then drop_slab()
While I like the general idea, it looks more like a hack to me, to try
to teach the shrinker something it was not built for/does not support yet.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 9:32 ` David Hildenbrand
@ 2020-02-06 9:44 ` Wang, Wei W
2020-02-06 9:49 ` David Hildenbrand
0 siblings, 1 reply; 17+ messages in thread
From: Wang, Wei W @ 2020-02-06 9:44 UTC (permalink / raw)
To: David Hildenbrand, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org
Cc: tysand@google.com, mst@redhat.com,
alexander.h.duyck@linux.intel.com, rientjes@google.com,
mhocko@kernel.org, namit@vmware.com,
penguin-kernel@I-love.SAKURA.ne.jp
On Thursday, February 6, 2020 5:32 PM, David Hildenbrand wrote:
>
> If the page cache is empty, a drop_slab() will deflate the whole balloon if I
> am not wrong.
>
> Especially, a echo 3 > /proc/sys/vm/drop_caches
>
> will first drop the page cache and then drop_slab()
Then that's the problem of "echo 3 > /proc/sys/vm/drop_cache" itself. It invokes other shrinkers as well (if considered an issue), need to be tweaked in the mm.
Best,
Wei
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 9:44 ` Wang, Wei W
@ 2020-02-06 9:49 ` David Hildenbrand
0 siblings, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2020-02-06 9:49 UTC (permalink / raw)
To: Wang, Wei W, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org
Cc: tysand@google.com, mst@redhat.com,
alexander.h.duyck@linux.intel.com, rientjes@google.com,
mhocko@kernel.org, namit@vmware.com,
penguin-kernel@I-love.SAKURA.ne.jp
On 06.02.20 10:44, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:32 PM, David Hildenbrand wrote:
>>
>> If the page cache is empty, a drop_slab() will deflate the whole balloon if I
>> am not wrong.
>>
>> Especially, a echo 3 > /proc/sys/vm/drop_caches
>>
>> will first drop the page cache and then drop_slab()
>
> Then that's the problem of "echo 3 > /proc/sys/vm/drop_cache" itself. It invokes other shrinkers as well (if considered an issue), need to be tweaked in the mm.
In short, I don't like this approach as long as a drop_slab() can
deflate the whole balloon and don't think this is the right approach then.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-06 8:01 [PATCH RFC] virtio_balloon: conservative balloon page shrinking Wei Wang
2020-02-06 9:04 ` Michael S. Tsirkin
2020-02-06 9:09 ` David Hildenbrand
@ 2020-02-08 12:32 ` Tetsuo Handa
2020-02-10 3:13 ` Wang, Wei W
2 siblings, 1 reply; 17+ messages in thread
From: Tetsuo Handa @ 2020-02-08 12:32 UTC (permalink / raw)
To: Wei Wang, linux-kernel, virtualization
Cc: tysand, mst, david, alexander.h.duyck, rientjes, mhocko, namit
On 2020/02/06 17:01, Wei Wang wrote:
> There are cases that users want to shrink balloon pages after the
> pagecache depleted. The conservative_shrinker lets the shrinker
> shrink balloon pages when all the pagecache has been reclaimed.
>
> @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
> {
> unsigned long pages_freed = 0;
>
> + /* Balloon pages only gets shrunk when the pagecache depleted */
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + return 0;
> +
Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and node-B's
NR_FILE_PAGES is not 0, but allocation request which triggered this shrinker
wants to allocate from only node-B" happen? Can some thread keep this shrinker
defunctional by keep increasing NR_FILE_PAGES?
Is this patch from "Re: Balloon pressuring page cache" thread? I hope that
the guest could start reclaiming memory based on host's request (like OOM
notifier chain) which is issued when host thinks that host is getting close
to OOM and thus guests should start returning their unused memory to host.
Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM
condition" is also possible.
^ permalink raw reply [flat|nested] 17+ messages in thread* RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-08 12:32 ` Tetsuo Handa
@ 2020-02-10 3:13 ` Wang, Wei W
2020-02-10 3:57 ` Tetsuo Handa
0 siblings, 1 reply; 17+ messages in thread
From: Wang, Wei W @ 2020-02-10 3:13 UTC (permalink / raw)
To: Tetsuo Handa, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org
Cc: mst@redhat.com, mhocko@kernel.org, tysand@google.com,
namit@vmware.com, rientjes@google.com,
alexander.h.duyck@linux.intel.com
On Saturday, February 8, 2020 8:33 PM, Tetsuo Handa wrote:
>
> Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and
> node-B's NR_FILE_PAGES is not 0, but allocation request which triggered this
> shrinker wants to allocate from only node-B" happen?
No, it's a global counter.
>Can some thread keep
> this shrinker defunctional by keep increasing NR_FILE_PAGES?
Yes. Actually it's our intention - as long as there are pagecache pages,
balloon pages are avoided to be reclaimed.
>
> Is this patch from "Re: Balloon pressuring page cache" thread? I hope that
> the guest could start reclaiming memory based on host's request (like OOM
> notifier chain) which is issued when host thinks that host is getting close to
> OOM and thus guests should start returning their unused memory to host.
> Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM
> condition" is also possible.
That's about the host usages. The host side management software decides when to issue a request to balloon (either periodically or event driven), I think there isn't anything we need to do in the balloon driver here.
Best,
Wei
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-10 3:13 ` Wang, Wei W
@ 2020-02-10 3:57 ` Tetsuo Handa
2020-02-10 7:27 ` Wang, Wei W
0 siblings, 1 reply; 17+ messages in thread
From: Tetsuo Handa @ 2020-02-10 3:57 UTC (permalink / raw)
To: Wang, Wei W
Cc: mst@redhat.com, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, tysand@google.com,
namit@vmware.com, rientjes@google.com,
alexander.h.duyck@linux.intel.com, mhocko@kernel.org
Wang, Wei W wrote:
> On Saturday, February 8, 2020 8:33 PM, Tetsuo Handa wrote:
> >
> > Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and
> > node-B's NR_FILE_PAGES is not 0, but allocation request which triggered this
> > shrinker wants to allocate from only node-B" happen?
>
> No, it's a global counter.
>
> >Can some thread keep
> > this shrinker defunctional by keep increasing NR_FILE_PAGES?
>
> Yes. Actually it's our intention - as long as there are pagecache pages,
> balloon pages are avoided to be reclaimed.
Then, "node-A's NR_FILE_PAGES is already 0 and node-B's NR_FILE_PAGES is not 0, but
allocation request which triggered this shrinker wants to allocate from only node-A"
would be confused by this change, for the pagecache pages for allocating thread's
interested node are already depleted but the balloon cannot shrink when it should
because the pagecache pages for allocating thread's uninterested nodes are not yet
depleted.
>
>
> >
> > Is this patch from "Re: Balloon pressuring page cache" thread? I hope that
> > the guest could start reclaiming memory based on host's request (like OOM
> > notifier chain) which is issued when host thinks that host is getting close to
> > OOM and thus guests should start returning their unused memory to host.
> > Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM
> > condition" is also possible.
>
> That's about the host usages. The host side management software decides when to
> issue a request to balloon (either periodically or event driven), I think there
> isn't anything we need to do in the balloon driver here.
Well, my comment is rather: "Do not try to reserve guest's memory. In other words,
do not try to maintain balloons on the guest side. Since host would be able to cache
file data on the host's cache, guests would be able to quickly fetch file data from
host's cache via normal I/O requests." ;-)
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-10 3:57 ` Tetsuo Handa
@ 2020-02-10 7:27 ` Wang, Wei W
2020-02-11 14:18 ` Tetsuo Handa
0 siblings, 1 reply; 17+ messages in thread
From: Wang, Wei W @ 2020-02-10 7:27 UTC (permalink / raw)
To: Tetsuo Handa
Cc: mst@redhat.com, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, tysand@google.com,
namit@vmware.com, rientjes@google.com,
alexander.h.duyck@linux.intel.com, mhocko@kernel.org
On Monday, February 10, 2020 11:57 AM, Tetsuo Handa wrote:
> Then, "node-A's NR_FILE_PAGES is already 0 and node-B's NR_FILE_PAGES is
> not 0, but allocation request which triggered this shrinker wants to allocate
> from only node-A"
> would be confused by this change, for the pagecache pages for allocating
> thread's interested node are already depleted but the balloon cannot shrink
> when it should because the pagecache pages for allocating thread's
> uninterested nodes are not yet depleted.
The existing balloon isn't numa aware. "but the balloon cannot shrink " - even we
let balloon to shrink, it could shrink pages from the uninterested node.
When we have a numa aware balloon, we could further update the shrinker
to check with the per node counter , node_page_state(NR_FILE_PAGES).
>
> >
> Well, my comment is rather: "Do not try to reserve guest's memory. In other
> words, do not try to maintain balloons on the guest side. Since host would
> be able to cache file data on the host's cache, guests would be able to
> quickly fetch file data from host's cache via normal I/O requests." ;-)
Didn't this one. The discussion was about guest pagecache pages v.s. guest balloon pages.
Why is host's pagecache here?
Best,
Wei
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-10 7:27 ` Wang, Wei W
@ 2020-02-11 14:18 ` Tetsuo Handa
2020-02-14 20:22 ` Tyler Sanderson via Virtualization
0 siblings, 1 reply; 17+ messages in thread
From: Tetsuo Handa @ 2020-02-11 14:18 UTC (permalink / raw)
To: Wang, Wei W
Cc: mst@redhat.com, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, tysand@google.com,
namit@vmware.com, rientjes@google.com,
alexander.h.duyck@linux.intel.com, mhocko@kernel.org
On 2020/02/10 16:27, Wang, Wei W wrote:
>> Well, my comment is rather: "Do not try to reserve guest's memory. In other
>> words, do not try to maintain balloons on the guest side. Since host would
>> be able to cache file data on the host's cache, guests would be able to
>> quickly fetch file data from host's cache via normal I/O requests." ;-)
>
> Didn't this one. The discussion was about guest pagecache pages v.s. guest balloon pages.
> Why is host's pagecache here?
I'm expecting a mode: "Guests should try to minimize pagecache pages (and teach
host to treat reclaimed pages as if POSIX_FADV_DONTNEED) instead of managing
guest balloon pages". In other words, as if
while :; sleep 5; echo 1 > /proc/sys/vm/drop_caches; done
is running in the guest's kernel. And as if
echo 2 > /proc/sys/vm/drop_caches
is triggered in the guest's kernel when host requested guests to reclaim
memory. No long-life balloons. Guest balloons do not need to care about
NUMA. Just leave the management of pagecache pages to the host.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking
2020-02-11 14:18 ` Tetsuo Handa
@ 2020-02-14 20:22 ` Tyler Sanderson via Virtualization
0 siblings, 0 replies; 17+ messages in thread
From: Tyler Sanderson via Virtualization @ 2020-02-14 20:22 UTC (permalink / raw)
To: Tetsuo Handa
Cc: mst@redhat.com, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, namit@vmware.com,
rientjes@google.com, alexander.h.duyck@linux.intel.com,
mhocko@kernel.org
[-- Attachment #1.1: Type: text/plain, Size: 3390 bytes --]
Sorry for the slow reply.
Re: Module parameters: I prefer not to have module parameters since they
are controlled by the guest. In general, in virtualized environments the
admins controlling the hypervisor are more knowledgeable about these things
than the users. A feature bit seems useful so that the host knows what the
guest behavior will be, and can change the host side implementation to make
the experience good for the guest.
I worry that requiring global_node_page_state(NR_FILE_PAGES) == 0 before
allowing deflation is too strict. One of the benefits of the shrinker API
is that it is invoked before vmscan.c has gone through heroic efforts to
reclaim the world. I'm not familiar enough with the code to judge how this
patch impacts this, but would it be beneficial to allow deflation when
vmscan.c is trying "too hard" to reclaim pages? Is there some softer
condition than "global_node_page_state(NR_FILE_PAGES) == 0"?
For my own understanding, does this patch work by returning 0 pages when
asked for pages? Are there cases where that results in an unnecessary OOM?
For example, if global_node_page_state(NR_FILE_PAGES) == 1, and the guest
needs 2?
Regarding other shrinkers (like KVM MMU cache): Reclaiming other shrinkers
first would match the behavior of DEFLATE_ON_OOM when it was using the OOM
notifier callback. On the other hand (awkwardly), the memory stats reported
on the stats queue for "available memory" do not count shrinker memory as
"available". So a balloon implementation that aims to reclaim some amount
of available memory would not be able to tell how much memory was in the
shrinkers and probably doesn't expect to reclaim them. For this reason, I
think only looking at page cache size is the right choice. There should be
a 1:1 relationship between stats reported and when DEFLATE_ON_OOM is
invoked. Maybe in the future we add another stat that reports shrinker
sizes, in which case we should also add a feature bit that allows other
shrinkers to be pressured.
Regarding NUMA awareness: I agree it's out of scope for this patch since
all implementations so far are not NUMA aware.
Would it be possible to back port this patch to 4.19 when the change to
shrinker API was made?
On Tue, Feb 11, 2020 at 6:20 AM Tetsuo Handa <
penguin-kernel@i-love.sakura.ne.jp> wrote:
> On 2020/02/10 16:27, Wang, Wei W wrote:
> >> Well, my comment is rather: "Do not try to reserve guest's memory. In
> other
> >> words, do not try to maintain balloons on the guest side. Since host
> would
> >> be able to cache file data on the host's cache, guests would be able to
> >> quickly fetch file data from host's cache via normal I/O requests." ;-)
> >
> > Didn't this one. The discussion was about guest pagecache pages v.s.
> guest balloon pages.
> > Why is host's pagecache here?
>
> I'm expecting a mode: "Guests should try to minimize pagecache pages (and
> teach
> host to treat reclaimed pages as if POSIX_FADV_DONTNEED) instead of
> managing
> guest balloon pages". In other words, as if
>
> while :; sleep 5; echo 1 > /proc/sys/vm/drop_caches; done
>
> is running in the guest's kernel. And as if
>
> echo 2 > /proc/sys/vm/drop_caches
>
> is triggered in the guest's kernel when host requested guests to reclaim
> memory. No long-life balloons. Guest balloons do not need to care about
> NUMA. Just leave the management of pagecache pages to the host.
>
>
[-- Attachment #1.2: Type: text/html, Size: 4038 bytes --]
[-- Attachment #2: Type: text/plain, Size: 183 bytes --]
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2020-02-14 20:22 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-02-06 8:01 [PATCH RFC] virtio_balloon: conservative balloon page shrinking Wei Wang
2020-02-06 9:04 ` Michael S. Tsirkin
2020-02-06 9:27 ` Wang, Wei W
2020-02-06 9:31 ` Michael S. Tsirkin
2020-02-06 9:43 ` Wang, Wei W
2020-02-06 11:26 ` Michael S. Tsirkin
2020-02-06 9:09 ` David Hildenbrand
2020-02-06 9:28 ` Wang, Wei W
2020-02-06 9:32 ` David Hildenbrand
2020-02-06 9:44 ` Wang, Wei W
2020-02-06 9:49 ` David Hildenbrand
2020-02-08 12:32 ` Tetsuo Handa
2020-02-10 3:13 ` Wang, Wei W
2020-02-10 3:57 ` Tetsuo Handa
2020-02-10 7:27 ` Wang, Wei W
2020-02-11 14:18 ` Tetsuo Handa
2020-02-14 20:22 ` Tyler Sanderson via Virtualization
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.