* Re: [PATCH RESEND] mm: don't zero ballooned pages
2017-08-03 11:59 [PATCH RESEND] mm: don't zero ballooned pages Wei Wang
@ 2017-08-03 12:24 ` Michael S. Tsirkin
2017-08-03 12:54 ` Michal Hocko
2017-08-07 8:44 ` David Hildenbrand
2 siblings, 0 replies; 8+ messages in thread
From: Michael S. Tsirkin @ 2017-08-03 12:24 UTC (permalink / raw)
To: Wei Wang
Cc: linux-kernel, linux-mm, virtualization, mhocko, zhenwei.pi, akpm,
dave.hansen, mawilcox
On Thu, Aug 03, 2017 at 07:59:17PM +0800, Wei Wang wrote:
> This patch is a revert of 'commit bb01b64cfab7 ("mm/balloon_compaction.c:
> enqueue zero page to balloon device")'
>
> Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> shouldn't be given to the host ksmd to scan. Therefore, it is not
> necessary to zero ballooned pages, which is very time consuming when
> the page amount is large. The ongoing fast balloon tests show that the
> time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
> __GFP_ZERO added. So, this patch removes the flag.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Michael S. Tsirkin <mst@redhat.com>
Fixes: bb01b64cfab7 ("mm/balloon_compaction.c: enqueue zero page to balloon device")
Looks like hypervisor is better placed to zero these if it wants to.
If it can't for some reason, this change would need a feature bit
to avoid adding extra work for all guests.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> mm/balloon_compaction.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> index 9075aa5..b06d9fe 100644
> --- a/mm/balloon_compaction.c
> +++ b/mm/balloon_compaction.c
> @@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info)
> {
> unsigned long flags;
> struct page *page = alloc_page(balloon_mapping_gfp_mask() |
> - __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
> + __GFP_NOMEMALLOC | __GFP_NORETRY);
> if (!page)
> return NULL;
>
> --
> 2.7.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND] mm: don't zero ballooned pages
2017-08-03 11:59 [PATCH RESEND] mm: don't zero ballooned pages Wei Wang
2017-08-03 12:24 ` Michael S. Tsirkin
@ 2017-08-03 12:54 ` Michal Hocko
2017-08-03 13:18 ` Wei Wang
2017-08-07 8:44 ` David Hildenbrand
2 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2017-08-03 12:54 UTC (permalink / raw)
To: Wei Wang
Cc: linux-kernel, linux-mm, virtualization, mst, zhenwei.pi, akpm,
dave.hansen, mawilcox
On Thu 03-08-17 19:59:17, Wei Wang wrote:
> This patch is a revert of 'commit bb01b64cfab7 ("mm/balloon_compaction.c:
> enqueue zero page to balloon device")'
>
> Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> shouldn't be given to the host ksmd to scan.
I find MADV_DONTNEED reference still quite confusing. What do you think
about the following wording instead:
"
Zeroying ballon pages is rather time consuming, especially when a lot of
pages are in flight. E.g. 7GB worth of ballooned memory takes 2.8s with
__GFP_ZERO while it takes ~491ms without it. The original commit argued
that zeroying will help ksmd to merge these pages on the host but this
argument is assuming that the host actually marks balloon pages for ksm
which is not universally true. So we pay performance penalty for
something that even might not be used in the end which is wrong. The
host can zero out pages on its own when there is a need.
"
> Therefore, it is not
> necessary to zero ballooned pages, which is very time consuming when
> the page amount is large. The ongoing fast balloon tests show that the
> time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
> __GFP_ZERO added. So, this patch removes the flag.
The only reason why unconditional zeroying makes some sense is the
data leak protection (guest doesn't want to leak potentially sensitive
data to a malicious guest). I am not sure such a thread applies here
though.
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Michael S. Tsirkin <mst@redhat.com>
other than that
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/balloon_compaction.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> index 9075aa5..b06d9fe 100644
> --- a/mm/balloon_compaction.c
> +++ b/mm/balloon_compaction.c
> @@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info)
> {
> unsigned long flags;
> struct page *page = alloc_page(balloon_mapping_gfp_mask() |
> - __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
> + __GFP_NOMEMALLOC | __GFP_NORETRY);
> if (!page)
> return NULL;
>
> --
> 2.7.4
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND] mm: don't zero ballooned pages
2017-08-03 12:54 ` Michal Hocko
@ 2017-08-03 13:18 ` Wei Wang
0 siblings, 0 replies; 8+ messages in thread
From: Wei Wang @ 2017-08-03 13:18 UTC (permalink / raw)
To: Michal Hocko
Cc: linux-kernel, linux-mm, virtualization, mst, zhenwei.pi, akpm,
dave.hansen, mawilcox
On 08/03/2017 08:54 PM, Michal Hocko wrote:
> On Thu 03-08-17 19:59:17, Wei Wang wrote:
>> This patch is a revert of 'commit bb01b64cfab7 ("mm/balloon_compaction.c:
>> enqueue zero page to balloon device")'
>>
>> Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
>> shouldn't be given to the host ksmd to scan.
> I find MADV_DONTNEED reference still quite confusing. What do you think
> about the following wording instead:
> "
> Zeroying ballon pages is rather time consuming, especially when a lot of
> pages are in flight. E.g. 7GB worth of ballooned memory takes 2.8s with
> __GFP_ZERO while it takes ~491ms without it. The original commit argued
> that zeroying will help ksmd to merge these pages on the host but this
> argument is assuming that the host actually marks balloon pages for ksm
> which is not universally true. So we pay performance penalty for
> something that even might not be used in the end which is wrong. The
> host can zero out pages on its own when there is a need.
> "
I think it looks good. Thanks.
>> Therefore, it is not
>> necessary to zero ballooned pages, which is very time consuming when
>> the page amount is large. The ongoing fast balloon tests show that the
>> time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
>> __GFP_ZERO added. So, this patch removes the flag.
> The only reason why unconditional zeroying makes some sense is the
> data leak protection (guest doesn't want to leak potentially sensitive
> data to a malicious guest). I am not sure such a thread applies here
> though.
I think the unwashed contents left in the balloon pages (also free pages)
should be treated non-confidential - if the guest application has
confidential content in its memory, the application itself should zero that
before giving back that memory to the guest kernel.
Best,
Wei
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND] mm: don't zero ballooned pages
2017-08-03 11:59 [PATCH RESEND] mm: don't zero ballooned pages Wei Wang
2017-08-03 12:24 ` Michael S. Tsirkin
2017-08-03 12:54 ` Michal Hocko
@ 2017-08-07 8:44 ` David Hildenbrand
2017-08-07 9:25 ` Michal Hocko
2017-08-07 9:35 ` Wei Wang
2 siblings, 2 replies; 8+ messages in thread
From: David Hildenbrand @ 2017-08-07 8:44 UTC (permalink / raw)
To: Wei Wang, linux-kernel, linux-mm, virtualization, mhocko, mst,
zhenwei.pi
Cc: dave.hansen, akpm, mawilcox, Andrea Arcangeli
On 03.08.2017 13:59, Wei Wang wrote:
> This patch is a revert of 'commit bb01b64cfab7 ("mm/balloon_compaction.c:
> enqueue zero page to balloon device")'
>
> Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> shouldn't be given to the host ksmd to scan. Therefore, it is not
> necessary to zero ballooned pages, which is very time consuming when
> the page amount is large. The ongoing fast balloon tests show that the
> time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
> __GFP_ZERO added. So, this patch removes the flag.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> ---
> mm/balloon_compaction.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> index 9075aa5..b06d9fe 100644
> --- a/mm/balloon_compaction.c
> +++ b/mm/balloon_compaction.c
> @@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info)
> {
> unsigned long flags;
> struct page *page = alloc_page(balloon_mapping_gfp_mask() |
> - __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
> + __GFP_NOMEMALLOC | __GFP_NORETRY);
> if (!page)
> return NULL;
>
>
Your assumption here is, that the hypervisor will always supply a zero
page. Unfortunately, this assumption is wrong (and it stems from the
lack of different page size support in virtio-balloon).
Think about these examples:
1. Guest is backed by huge pages (hugetbfs). Ballooning kicks in.
MADV_DONTNEED is simply ignored in the hypervisor (hugetlbfs requires
fallocate punshhole). Also, trying to zap 4k on e.g. 1MB pages will
simply be ignored.
2. Guest on PPC uses 4k pages. Hypervisor uses 64k pages. trying to
MADV_DONTNEED 4K on 64k pages will simply be ignored.
So unfortunately, zeroing the page is the right thing to do to cover all
cases.
--
Thanks,
David
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND] mm: don't zero ballooned pages
2017-08-07 8:44 ` David Hildenbrand
@ 2017-08-07 9:25 ` Michal Hocko
2017-08-07 9:37 ` David Hildenbrand
2017-08-07 9:35 ` Wei Wang
1 sibling, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2017-08-07 9:25 UTC (permalink / raw)
To: David Hildenbrand
Cc: Wei Wang, linux-kernel, linux-mm, virtualization, mst, zhenwei.pi,
dave.hansen, akpm, mawilcox, Andrea Arcangeli
On Mon 07-08-17 10:44:50, David Hildenbrand wrote:
> On 03.08.2017 13:59, Wei Wang wrote:
> > This patch is a revert of 'commit bb01b64cfab7 ("mm/balloon_compaction.c:
> > enqueue zero page to balloon device")'
> >
> > Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> > shouldn't be given to the host ksmd to scan. Therefore, it is not
> > necessary to zero ballooned pages, which is very time consuming when
> > the page amount is large. The ongoing fast balloon tests show that the
> > time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
> > __GFP_ZERO added. So, this patch removes the flag.
> >
> > Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > mm/balloon_compaction.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> > index 9075aa5..b06d9fe 100644
> > --- a/mm/balloon_compaction.c
> > +++ b/mm/balloon_compaction.c
> > @@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info)
> > {
> > unsigned long flags;
> > struct page *page = alloc_page(balloon_mapping_gfp_mask() |
> > - __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
> > + __GFP_NOMEMALLOC | __GFP_NORETRY);
> > if (!page)
> > return NULL;
> >
> >
>
> Your assumption here is, that the hypervisor will always supply a zero
> page. Unfortunately, this assumption is wrong (and it stems from the
> lack of different page size support in virtio-balloon).
>
> Think about these examples:
>
> 1. Guest is backed by huge pages (hugetbfs). Ballooning kicks in.
>
> MADV_DONTNEED is simply ignored in the hypervisor (hugetlbfs requires
> fallocate punshhole). Also, trying to zap 4k on e.g. 1MB pages will
> simply be ignored.
>
> 2. Guest on PPC uses 4k pages. Hypervisor uses 64k pages. trying to
> MADV_DONTNEED 4K on 64k pages will simply be ignored.
>
> So unfortunately, zeroing the page is the right thing to do to cover all
> cases.
Maybe it is my absolute lack of familiarity with what the host actually
does with balloon pages but I fail to see why the above matters at all.
ksm will not try to merge sub page units (4k for hugetlb or a large base
page). And if you need to hide the guest contents then the host can
clear the respective subpage just fine. So could you be more explicit
why MADV_DONTNEED matters at all? Also does any host actually share sub
pages between different guests? This sounds like a bad idea to me in
general.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND] mm: don't zero ballooned pages
2017-08-07 9:25 ` Michal Hocko
@ 2017-08-07 9:37 ` David Hildenbrand
0 siblings, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2017-08-07 9:37 UTC (permalink / raw)
To: Michal Hocko
Cc: Wei Wang, linux-kernel, linux-mm, virtualization, mst, zhenwei.pi,
dave.hansen, akpm, mawilcox, Andrea Arcangeli
> Maybe it is my absolute lack of familiarity with what the host actually
> does with balloon pages but I fail to see why the above matters at all.
> ksm will not try to merge sub page units (4k for hugetlb or a large base
> page). And if you need to hide the guest contents then the host can
> clear the respective subpage just fine. So could you be more explicit
> why MADV_DONTNEED matters at all? Also does any host actually share sub
> pages between different guests? This sounds like a bad idea to me in
> general.
>
Okay, I think I got the issue wrong. I thought that the original patch
tried to also fix a corner case where the guest would assume that it
would get supplied zero pages afterwards. Please ignore the noise. :)
--
Thanks,
David
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH RESEND] mm: don't zero ballooned pages
2017-08-07 8:44 ` David Hildenbrand
2017-08-07 9:25 ` Michal Hocko
@ 2017-08-07 9:35 ` Wei Wang
1 sibling, 0 replies; 8+ messages in thread
From: Wei Wang @ 2017-08-07 9:35 UTC (permalink / raw)
To: David Hildenbrand, linux-kernel, linux-mm, virtualization, mhocko,
mst, zhenwei.pi
Cc: dave.hansen, akpm, mawilcox, Andrea Arcangeli
On 08/07/2017 04:44 PM, David Hildenbrand wrote:
> On 03.08.2017 13:59, Wei Wang wrote:
>> This patch is a revert of 'commit bb01b64cfab7 ("mm/balloon_compaction.c:
>> enqueue zero page to balloon device")'
>>
>> Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
>> shouldn't be given to the host ksmd to scan. Therefore, it is not
>> necessary to zero ballooned pages, which is very time consuming when
>> the page amount is large. The ongoing fast balloon tests show that the
>> time to balloon 7G pages is increased from ~491ms to 2.8 seconds with
>> __GFP_ZERO added. So, this patch removes the flag.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> ---
>> mm/balloon_compaction.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
>> index 9075aa5..b06d9fe 100644
>> --- a/mm/balloon_compaction.c
>> +++ b/mm/balloon_compaction.c
>> @@ -24,7 +24,7 @@ struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info)
>> {
>> unsigned long flags;
>> struct page *page = alloc_page(balloon_mapping_gfp_mask() |
>> - __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_ZERO);
>> + __GFP_NOMEMALLOC | __GFP_NORETRY);
>> if (!page)
>> return NULL;
>>
>>
> Your assumption here is, that the hypervisor will always supply a zero
> page. Unfortunately, this assumption is wrong (and it stems from the
> lack of different page size support in virtio-balloon).
I think this would be something that we can improve the balloon.
For example, the balloon request from the device should be aligned
to the host page size before sending to the guest driver:
On PPC, if the command requests for 140K memory to inflate, it can
be aligned to 128K.
>
> Think about these examples:
>
> 1. Guest is backed by huge pages (hugetbfs). Ballooning kicks in.
>
> MADV_DONTNEED is simply ignored in the hypervisor (hugetlbfs requires
> fallocate punshhole). Also, trying to zap 4k on e.g. 1MB pages will
> simply be ignored.
For the hugetlbfs case, I think the balloon size can be aligned to
the huge page size (i.e 2M or 1GB).
Best,
Wei
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread