All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: mawilcox@microsoft.com, dave.hansen@intel.com,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, ZhenweiPi <zhenwei.pi@youruncloud.com>
Subject: Re: [PATCH] mm: don't zero ballooned pages
Date: Tue, 1 Aug 2017 18:38:54 +0300	[thread overview]
Message-ID: <20170801183518-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20170731083724.GF15767@dhcp22.suse.cz>

On Mon, Jul 31, 2017 at 10:37:24AM +0200, Michal Hocko wrote:
> On Mon 31-07-17 16:23:26, ZhenweiPi wrote:
> > On 07/31/2017 03:51 PM, Michal Hocko wrote:
> > 
> > >On Mon 31-07-17 15:41:49, Wei Wang wrote:
> > >>>On 07/31/2017 02:55 PM, Michal Hocko wrote:
> > >>>> >On Mon 31-07-17 12:13:33, Wei Wang wrote:
> > >>>>> >>Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> > >>>>> >>shouldn't be given to the host ksmd to scan.
> > >>>> >Could you point me where this MADV_DONTNEED is done, please?
> > >>>
> > >>>Sure. It's done in the hypervisor when the balloon pages are received.
> > >>>
> > >>>Please see line 40 at
> > >>>https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c
> > >And one more thing. I am not familiar with ksm much. But how is
> > >MADV_DONTNEED even helping? This madvise is not sticky - aka it will
> > >unmap the range without leaving any note behind. AFAICS the only way
> > >to have vma scanned is to have VM_MERGEABLE and that is an opt in:
> > >See Documentation/vm/ksm.txt
> > >"
> > >KSM only operates on those areas of address space which an application
> > >has advised to be likely candidates for merging, by using the madvise(2)
> > >system call: int madvise(addr, length, MADV_MERGEABLE).
> > >"
> > >
> > >So what exactly is going on here? The original patch looks highly
> > >suspicious as well. If somebody wants to make that memory mergable then
> > >the user of that memory should zero them out.
> > 
> > Kernel starts a kthread named "ksmd". ksmd scans the VM_MERGEABLE
> > memory, and merge the same pages.(same page means memcmp(page1,
> > page2, PAGESIZE) == 0).
> > 
> > Guest can not use ballooned pages, and these pages will not be accessed
> > in a long time. Kswapd on host will swap these pages out and get more
> > free memory.
> > 
> > Rather than swapping, KSM has better performence.  Presently pages in
> > the balloon device have random value,  they usually cannot be merged.
> > So enqueue zero pages will resolve this problem.
> > 
> > Because MADV_DONTNEED depends on host os capability and hypervisor capability,
> > I prefer to enqueue zero pages to balloon device and made this patch.

I think you should have hypervisor zero them out if it wants to then. Seems cleaner.

> 
> So why exactly are we zeroying pages (and pay some cost for that) in
> guest when we do not know what host actually does with them?

I suspect this is some special hypervisor that somehow benefits from
this patch. It should just use a feature bit for its special needs
I think.

Michal is also exactly right that patches like this should come
with some performance numbers.
I'll post a patch adding virtio lists for mm/balloon_compaction.c
so that we notice when people tweak it like that.

> -- 
> Michal Hocko
> SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: ZhenweiPi <zhenwei.pi@youruncloud.com>,
	Wei Wang <wei.w.wang@intel.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	virtualization@lists.linux-foundation.org,
	mawilcox@microsoft.com, dave.hansen@intel.com,
	akpm@linux-foundation.org
Subject: Re: [PATCH] mm: don't zero ballooned pages
Date: Tue, 1 Aug 2017 18:38:54 +0300	[thread overview]
Message-ID: <20170801183518-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20170731083724.GF15767@dhcp22.suse.cz>

On Mon, Jul 31, 2017 at 10:37:24AM +0200, Michal Hocko wrote:
> On Mon 31-07-17 16:23:26, ZhenweiPi wrote:
> > On 07/31/2017 03:51 PM, Michal Hocko wrote:
> > 
> > >On Mon 31-07-17 15:41:49, Wei Wang wrote:
> > >>>On 07/31/2017 02:55 PM, Michal Hocko wrote:
> > >>>> >On Mon 31-07-17 12:13:33, Wei Wang wrote:
> > >>>>> >>Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> > >>>>> >>shouldn't be given to the host ksmd to scan.
> > >>>> >Could you point me where this MADV_DONTNEED is done, please?
> > >>>
> > >>>Sure. It's done in the hypervisor when the balloon pages are received.
> > >>>
> > >>>Please see line 40 at
> > >>>https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c
> > >And one more thing. I am not familiar with ksm much. But how is
> > >MADV_DONTNEED even helping? This madvise is not sticky - aka it will
> > >unmap the range without leaving any note behind. AFAICS the only way
> > >to have vma scanned is to have VM_MERGEABLE and that is an opt in:
> > >See Documentation/vm/ksm.txt
> > >"
> > >KSM only operates on those areas of address space which an application
> > >has advised to be likely candidates for merging, by using the madvise(2)
> > >system call: int madvise(addr, length, MADV_MERGEABLE).
> > >"
> > >
> > >So what exactly is going on here? The original patch looks highly
> > >suspicious as well. If somebody wants to make that memory mergable then
> > >the user of that memory should zero them out.
> > 
> > Kernel starts a kthread named "ksmd". ksmd scans the VM_MERGEABLE
> > memory, and merge the same pages.(same page means memcmp(page1,
> > page2, PAGESIZE) == 0).
> > 
> > Guest can not use ballooned pages, and these pages will not be accessed
> > in a long time. Kswapd on host will swap these pages out and get more
> > free memory.
> > 
> > Rather than swapping, KSM has better performence.  Presently pages in
> > the balloon device have random value,  they usually cannot be merged.
> > So enqueue zero pages will resolve this problem.
> > 
> > Because MADV_DONTNEED depends on host os capability and hypervisor capability,
> > I prefer to enqueue zero pages to balloon device and made this patch.

I think you should have hypervisor zero them out if it wants to then. Seems cleaner.

> 
> So why exactly are we zeroying pages (and pay some cost for that) in
> guest when we do not know what host actually does with them?

I suspect this is some special hypervisor that somehow benefits from
this patch. It should just use a feature bit for its special needs
I think.

Michal is also exactly right that patches like this should come
with some performance numbers.
I'll post a patch adding virtio lists for mm/balloon_compaction.c
so that we notice when people tweak it like that.

> -- 
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: ZhenweiPi <zhenwei.pi@youruncloud.com>,
	Wei Wang <wei.w.wang@intel.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	virtualization@lists.linux-foundation.org,
	mawilcox@microsoft.com, dave.hansen@intel.com,
	akpm@linux-foundation.org
Subject: Re: [PATCH] mm: don't zero ballooned pages
Date: Tue, 1 Aug 2017 18:38:54 +0300	[thread overview]
Message-ID: <20170801183518-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20170731083724.GF15767@dhcp22.suse.cz>

On Mon, Jul 31, 2017 at 10:37:24AM +0200, Michal Hocko wrote:
> On Mon 31-07-17 16:23:26, ZhenweiPi wrote:
> > On 07/31/2017 03:51 PM, Michal Hocko wrote:
> > 
> > >On Mon 31-07-17 15:41:49, Wei Wang wrote:
> > >>>On 07/31/2017 02:55 PM, Michal Hocko wrote:
> > >>>> >On Mon 31-07-17 12:13:33, Wei Wang wrote:
> > >>>>> >>Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and
> > >>>>> >>shouldn't be given to the host ksmd to scan.
> > >>>> >Could you point me where this MADV_DONTNEED is done, please?
> > >>>
> > >>>Sure. It's done in the hypervisor when the balloon pages are received.
> > >>>
> > >>>Please see line 40 at
> > >>>https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c
> > >And one more thing. I am not familiar with ksm much. But how is
> > >MADV_DONTNEED even helping? This madvise is not sticky - aka it will
> > >unmap the range without leaving any note behind. AFAICS the only way
> > >to have vma scanned is to have VM_MERGEABLE and that is an opt in:
> > >See Documentation/vm/ksm.txt
> > >"
> > >KSM only operates on those areas of address space which an application
> > >has advised to be likely candidates for merging, by using the madvise(2)
> > >system call: int madvise(addr, length, MADV_MERGEABLE).
> > >"
> > >
> > >So what exactly is going on here? The original patch looks highly
> > >suspicious as well. If somebody wants to make that memory mergable then
> > >the user of that memory should zero them out.
> > 
> > Kernel starts a kthread named "ksmd". ksmd scans the VM_MERGEABLE
> > memory, and merge the same pages.(same page means memcmp(page1,
> > page2, PAGESIZE) == 0).
> > 
> > Guest can not use ballooned pages, and these pages will not be accessed
> > in a long time. Kswapd on host will swap these pages out and get more
> > free memory.
> > 
> > Rather than swapping, KSM has better performence.  Presently pages in
> > the balloon device have random value,  they usually cannot be merged.
> > So enqueue zero pages will resolve this problem.
> > 
> > Because MADV_DONTNEED depends on host os capability and hypervisor capability,
> > I prefer to enqueue zero pages to balloon device and made this patch.

I think you should have hypervisor zero them out if it wants to then. Seems cleaner.

> 
> So why exactly are we zeroying pages (and pay some cost for that) in
> guest when we do not know what host actually does with them?

I suspect this is some special hypervisor that somehow benefits from
this patch. It should just use a feature bit for its special needs
I think.

Michal is also exactly right that patches like this should come
with some performance numbers.
I'll post a patch adding virtio lists for mm/balloon_compaction.c
so that we notice when people tweak it like that.

> -- 
> Michal Hocko
> SUSE Labs

  reply	other threads:[~2017-08-01 15:38 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-31  4:13 [PATCH] mm: don't zero ballooned pages Wei Wang
2017-07-31  4:13 ` Wei Wang
2017-07-31  6:55 ` Michal Hocko
2017-07-31  6:55 ` Michal Hocko
2017-07-31  6:55   ` Michal Hocko
2017-07-31  7:39   ` ZhenweiPi
2017-07-31  7:39     ` ZhenweiPi
2017-07-31  7:39   ` ZhenweiPi
2017-07-31  7:41   ` Wei Wang
2017-07-31  7:41     ` Wei Wang
2017-07-31  7:43     ` Michal Hocko
2017-07-31  7:43       ` Michal Hocko
2017-07-31  8:34       ` Wei Wang
2017-07-31  8:34         ` Wei Wang
2017-07-31  8:34       ` Wei Wang
2017-07-31  7:43     ` Michal Hocko
2017-07-31  7:51     ` Michal Hocko
2017-07-31  7:51     ` Michal Hocko
2017-07-31  7:51       ` Michal Hocko
2017-07-31  8:23       ` ZhenweiPi
2017-07-31  8:37         ` Michal Hocko
2017-07-31  8:37         ` Michal Hocko
2017-07-31  8:37           ` Michal Hocko
2017-08-01 15:38           ` Michael S. Tsirkin [this message]
2017-08-01 15:38             ` Michael S. Tsirkin
2017-08-01 15:38             ` Michael S. Tsirkin
2017-07-31  8:23       ` ZhenweiPi
2017-07-31  7:41   ` Wei Wang
  -- strict thread matches above, loose matches on Subject: below --
2017-07-31  4:13 Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170801183518-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mawilcox@microsoft.com \
    --cc=mhocko@kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=zhenwei.pi@youruncloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.