Re: [RFC/T/D][PATCH 2/2] Linux/Guest cooperative unmapped page cache control

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Avi Kivity <avi@redhat.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>, kvm <kvm@vger.kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC/T/D][PATCH 2/2] Linux/Guest cooperative unmapped page cache control
Date: Tue, 15 Jun 2010 15:48:55 +0530	[thread overview]
Message-ID: <20100615101855.GC4306@balbir.in.ibm.com> (raw)
In-Reply-To: <4C174B7F.8070504@redhat.com>

* Avi Kivity <avi@redhat.com> [2010-06-15 12:44:31]:

> On 06/15/2010 10:49 AM, Balbir Singh wrote:
> >
> >>All we need is to select the right page to drop.
> >>
> >Do we need to drop to the granularity of the page to drop? I think
> >figuring out the class of pages and making sure that we don't write
> >our own reclaim logic, but work with what we have to identify the
> >class of pages is a good start.
> 
> Well, the class of pages are 'pages that are duplicated on the
> host'.  Unmapped page cache pages are 'pages that might be
> duplicated on the host'.  IMO, that's not close enough.
>

Agreed, but what happens in reality with the code is that it drops
not-so-frequently-used cache (still reusing the reclaim mechanism),
but prioritizing cached memory.
 
> >>How can the host tell if there is duplication?  It may know it has
> >>some pagecache, but it has no idea whether or to what extent guest
> >>pagecache duplicates host pagecache.
> >>
> >Well it is possible in host user space, I for example use memory
> >cgroup and through the stats I have a good idea of how much is duplicated.
> >I am ofcourse making an assumption with my setup of the cached mode,
> >that the data in the guest page cache and page cache in the cgroup
> >will be duplicated to a large extent. I did some trivial experiments
> >like drop the data from the guest and look at the cost of bringing it
> >in and dropping the data from both guest and host and look at the
> >cost. I could see a difference.
> >
> >Unfortunately, I did not save the data, so I'll need to redo the
> >experiment.
> 
> I'm sure we can detect it experimentally, but how do we do it
> programatically at run time (without dropping all the pages).
> Situations change, and I don't think we can infer from a few
> experiments that we'll have a similar amount of sharing.  The cost
> of an incorrect decision is too high IMO (not that I think the
> kernel always chooses the right pages now, but I'd like to avoid
> regressions from the unvirtualized state).
> 
> btw, when running with a disk controller that has a very large
> cache, we might also see duplication between "guest" and host.  So,
> if this is a good idea, it shouldn't be enabled just for
> virtualization, but for any situation where we have a sizeable cache
> behind us.
> 

It depends, once the disk controller has the cache and the pages in
the guest are not-so-frequently-used we can drop them. Please remember
we still use the LRU to identify these pages.

> >>It doesn't, really.  The host only has aggregate information about
> >>itself, and no information about the guest.
> >>
> >>Dropping duplicate pages would be good if we could identify them.
> >>Even then, it's better to drop the page from the host, not the
> >>guest, unless we know the same page is cached by multiple guests.
> >>
> >On the exact pages to drop, please see my comments above on the class
> >of pages to drop.
> 
> Well, we disagree about that.  There is some value in dropping
> duplicated pages (not always), but that's not what the patch does.
> It drops unmapped pagecache pages, which may or may not be
> duplicated.
> 
> >There are reasons for wanting to get the host to cache the data
> 
> There are also reasons to get the guest to cache the data - it's
> more efficient to access it in the guest.
> 
> >Unless the guest is using cache = none, the data will still hit the
> >host page cache
> >The host can do a better job of optimizing the writeouts
> 
> True, especially for non-raw storage.  But even there we have to
> fsync all the time to keep the metadata right.
> 
> >>But why would the guest voluntarily drop the cache?  If there is no
> >>memory pressure, dropping caches increases cpu overhead and latency
> >>even if the data is still cached on the host.
> >>
> >So, there are basically two approaches
> >
> >1. First patch, proactive - enabled by a boot option
> >2. When ballooned, we try to (please NOTE try to) reclaim cached pages
> >first. Failing which, we go after regular pages in the alloc_page()
> >call in the balloon driver.
> 
> Doesn't that mean you may evict a RU mapped page ahead of an LRU
> unmapped page, just in the hope that it is double-cached?
> 
> Maybe we need the guest and host to talk to each other about which
> pages to keep.
> 

Yeah.. I guess that falls into the domain of CMM.

> >>>2. Drop the cache on either a special balloon option, again the host
> >>>knows it caches that very same information, so it prefers to free that
> >>>up first.
> >>Dropping in response to pressure is good.  I'm just not convinced
> >>the patch helps in selecting the correct page to drop.
> >>
> >That is why I've presented data on the experiments I've run and
> >provided more arguments to backup the approach.
> 
> I'm still unconvinced, sorry.
> 

The reason for making this optional is to let the administrators
decide how they want to use the memory in the system. In some
situations it might be a big no-no to waste memory, in some cases it
might be acceptable. 

-- 
	Three Cheers,
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-06-15 10:19 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-08 15:51 [RFC/T/D][PATCH 0/2] KVM page cache optimization (v2) Balbir Singh
2010-06-08 15:51 ` [RFC][PATCH 1/2] Linux/Guest unmapped page cache control Balbir Singh
2010-06-13 18:31   ` Balbir Singh
2010-06-14  0:28     ` KAMEZAWA Hiroyuki
2010-06-14  6:49       ` Balbir Singh
2010-06-14  7:00         ` KAMEZAWA Hiroyuki
2010-06-14  7:36           ` Balbir Singh
2010-06-14  7:49             ` KAMEZAWA Hiroyuki
2010-06-08 15:51 ` [RFC/T/D][PATCH 2/2] Linux/Guest cooperative " Balbir Singh
2010-06-10  9:43   ` Avi Kivity
2010-06-10 14:25     ` Balbir Singh
2010-06-11  0:07       ` Dave Hansen
2010-06-11  1:54         ` KAMEZAWA Hiroyuki
2010-06-11  4:46           ` Balbir Singh
2010-06-11  5:05             ` KAMEZAWA Hiroyuki
2010-06-11  5:08               ` KAMEZAWA Hiroyuki
2010-06-11  6:14               ` Balbir Singh
2010-06-11  4:56         ` Balbir Singh
2010-06-14  8:09           ` Avi Kivity
2010-06-14  8:48             ` Balbir Singh
2010-06-14 12:40               ` Avi Kivity
2010-06-14 12:50                 ` Balbir Singh
2010-06-14 13:01                   ` Avi Kivity
2010-06-14 15:33                     ` Dave Hansen
2010-06-14 15:44                       ` Avi Kivity
2010-06-14 15:55                         ` Dave Hansen
2010-06-14 16:34                           ` Avi Kivity
2010-06-14 17:45                             ` Balbir Singh
2010-06-15  6:58                               ` Avi Kivity
2010-06-15  7:49                                 ` Balbir Singh
2010-06-15  9:44                                   ` Avi Kivity
2010-06-15 10:18                                     ` Balbir Singh [this message]
2010-06-14 17:58                             ` Dave Hansen
2010-06-15  7:07                               ` Avi Kivity
2010-06-15 14:47                                 ` Dave Hansen
2010-06-16 11:39                                   ` Avi Kivity
2010-06-17  6:04                                     ` Balbir Singh
2010-06-14 15:12               ` Dave Hansen
2010-06-14 15:34                 ` Avi Kivity
2010-06-14 17:40                   ` Balbir Singh
2010-06-15  7:11                     ` Avi Kivity
2010-06-14 16:58                 ` Balbir Singh
2010-06-14 17:09                   ` Dave Hansen
2010-06-14 17:16                     ` Balbir Singh
2010-06-15  7:12                       ` Avi Kivity
2010-06-15  7:52                         ` Balbir Singh
2010-06-15  9:54                           ` Avi Kivity
2010-06-15 12:49                             ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100615101855.GC4306@balbir.in.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).