From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
virtualization@lists.osdl.org, frankeh@watson.ibm.com,
akpm@osdl.org, nickpiggin@yahoo.com.au, hugh@veritas.com,
riel@redhat.com
Subject: Re: [patch 0/6] Guest page hinting version 7.
Date: Mon, 30 Mar 2009 18:34:05 +0200 [thread overview]
Message-ID: <20090330183405.750440da@skybase> (raw)
In-Reply-To: <1238428495.8286.638.camel@nimitz>
On Mon, 30 Mar 2009 08:54:55 -0700
Dave Hansen <dave@linux.vnet.ibm.com> wrote:
> On Sun, 2009-03-29 at 16:12 +0200, Martin Schwidefsky wrote:
> > > Can we persuade the hypervisor to tell us which pages it decided to page
> > > out and just skip those when we're scanning the LRU?
> >
> > One principle of the whole approach is that the hypervisor does not
> > call into an otherwise idle guest. The cost of schedulung the virtual
> > cpu is just too high. So we would a means to store the information where
> > the guest can pick it up when it happens to do LRU. I don't think that
> > this will work out.
>
> I didn't mean for it to actively notify the guest. Perhaps, as Rik
> said, have a bitmap where the host can set or clear bit for the guest to
> see.
Yes, agreed.
> As the guest is scanning the LRU, it checks the structure (or makes an
> hcall or whatever) and sees that the hypervisor has already taken care
> of the page. It skips these pages in the first round of scanning.
As long as we make this optional I'm fine with it. On s390 with the
current implementation that translates to an ESSA call. Which is not
exactly inexpensive, we are talking about > 100 cycles. The better
solution for us is to age the page with the standard active/inactive
processing.
> I do see what you're saying about this saving the page-*out* operation
> on the hypervisor side. It can simply toss out pages instead of paging
> them itself. That's a pretty advanced optimization, though. What would
> this code look like if we didn't optimize to that level?
Why? It is just a simple test in the hosts LRU scan. If the page is at
the end of the inactive list AND has the volatile state then don't
bother with writeback, just throw it away. This is the only place where
the host has to check for the page state.
> It also occurs to me that the hypervisor could be doing a lot of this
> internally. This whole scheme is about telling the hypervisor about
> pages that we (the kernel) know we can regenerate. The hypervisor
> should know a lot of that information, too. We ask it to populate a
> page with stuff from virtual I/O devices or write a page out to those
> devices. The page remains volatile until something from the guest
> writes to it. The hypervisor could keep a record of how to recreate the
> page as long as it remains volatile and clean.
Unfortunately it is not that simple. There are quite a few reasons why
a page has to be made stable. You'd have to pass that information back
and forth between the guest and the host otherwise the host will throw
away e.g. an mlocked page because it is still marked as volatile in the
virtual block device.
> That wouldn't cover things like page cache from network filesystems,
> though.
Yes, there are pages with a backing the host knows nothing about.
> This patch does look like the full monty but I have to wonder what other
> partial approaches are out there.
I am open for suggestions. The simples partial approach is already
implemented for s390: unused/stable transitions in the buddy allocator.
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-30 16:34 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-27 15:09 [patch 0/6] Guest page hinting version 7 Martin Schwidefsky
2009-03-27 15:09 ` [patch 1/6] Guest page hinting: core + volatile page cache Martin Schwidefsky
2009-03-27 22:57 ` Rik van Riel
2009-03-29 13:56 ` Martin Schwidefsky
2009-03-29 14:35 ` Rik van Riel
2009-03-27 15:09 ` [patch 2/6] Guest page hinting: volatile swap cache Martin Schwidefsky
2009-04-01 2:10 ` Rik van Riel
2009-04-01 8:13 ` Martin Schwidefsky
2009-03-27 15:09 ` [patch 3/6] Guest page hinting: mlocked pages Martin Schwidefsky
2009-04-01 2:52 ` Rik van Riel
2009-04-01 8:13 ` Martin Schwidefsky
2009-03-27 15:09 ` [patch 4/6] Guest page hinting: writable page table entries Martin Schwidefsky
2009-04-01 13:25 ` Rik van Riel
2009-04-01 14:36 ` Martin Schwidefsky
2009-04-01 14:45 ` Rik van Riel
2009-03-27 15:09 ` [patch 5/6] Guest page hinting: minor fault optimization Martin Schwidefsky
2009-04-01 15:33 ` Rik van Riel
2009-03-27 15:09 ` [patch 6/6] Guest page hinting: s390 support Martin Schwidefsky
2009-04-01 16:18 ` Rik van Riel
2009-03-27 23:03 ` [patch 0/6] Guest page hinting version 7 Dave Hansen
2009-03-28 0:06 ` Rik van Riel
2009-03-29 14:20 ` Martin Schwidefsky
2009-03-29 14:38 ` Rik van Riel
2009-03-29 14:12 ` Martin Schwidefsky
2009-03-30 15:54 ` Dave Hansen
2009-03-30 16:34 ` Martin Schwidefsky [this message]
2009-03-30 18:37 ` Jeremy Fitzhardinge
2009-03-30 18:42 ` Rik van Riel
2009-03-30 18:59 ` Jeremy Fitzhardinge
2009-03-30 20:02 ` Rik van Riel
2009-03-30 20:35 ` Jeremy Fitzhardinge
2009-03-30 21:38 ` Dor Laor
2009-03-30 22:16 ` Izik Eidus
2009-03-28 6:35 ` Rusty Russell
2009-03-29 14:23 ` Martin Schwidefsky
2009-04-02 11:32 ` Nick Piggin
2009-04-02 15:52 ` Martin Schwidefsky
2009-04-02 16:18 ` Jeremy Fitzhardinge
2009-04-02 16:23 ` Nick Piggin
2009-04-02 19:06 ` Rik van Riel
2009-04-02 19:22 ` Nick Piggin
2009-04-02 20:05 ` Rik van Riel
2009-04-03 0:50 ` Jeremy Fitzhardinge
2009-04-02 19:58 ` Jeremy Fitzhardinge
2009-04-02 20:14 ` Rik van Riel
2009-04-02 20:34 ` Jeremy Fitzhardinge
2009-04-03 8:49 ` Martin Schwidefsky
2009-04-03 18:19 ` Jeremy Fitzhardinge
2009-04-06 7:21 ` Martin Schwidefsky
2009-04-06 7:32 ` Nick Piggin
2009-04-06 19:23 ` Jeremy Fitzhardinge
2009-04-02 19:27 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090330183405.750440da@skybase \
--to=schwidefsky@de.ibm.com \
--cc=akpm@osdl.org \
--cc=dave@linux.vnet.ibm.com \
--cc=frankeh@watson.ibm.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=riel@redhat.com \
--cc=virtualization@lists.osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).