Re: [RFC 7/8]KVM: swap out guest pages

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Avi Kivity <avi@qumranet.com>
To: Shaohua Li <shaohua.li@intel.com>
Cc: kvm-devel <kvm-devel@lists.sourceforge.net>,
	lkml <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>
Subject: Re: [RFC 7/8]KVM: swap out guest pages
Date: Tue, 24 Jul 2007 17:55:04 +0300	[thread overview]
Message-ID: <46A612C8.6090804@qumranet.com> (raw)
In-Reply-To: <1185173505.2645.71.camel@sli10-conroe.sh.intel.com>

Shaohua Li wrote:
> Make KVM guest pages be allocated dynamically and able to be swaped out.
>
> One issue: all inodes returned from anon_inode_getfd are shared,
> if one module changes field of the inode, other moduels might break.
> Should we introduce a new API to not share inode?
>
> Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> ---
>  
> +static int kvm_set_page_dirty(struct page *page)
> +{
> +	if (!PageDirty(page))
> +		SetPageDirty(page);
> +	return 0;
> +}
> +
> +static int kvm_writepage(struct page *page, struct writeback_control *wbc)
> +{
> +	struct address_space *mapping = page->mapping;
> +	struct kvm *kvm = address_space_to_kvm(mapping);
> +	int ret = 0;
> +
> +	/*
> +	 * gfn_to_page is called with kvm->lock hold, which might invoke page
> +	 * reclaim. So the .writepage should check if we already hold the lock
> +	 * to avoid deadlock.
> +	 */
> +	if (!mutex_trylock(&kvm->lock)) {
> +		set_page_dirty(page);
> +		return AOP_WRITEPAGE_ACTIVATE;
> +	}
> +
> +	/*
> +	 * We just zap vcpu 0's page table. For a SMP guest, we should zap all
> + 	 * vcpus'. It's better shadow page table is per-vm.
> +	 */
> +	if (PagePrivate(page))
> +		kvm_mmu_zap_pagetbl(&kvm->vcpus[0], page->index);
> +
> +	ret = kvm_move_to_swap(page);
> +	if (ret) {
> +		set_page_dirty(page);
> +		goto out;
> +	}
> +	unlock_page(page);
> +out:
> +	mutex_unlock(&kvm->lock);
> +
> +	return ret;
> +}
> +
>   

Perhaps we can use this as a base for userspace-allocated memory.  We 
still have a kvm inode and address_space; but instead of calling 
kvm_move_to_swap(), we use the memory slot and virtual address offset to 
locate the underlying address_space and call that ->writepage().

So:
  kvm_writepage() removes any shadow page table references
  the underlying ->writepage() does the work of paging to the underlying 
store

We need to figure out how to avoid the underlying ->writepage() from not 
within the context of kvm_writepage().  Maybe have a page flag 
signifying layered address spaces?

[it probably violates fifteen different mm assumptions; I need to study 
that code]

An alternative would be to have kvm set a page flag signifying it has 
references to the page when it installs it in a shadow pte.  The mm 
would notice the flag and call kvm to clear it below proceeding with 
normal ->writepage().

-- 
error compiling committee.c: too many arguments to function

WARNING: multiple messages have this Message-ID (diff)

From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Shaohua Li <shaohua.li-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: kvm-devel
	<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	lkml <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [RFC 7/8]KVM: swap out guest pages
Date: Tue, 24 Jul 2007 17:55:04 +0300	[thread overview]
Message-ID: <46A612C8.6090804@qumranet.com> (raw)
In-Reply-To: <1185173505.2645.71.camel-yAZKuqJtXNMXR+D7ky4Foa2pdiUAq4bhAL8bYrjMMd8@public.gmane.org>

Shaohua Li wrote:
> Make KVM guest pages be allocated dynamically and able to be swaped out.
>
> One issue: all inodes returned from anon_inode_getfd are shared,
> if one module changes field of the inode, other moduels might break.
> Should we introduce a new API to not share inode?
>
> Signed-off-by: Shaohua Li <shaohua.li-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  
> +static int kvm_set_page_dirty(struct page *page)
> +{
> +	if (!PageDirty(page))
> +		SetPageDirty(page);
> +	return 0;
> +}
> +
> +static int kvm_writepage(struct page *page, struct writeback_control *wbc)
> +{
> +	struct address_space *mapping = page->mapping;
> +	struct kvm *kvm = address_space_to_kvm(mapping);
> +	int ret = 0;
> +
> +	/*
> +	 * gfn_to_page is called with kvm->lock hold, which might invoke page
> +	 * reclaim. So the .writepage should check if we already hold the lock
> +	 * to avoid deadlock.
> +	 */
> +	if (!mutex_trylock(&kvm->lock)) {
> +		set_page_dirty(page);
> +		return AOP_WRITEPAGE_ACTIVATE;
> +	}
> +
> +	/*
> +	 * We just zap vcpu 0's page table. For a SMP guest, we should zap all
> + 	 * vcpus'. It's better shadow page table is per-vm.
> +	 */
> +	if (PagePrivate(page))
> +		kvm_mmu_zap_pagetbl(&kvm->vcpus[0], page->index);
> +
> +	ret = kvm_move_to_swap(page);
> +	if (ret) {
> +		set_page_dirty(page);
> +		goto out;
> +	}
> +	unlock_page(page);
> +out:
> +	mutex_unlock(&kvm->lock);
> +
> +	return ret;
> +}
> +
>   

Perhaps we can use this as a base for userspace-allocated memory.  We 
still have a kvm inode and address_space; but instead of calling 
kvm_move_to_swap(), we use the memory slot and virtual address offset to 
locate the underlying address_space and call that ->writepage().

So:
  kvm_writepage() removes any shadow page table references
  the underlying ->writepage() does the work of paging to the underlying 
store

We need to figure out how to avoid the underlying ->writepage() from not 
within the context of kvm_writepage().  Maybe have a page flag 
signifying layered address spaces?

[it probably violates fifteen different mm assumptions; I need to study 
that code]

An alternative would be to have kvm set a page flag signifying it has 
references to the page when it installs it in a shadow pte.  The mm 
would notice the flag and call kvm to clear it below proceeding with 
normal ->writepage().

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

next prev parent reply	other threads:[~2007-07-24 14:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-23  6:51 [RFC 7/8]KVM: swap out guest pages Shaohua Li
2007-07-23  6:51 ` Shaohua Li
2007-07-23 11:32 ` Avi Kivity
2007-07-23 11:32   ` Avi Kivity
2007-07-24  1:51   ` Shaohua Li
2007-07-24  1:51     ` Shaohua Li
2007-07-24  5:38     ` Avi Kivity
2007-07-24  5:38       ` Avi Kivity
2007-07-24 14:55 ` Avi Kivity [this message]
2007-07-24 14:55   ` Avi Kivity
2007-07-25 11:55   ` [kvm-devel] " Shaohua Li
2007-07-25 11:55     ` Shaohua Li
2007-07-25 13:20     ` [kvm-devel] " Shaohua Li
2007-07-25 13:20       ` Shaohua Li
2007-07-25 13:25       ` [kvm-devel] " Avi Kivity
2007-07-25 13:25         ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46A612C8.6090804@qumranet.com \
    --to=avi@qumranet.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.