All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@qumranet.com>
To: Shaohua Li <shaohua.li@intel.com>
Cc: kvm-devel <kvm-devel@lists.sourceforge.net>,
	lkml <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>
Subject: Re: [RFC 0/8]KVM: swap out guest pages
Date: Tue, 24 Jul 2007 08:42:29 +0300	[thread overview]
Message-ID: <46A59145.3040705@qumranet.com> (raw)
In-Reply-To: <1185241357.24201.12.camel@sli10-conroe.sh.intel.com>

Shaohua Li wrote:
> On Mon, 2007-07-23 at 18:27 +0800, Avi Kivity wrote:
>   
>> Shaohua Li wrote:
>>     
>>> This patch series make kvm guest pages be able to be swapped out and
>>> dynamically allocated. Without it, all guest memory is allocated at
>>> guest start time.
>>>
>>> patches are against latest git, and you need first patch Avi's
>>>       
>> kvm-sch
>>     
>>> integration patch
>>>
>>>       
>> (http://sourceforge.net/mailarchive/forum.php?thread_name=11841693332609-git-send-email-avi%40qumranet.com&forum_name=kvm-devel ).
>>     
>>> Patch is quite stable in my test. With the patch, I can run a 256M
>>> memory guest in a 300M memory host.
>>>       
>> What about the opposite?
>>
>>     
>>> If guest is idle, the memory it used
>>> can be less than 10M. I did a simple performance test (measure
>>>       
>> kernel
>>     
>>> build time in guest), if there is few swap, the performance w/wo the
>>> patch difference isn't significent. If you have better measurement
>>> approach, please let me try.
>>>
>>> Unresolved issue:
>>> 1. swapoff doesn't work, we need a hook.
>>> 2. SMP guest might not work, as kvm doesn't support smp till now.
>>> 3. better algorithm to select swaped out guest pages according to
>>> guest's memory usage.
>>> Maybe more.
>>>
>>> Any suggests and comments are appreciated.
>>>  
>>>       
>> The big question is whether to have kvm's own address_space or not.
>>
>> Having an address_space (like your patch does) is remarkably simple,
>> and
>> requires few hooks from the current vm.  However using existing vmas
>> mapped by the user has many advantages:
>>
>> - compatible with s390 requirements
>> - allows the user to use hugetlbfs pages, which have a performance
>> advantage using ept/npt (but which are unswappable)
>> - allows the user to map a file (which can be regarded as way to
>> specify
>> the swap device)
>> - better ingration with the rest of the vm
>>
>> I am quite torn between the simplicity of your approach and the
>> advantages of using generic vmas.  However, s390 pretty much forces
>> our
>> hand.
>>
>> What is your opinion of extending generic vmas to back kvm guest
>> memory?
>>     
> several issues:
> 1. vma is to manage usersapce address, kvm guest uses full address
> space.
> 2. qemu itself must use some address space.
>   

My idea is to keep the current slot concept, but instead of having kvm
allocate pages for a slot, it would call get_user_pages() for a virtual
address range.  Userspace doesn't directly talk about vmas, just virtual
address ranges.


> 3. kvm need special page fault for shadow page table. generic page table
> operations can't be directly used for guest.
> I have no idea if your idea is feasible. The s390 guys said their shadow
> page table is the same as host, this is why they can easily implement
> swap, x86 is hard.
>   

No question that it is hard.  I'd like to explore just how hard it is.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


WARNING: multiple messages have this Message-ID (diff)
From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Shaohua Li <shaohua.li-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: kvm-devel
	<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	lkml <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [RFC 0/8]KVM: swap out guest pages
Date: Tue, 24 Jul 2007 08:42:29 +0300	[thread overview]
Message-ID: <46A59145.3040705@qumranet.com> (raw)
In-Reply-To: <1185241357.24201.12.camel-yAZKuqJtXNMXR+D7ky4Foa2pdiUAq4bhAL8bYrjMMd8@public.gmane.org>

Shaohua Li wrote:
> On Mon, 2007-07-23 at 18:27 +0800, Avi Kivity wrote:
>   
>> Shaohua Li wrote:
>>     
>>> This patch series make kvm guest pages be able to be swapped out and
>>> dynamically allocated. Without it, all guest memory is allocated at
>>> guest start time.
>>>
>>> patches are against latest git, and you need first patch Avi's
>>>       
>> kvm-sch
>>     
>>> integration patch
>>>
>>>       
>> (http://sourceforge.net/mailarchive/forum.php?thread_name=11841693332609-git-send-email-avi%40qumranet.com&forum_name=kvm-devel ).
>>     
>>> Patch is quite stable in my test. With the patch, I can run a 256M
>>> memory guest in a 300M memory host.
>>>       
>> What about the opposite?
>>
>>     
>>> If guest is idle, the memory it used
>>> can be less than 10M. I did a simple performance test (measure
>>>       
>> kernel
>>     
>>> build time in guest), if there is few swap, the performance w/wo the
>>> patch difference isn't significent. If you have better measurement
>>> approach, please let me try.
>>>
>>> Unresolved issue:
>>> 1. swapoff doesn't work, we need a hook.
>>> 2. SMP guest might not work, as kvm doesn't support smp till now.
>>> 3. better algorithm to select swaped out guest pages according to
>>> guest's memory usage.
>>> Maybe more.
>>>
>>> Any suggests and comments are appreciated.
>>>  
>>>       
>> The big question is whether to have kvm's own address_space or not.
>>
>> Having an address_space (like your patch does) is remarkably simple,
>> and
>> requires few hooks from the current vm.  However using existing vmas
>> mapped by the user has many advantages:
>>
>> - compatible with s390 requirements
>> - allows the user to use hugetlbfs pages, which have a performance
>> advantage using ept/npt (but which are unswappable)
>> - allows the user to map a file (which can be regarded as way to
>> specify
>> the swap device)
>> - better ingration with the rest of the vm
>>
>> I am quite torn between the simplicity of your approach and the
>> advantages of using generic vmas.  However, s390 pretty much forces
>> our
>> hand.
>>
>> What is your opinion of extending generic vmas to back kvm guest
>> memory?
>>     
> several issues:
> 1. vma is to manage usersapce address, kvm guest uses full address
> space.
> 2. qemu itself must use some address space.
>   

My idea is to keep the current slot concept, but instead of having kvm
allocate pages for a slot, it would call get_user_pages() for a virtual
address range.  Userspace doesn't directly talk about vmas, just virtual
address ranges.


> 3. kvm need special page fault for shadow page table. generic page table
> operations can't be directly used for guest.
> I have no idea if your idea is feasible. The s390 guys said their shadow
> page table is the same as host, this is why they can easily implement
> swap, x86 is hard.
>   

No question that it is hard.  I'd like to explore just how hard it is.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

  reply	other threads:[~2007-07-24  5:42 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-23  6:51 [RFC 0/8]KVM: swap out guest pages Shaohua Li
2007-07-23  6:51 ` Shaohua Li
2007-07-23 10:27 ` Avi Kivity
2007-07-23 10:27   ` Avi Kivity
2007-07-23 12:25   ` [kvm-devel] " Christoph Hellwig
2007-07-23 12:25     ` Christoph Hellwig
2007-07-23 12:29     ` [kvm-devel] " Avi Kivity
2007-07-23 12:29       ` Avi Kivity
2007-07-23 12:34       ` [kvm-devel] " Christoph Hellwig
2007-07-23 12:34         ` Christoph Hellwig
2007-07-23 12:39         ` [kvm-devel] " Avi Kivity
2007-07-23 12:39           ` Avi Kivity
2007-07-24  2:00         ` [kvm-devel] " Shaohua Li
2007-07-24  2:00           ` Shaohua Li
2007-07-23 20:06   ` Jeff Dike
2007-07-23 20:06     ` Jeff Dike
2007-07-24  5:22     ` Avi Kivity
2007-07-24  5:22       ` Avi Kivity
2007-07-25 16:15       ` Jeff Dike
2007-07-25 16:15         ` Jeff Dike
2007-07-25 17:12         ` [kvm-devel] " Carsten Otte
2007-07-25 17:12           ` Carsten Otte
2007-07-23 23:10   ` [kvm-devel] " Rusty Russell
2007-07-23 23:10     ` Rusty Russell
2007-07-24  5:30     ` [kvm-devel] " Avi Kivity
2007-07-24  6:11       ` Rusty Russell
2007-07-24  6:11         ` Rusty Russell
2007-07-24  6:21         ` [kvm-devel] " Avi Kivity
2007-07-24  6:21           ` Avi Kivity
2007-07-24  6:45           ` [kvm-devel] " Rusty Russell
2007-07-24  6:45             ` Rusty Russell
2007-07-24  6:59             ` [kvm-devel] " Avi Kivity
2007-07-24  6:59               ` Avi Kivity
2007-07-24  7:17               ` [kvm-devel] " Rusty Russell
2007-07-24  7:17                 ` Rusty Russell
2007-07-24  1:42   ` Shaohua Li
2007-07-24  1:42     ` Shaohua Li
2007-07-24  5:42     ` Avi Kivity [this message]
2007-07-24  5:42       ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46A59145.3040705@qumranet.com \
    --to=avi@qumranet.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.