KVM architecture docs

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* KVM architecture docs
@ 2008-03-03 15:51 Alessandro Sardo
  2008-03-03 18:23 ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Alessandro Sardo @ 2008-03-03 15:51 UTC (permalink / raw)
  To: kvm-devel

Hello,

I'm interested in learning the technical details of KVM, possibly up to 
date with latest versions (KVM changes so fast!). I'm not really geared 
to development (yet), rather I would like to study its architecture from 
a security point of view.

I've searched anywhere and all I could find was some basic/marketing 
stuff, or simple white papers explaining HW virtualization. The wiki 
currently has some details, but none of them are satisfying enough for 
my needs :-)

If you're familiar with Xen, I'm looking for the KVM equivalent versions 
of the following docs:

http://www.cl.cam.ac.uk/research/srg/netos/papers/2003-xensosp.pdf
http://wiki.xensource.com/xenwiki/XenArchitecture?action=AttachFile&do=get&target=Xen+Architecture_Q1+2008.pdf

Does anything like that exist, or should I go the long way studying QEMU 
and KVM sources?

Thanks,

- Alessandro Sardo

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-03 15:51 KVM architecture docs Alessandro Sardo
@ 2008-03-03 18:23 ` Avi Kivity
  2008-03-04  4:12   ` Zhao Forrest
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2008-03-03 18:23 UTC (permalink / raw)
  To: Alessandro Sardo; +Cc: kvm-devel

Alessandro Sardo wrote:
> Hello,
>
> I'm interested in learning the technical details of KVM, possibly up to 
> date with latest versions (KVM changes so fast!). I'm not really geared 
> to development (yet), rather I would like to study its architecture from 
> a security point of view.
>
> I've searched anywhere and all I could find was some basic/marketing 
> stuff, or simple white papers explaining HW virtualization. The wiki 
> currently has some details, but none of them are satisfying enough for 
> my needs :-)
>
> If you're familiar with Xen, I'm looking for the KVM equivalent versions 
> of the following docs:
>
> http://www.cl.cam.ac.uk/research/srg/netos/papers/2003-xensosp.pdf
> http://wiki.xensource.com/xenwiki/XenArchitecture?action=AttachFile&do=get&target=Xen+Architecture_Q1+2008.pdf
>
> Does anything like that exist, or should I go the long way studying QEMU 
> and KVM sources?
>
>   

http://ols.108.redhat.com/2007/Reprints/kivity-Reprint.pdf


Outdated, but as a high level overview should suffice.  The biggest 
change in terms of architecture is that we now use regular Linux 
userspace memory for the guest, which means that normal Linux memory 
management applies to the guest.  This includes swapping, memory 
sharing, large page support, NUMA page placement and migration, etc.



-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-03 18:23 ` Avi Kivity
@ 2008-03-04  4:12   ` Zhao Forrest
  2008-03-04  8:25     ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Zhao Forrest @ 2008-03-04  4:12 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel

>
> http://ols.108.redhat.com/2007/Reprints/kivity-Reprint.pdf
>
Hi Avi,

I have a question about KVM architecture after reading your paper.
It reads:
......
At the kernel level, the kernel causes the hardware
to enter guest mode. If the processor exits guest
mode due to an event such as an external interrupt
or a shadow page table fault, the kernel performs
the necessary handling and resumes guest execution.
If the exit reason is due to an I/O instruction
or a signal queued to the process, then the kernel
exits to userspace.
......
After reading your paper my understanding of KVM architecture is that
for a particular VM the user mode(QEMU), kernel mode and guest mode share
the same process context from host linux kernel's point of view, right?
If this is the case, see the below example:
1 physical NIC interrupt is received on physical CPU 0 and host kernel
determines that this is a network packet targeted to the emulated NIC
for a VM
2 at the same time this VM is running in guest mode on physical CPU 1
My question is: at this time can host kernel *actively* interrupt VM
and make it run in user mode to handle the incoming network data
packet in QEMU? Or host kernel has to wait for
VM(because of external interrupt or shadow page table fault or I/O
instruction) to quit guest mode and wait for VM to voluntarily detect
that incoming network packet is pending and switch to user space?
A further question is, how a VM detect the incoming pending network
packet? In kernel space or in user space?

Thanks,
Forrest

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-04  4:12   ` Zhao Forrest
@ 2008-03-04  8:25     ` Avi Kivity
  2008-03-04  9:02       ` Zhao Forrest
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2008-03-04  8:25 UTC (permalink / raw)
  To: Zhao Forrest; +Cc: kvm-devel

Zhao Forrest wrote:
>> http://ols.108.redhat.com/2007/Reprints/kivity-Reprint.pdf
>>
>>     
> Hi Avi,
>
> I have a question about KVM architecture after reading your paper.
> It reads:
> ......
> At the kernel level, the kernel causes the hardware
> to enter guest mode. If the processor exits guest
> mode due to an event such as an external interrupt
> or a shadow page table fault, the kernel performs
> the necessary handling and resumes guest execution.
> If the exit reason is due to an I/O instruction
> or a signal queued to the process, then the kernel
> exits to userspace.
> ......
> After reading your paper my understanding of KVM architecture is that
> for a particular VM the user mode(QEMU), kernel mode and guest mode share
> the same process context from host linux kernel's point of view, right?
>   

Correct.  Virtual machine == process, virtual cpu == thread.

> If this is the case, see the below example:
> 1 physical NIC interrupt is received on physical CPU 0 and host kernel
> determines that this is a network packet targeted to the emulated NIC
> for a VM
> 2 at the same time this VM is running in guest mode on physical CPU 1
> My question is: at this time can host kernel *actively* interrupt VM
> and make it run in user mode to handle the incoming network data
> packet in QEMU? Or host kernel has to wait for
> VM(because of external interrupt or shadow page table fault or I/O
> instruction) to quit guest mode and wait for VM to voluntarily detect
> that incoming network packet is pending and switch to user space?
>   

The incoming packet is processed by the host ethernet stack; it is 
forwarded to the bridge, which forwards it to the tap.  When the tap 
queues the packet, it sends a signal to qemu (since the tap file 
descriptor has a signal associated).  When the kernel delivers the 
signal, it notices the qemu thread is running on cpu 1, so it sends an 
inter-processor interrupt to cpu 1.  The interrupt causes the processor 
to leave guest mode and exit to the hypervisor, which notices that a 
signal is pending, so it exits to qemu which dequeues the packet and 
notifies the guest (if necessary) by injecting an interrupt.

Note that most of this path (including the IPI) is regular Linux code, 
not kvm related, and would happen for any other application in the same way.

> A further question is, how a VM detect the incoming pending network
> packet? In kernel space or in user space?
>   

Are you talking about the host or guest?  If the host, the packet is 
received by the kernel, and further processing is done in userspace.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-04  8:25     ` Avi Kivity
@ 2008-03-04  9:02       ` Zhao Forrest
  2008-03-04  9:21         ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Zhao Forrest @ 2008-03-04  9:02 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel

Thanks for your detailed explanation :). That's quite helpful for me
to understand KVM internals.

>
> > If this is the case, see the below example:
> > 1 physical NIC interrupt is received on physical CPU 0 and host kernel
> > determines that this is a network packet targeted to the emulated NIC
> > for a VM
> > 2 at the same time this VM is running in guest mode on physical CPU 1
> > My question is: at this time can host kernel *actively* interrupt VM
> > and make it run in user mode to handle the incoming network data
> > packet in QEMU? Or host kernel has to wait for
> > VM(because of external interrupt or shadow page table fault or I/O
> > instruction) to quit guest mode and wait for VM to voluntarily detect
> > that incoming network packet is pending and switch to user space?
> >
>
> The incoming packet is processed by the host ethernet stack; it is
> forwarded to the bridge, which forwards it to the tap. When the tap
> queues the packet, it sends a signal to qemu (since the tap file
> descriptor has a signal associated). When the kernel delivers the
> signal, it notices the qemu thread is running on cpu 1, so it sends an
> inter-processor interrupt to cpu 1. The interrupt causes the processor
Is the behavior of "When the kernel delivers the signal, it notices the
qemu thread is running on cpu 1, so it sends an IPI to cpu 1" a generic
signal-delivery behavior in Linux kernel? Or KVM need to add some hook
to achieve this?

> to leave guest mode and exit to the hypervisor, which notices that a
> signal is pending, so it exits to qemu which dequeues the packet and
> notifies the guest (if necessary) by injecting an interrupt.
>
> Note that most of this path (including the IPI) is regular Linux code,
> not kvm related, and would happen for any other application in the same way.
>

Thanks,
Forrest

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-04  9:02       ` Zhao Forrest
@ 2008-03-04  9:21         ` Avi Kivity
  2008-03-04 14:17           ` Javier Guerra
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2008-03-04  9:21 UTC (permalink / raw)
  To: Zhao Forrest; +Cc: kvm-devel

Zhao Forrest wrote:

>> The incoming packet is processed by the host ethernet stack; it is
>> forwarded to the bridge, which forwards it to the tap. When the tap
>> queues the packet, it sends a signal to qemu (since the tap file
>> descriptor has a signal associated). When the kernel delivers the
>> signal, it notices the qemu thread is running on cpu 1, so it sends an
>> inter-processor interrupt to cpu 1. The interrupt causes the processor
>>     
> Is the behavior of "When the kernel delivers the signal, it notices the
> qemu thread is running on cpu 1, so it sends an IPI to cpu 1" a generic
> signal-delivery behavior in Linux kernel? Or KVM need to add some hook
> to achieve this?
>
>   

This is generic behavior.  The same thing happens when running plain 
qemu, except that instead of breaking out from guest mode, the IPI 
causes the processor to break out from user mode and trap into the kernel.

This is one of the most powerful characteristics of kvm: it works with 
ordinary kernel mechanisms, so that all the properties and features of 
the kernel apply to kvm automatically.  In this particular case, it 
means that all the scheduler features, including real-time scheduling, 
apply to kvm guests.  With mmu notifiers, the trend will grow even stronger.

-- 
error compiling committee.c: too many arguments to function

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-04  9:21         ` Avi Kivity
@ 2008-03-04 14:17           ` Javier Guerra
  2008-03-04 15:08             ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Javier Guerra @ 2008-03-04 14:17 UTC (permalink / raw)
  To: kvm-devel

On 3/4/08, Avi Kivity <avi@qumranet.com> wrote:
>  apply to kvm guests.  With mmu notifiers, the trend will grow even stronger.

could you (or anybody) elaborate on that? the mmu-related threads show
lots of progress, but it's way (way) out of my league.

AFAICT, it's about the infrastructure to later write drivers (virtio?)
to DMA-heavy hardware (IB, RDMA, etc).  am i wrong?  or is it
something more complete (like a ready to use driver)?

-- 
Javier

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-04 14:17           ` Javier Guerra
@ 2008-03-04 15:08             ` Avi Kivity
  2008-03-05  4:06               ` Zhao Forrest
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2008-03-04 15:08 UTC (permalink / raw)
  To: Javier Guerra; +Cc: kvm-devel

Javier Guerra wrote:
> On 3/4/08, Avi Kivity <avi@qumranet.com> wrote:
>   
>>  apply to kvm guests.  With mmu notifiers, the trend will grow even stronger.
>>     
>
> could you (or anybody) elaborate on that? the mmu-related threads show
> lots of progress, but it's way (way) out of my league.
>
> AFAICT, it's about the infrastructure to later write drivers (virtio?)
> to DMA-heavy hardware (IB, RDMA, etc).  am i wrong?  or is it
> something more complete (like a ready to use driver)?
>
>   

mmu notifiers provide a way for the core Linux memory management code to 
propagate changes in how Linux views a process' memory map to external 
memory management units that are also interested in that memory map.  
These changes include things like swapping, page migration, changes to 
memory protection, defragmentation, and copy-on-write.  In this context, 
kvm appears as a dma capable memory controller, like RDMA NICs or GPUs.

For kvm, this is important as it allows all those features to be used 
transparently with guests.

- swapping allows you to overcommit memory
- page migration allows optimization of memory placement within the host 
in response to changing workloads
- defragmentation will allow (if/when it is merged into Linux) more 
widespread use of large pages, which improve performance
- copy-on-write allows sharing identical pages of memory among guests, 
increasing guest density

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-04 15:08             ` Avi Kivity
@ 2008-03-05  4:06               ` Zhao Forrest
  2008-03-05  5:19                 ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Zhao Forrest @ 2008-03-05  4:06 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, Javier Guerra

> >
> > could you (or anybody) elaborate on that? the mmu-related threads show
> > lots of progress, but it's way (way) out of my league.
> >
> > AFAICT, it's about the infrastructure to later write drivers (virtio?)
> > to DMA-heavy hardware (IB, RDMA, etc). am i wrong? or is it
> > something more complete (like a ready to use driver)?
> >
> >
>
> mmu notifiers provide a way for the core Linux memory management code to
> propagate changes in how Linux views a process' memory map to external
> memory management units that are also interested in that memory map.
> These changes include things like swapping, page migration, changes to
> memory protection, defragmentation, and copy-on-write. In this context,
> kvm appears as a dma capable memory controller, like RDMA NICs or GPUs.
>
> For kvm, this is important as it allows all those features to be used
> transparently with guests.
>
> - swapping allows you to overcommit memory

Normally swapping mechanism choose the Least Recently Used(LRU) pages
of a process to be swapped out. When KVM uses MMU notifier in linux
kernel to implement swapping for VM, could KVM choose LRU pages of a
VM to swap out? If so, could you give a brief description about how
this is implemented?

> - page migration allows optimization of memory placement within the host
> in response to changing workloads
> - defragmentation will allow (if/when it is merged into Linux) more
> widespread use of large pages, which improve performance
> - copy-on-write allows sharing identical pages of memory among guests,
> increasing guest density

Thanks,
Forrest

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-05  4:06               ` Zhao Forrest
@ 2008-03-05  5:19                 ` Avi Kivity
  2008-03-05  6:05                   ` Zhao Forrest
  2008-03-05  6:51                   ` Zhao Forrest
  0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2008-03-05  5:19 UTC (permalink / raw)
  To: Zhao Forrest; +Cc: kvm-devel, Javier Guerra

Zhao Forrest wrote:
>> - swapping allows you to overcommit memory
>>     
>
> Normally swapping mechanism choose the Least Recently Used(LRU) pages
> of a process to be swapped out. When KVM uses MMU notifier in linux
> kernel to implement swapping for VM, could KVM choose LRU pages of a
> VM to swap out? If so, could you give a brief description about how
> this is implemented?
>   

The Linux memory manager approximates LRU by scanning pages for the 
accessed bit, which is set in the pte by the processor when a page is 
accessed through that pte.  mmu notifiers provide a callback for the 
check, so that kvm can check the accessed bit on the shadow ptes.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-05  5:19                 ` Avi Kivity
@ 2008-03-05  6:05                   ` Zhao Forrest
  2008-03-05  6:10                     ` Avi Kivity
  2008-03-05  6:38                     ` Avi Kivity
  2008-03-05  6:51                   ` Zhao Forrest
  1 sibling, 2 replies; 15+ messages in thread
From: Zhao Forrest @ 2008-03-05  6:05 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, Javier Guerra

> > Normally swapping mechanism choose the Least Recently Used(LRU) pages
> > of a process to be swapped out. When KVM uses MMU notifier in linux
> > kernel to implement swapping for VM, could KVM choose LRU pages of a
> > VM to swap out? If so, could you give a brief description about how
> > this is implemented?
> >
>
> The Linux memory manager approximates LRU by scanning pages for the
> accessed bit, which is set in the pte by the processor when a page is
> accessed through that pte. mmu notifiers provide a callback for the
> check, so that kvm can check the accessed bit on the shadow ptes.

If I understand correctly, when NPT is used by KVM in the future, this mmu
notifier can't help much for swapping out pages used by VM, right?
That is, when NPT is used, a balloon para-virt driver running on gust
OS might be more efficient for swapping, am I right?

Thanks,
Forrest

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-05  6:05                   ` Zhao Forrest
@ 2008-03-05  6:10                     ` Avi Kivity
  2008-03-05  6:38                     ` Avi Kivity
  1 sibling, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2008-03-05  6:10 UTC (permalink / raw)
  To: Zhao Forrest; +Cc: kvm-devel, Javier Guerra

Zhao Forrest wrote:
>>> Normally swapping mechanism choose the Least Recently Used(LRU) pages
>>> of a process to be swapped out. When KVM uses MMU notifier in linux
>>> kernel to implement swapping for VM, could KVM choose LRU pages of a
>>> VM to swap out? If so, could you give a brief description about how
>>> this is implemented?
>>>
>>>       
>> The Linux memory manager approximates LRU by scanning pages for the
>> accessed bit, which is set in the pte by the processor when a page is
>> accessed through that pte. mmu notifiers provide a callback for the
>> check, so that kvm can check the accessed bit on the shadow ptes.
>>     
>
> If I understand correctly, when NPT is used by KVM in the future, this mmu
> notifier can't help much for swapping out pages used by VM, right?
>   

No, NPT does not change things materially.  Shadow page tables are still 
used, though instead of mapping guest virtual addresses to host physical 
addresses, they now translate guest physical addresses to host physical 
addresses.  Swapping and all the other goodies still work.

> That is, when NPT is used, a balloon para-virt driver running on gust
> OS might be more efficient for swapping, am I right?
>   

Ballooning is more efficient than swapping both with and without NPT.  
The problem with ballooning is that it requires guest cooperation.  The 
guest may not be able to balloon, or it may take a long time to balloon, 
while the host may need the memory immediately.  A rebooting guest also 
implicitly deflates its balloon, creating a large and unpredictable 
memory demand on the host.

A good solution needs to use ballooning with swapping as a fallback for 
guaranteeing that the system does not run out of memory.

A nice feature in 2.6.25 is the ability to select which guests will 
swap, via the memory controller feature (mlock() also works, but is 
relatively crude).

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-05  6:05                   ` Zhao Forrest
  2008-03-05  6:10                     ` Avi Kivity
@ 2008-03-05  6:38                     ` Avi Kivity
  1 sibling, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2008-03-05  6:38 UTC (permalink / raw)
  To: Zhao Forrest; +Cc: kvm-devel, Javier Guerra

Zhao Forrest wrote:
> when NPT is used by KVM in the future, this mmu
>   

btw, NPT support is already integrated.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-05  5:19                 ` Avi Kivity
  2008-03-05  6:05                   ` Zhao Forrest
@ 2008-03-05  6:51                   ` Zhao Forrest
  2008-03-05  6:58                     ` Avi Kivity
  1 sibling, 1 reply; 15+ messages in thread
From: Zhao Forrest @ 2008-03-05  6:51 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, Javier Guerra

> >
> > Normally swapping mechanism choose the Least Recently Used(LRU) pages
> > of a process to be swapped out. When KVM uses MMU notifier in linux
> > kernel to implement swapping for VM, could KVM choose LRU pages of a
> > VM to swap out? If so, could you give a brief description about how
> > this is implemented?
> >
>
> The Linux memory manager approximates LRU by scanning pages for the
> accessed bit, which is set in the pte by the processor when a page is
> accessed through that pte. mmu notifiers provide a callback for the
> check, so that kvm can check the accessed bit on the shadow ptes.

Linux kernel maintains a reverse mapping from a page frame to all page tables
pointing to this page frame. Does KVM need to maintain a similar reverse mapping
from a page frame to all shadow page tables pointing to this page frame?
I should have read the code to find the answer. But it's appreciated
if you could give
a quick answer :)

Thanks,
Forrest

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM architecture docs
  2008-03-05  6:51                   ` Zhao Forrest
@ 2008-03-05  6:58                     ` Avi Kivity
  0 siblings, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2008-03-05  6:58 UTC (permalink / raw)
  To: Zhao Forrest; +Cc: kvm-devel, Javier Guerra

Zhao Forrest wrote:
>>> Normally swapping mechanism choose the Least Recently Used(LRU) pages
>>> of a process to be swapped out. When KVM uses MMU notifier in linux
>>> kernel to implement swapping for VM, could KVM choose LRU pages of a
>>> VM to swap out? If so, could you give a brief description about how
>>> this is implemented?
>>>
>>>       
>> The Linux memory manager approximates LRU by scanning pages for the
>> accessed bit, which is set in the pte by the processor when a page is
>> accessed through that pte. mmu notifiers provide a callback for the
>> check, so that kvm can check the accessed bit on the shadow ptes.
>>     
>
> Linux kernel maintains a reverse mapping from a page frame to all page tables
> pointing to this page frame. Does KVM need to maintain a similar reverse mapping
> from a page frame to all shadow page tables pointing to this page frame?
>   

Yes, look for 'rmap' in mmu.c.  The purpose was initially to be able to 
write-protect shadowed guest page tables without horrible worst-case 
performance, and was later extended to swapping.

With mmu notifiers, when the kernel swaps a page, it first scans its own 
rmap, then calls kvm which scans the kvm rmap.  So one way to look at 
mmu notifiers is as rmap extenders (that's not the whole story -- kvm 
ptes are in a different format than Linux ptes, so the code has to be 
different).

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2008-03-05  6:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-03 15:51 KVM architecture docs Alessandro Sardo
2008-03-03 18:23 ` Avi Kivity
2008-03-04  4:12   ` Zhao Forrest
2008-03-04  8:25     ` Avi Kivity
2008-03-04  9:02       ` Zhao Forrest
2008-03-04  9:21         ` Avi Kivity
2008-03-04 14:17           ` Javier Guerra
2008-03-04 15:08             ` Avi Kivity
2008-03-05  4:06               ` Zhao Forrest
2008-03-05  5:19                 ` Avi Kivity
2008-03-05  6:05                   ` Zhao Forrest
2008-03-05  6:10                     ` Avi Kivity
2008-03-05  6:38                     ` Avi Kivity
2008-03-05  6:51                   ` Zhao Forrest
2008-03-05  6:58                     ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox