* [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM
@ 2012-03-08 16:29 Akshay Karle
2012-03-15 16:42 ` Konrad Rzeszutek Wilk
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Akshay Karle @ 2012-03-08 16:29 UTC (permalink / raw)
To: linux-kernel
Cc: Dan Magenheimer, konrad.wilk, kvm, ashu tripathi, nishant gulhane,
amarmore2006, Shreyas Mahure, mahesh mohan
Hi,
We are undergraduate engineering students of Maharashtra Academy of
Engineering, Pune, India and we are working on a project entitled
'Transcendent Memory on KVM' as a part of our academics.
The project members are:
1. Ashutosh Tripathi
2. Shreyas Mahure
3. Nishant Gulhane
4. Akshay Karle
---
Project Description:
What is Transcendent Memory(tmem in short)?
Transcendent Memory is a memory optimization technique for the
virtualized environment. It collects the underutilized memory of the
guests and the unassigned(fallow) memory of the host and places it into
a central tmem pool. Indirect access to this pool is then provided to the guests.
For further information on tmem, please refer the article on lwn by Dr.
Dan Magenheimer:
http://lwn.net/Articles/454795/
Since kvm is one of the most popular hypervisors available,
we decided to implement this technique for kvm.
---
kvm-tmem Patch details:
This patch adds appropriate shims at the guest that invokes the kvm
hypercalls, and the host uses zcache pools to implement the required
functions.
To enable tmem on the 'kvm host' add the boot parameter:
"kvmtmem"
And to enable tmem in the 'kvm guests' add the boot parameter:
"tmem"
The diffstat details for this patch are given below:
arch/x86/include/asm/kvm_host.h | 1
arch/x86/kvm/x86.c | 4
drivers/staging/zcache/Makefile | 2
drivers/staging/zcache/kvm-tmem.c | 356 +++++++++++++++++++++++++++++++++++
drivers/staging/zcache/kvm-tmem.h | 55 +++++
drivers/staging/zcache/zcache-main.c | 98 ++++++++-
include/linux/kvm_para.h | 1
7 files changed, 508 insertions(+), 9 deletions(-)
We have already uploaded our work alongwith the 'Frontswap' submitted by Dan,
on the following link:
https://github.com/akshaykarle/kvm-tmem
Any comments/feedback would be appreciated and will help us a lot with our work.
Regards,
Akshay
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-08 16:29 [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM Akshay Karle @ 2012-03-15 16:42 ` Konrad Rzeszutek Wilk 2012-03-15 16:48 ` Konrad Rzeszutek Wilk 2012-03-15 16:58 ` Avi Kivity 2 siblings, 0 replies; 11+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-03-15 16:42 UTC (permalink / raw) To: Akshay Karle Cc: linux-kernel, Dan Magenheimer, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan On Thu, Mar 08, 2012 at 09:59:41PM +0530, Akshay Karle wrote: > Hi, > > We are undergraduate engineering students of Maharashtra Academy of > Engineering, Pune, India and we are working on a project entitled > 'Transcendent Memory on KVM' as a part of our academics. > The project members are: > 1. Ashutosh Tripathi > 2. Shreyas Mahure > 3. Nishant Gulhane > 4. Akshay Karle > > --- > Project Description: > What is Transcendent Memory(tmem in short)? > Transcendent Memory is a memory optimization technique for the > virtualized environment. It collects the underutilized memory of the > guests and the unassigned(fallow) memory of the host and places it into > a central tmem pool. Indirect access to this pool is then provided to the guests. > For further information on tmem, please refer the article on lwn by Dr. > Dan Magenheimer: > http://lwn.net/Articles/454795/ > > Since kvm is one of the most popular hypervisors available, > we decided to implement this technique for kvm. > > --- > kvm-tmem Patch details: > This patch adds appropriate shims at the guest that invokes the kvm > hypercalls, and the host uses zcache pools to implement the required > functions. Great! > > To enable tmem on the 'kvm host' add the boot parameter: > "kvmtmem" > And to enable tmem in the 'kvm guests' add the boot parameter: > "tmem" > > The diffstat details for this patch are given below: > arch/x86/include/asm/kvm_host.h | 1 > arch/x86/kvm/x86.c | 4 > drivers/staging/zcache/Makefile | 2 > drivers/staging/zcache/kvm-tmem.c | 356 +++++++++++++++++++++++++++++++++++ > drivers/staging/zcache/kvm-tmem.h | 55 +++++ > drivers/staging/zcache/zcache-main.c | 98 ++++++++- > include/linux/kvm_para.h | 1 > 7 files changed, 508 insertions(+), 9 deletions(-) > > We have already uploaded our work alongwith the 'Frontswap' submitted by Dan, > on the following link: > https://github.com/akshaykarle/kvm-tmem > > Any comments/feedback would be appreciated and will help us a lot with our work. Great. Will do. > > Regards, > Akshay ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-08 16:29 [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM Akshay Karle 2012-03-15 16:42 ` Konrad Rzeszutek Wilk @ 2012-03-15 16:48 ` Konrad Rzeszutek Wilk 2012-03-15 16:58 ` Avi Kivity 2 siblings, 0 replies; 11+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-03-15 16:48 UTC (permalink / raw) To: Akshay Karle Cc: linux-kernel, Dan Magenheimer, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan > --- > kvm-tmem Patch details: > This patch adds appropriate shims at the guest that invokes the kvm > hypercalls, and the host uses zcache pools to implement the required > functions. > > To enable tmem on the 'kvm host' add the boot parameter: > "kvmtmem" > And to enable tmem in the 'kvm guests' add the boot parameter: > "tmem" > > The diffstat details for this patch are given below: > arch/x86/include/asm/kvm_host.h | 1 > arch/x86/kvm/x86.c | 4 > drivers/staging/zcache/Makefile | 2 > drivers/staging/zcache/kvm-tmem.c | 356 +++++++++++++++++++++++++++++++++++ > drivers/staging/zcache/kvm-tmem.h | 55 +++++ > drivers/staging/zcache/zcache-main.c | 98 ++++++++- > include/linux/kvm_para.h | 1 > 7 files changed, 508 insertions(+), 9 deletions(-) > > We have already uploaded our work alongwith the 'Frontswap' submitted by Dan, > on the following link: > https://github.com/akshaykarle/kvm-tmem Is there a way for these patches to be posted on LKML? It is rather difficult to copy-n-paste patches in emails and sending them. Or if you want to, you can email them directly to me. To do that use 'git send-email' and 'git format-patch' to prep the git commits into patches. Also, the title says 'RFC 0/2' but I am not seing 1 or 2? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-08 16:29 [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM Akshay Karle 2012-03-15 16:42 ` Konrad Rzeszutek Wilk 2012-03-15 16:48 ` Konrad Rzeszutek Wilk @ 2012-03-15 16:58 ` Avi Kivity 2012-03-15 17:49 ` Dan Magenheimer 2 siblings, 1 reply; 11+ messages in thread From: Avi Kivity @ 2012-03-15 16:58 UTC (permalink / raw) To: Akshay Karle Cc: linux-kernel, Dan Magenheimer, konrad.wilk, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan On 03/08/2012 06:29 PM, Akshay Karle wrote: > Hi, > > We are undergraduate engineering students of Maharashtra Academy of > Engineering, Pune, India and we are working on a project entitled > 'Transcendent Memory on KVM' as a part of our academics. > The project members are: > 1. Ashutosh Tripathi > 2. Shreyas Mahure > 3. Nishant Gulhane > 4. Akshay Karle > > --- > Project Description: > What is Transcendent Memory(tmem in short)? > Transcendent Memory is a memory optimization technique for the > virtualized environment. It collects the underutilized memory of the > guests and the unassigned(fallow) memory of the host and places it into > a central tmem pool. Indirect access to this pool is then provided to the guests. > For further information on tmem, please refer the article on lwn by Dr. > Dan Magenheimer: > http://lwn.net/Articles/454795/ > > Since kvm is one of the most popular hypervisors available, > we decided to implement this technique for kvm. > > Any comments/feedback would be appreciated and will help us a lot with our work. > One of the potential problems with tmem is reduction in performance when the cache hit rate is low, for example when streaming. Can you test this by creating a large file, for example with dd < /dev/urandom > file bs=1M count=100000 and then measuring the time to stream it, using time dd < file > /dev/null with and without the patch? Should be done on a cleancache enabled guest filesystem backed by a virtio disk with cache=none. It would be interesting to compare kvm_stat during the streaming, with and without the patch. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-15 16:58 ` Avi Kivity @ 2012-03-15 17:49 ` Dan Magenheimer 2012-03-15 18:01 ` Avi Kivity 0 siblings, 1 reply; 11+ messages in thread From: Dan Magenheimer @ 2012-03-15 17:49 UTC (permalink / raw) To: Avi Kivity, Akshay Karle Cc: linux-kernel, Konrad Wilk, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan > From: Avi Kivity [mailto:avi@redhat.com] > Subject: Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM > > On 03/08/2012 06:29 PM, Akshay Karle wrote: > > Hi, > > > > We are undergraduate engineering students of Maharashtra Academy of > > Engineering, Pune, India and we are working on a project entitled > > 'Transcendent Memory on KVM' as a part of our academics. > > > > Since kvm is one of the most popular hypervisors available, > > we decided to implement this technique for kvm. > > > > Any comments/feedback would be appreciated and will help us a lot with our work. > > One of the potential problems with tmem is reduction in performance when > the cache hit rate is low, for example when streaming. > > Can you test this by creating a large file, for example with > > dd < /dev/urandom > file bs=1M count=100000 > > and then measuring the time to stream it, using > > time dd < file > /dev/null > > with and without the patch? > > Should be done on a cleancache enabled guest filesystem backed by a > virtio disk with cache=none. > > It would be interesting to compare kvm_stat during the streaming, with > and without the patch. Hi Avi -- The "WasActive" patch (https://lkml.org/lkml/2012/1/25/300) is intended to avoid the streaming situation you are creating here. It increases the "quality" of cached pages placed into zcache and should probably also be used on the guest-side stubs (and/or maybe the host-side zcache... I don't know KVM well enough to determine if that would work). As Dave Hansen pointed out, the WasActive patch is not yet correct and, as akpm points out, pageflag bits are scarce on 32-bit systems, so it remains to be seen if the WasActive patch can be upstreamed. Or maybe there is a different way to achieve the same goal. But I wanted to let you know that the streaming issue is understood and needs to be resolved for some cleancache backends just as it was resolved in the core mm code. The measurement you suggest would still be interesting even without the WasActive patch as it measures a "worst case". Dan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-15 17:49 ` Dan Magenheimer @ 2012-03-15 18:01 ` Avi Kivity 2012-03-15 18:02 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 11+ messages in thread From: Avi Kivity @ 2012-03-15 18:01 UTC (permalink / raw) To: Dan Magenheimer Cc: Akshay Karle, linux-kernel, Konrad Wilk, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan On 03/15/2012 07:49 PM, Dan Magenheimer wrote: > > One of the potential problems with tmem is reduction in performance when > > the cache hit rate is low, for example when streaming. > > > > Can you test this by creating a large file, for example with > > > > dd < /dev/urandom > file bs=1M count=100000 > > > > and then measuring the time to stream it, using > > > > time dd < file > /dev/null > > > > with and without the patch? > > > > Should be done on a cleancache enabled guest filesystem backed by a > > virtio disk with cache=none. > > > > It would be interesting to compare kvm_stat during the streaming, with > > and without the patch. > > Hi Avi -- > > The "WasActive" patch (https://lkml.org/lkml/2012/1/25/300) > is intended to avoid the streaming situation you are creating here. > It increases the "quality" of cached pages placed into zcache > and should probably also be used on the guest-side stubs (and/or maybe > the host-side zcache... I don't know KVM well enough to determine > if that would work). > > As Dave Hansen pointed out, the WasActive patch is not yet correct > and, as akpm points out, pageflag bits are scarce on 32-bit systems, > so it remains to be seen if the WasActive patch can be upstreamed. > Or maybe there is a different way to achieve the same goal. > But I wanted to let you know that the streaming issue is understood > and needs to be resolved for some cleancache backends just as it was > resolved in the core mm code. Nice. This takes care of the tail-end of the streaming (the more important one - since it always involves a cold copy). What about the other side? Won't the read code invoke cleancache_get_page() for every page? (this one is just a null hypercall, so it's cheaper, but still expensive). > The measurement you suggest would still be interesting even > without the WasActive patch as it measures a "worst case". It can provide the justification for that patch, yes. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-15 18:01 ` Avi Kivity @ 2012-03-15 18:02 ` Konrad Rzeszutek Wilk 2012-03-15 18:10 ` Avi Kivity 2012-03-15 19:16 ` Dan Magenheimer 0 siblings, 2 replies; 11+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-03-15 18:02 UTC (permalink / raw) To: Avi Kivity Cc: Dan Magenheimer, Akshay Karle, linux-kernel, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan On Thu, Mar 15, 2012 at 08:01:52PM +0200, Avi Kivity wrote: > On 03/15/2012 07:49 PM, Dan Magenheimer wrote: > > > One of the potential problems with tmem is reduction in performance when > > > the cache hit rate is low, for example when streaming. > > > > > > Can you test this by creating a large file, for example with > > > > > > dd < /dev/urandom > file bs=1M count=100000 > > > > > > and then measuring the time to stream it, using > > > > > > time dd < file > /dev/null > > > > > > with and without the patch? > > > > > > Should be done on a cleancache enabled guest filesystem backed by a > > > virtio disk with cache=none. > > > > > > It would be interesting to compare kvm_stat during the streaming, with > > > and without the patch. > > > > Hi Avi -- > > > > The "WasActive" patch (https://lkml.org/lkml/2012/1/25/300) > > is intended to avoid the streaming situation you are creating here. > > It increases the "quality" of cached pages placed into zcache > > and should probably also be used on the guest-side stubs (and/or maybe > > the host-side zcache... I don't know KVM well enough to determine > > if that would work). > > > > As Dave Hansen pointed out, the WasActive patch is not yet correct > > and, as akpm points out, pageflag bits are scarce on 32-bit systems, > > so it remains to be seen if the WasActive patch can be upstreamed. > > Or maybe there is a different way to achieve the same goal. > > But I wanted to let you know that the streaming issue is understood > > and needs to be resolved for some cleancache backends just as it was > > resolved in the core mm code. > > Nice. This takes care of the tail-end of the streaming (the more > important one - since it always involves a cold copy). What about the > other side? Won't the read code invoke cleancache_get_page() for every > page? (this one is just a null hypercall, so it's cheaper, but still > expensive). That is something we should fix - I think it was mentioned in the frontswap email thread the need for batching and it certainly seems required as those hypercalls aren't that cheap. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-15 18:02 ` Konrad Rzeszutek Wilk @ 2012-03-15 18:10 ` Avi Kivity 2012-03-15 19:36 ` Dan Magenheimer 2012-03-15 19:16 ` Dan Magenheimer 1 sibling, 1 reply; 11+ messages in thread From: Avi Kivity @ 2012-03-15 18:10 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Dan Magenheimer, Akshay Karle, linux-kernel, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan On 03/15/2012 08:02 PM, Konrad Rzeszutek Wilk wrote: > > > > Nice. This takes care of the tail-end of the streaming (the more > > important one - since it always involves a cold copy). What about the > > other side? Won't the read code invoke cleancache_get_page() for every > > page? (this one is just a null hypercall, so it's cheaper, but still > > expensive). > > That is something we should fix - I think it was mentioned in the frontswap > email thread the need for batching and it certainly seems required as those > hypercalls aren't that cheap. In fact when tmem was first proposed I asked for two changes - make it batchable, and make it asynchronous (so we can offload copies to a dma engine, etc). Of course that would have made tmem significantly more complicated. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-15 18:10 ` Avi Kivity @ 2012-03-15 19:36 ` Dan Magenheimer 2012-03-15 19:46 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 11+ messages in thread From: Dan Magenheimer @ 2012-03-15 19:36 UTC (permalink / raw) To: Avi Kivity, Konrad Wilk Cc: Akshay Karle, linux-kernel, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan > From: Avi Kivity [mailto:avi@redhat.com] > Sent: Thursday, March 15, 2012 12:11 PM > To: Konrad Rzeszutek Wilk > Cc: Dan Magenheimer; Akshay Karle; linux-kernel@vger.kernel.org; kvm@vger.kernel.org; ashu tripathi; > nishant gulhane; amarmore2006; Shreyas Mahure; mahesh mohan > Subject: Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM > > On 03/15/2012 08:02 PM, Konrad Rzeszutek Wilk wrote: > > > > > > Nice. This takes care of the tail-end of the streaming (the more > > > important one - since it always involves a cold copy). What about the > > > other side? Won't the read code invoke cleancache_get_page() for every > > > page? (this one is just a null hypercall, so it's cheaper, but still > > > expensive). > > > > That is something we should fix - I think it was mentioned in the frontswap > > email thread the need for batching and it certainly seems required as those > > hypercalls aren't that cheap. > > In fact when tmem was first proposed I asked for two changes - make it > batchable, and make it asynchronous (so we can offload copies to a dma > engine, etc). Of course that would have made tmem significantly more > complicated. (Sorry, I'm not typing fast enough to keep up with the thread...) Hi Avi -- In case it wasn't clear from my last reply, RAMster shows that tmem CAN be used asynchronously... by making it more complicated, but without making the core kernel changes more complicated. In RAMster, pages are locally cached (compressed using zcache) and then, depending on policy, a separate thread sends the pages to a remote machine. So the first part (compress and store locally) still must be synchronous, but the second part (transmit to another -- remote or possibly host? -- system) can be done asynchronously. The RAMster code has to handle all the race conditions, which is a pain but seems to work. This is all working today in RAMster (which is in linux-next). Batching is still not implemented by any tmem backend, but RAMster demonstrates how the backend implementation COULD do batching without any additional core kernel changes. I.e. no changes necessary to frontswap or cleancache. So, you see, I *was* listening. I just wasn't willing to fight the uphill battle of much more complexity in the core kernel for a capability that could be implemented differently. That said, I still think it remains to be proven that reducing the number of hypercalls by 2x or 3x (or whatever the batching factor you choose) will make a noticeable performance difference. But if it does, batching can be done... and completely hidden in the backend. (I hope Andrea is listening ;-) Dan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-15 19:36 ` Dan Magenheimer @ 2012-03-15 19:46 ` Konrad Rzeszutek Wilk 0 siblings, 0 replies; 11+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-03-15 19:46 UTC (permalink / raw) To: Dan Magenheimer Cc: Avi Kivity, Akshay Karle, linux-kernel, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan On Thu, Mar 15, 2012 at 12:36:48PM -0700, Dan Magenheimer wrote: > > From: Avi Kivity [mailto:avi@redhat.com] > > Sent: Thursday, March 15, 2012 12:11 PM > > To: Konrad Rzeszutek Wilk > > Cc: Dan Magenheimer; Akshay Karle; linux-kernel@vger.kernel.org; kvm@vger.kernel.org; ashu tripathi; > > nishant gulhane; amarmore2006; Shreyas Mahure; mahesh mohan > > Subject: Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM > > > > On 03/15/2012 08:02 PM, Konrad Rzeszutek Wilk wrote: > > > > > > > > Nice. This takes care of the tail-end of the streaming (the more > > > > important one - since it always involves a cold copy). What about the > > > > other side? Won't the read code invoke cleancache_get_page() for every > > > > page? (this one is just a null hypercall, so it's cheaper, but still > > > > expensive). > > > > > > That is something we should fix - I think it was mentioned in the frontswap > > > email thread the need for batching and it certainly seems required as those > > > hypercalls aren't that cheap. > > > > In fact when tmem was first proposed I asked for two changes - make it > > batchable, and make it asynchronous (so we can offload copies to a dma > > engine, etc). Of course that would have made tmem significantly more > > complicated. > > (Sorry, I'm not typing fast enough to keep up with the thread...) > > Hi Avi -- > > In case it wasn't clear from my last reply, RAMster shows > that tmem CAN be used asynchronously... by making it more > complicated, but without making the core kernel changes more > complicated. > > In RAMster, pages are locally cached (compressed using zcache) > and then, depending on policy, a separate thread sends the pages > to a remote machine. So the first part (compress and store locally) > still must be synchronous, but the second part (transmit to > another -- remote or possibly host? -- system) can be done > asynchronously. The RAMster code has to handle all the race > conditions, which is a pain but seems to work. > > This is all working today in RAMster (which is in linux-next). > Batching is still not implemented by any tmem backend, but RAMster > demonstrates how the backend implementation COULD do batching without > any additional core kernel changes. I.e. no changes necessary > to frontswap or cleancache. > > So, you see, I *was* listening. I just wasn't willing to fight > the uphill battle of much more complexity in the core kernel > for a capability that could be implemented differently. Dan, please stop this. The frontswap work is going through me and my goal is to provide the batching and asynchronous option. It might take longer than anticipated b/c it might require redoing some of the code - that is OK. We can do this in steps too - first do the synchronous (as is right now in implementation) and then add on the batching and asynchrnous work. This means breaking the ABI/API, and I believe Avi would like the ABI be as much baked as possible so that he does not have to provide a v2 (or v3) of the tmem support in KVM. I appreciate you having done that in RAMster but the "transmit" option is what we need to batch. Think of Scatter Gather DMA. > > That said, I still think it remains to be proven that > reducing the number of hypercalls by 2x or 3x (or whatever > the batching factor you choose) will make a noticeable I was thinking 32 - about the same number that we do in Xen with PV MMU upcalls. We also batch it there with multicalls. > performance difference. But if it does, batching can > be done... and completely hidden in the backend. > > (I hope Andrea is listening ;-) > > Dan ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM 2012-03-15 18:02 ` Konrad Rzeszutek Wilk 2012-03-15 18:10 ` Avi Kivity @ 2012-03-15 19:16 ` Dan Magenheimer 1 sibling, 0 replies; 11+ messages in thread From: Dan Magenheimer @ 2012-03-15 19:16 UTC (permalink / raw) To: Konrad Wilk, Avi Kivity Cc: Akshay Karle, linux-kernel, kvm, ashu tripathi, nishant gulhane, amarmore2006, Shreyas Mahure, mahesh mohan > From: Konrad Rzeszutek Wilk > Subject: Re: [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM > > On Thu, Mar 15, 2012 at 08:01:52PM +0200, Avi Kivity wrote: > > On 03/15/2012 07:49 PM, Dan Magenheimer wrote: > > > > > > The "WasActive" patch (https://lkml.org/lkml/2012/1/25/300) > > > is intended to avoid the streaming situation you are creating here. > > > It increases the "quality" of cached pages placed into zcache > > > and should probably also be used on the guest-side stubs (and/or maybe > > > the host-side zcache... I don't know KVM well enough to determine > > > if that would work). > > > > > > As Dave Hansen pointed out, the WasActive patch is not yet correct > > > and, as akpm points out, pageflag bits are scarce on 32-bit systems, > > > so it remains to be seen if the WasActive patch can be upstreamed. > > > Or maybe there is a different way to achieve the same goal. > > > But I wanted to let you know that the streaming issue is understood > > > and needs to be resolved for some cleancache backends just as it was > > > resolved in the core mm code. > > > > Nice. This takes care of the tail-end of the streaming (the more > > important one - since it always involves a cold copy). What about the > > other side? Won't the read code invoke cleancache_get_page() for every > > page? (this one is just a null hypercall, so it's cheaper, but still > > expensive). > > That is something we should fix - I think it was mentioned in the frontswap > email thread the need for batching and it certainly seems required as those > hypercalls aren't that cheap. And exactly how expensive ARE hypercalls these days? On the first VT/SVN systems they were tens of thousands of cycles... now they are closer to sub-thousand are they not? (I remember seeing a graph of hypercall overhead dropping across generations of CPUs... anybody have a pointer to a public graph of this?) One of my favorite papers these days is "When Poll is Better than Interrupt" (http://static.usenix.org/events/fast12/tech/full_papers/Yang.pdf) which argues that wasting some CPU cycles doing a busy-wait is often more efficient than slogging through the Block I/O subsystem to set up and respond to an interrupt, if the device is fast enough. I wonder if the same might be true comparing hypercall overhead for tmem vs the path for KVM to get a page from the host via its normal path? Ignoring that for now, if excessive hypercalls is a problem, a better solution than batching may be to modify the Maharashtra approach to be more like RAMster: Put zcache in the guest-side and treat the host like a "remote" system. But let's wait for the Maharashta team to do some measurements first before we make any assumptions or change any designs... ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-03-15 19:50 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-08 16:29 [RFC 0/2] kvm: Transcendent Memory (tmem) on KVM Akshay Karle 2012-03-15 16:42 ` Konrad Rzeszutek Wilk 2012-03-15 16:48 ` Konrad Rzeszutek Wilk 2012-03-15 16:58 ` Avi Kivity 2012-03-15 17:49 ` Dan Magenheimer 2012-03-15 18:01 ` Avi Kivity 2012-03-15 18:02 ` Konrad Rzeszutek Wilk 2012-03-15 18:10 ` Avi Kivity 2012-03-15 19:36 ` Dan Magenheimer 2012-03-15 19:46 ` Konrad Rzeszutek Wilk 2012-03-15 19:16 ` Dan Magenheimer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox