* [PATCH 0/4] Swapping
@ 2007-10-13 2:06 Izik Eidus
[not found] ` <47102823.2000600-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2007-10-13 2:06 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
this patchs allow the guest not shadowed memory to be swapped out.
to make it the must effective you should run -kvm-shadow-memory 1 (witch
will make your machine slow)
with -kvm-shadow-memory 1, 3giga memory guest can get to be just 32mb
on physical host!
when not using -kvm-shadow-memory, i saw 4100mb machine getting to as
low as 168mb on the physical host (not as bad as i thought it would be,
and surely not as bad as it can be with 41mb of shadow pages :))
it seems to be very stable, it didnt crushed to me once, and i was able
to run:
2 3giga each windows xp + 5giga linux guest
and
2 4.1 giga each windows xp and 2 2giga each windows xp.
few things to note:
ignore for now the ugly messages at dmesg, it is due to the fact that
gfn_to_page try to sleep while local intrreupts disabled ( we have to
split some emulator function so it wont do it)
and i saw some issue with the new rmapp at fedora 7 live cd, for some
reason , in the nonpaging mode rmap_remove getting called about 50 times
less than it need
it doesnt happen at other linux guests, need to check this... (for now
it mean you might have about 200k of memory leak for each fedora 7 live
cd you are runing )
also note that now kvm load much faster, beacuse no memset on all the
memory is needed (beacuse gfn_to_page get called at run time)
(avi, and dor, note that this patch include small fix to a bug in the
patch that i sent you)
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <47102823.2000600-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-10-13 19:48 ` Anthony Liguori
[not found] ` <4711210F.40802-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-14 0:10 ` Anthony Liguori
` (2 subsequent siblings)
3 siblings, 1 reply; 24+ messages in thread
From: Anthony Liguori @ 2007-10-13 19:48 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Izik Eidus wrote:
> this patchs allow the guest not shadowed memory to be swapped out.
>
> to make it the must effective you should run -kvm-shadow-memory 1 (witch
> will make your machine slow)
> with -kvm-shadow-memory 1, 3giga memory guest can get to be just 32mb
> on physical host!
>
> when not using -kvm-shadow-memory, i saw 4100mb machine getting to as
> low as 168mb on the physical host (not as bad as i thought it would be,
> and surely not as bad as it can be with 41mb of shadow pages :))
>
So what exactly does this option do? Is it really worth exposing it as
an option if it slows down guests so much?
At least, a better name for the option would be nice.
Regards,
Anthony Liguori
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4711210F.40802-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2007-10-13 20:06 ` Izik Eidus
[not found] ` <4711252F.7020505-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2007-10-13 20:06 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> Izik Eidus wrote:
>> this patchs allow the guest not shadowed memory to be swapped out.
>>
>> to make it the must effective you should run -kvm-shadow-memory 1
>> (witch will make your machine slow)
>> with -kvm-shadow-memory 1, 3giga memory guest can get to be just
>> 32mb on physical host!
>>
>> when not using -kvm-shadow-memory, i saw 4100mb machine getting to as
>> low as 168mb on the physical host (not as bad as i thought it would
>> be, and surely not as bad as it can be with 41mb of shadow pages :))
>>
>
> So what exactly does this option do? Is it really worth exposing it
> as an option if it slows down guests so much?
>
> At least, a better name for the option would be nice.
>
> Regards,
>
> Anthony Liguori
-kvm-shadow-memory was included in kvm from version 46.
it control how much pages will be allocated to the shadow cache, (from
version 46 the default is 2% of the system (so 4.1giga have about 80 mbs
of shadow cache by default))
the problem is that each page that present at the shadow cache ( mmu
pages ) cant be swapped out, so as lower as the number of shadow caches
you have as more swappable will be
useful.
(this option (kvm-shadow-memory) was included to make it possible that
as less as possible shadow cache memory will be deleated ( beacuse it
have big overhead to lose this cache)
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4711252F.7020505-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-10-13 20:21 ` Izik Eidus
[not found] ` <471128B5.5090104-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2007-10-13 20:21 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Izik Eidus wrote:
> Anthony Liguori wrote:
>> Izik Eidus wrote:
>>> this patchs allow the guest not shadowed memory to be swapped out.
>>>
>>> to make it the must effective you should run -kvm-shadow-memory 1
>>> (witch will make your machine slow)
>>> with -kvm-shadow-memory 1, 3giga memory guest can get to be just
>>> 32mb on physical host!
>>>
>>> when not using -kvm-shadow-memory, i saw 4100mb machine getting to
>>> as low as 168mb on the physical host (not as bad as i thought it
>>> would be, and surely not as bad as it can be with 41mb of shadow
>>> pages :))
>>>
>>
>> So what exactly does this option do? Is it really worth exposing it
>> as an option if it slows down guests so much?
>>
>> At least, a better name for the option would be nice.
>>
>> Regards,
>>
>> Anthony Liguori
> -kvm-shadow-memory was included in kvm from version 46.
> it control how much pages will be allocated to the shadow cache, (from
> version 46 the default is 2% of the system (so 4.1giga have about 80
> mbs of shadow cache by default))
> the problem is that each page that present at the shadow cache ( mmu
> pages ) cant be swapped out, so as lower as the number of shadow
> caches you have as more swappable will be
> useful.
>
> (this option (kvm-shadow-memory) was included to make it possible that
> as less as possible shadow cache memory will be deleated ( beacuse it
> have big overhead to lose this cache)
>
just to make it clearer,
-kvm-shadow-memory main purpose is to speed up the guest, ( as more mmu
pages you have to the shadow cache, as faster as your guest will be while
it working with alot of memory)
(the main mission is that kvm_free_some_pages() will get called as less
as possible)
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <471128B5.5090104-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-10-13 23:17 ` Anthony Liguori
0 siblings, 0 replies; 24+ messages in thread
From: Anthony Liguori @ 2007-10-13 23:17 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Izik Eidus wrote:
> Izik Eidus wrote:
>> Anthony Liguori wrote:
>>> Izik Eidus wrote:
>>>> this patchs allow the guest not shadowed memory to be swapped out.
>>>>
>>>> to make it the must effective you should run -kvm-shadow-memory 1
>>>> (witch will make your machine slow)
>>>> with -kvm-shadow-memory 1, 3giga memory guest can get to be just
>>>> 32mb on physical host!
>>>>
>>>> when not using -kvm-shadow-memory, i saw 4100mb machine getting to
>>>> as low as 168mb on the physical host (not as bad as i thought it
>>>> would be, and surely not as bad as it can be with 41mb of shadow
>>>> pages :))
>>>>
>>>
>>> So what exactly does this option do? Is it really worth exposing it
>>> as an option if it slows down guests so much?
>>>
>>> At least, a better name for the option would be nice.
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>> -kvm-shadow-memory was included in kvm from version 46.
>> it control how much pages will be allocated to the shadow cache,
>> (from version 46 the default is 2% of the system (so 4.1giga have
>> about 80 mbs of shadow cache by default))
>> the problem is that each page that present at the shadow cache ( mmu
>> pages ) cant be swapped out, so as lower as the number of shadow
>> caches you have as more swappable will be
>> useful.
>>
>> (this option (kvm-shadow-memory) was included to make it possible
>> that as less as possible shadow cache memory will be deleated (
>> beacuse it have big overhead to lose this cache)
>>
> just to make it clearer,
> -kvm-shadow-memory main purpose is to speed up the guest, ( as more
> mmu pages you have to the shadow cache, as faster as your guest will
> be while
> it working with alot of memory)
> (the main mission is that kvm_free_some_pages() will get called as
> less as possible)
Yeah, I realized what -kvm-shadow-memory shortly after I sent my first
note :-) Sorry for the noise.
Regards,
Anthony Liguori
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <47102823.2000600-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-13 19:48 ` Anthony Liguori
@ 2007-10-14 0:10 ` Anthony Liguori
[not found] ` <47115E75.1040203-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-15 9:13 ` Carsten Otte
2007-10-15 18:10 ` Anthony Liguori
3 siblings, 1 reply; 24+ messages in thread
From: Anthony Liguori @ 2007-10-14 0:10 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Very nice!
I've tested this series (with your new 3/4) with win2k, winxp, ubuntu
7.10, and opensuse. Everything seemed to work just fine.
I also was able to create four 1G VMs on my 2G laptop :-) That was very
neat.
Regards,
Anthony Liguori
Izik Eidus wrote:
> this patchs allow the guest not shadowed memory to be swapped out.
>
> to make it the must effective you should run -kvm-shadow-memory 1 (witch
> will make your machine slow)
> with -kvm-shadow-memory 1, 3giga memory guest can get to be just 32mb
> on physical host!
>
> when not using -kvm-shadow-memory, i saw 4100mb machine getting to as
> low as 168mb on the physical host (not as bad as i thought it would be,
> and surely not as bad as it can be with 41mb of shadow pages :))
>
>
> it seems to be very stable, it didnt crushed to me once, and i was able
> to run:
> 2 3giga each windows xp + 5giga linux guest
>
> and
> 2 4.1 giga each windows xp and 2 2giga each windows xp.
>
> few things to note:
> ignore for now the ugly messages at dmesg, it is due to the fact that
> gfn_to_page try to sleep while local intrreupts disabled ( we have to
> split some emulator function so it wont do it)
>
> and i saw some issue with the new rmapp at fedora 7 live cd, for some
> reason , in the nonpaging mode rmap_remove getting called about 50 times
> less than it need
> it doesnt happen at other linux guests, need to check this... (for now
> it mean you might have about 200k of memory leak for each fedora 7 live
> cd you are runing )
>
> also note that now kvm load much faster, beacuse no memset on all the
> memory is needed (beacuse gfn_to_page get called at run time)
>
> (avi, and dor, note that this patch include small fix to a bug in the
> patch that i sent you)
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> kvm-devel mailing list
> kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <47115E75.1040203-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2007-10-14 0:14 ` Anthony Liguori
[not found] ` <47115F6A.7080800-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Anthony Liguori @ 2007-10-14 0:14 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> Very nice!
>
> I've tested this series (with your new 3/4) with win2k, winxp, ubuntu
> 7.10, and opensuse. Everything seemed to work just fine.
Spoke too soon, found the following in dmesg:
[35078.913071] BUG: scheduling while atomic:
qemu-system-x86/0x10000001/21612
[35078.913077]
[35078.913079] Call Trace:
[35078.913112] [<ffffffff804301c5>] thread_return+0x21e/0x6c9
[35078.913129] [<ffffffff8027a00d>] zone_statistics+0x7d/0x80
[35078.913139] [<ffffffff80273691>] get_page_from_freelist+0x441/0x5b0
[35078.913168] [<ffffffff8022ffec>] __cond_resched+0x1c/0x50
[35078.913174] [<ffffffff804306f2>] cond_resched+0x32/0x40
[35078.913181] [<ffffffff8024e4d9>] down_read+0x9/0x20
[35078.913199] [<ffffffff8839a87c>] :kvm:gfn_to_page+0x4c/0x130
[35078.913207] [<ffffffff8027b76d>] vm_normal_page+0x3d/0xc0
[35078.913230] [<ffffffff8839ff94>] :kvm:gpa_to_hpa+0x24/0x70
[35078.913249] [<ffffffff883a007e>] :kvm:paging32_set_pte_common+0x9e/0x2b0
[35078.913285] [<ffffffff883a02d9>] :kvm:paging32_set_pte+0x49/0x50
[35078.913308] [<ffffffff883a091d>] :kvm:kvm_mmu_pte_write+0x33d/0x3b0
[35078.913350] [<ffffffff883a0ca2>] :kvm:paging32_walk_addr+0x292/0x310
[35078.913383] [<ffffffff883a0e30>] :kvm:paging32_page_fault+0xc0/0x300
[35078.913399] [<ffffffff883a294c>] :kvm:x86_emulate_insn+0x11c/0x4190
[35078.913448] [<ffffffff883ba36b>] :kvm_intel:handle_exception+0x21b/0x2a0
[35078.913474] [<ffffffff8839ca5c>] :kvm:kvm_vcpu_ioctl+0xddc/0x1130
[35078.913488] [<ffffffff8022cacc>] task_rq_lock+0x4c/0x90
[35078.913494] [<ffffffff8022c599>] __activate_task+0x29/0x50
[35078.913504] [<ffffffff8022f30c>] try_to_wake_up+0x5c/0x3f0
[35078.913511] [<ffffffff8025240f>] futex_wait+0x2df/0x3c0
[35078.913521] [<ffffffff8022cacc>] task_rq_lock+0x4c/0x90
[35078.913528] [<ffffffff8022c599>] __activate_task+0x29/0x50
[35078.913545] [<ffffffff8022c307>] __wake_up_common+0x47/0x80
[35078.913561] [<ffffffff8022ca03>] __wake_up+0x43/0x70
[35078.913575] [<ffffffff80326971>] __up_read+0x21/0xb0
[35078.913585] [<ffffffff802528a0>] futex_wake+0xd0/0xf0
[35078.913617] [<ffffffff80240810>] __dequeue_signal+0x110/0x1d0
[35078.913633] [<ffffffff8023ffce>] recalc_sigpending+0xe/0x30
[35078.913638] [<ffffffff8024212c>] dequeue_signal+0x5c/0x190
[35078.913662] [<ffffffff802a6e05>] do_ioctl+0x35/0xe0
[35078.913675] [<ffffffff802a6f24>] vfs_ioctl+0x74/0x2d0
[35078.913680] [<ffffffff8023ffce>] recalc_sigpending+0xe/0x30
[35078.913684] [<ffffffff80240057>] sigprocmask+0x67/0xf0
[35078.913697] [<ffffffff802a7215>] sys_ioctl+0x95/0xb0
[35078.913715] [<ffffffff80209e8e>] system_call+0x7e/0x83
[35078.913743]
Regards,
Anthony Liguori
> I also was able to create four 1G VMs on my 2G laptop :-) That was
> very neat.
>
> Regards,
>
> Anthony Liguori
>
> Izik Eidus wrote:
>> this patchs allow the guest not shadowed memory to be swapped out.
>>
>> to make it the must effective you should run -kvm-shadow-memory 1
>> (witch will make your machine slow)
>> with -kvm-shadow-memory 1, 3giga memory guest can get to be just
>> 32mb on physical host!
>>
>> when not using -kvm-shadow-memory, i saw 4100mb machine getting to as
>> low as 168mb on the physical host (not as bad as i thought it would
>> be, and surely not as bad as it can be with 41mb of shadow pages :))
>>
>>
>> it seems to be very stable, it didnt crushed to me once, and i was
>> able to run:
>> 2 3giga each windows xp + 5giga linux guest
>>
>> and
>> 2 4.1 giga each windows xp and 2 2giga each windows xp.
>>
>> few things to note:
>> ignore for now the ugly messages at dmesg, it is due to the fact that
>> gfn_to_page try to sleep while local intrreupts disabled ( we have to
>> split some emulator function so it wont do it)
>>
>> and i saw some issue with the new rmapp at fedora 7 live cd, for some
>> reason , in the nonpaging mode rmap_remove getting called about 50
>> times less than it need
>> it doesnt happen at other linux guests, need to check this... (for
>> now it mean you might have about 200k of memory leak for each fedora
>> 7 live cd you are runing )
>>
>> also note that now kvm load much faster, beacuse no memset on all the
>> memory is needed (beacuse gfn_to_page get called at run time)
>>
>> (avi, and dor, note that this patch include small fix to a bug in the
>> patch that i sent you)
>>
>> -------------------------------------------------------------------------
>>
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems? Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>> _______________________________________________
>> kvm-devel mailing list
>> kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>>
>>
>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <47115F6A.7080800-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2007-10-14 6:10 ` Izik Eidus
0 siblings, 0 replies; 24+ messages in thread
From: Izik Eidus @ 2007-10-14 6:10 UTC (permalink / raw)
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> Anthony Liguori wrote:
>> Very nice!
>>
>> I've tested this series (with your new 3/4) with win2k, winxp, ubuntu
>> 7.10, and opensuse. Everything seemed to work just fine.
>
> Spoke too soon, found the following in dmesg:
>
> [35078.913071] BUG: scheduling while atomic:
> qemu-system-x86/0x10000001/21612
> [35078.913077]
> [35078.913079] Call Trace:
> [35078.913112] [<ffffffff804301c5>] thread_return+0x21e/0x6c9
> [35078.913129] [<ffffffff8027a00d>] zone_statistics+0x7d/0x80
> [35078.913139] [<ffffffff80273691>] get_page_from_freelist+0x441/0x5b0
> [35078.913168] [<ffffffff8022ffec>] __cond_resched+0x1c/0x50
> [35078.913174] [<ffffffff804306f2>] cond_resched+0x32/0x40
> [35078.913181] [<ffffffff8024e4d9>] down_read+0x9/0x20
> [35078.913199] [<ffffffff8839a87c>] :kvm:gfn_to_page+0x4c/0x130
> [35078.913207] [<ffffffff8027b76d>] vm_normal_page+0x3d/0xc0
> [35078.913230] [<ffffffff8839ff94>] :kvm:gpa_to_hpa+0x24/0x70
> [35078.913249] [<ffffffff883a007e>]
> :kvm:paging32_set_pte_common+0x9e/0x2b0
> [35078.913285] [<ffffffff883a02d9>] :kvm:paging32_set_pte+0x49/0x50
> [35078.913308] [<ffffffff883a091d>] :kvm:kvm_mmu_pte_write+0x33d/0x3b0
> [35078.913350] [<ffffffff883a0ca2>] :kvm:paging32_walk_addr+0x292/0x310
> [35078.913383] [<ffffffff883a0e30>] :kvm:paging32_page_fault+0xc0/0x300
> [35078.913399] [<ffffffff883a294c>] :kvm:x86_emulate_insn+0x11c/0x4190
> [35078.913448] [<ffffffff883ba36b>]
> :kvm_intel:handle_exception+0x21b/0x2a0
> [35078.913474] [<ffffffff8839ca5c>] :kvm:kvm_vcpu_ioctl+0xddc/0x1130
> [35078.913488] [<ffffffff8022cacc>] task_rq_lock+0x4c/0x90
> [35078.913494] [<ffffffff8022c599>] __activate_task+0x29/0x50
> [35078.913504] [<ffffffff8022f30c>] try_to_wake_up+0x5c/0x3f0
> [35078.913511] [<ffffffff8025240f>] futex_wait+0x2df/0x3c0
> [35078.913521] [<ffffffff8022cacc>] task_rq_lock+0x4c/0x90
> [35078.913528] [<ffffffff8022c599>] __activate_task+0x29/0x50
> [35078.913545] [<ffffffff8022c307>] __wake_up_common+0x47/0x80
> [35078.913561] [<ffffffff8022ca03>] __wake_up+0x43/0x70
> [35078.913575] [<ffffffff80326971>] __up_read+0x21/0xb0
> [35078.913585] [<ffffffff802528a0>] futex_wake+0xd0/0xf0
> [35078.913617] [<ffffffff80240810>] __dequeue_signal+0x110/0x1d0
> [35078.913633] [<ffffffff8023ffce>] recalc_sigpending+0xe/0x30
> [35078.913638] [<ffffffff8024212c>] dequeue_signal+0x5c/0x190
> [35078.913662] [<ffffffff802a6e05>] do_ioctl+0x35/0xe0
> [35078.913675] [<ffffffff802a6f24>] vfs_ioctl+0x74/0x2d0
> [35078.913680] [<ffffffff8023ffce>] recalc_sigpending+0xe/0x30
> [35078.913684] [<ffffffff80240057>] sigprocmask+0x67/0xf0
> [35078.913697] [<ffffffff802a7215>] sys_ioctl+0x95/0xb0
> [35078.913715] [<ffffffff80209e8e>] system_call+0x7e/0x83
> [35078.913743]
>
this is funny, but this message is "ok"
i wrote it when i sent the patch,
it happen beacuse the disable_local_irqs in kvm, (beacuse in this part
some emulator function get called and do gfn_to_page()
we have to split this function,....
so it isnt really bug in the swapping, it is beacuse get_user_pages do
cond_reschd(), when we will split the emulator function, we wont have
this message :)
> Regards,
>
> Anthony Liguori
>
>> I also was able to create four 1G VMs on my 2G laptop :-) That was
>> very neat.
>>
>> Regards,
>>
>> Anthony Liguori
>>
>> Izik Eidus wrote:
>>> this patchs allow the guest not shadowed memory to be swapped out.
>>>
>>> to make it the must effective you should run -kvm-shadow-memory 1
>>> (witch will make your machine slow)
>>> with -kvm-shadow-memory 1, 3giga memory guest can get to be just
>>> 32mb on physical host!
>>>
>>> when not using -kvm-shadow-memory, i saw 4100mb machine getting to
>>> as low as 168mb on the physical host (not as bad as i thought it
>>> would be, and surely not as bad as it can be with 41mb of shadow
>>> pages :))
>>>
>>>
>>> it seems to be very stable, it didnt crushed to me once, and i was
>>> able to run:
>>> 2 3giga each windows xp + 5giga linux guest
>>>
>>> and
>>> 2 4.1 giga each windows xp and 2 2giga each windows xp.
>>>
>>> few things to note:
>>> ignore for now the ugly messages at dmesg, it is due to the fact
>>> that gfn_to_page try to sleep while local intrreupts disabled ( we
>>> have to split some emulator function so it wont do it)
>>>
>>> and i saw some issue with the new rmapp at fedora 7 live cd, for
>>> some reason , in the nonpaging mode rmap_remove getting called about
>>> 50 times less than it need
>>> it doesnt happen at other linux guests, need to check this... (for
>>> now it mean you might have about 200k of memory leak for each fedora
>>> 7 live cd you are runing )
>>>
>>> also note that now kvm load much faster, beacuse no memset on all
>>> the memory is needed (beacuse gfn_to_page get called at run time)
>>>
>>> (avi, and dor, note that this patch include small fix to a bug in
>>> the patch that i sent you)
>>>
>>> -------------------------------------------------------------------------
>>>
>>> This SF.net email is sponsored by: Splunk Inc.
>>> Still grepping through log files to find problems? Stop.
>>> Now Search log events and configuration files using AJAX and a browser.
>>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>>> _______________________________________________
>>> kvm-devel mailing list
>>> kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>>> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>>>
>>>
>>
>>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <47102823.2000600-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-13 19:48 ` Anthony Liguori
2007-10-14 0:10 ` Anthony Liguori
@ 2007-10-15 9:13 ` Carsten Otte
[not found] ` <47132F57.3040703-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-10-15 18:10 ` Anthony Liguori
3 siblings, 1 reply; 24+ messages in thread
From: Carsten Otte @ 2007-10-15 9:13 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Izik Eidus wrote:
> this patchs allow the guest not shadowed memory to be swapped out.
This patch has greatly improved since I've read the swapping code last
time. While not having time for a deep review, it looks very clean and
sane to me when scrolling over.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <47132F57.3040703-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
@ 2007-10-15 12:18 ` Izik Eidus
0 siblings, 0 replies; 24+ messages in thread
From: Izik Eidus @ 2007-10-15 12:18 UTC (permalink / raw)
To: carsteno-tA70FqPdS9bQT0dZR+AlfA
Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Mon, 2007-10-15 at 11:13 +0200, Carsten Otte wrote:
> Izik Eidus wrote:
> > this patchs allow the guest not shadowed memory to be swapped out.
> This patch has greatly improved since I've read the swapping code last
> time. While not having time for a deep review, it looks very clean and
> sane to me when scrolling over.
thanks.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <47102823.2000600-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
` (2 preceding siblings ...)
2007-10-15 9:13 ` Carsten Otte
@ 2007-10-15 18:10 ` Anthony Liguori
[not found] ` <4713ACF8.6010809-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
3 siblings, 1 reply; 24+ messages in thread
From: Anthony Liguori @ 2007-10-15 18:10 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
I've been playing around with these patches. If I do an
madvise(MADV_DONTNEED) in userspace, when I close the VM, I get the
following bug. My knowledge of the mm is limited but since
madvise(MADV_DONTNEED) effectively does a zap_page_range() I wonder if
we're lacking the necessary callback to also remove any potential GPA
covered by that range from shadow page cache.
Regards,
Anthony Liguori
[ 860.724555] rmap_remove: ffff81004c48cf00 506d1025 0->BUG
[ 860.724603] ------------[ cut here ]------------
[ 860.724606] kernel BUG at
/home/anthony/git/fresh/kvm-userspace/kernel/mmu.c:433!
[ 860.724608] invalid opcode: 0000 [1] SMP
[ 860.724611] CPU 0
[ 860.724613] Modules linked in: kvm_intel kvm i915 drm af_packet
rfcomm l2cap bluetooth nbd thinkpad_acpi ppdev acpi_cpufreq
cpufreq_userspace cpufreq_conservative cpufreq_powersave cpufreq_stats
cpufreq_ondemand freq_table ac bay battery container video sbs button
dock ipv6 bridge ipt_REJECT xt_state xt_tcpudp iptable_filter
ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack
nfnetlink ip_tables x_tables deflate zlib_deflate twofish twofish_common
camellia serpent blowfish des cbc aes xcbc sha256 sha1 crypto_null
af_key sbp2 lp joydev arc4 ecb blkcipher snd_hda_intel snd_pcm_oss
snd_mixer_oss iwl4965 snd_pcm iwlwifi_mac80211 pcmcia snd_seq_dummy
sdhci snd_seq_oss cfg80211 parport_pc parport serio_raw psmouse mmc_core
pcspkr yenta_socket rsrc_nonstatic pcmcia_core intel_agp snd_seq_midi
snd_rawmidi snd_seq_midi_event snd_seq shpchp pci_hotplug snd_timer
snd_seq_device snd soundcore snd_page_alloc evdev ext3 jbd mbcache sg
sr_mod cdrom sd_mod usbhid hid ata_piix ata_generic libata scsi_mod
ohci1394 ieee1394 ehci_hcd e1000 uhci_hcd usbcore dm_mirror dm_snapshot
dm_mod thermal processor fan fuse apparmor commoncap
[ 860.724688] Pid: 7372, comm: qemu-system-x86 Not tainted
2.6.22-14-generic #1
[ 860.724690] RIP: 0010:[<ffffffff88384ef3>] [<ffffffff88384ef3>]
:kvm:rmap_remove+0xb3/0x190
[ 860.724704] RSP: 0018:ffff81004f079d28 EFLAGS: 00010292
[ 860.724706] RAX: 0000000000000040 RBX: ffff81004ccc9580 RCX:
ffffffff80534b68
[ 860.724709] RDX: ffffffff80534b68 RSI: 0000000000000086 RDI:
ffffffff80534b60
[ 860.724711] RBP: ffff81004c48cf00 R08: 0000000000000000 R09:
0000000000000000
[ 860.724714] R10: ffffffff805ce880 R11: ffffffff8021e2c0 R12:
ffff81004cda0000
[ 860.724716] R13: ffff81004ccc9580 R14: ffff81004cda0000 R15:
000ffffffffff000
[ 860.724719] FS: 00002b55f14e6d30(0000) GS:ffffffff80560000(0000)
knlGS:0000000000000000
[ 860.724721] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 860.724724] CR2: 00002b55f0129680 CR3: 0000000000201000 CR4:
00000000000026e0
[ 860.724726] Process qemu-system-x86 (pid: 7372, threadinfo
ffff81004f078000, task ffff810056d974a0)
[ 860.724728] Stack: ffff81004c48cf00 00000000000001e0
0000000000000000 ffffffff883851e4
[ 860.724734] ffff8100672cf650 ffff81004c63a000 ffff81004c63a000
ffff81004cda0000
[ 860.724739] ffff8100512056a8 ffff810050c75100 ffff81004dfb9a90
ffffffff88385453
[ 860.724743] Call Trace:
[ 860.724755] [<ffffffff883851e4>] :kvm:kvm_mmu_zap_page+0x214/0x250
[ 860.724769] [<ffffffff88385453>] :kvm:free_mmu_pages+0x23/0x50
[ 860.724777] [<ffffffff8838549d>] :kvm:kvm_mmu_destroy+0x1d/0x70
[ 860.724788] [<ffffffff883819e1>] :kvm:kvm_vcpu_uninit+0x11/0x30
[ 860.724795] [<ffffffff8839fc7b>] :kvm_intel:vmx_free_vcpu+0x5b/0x70
[ 860.724803] [<ffffffff88382d4a>] :kvm:kvm_destroy_vm+0xca/0x130
[ 860.724813] [<ffffffff88382f60>] :kvm:kvm_vm_release+0x10/0x20
[ 860.724820] [<ffffffff8029a3c1>] __fput+0xc1/0x1e0
[ 860.724834] [<ffffffff8837f9ea>] :kvm:kvm_vcpu_release+0x1a/0x30
[ 860.724838] [<ffffffff8029a3c1>] __fput+0xc1/0x1e0
[ 860.724848] [<ffffffff80297334>] filp_close+0x54/0x90
[ 860.724854] [<ffffffff80237c8d>] put_files_struct+0xed/0x120
[ 860.724864] [<ffffffff80239051>] do_exit+0x1a1/0x940
[ 860.724878] [<ffffffff8023981c>] do_group_exit+0x2c/0x80
[ 860.724884] [<ffffffff80209e8e>] system_call+0x7e/0x83
[ 860.724899]
[ 860.724900]
[ 860.724901] Code: 0f 0b eb fe 48 89 c7 48 83 e7 fe 0f 84 a1 00 00 00
45 31 c0
[ 860.724911] RIP [<ffffffff88384ef3>] :kvm:rmap_remove+0xb3/0x190
[ 860.724919] RSP <ffff81004f079d28>
[ 860.724921] Fixing recursive fault but reboot is needed!
Izik Eidus wrote:
> this patchs allow the guest not shadowed memory to be swapped out.
>
> to make it the must effective you should run -kvm-shadow-memory 1 (witch
> will make your machine slow)
> with -kvm-shadow-memory 1, 3giga memory guest can get to be just 32mb
> on physical host!
>
> when not using -kvm-shadow-memory, i saw 4100mb machine getting to as
> low as 168mb on the physical host (not as bad as i thought it would be,
> and surely not as bad as it can be with 41mb of shadow pages :))
>
>
> it seems to be very stable, it didnt crushed to me once, and i was able
> to run:
> 2 3giga each windows xp + 5giga linux guest
>
> and
> 2 4.1 giga each windows xp and 2 2giga each windows xp.
>
> few things to note:
> ignore for now the ugly messages at dmesg, it is due to the fact that
> gfn_to_page try to sleep while local intrreupts disabled ( we have to
> split some emulator function so it wont do it)
>
> and i saw some issue with the new rmapp at fedora 7 live cd, for some
> reason , in the nonpaging mode rmap_remove getting called about 50 times
> less than it need
> it doesnt happen at other linux guests, need to check this... (for now
> it mean you might have about 200k of memory leak for each fedora 7 live
> cd you are runing )
>
> also note that now kvm load much faster, beacuse no memset on all the
> memory is needed (beacuse gfn_to_page get called at run time)
>
> (avi, and dor, note that this patch include small fix to a bug in the
> patch that i sent you)
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> kvm-devel mailing list
> kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713ACF8.6010809-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2007-10-15 18:21 ` Izik Eidus
[not found] ` <4713AF9C.8000609-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-16 7:36 ` Avi Kivity
1 sibling, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2007-10-15 18:21 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> I've been playing around with these patches. If I do an
> madvise(MADV_DONTNEED) in userspace, when I close the VM, I get the
> following bug. My knowledge of the mm is limited but since
> madvise(MADV_DONTNEED) effectively does a zap_page_range() I wonder if
> we're lacking the necessary callback to also remove any potential GPA
> covered by that range from shadow page cache.
>
> Regards,
>
> Anthony Liguori
it probably beacuse the changes of the first patch, ( that make all
present shadow pages rmapped )
Anthony can you please check what happn if you run latest kvm with the
patch (of the rmap) and without the patch?
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713AF9C.8000609-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-10-15 19:03 ` Anthony Liguori
[not found] ` <4713B97F.7090403-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Anthony Liguori @ 2007-10-15 19:03 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Izik Eidus wrote:
> Anthony Liguori wrote:
>> I've been playing around with these patches. If I do an
>> madvise(MADV_DONTNEED) in userspace, when I close the VM, I get the
>> following bug. My knowledge of the mm is limited but since
>> madvise(MADV_DONTNEED) effectively does a zap_page_range() I wonder
>> if we're lacking the necessary callback to also remove any potential
>> GPA covered by that range from shadow page cache.
>>
>> Regards,
>>
>> Anthony Liguori
> it probably beacuse the changes of the first patch, ( that make all
> present shadow pages rmapped )
> Anthony can you please check what happn if you run latest kvm with the
> patch (of the rmap) and without the patch?
It looks like it's my patch for doing an in kernel mmap() to support
older userspaces. I'll figure out what the problem is.
But at any rate, would madvise() be able to evict the current contents
of something in the shadow page cache or will the guest not pick up the
new memory until the old gets evicted from the shadow page cache?
Regards,
Anthony Liguori
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713B97F.7090403-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2007-10-15 19:16 ` Izik Eidus
[not found] ` <4713BCA4.3080103-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-16 7:38 ` Avi Kivity
1 sibling, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2007-10-15 19:16 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> Izik Eidus wrote:
>> Anthony Liguori wrote:
>>> I've been playing around with these patches. If I do an
>>> madvise(MADV_DONTNEED) in userspace, when I close the VM, I get the
>>> following bug. My knowledge of the mm is limited but since
>>> madvise(MADV_DONTNEED) effectively does a zap_page_range() I wonder
>>> if we're lacking the necessary callback to also remove any potential
>>> GPA covered by that range from shadow page cache.
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>> it probably beacuse the changes of the first patch, ( that make all
>> present shadow pages rmapped )
>> Anthony can you please check what happn if you run latest kvm with
>> the patch (of the rmap) and without the patch?
>
> It looks like it's my patch for doing an in kernel mmap() to support
> older userspaces. I'll figure out what the problem is.
>
> But at any rate, would madvise() be able to evict the current contents
> of something in the shadow page cache or will the guest not pick up
> the new memory until the old gets evicted from the shadow page cache?
if i understand you right, then madvise() wont harm us, beacuse we
protect all our shadow memory by removing the writable bit from it.
therefor the guest cant change anything without we know it.
>
> Regards,
>
> Anthony Liguori
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713BCA4.3080103-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-10-15 19:29 ` Anthony Liguori
[not found] ` <4713BFB3.8060701-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-16 7:37 ` Avi Kivity
1 sibling, 1 reply; 24+ messages in thread
From: Anthony Liguori @ 2007-10-15 19:29 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Izik Eidus wrote:
> Anthony Liguori wrote:
>> Izik Eidus wrote:
>>> Anthony Liguori wrote:
>>>> I've been playing around with these patches. If I do an
>>>> madvise(MADV_DONTNEED) in userspace, when I close the VM, I get the
>>>> following bug. My knowledge of the mm is limited but since
>>>> madvise(MADV_DONTNEED) effectively does a zap_page_range() I wonder
>>>> if we're lacking the necessary callback to also remove any
>>>> potential GPA covered by that range from shadow page cache.
>>>>
>>>> Regards,
>>>>
>>>> Anthony Liguori
>>> it probably beacuse the changes of the first patch, ( that make all
>>> present shadow pages rmapped )
>>> Anthony can you please check what happn if you run latest kvm with
>>> the patch (of the rmap) and without the patch?
>>
>> It looks like it's my patch for doing an in kernel mmap() to support
>> older userspaces. I'll figure out what the problem is.
>>
>> But at any rate, would madvise() be able to evict the current
>> contents of something in the shadow page cache or will the guest not
>> pick up the new memory until the old gets evicted from the shadow
>> page cache?
> if i understand you right, then madvise() wont harm us, beacuse we
> protect all our shadow memory by removing the writable bit from it.
> therefor the guest cant change anything without we know it.
That's not quite what I was wondering.
When you do an madvise() in userspace, the result is that when that
memory is accessed again, linux will demand-fault in a zero page and COW
it appropriately. If we do madvise() on the VA representing guest
physical memory, what I'm curious about is whether the guest will
actually see this change. If the guest happens to have the page mapped
before we do the madvise(), what triggers KVM to kick any shadow page
table entries out of it's cache?
IIUC, today, after the madvise, the guest will have access to the old
page until that entry gets evicted and reloaded from the shadow page
table cache.
Regards,
Anthony Liguori
>>
>> Regards,
>>
>> Anthony Liguori
>>
>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713BFB3.8060701-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2007-10-15 19:50 ` Izik Eidus
[not found] ` <4713C46E.9020107-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2007-10-15 19:50 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> Izik Eidus wrote:
>> Anthony Liguori wrote:
>>> Izik Eidus wrote:
>>>> Anthony Liguori wrote:
>>>>> I've been playing around with these patches. If I do an
>>>>> madvise(MADV_DONTNEED) in userspace, when I close the VM, I get
>>>>> the following bug. My knowledge of the mm is limited but since
>>>>> madvise(MADV_DONTNEED) effectively does a zap_page_range() I
>>>>> wonder if we're lacking the necessary callback to also remove any
>>>>> potential GPA covered by that range from shadow page cache.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Anthony Liguori
>>>> it probably beacuse the changes of the first patch, ( that make all
>>>> present shadow pages rmapped )
>>>> Anthony can you please check what happn if you run latest kvm with
>>>> the patch (of the rmap) and without the patch?
>>>
>>> It looks like it's my patch for doing an in kernel mmap() to support
>>> older userspaces. I'll figure out what the problem is.
>>>
>>> But at any rate, would madvise() be able to evict the current
>>> contents of something in the shadow page cache or will the guest not
>>> pick up the new memory until the old gets evicted from the shadow
>>> page cache?
>> if i understand you right, then madvise() wont harm us, beacuse we
>> protect all our shadow memory by removing the writable bit from it.
>> therefor the guest cant change anything without we know it.
>
> That's not quite what I was wondering.
>
> When you do an madvise() in userspace, the result is that when that
> memory is accessed again, linux will demand-fault in a zero page and
> COW it appropriately. If we do madvise() on the VA representing guest
> physical memory, what I'm curious about is whether the guest will
> actually see this change. If the guest happens to have the page
> mapped before we do the madvise(), what triggers KVM to kick any
> shadow page table entries out of it's cache?
>
> IIUC, today, after the madvise, the guest will have access to the old
> page until that entry gets evicted and reloaded from the shadow page
> table cache.
ok i am no familier with madvise() so i might talk nonsense but,
if the guest have the page mapped before the madvise(), this mean we
have high refernce to it, this is our only protection and as far as i
understand this should be enough
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713C46E.9020107-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-10-15 20:03 ` Anthony Liguori
[not found] ` <4713C7A3.4050805-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Anthony Liguori @ 2007-10-15 20:03 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Izik Eidus wrote:
>>
>> That's not quite what I was wondering.
>>
>> When you do an madvise() in userspace, the result is that when that
>> memory is accessed again, linux will demand-fault in a zero page and
>> COW it appropriately. If we do madvise() on the VA representing
>> guest physical memory, what I'm curious about is whether the guest
>> will actually see this change. If the guest happens to have the page
>> mapped before we do the madvise(), what triggers KVM to kick any
>> shadow page table entries out of it's cache?
>>
>> IIUC, today, after the madvise, the guest will have access to the old
>> page until that entry gets evicted and reloaded from the shadow page
>> table cache.
> ok i am no familier with madvise() so i might talk nonsense but,
> if the guest have the page mapped before the madvise(), this mean we
> have high refernce to it, this is our only protection and as far as i
> understand this should be enough
Right, we will have a reference to the page. But we want to propagate
this change to the guest. So madvise() may be a bad example.
What if you wanted to do shared memory for multiple guests. You start
out with an anonymous mmap(), and you now what to mmap() a file in
/dev/shm to be shared among multiple guests so you mmap(MAP_FIXED) to
phys_ram_base + guest_pa in each guest.
So what if guest_pa was in the guest's shadow page cache? In order for
the guest to see the right hpa (the new shared memory), we have to be
able to evict guest_pa. We could do this with something like
mmu_unshadow().
What I don't understand, is how we can have something like
mmu_unshadow() called automatically when an mmap() is initiated from
userspace. We could just add an ioctl() to do it from userspace but I
think it would be nicer if it Just Worked.
Regards,
Anthony Liguori
>
>
>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713C7A3.4050805-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2007-10-15 20:15 ` Izik Eidus
2007-10-16 9:35 ` Avi Kivity
1 sibling, 0 replies; 24+ messages in thread
From: Izik Eidus @ 2007-10-15 20:15 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> Izik Eidus wrote:
>>>
>>> That's not quite what I was wondering.
>>>
>>> When you do an madvise() in userspace, the result is that when that
>>> memory is accessed again, linux will demand-fault in a zero page and
>>> COW it appropriately. If we do madvise() on the VA representing
>>> guest physical memory, what I'm curious about is whether the guest
>>> will actually see this change. If the guest happens to have the
>>> page mapped before we do the madvise(), what triggers KVM to kick
>>> any shadow page table entries out of it's cache?
>>>
>>> IIUC, today, after the madvise, the guest will have access to the
>>> old page until that entry gets evicted and reloaded from the shadow
>>> page table cache.
>> ok i am no familier with madvise() so i might talk nonsense but,
>> if the guest have the page mapped before the madvise(), this mean we
>> have high refernce to it, this is our only protection and as far as i
>> understand this should be enough
>
> Right, we will have a reference to the page. But we want to propagate
> this change to the guest. So madvise() may be a bad example.
>
> What if you wanted to do shared memory for multiple guests. You start
> out with an anonymous mmap(), and you now what to mmap() a file in
> /dev/shm to be shared among multiple guests so you mmap(MAP_FIXED) to
> phys_ram_base + guest_pa in each guest.
>
> So what if guest_pa was in the guest's shadow page cache? In order
> for the guest to see the right hpa (the new shared memory), we have to
> be able to evict guest_pa. We could do this with something like
> mmu_unshadow().
>
> What I don't understand, is how we can have something like
> mmu_unshadow() called automatically when an mmap() is initiated from
> userspace. We could just add an ioctl() to do it from userspace but I
> think it would be nicer if it Just Worked.
yea i agree with you, but i dont know either how to do it, we will
probably have to look around to see how it can be done.
avi any idea?
>
> Regards,
>
> Anthony Liguori
>
>>
>>
>>
>>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713ACF8.6010809-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-15 18:21 ` Izik Eidus
@ 2007-10-16 7:36 ` Avi Kivity
1 sibling, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2007-10-16 7:36 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> I've been playing around with these patches. If I do an
> madvise(MADV_DONTNEED) in userspace, when I close the VM, I get the
> following bug. My knowledge of the mm is limited but since
> madvise(MADV_DONTNEED) effectively does a zap_page_range() I wonder if
> we're lacking the necessary callback to also remove any potential GPA
> covered by that range from shadow page cache.
>
> Regards,
>
> Anthony Liguori
>
> [ 860.724555] rmap_remove: ffff81004c48cf00 506d1025 0->BUG
>
The mmu should keep a page's refcount elevated while it's mapped in, so
MADV_DONTNEED should not affect it. Looks like there is a bug where kvm
looks at the host pagetables for a page which is already in the shadow
pagetables. We need to avoid this as long as we don't have pte notifiers.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713BCA4.3080103-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-15 19:29 ` Anthony Liguori
@ 2007-10-16 7:37 ` Avi Kivity
1 sibling, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2007-10-16 7:37 UTC (permalink / raw)
To: Izik Eidus; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Izik Eidus wrote:
> Anthony Liguori wrote:
>
>> Izik Eidus wrote:
>>
>>> Anthony Liguori wrote:
>>>
>>>> I've been playing around with these patches. If I do an
>>>> madvise(MADV_DONTNEED) in userspace, when I close the VM, I get the
>>>> following bug. My knowledge of the mm is limited but since
>>>> madvise(MADV_DONTNEED) effectively does a zap_page_range() I wonder
>>>> if we're lacking the necessary callback to also remove any potential
>>>> GPA covered by that range from shadow page cache.
>>>>
>>>> Regards,
>>>>
>>>> Anthony Liguori
>>>>
>>> it probably beacuse the changes of the first patch, ( that make all
>>> present shadow pages rmapped )
>>> Anthony can you please check what happn if you run latest kvm with
>>> the patch (of the rmap) and without the patch?
>>>
>> It looks like it's my patch for doing an in kernel mmap() to support
>> older userspaces. I'll figure out what the problem is.
>>
>> But at any rate, would madvise() be able to evict the current contents
>> of something in the shadow page cache or will the guest not pick up
>> the new memory until the old gets evicted from the shadow page cache?
>>
> if i understand you right, then madvise() wont harm us, beacuse we
> protect all our shadow memory by removing the writable bit from it.
> therefor the guest cant change anything without we know it.
>
The host userspace can, though, and we need to protect the kernel from that.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713B97F.7090403-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-15 19:16 ` Izik Eidus
@ 2007-10-16 7:38 ` Avi Kivity
1 sibling, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2007-10-16 7:38 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> Izik Eidus wrote:
>
>> Anthony Liguori wrote:
>>
>>> I've been playing around with these patches. If I do an
>>> madvise(MADV_DONTNEED) in userspace, when I close the VM, I get the
>>> following bug. My knowledge of the mm is limited but since
>>> madvise(MADV_DONTNEED) effectively does a zap_page_range() I wonder
>>> if we're lacking the necessary callback to also remove any potential
>>> GPA covered by that range from shadow page cache.
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>>
>> it probably beacuse the changes of the first patch, ( that make all
>> present shadow pages rmapped )
>> Anthony can you please check what happn if you run latest kvm with the
>> patch (of the rmap) and without the patch?
>>
>
> It looks like it's my patch for doing an in kernel mmap() to support
> older userspaces. I'll figure out what the problem is.
>
> But at any rate, would madvise() be able to evict the current contents
> of something in the shadow page cache or will the guest not pick up the
> new memory until the old gets evicted from the shadow page cache?
>
>
The latter. With pte notifiers, the former.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <4713C7A3.4050805-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-15 20:15 ` Izik Eidus
@ 2007-10-16 9:35 ` Avi Kivity
[not found] ` <471485E2.8090301-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
1 sibling, 1 reply; 24+ messages in thread
From: Avi Kivity @ 2007-10-16 9:35 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
> Izik Eidus wrote:
>
>>> That's not quite what I was wondering.
>>>
>>> When you do an madvise() in userspace, the result is that when that
>>> memory is accessed again, linux will demand-fault in a zero page and
>>> COW it appropriately. If we do madvise() on the VA representing
>>> guest physical memory, what I'm curious about is whether the guest
>>> will actually see this change. If the guest happens to have the page
>>> mapped before we do the madvise(), what triggers KVM to kick any
>>> shadow page table entries out of it's cache?
>>>
>>> IIUC, today, after the madvise, the guest will have access to the old
>>> page until that entry gets evicted and reloaded from the shadow page
>>> table cache.
>>>
>> ok i am no familier with madvise() so i might talk nonsense but,
>> if the guest have the page mapped before the madvise(), this mean we
>> have high refernce to it, this is our only protection and as far as i
>> understand this should be enough
>>
>
> Right, we will have a reference to the page. But we want to propagate
> this change to the guest. So madvise() may be a bad example.
>
> What if you wanted to do shared memory for multiple guests. You start
> out with an anonymous mmap(), and you now what to mmap() a file in
> /dev/shm to be shared among multiple guests so you mmap(MAP_FIXED) to
> phys_ram_base + guest_pa in each guest.
>
> So what if guest_pa was in the guest's shadow page cache? In order for
> the guest to see the right hpa (the new shared memory), we have to be
> able to evict guest_pa. We could do this with something like
> mmu_unshadow().
>
> What I don't understand, is how we can have something like
> mmu_unshadow() called automatically when an mmap() is initiated from
> userspace. We could just add an ioctl() to do it from userspace but I
> think it would be nicer if it Just Worked.
>
Behold the magic of pte notifiers! Every time the host touches a host
page table entry, it calls kvm which zaps the corresponding shadow pte
entries and invalidates any tlb entries in running vcpus.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <471485E2.8090301-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-10-16 18:29 ` Anthony Liguori
[not found] ` <47150325.3070009-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Anthony Liguori @ 2007-10-16 18:29 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Izik Eidus wrote:
>>
>>>> That's not quite what I was wondering.
>>>>
>>>> When you do an madvise() in userspace, the result is that when that
>>>> memory is accessed again, linux will demand-fault in a zero page
>>>> and COW it appropriately. If we do madvise() on the VA
>>>> representing guest physical memory, what I'm curious about is
>>>> whether the guest will actually see this change. If the guest
>>>> happens to have the page mapped before we do the madvise(), what
>>>> triggers KVM to kick any shadow page table entries out of it's cache?
>>>>
>>>> IIUC, today, after the madvise, the guest will have access to the
>>>> old page until that entry gets evicted and reloaded from the shadow
>>>> page table cache.
>>>>
>>> ok i am no familier with madvise() so i might talk nonsense but,
>>> if the guest have the page mapped before the madvise(), this mean we
>>> have high refernce to it, this is our only protection and as far as
>>> i understand this should be enough
>>>
>>
>> Right, we will have a reference to the page. But we want to
>> propagate this change to the guest. So madvise() may be a bad example.
>>
>> What if you wanted to do shared memory for multiple guests. You
>> start out with an anonymous mmap(), and you now what to mmap() a file
>> in /dev/shm to be shared among multiple guests so you mmap(MAP_FIXED)
>> to phys_ram_base + guest_pa in each guest.
>>
>> So what if guest_pa was in the guest's shadow page cache? In order
>> for the guest to see the right hpa (the new shared memory), we have
>> to be able to evict guest_pa. We could do this with something like
>> mmu_unshadow().
>>
>> What I don't understand, is how we can have something like
>> mmu_unshadow() called automatically when an mmap() is initiated from
>> userspace. We could just add an ioctl() to do it from userspace but
>> I think it would be nicer if it Just Worked.
>>
>
> Behold the magic of pte notifiers! Every time the host touches a host
> page table entry, it calls kvm which zaps the corresponding shadow pte
> entries and invalidates any tlb entries in running vcpus.
/me bows down to the greatness of pte notifiers
So yeah, that would solve the problem nicely. Are you planning on
resubmitting those patches or did they end up in Linus' tree?
Regards,
Anthony Liguori
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/4] Swapping
[not found] ` <47150325.3070009-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
@ 2007-10-16 20:01 ` Avi Kivity
0 siblings, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2007-10-16 20:01 UTC (permalink / raw)
To: Anthony Liguori; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Anthony Liguori wrote:
>>> What I don't understand, is how we can have something like
>>> mmu_unshadow() called automatically when an mmap() is initiated from
>>> userspace. We could just add an ioctl() to do it from userspace but
>>> I think it would be nicer if it Just Worked.
>>>
>>
>> Behold the magic of pte notifiers! Every time the host touches a
>> host page table entry, it calls kvm which zaps the corresponding
>> shadow pte entries and invalidates any tlb entries in running vcpus.
>
> /me bows down to the greatness of pte notifiers
>
> So yeah, that would solve the problem nicely. Are you planning on
> resubmitting those patches or did they end up in Linus' tree?
There's a lot of work before pte notifers are linusable.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-10-16 20:01 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-13 2:06 [PATCH 0/4] Swapping Izik Eidus
[not found] ` <47102823.2000600-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-13 19:48 ` Anthony Liguori
[not found] ` <4711210F.40802-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-13 20:06 ` Izik Eidus
[not found] ` <4711252F.7020505-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-13 20:21 ` Izik Eidus
[not found] ` <471128B5.5090104-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-13 23:17 ` Anthony Liguori
2007-10-14 0:10 ` Anthony Liguori
[not found] ` <47115E75.1040203-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-14 0:14 ` Anthony Liguori
[not found] ` <47115F6A.7080800-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-14 6:10 ` Izik Eidus
2007-10-15 9:13 ` Carsten Otte
[not found] ` <47132F57.3040703-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>
2007-10-15 12:18 ` Izik Eidus
2007-10-15 18:10 ` Anthony Liguori
[not found] ` <4713ACF8.6010809-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-15 18:21 ` Izik Eidus
[not found] ` <4713AF9C.8000609-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-15 19:03 ` Anthony Liguori
[not found] ` <4713B97F.7090403-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-15 19:16 ` Izik Eidus
[not found] ` <4713BCA4.3080103-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-15 19:29 ` Anthony Liguori
[not found] ` <4713BFB3.8060701-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-15 19:50 ` Izik Eidus
[not found] ` <4713C46E.9020107-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-15 20:03 ` Anthony Liguori
[not found] ` <4713C7A3.4050805-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-15 20:15 ` Izik Eidus
2007-10-16 9:35 ` Avi Kivity
[not found] ` <471485E2.8090301-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-10-16 18:29 ` Anthony Liguori
[not found] ` <47150325.3070009-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org>
2007-10-16 20:01 ` Avi Kivity
2007-10-16 7:37 ` Avi Kivity
2007-10-16 7:38 ` Avi Kivity
2007-10-16 7:36 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox