All of lore.kernel.org
 help / color / mirror / Atom feed
* Why is 'emulate' as good as writable PT's?
@ 2006-06-05 21:45 Andrew Theurer
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Theurer @ 2006-06-05 21:45 UTC (permalink / raw)
  To: Xen development list

We have been doing some scalability work, and we noticed that forcing 
'emulate' in arch.x86/mm.c achieves the same performance on 1-way dom0.  
For example:

xen-unstable, changeset 10200, i386 with PAE, 1-way

benchmark       xen0   xen0+emulate
-------------   ----   ----
reaim_fserver   4421   4426
reaim_compute   2555   2531
SDET            4759   4810

The reaim benchmarks probably don't have much fork(), where I'd expect 
writable page tables to help, but SDET has a ton of fork+exec.

Could there be situations were we are inadvertently triggering a 
writable page table, where we should just be doing a update_va_mapping()?


-Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Why is 'emulate' as good as writable PT's?
@ 2006-06-05 22:17 Ian Pratt
  2006-06-05 22:29 ` Andrew Theurer
  2006-06-06 20:28 ` Andrew Theurer
  0 siblings, 2 replies; 9+ messages in thread
From: Ian Pratt @ 2006-06-05 22:17 UTC (permalink / raw)
  To: Andrew Theurer, Xen development list; +Cc: Rolf Neugebauer

> Could there be situations were we are inadvertently triggering a
> writable page table, where we should just be doing a
update_va_mapping()?

Almost certainly. Singleton (or small batch) updates should not be using
writeable pagetables, and should use update_va_mapping (or mmu_update if
the VA isn't known or may not be mapped).

~18 months ago Rolf wrote and checked in profile code to collect a
histogram of the number of entries found to be modified when writeable
pagetables are flushed.
At the time there was a big spike at '1' which was fixed, but with all
the various linux version upgrades it likely needs revisiting. 

The profile code also records the EIP that caused the writeable
pagetables operation, so if you print out the value a few times you'll
quickly find the culprit.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Why is 'emulate' as good as writable PT's?
  2006-06-05 22:17 Why is 'emulate' as good as writable PT's? Ian Pratt
@ 2006-06-05 22:29 ` Andrew Theurer
  2006-06-06 20:28 ` Andrew Theurer
  1 sibling, 0 replies; 9+ messages in thread
From: Andrew Theurer @ 2006-06-05 22:29 UTC (permalink / raw)
  To: Ian Pratt; +Cc: Xen development list, Rolf Neugebauer

Ian Pratt wrote:
>> Could there be situations were we are inadvertently triggering a
>> writable page table, where we should just be doing a
> update_va_mapping()?
>
> Almost certainly. Singleton (or small batch) updates should not be using
> writeable pagetables, and should use update_va_mapping (or mmu_update if
> the VA isn't known or may not be mapped).
>
> ~18 months ago Rolf wrote and checked in profile code to collect a
> histogram of the number of entries found to be modified when writeable
> pagetables are flushed.
> At the time there was a big spike at '1' which was fixed, but with all
> the various linux version upgrades it likely needs revisiting.
>
> The profile code also records the EIP that caused the writeable
> pagetables operation, so if you print out the value a few times you'll
> quickly find the culprit.
Thanks!  It looks like the histogram and EIP logs in ptwr_flush are 
still there, so we'll run again with perfc=y and see if we can pinpoint 
the culprit.

-Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Why is 'emulate' as good as writable PT's?
  2006-06-05 22:17 Why is 'emulate' as good as writable PT's? Ian Pratt
  2006-06-05 22:29 ` Andrew Theurer
@ 2006-06-06 20:28 ` Andrew Theurer
  2006-06-06 21:14   ` Keir Fraser
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Theurer @ 2006-06-06 20:28 UTC (permalink / raw)
  To: Ian Pratt; +Cc: Xen development list, Rolf Neugebauer

Ian Pratt wrote:
>> Could there be situations were we are inadvertently triggering a
>> writable page table, where we should just be doing a
>>     
> update_va_mapping()?
>
> Almost certainly. Singleton (or small batch) updates should not be using
> writeable pagetables, and should use update_va_mapping (or mmu_update if
> the VA isn't known or may not be mapped).
>
> ~18 months ago Rolf wrote and checked in profile code to collect a
> histogram of the number of entries found to be modified when writeable
> pagetables are flushed.
> At the time there was a big spike at '1' which was fixed, but with all
> the various linux version upgrades it likely needs revisiting. 
>
> The profile code also records the EIP that caused the writeable
> pagetables operation, so if you print out the value a few times you'll
> quickly find the culprit.
>
> Thanks,
> Ian
>   
Yes, we definitely have a problem here.  Tons of flushes with 
modified=1, and lots with <=10.  The three benchmarks all seem to hit 
the same areas.  Here is the output from running SDET, with snippets 
from System.map mixed in:

Out of a total of 19601 writable PT updates:

c01522b0 <=1   40    <=10    0       <=50    0       <=100    0      <=512    0
--------
c0151e90 T sys_mprotect
c01524d3 t .text.lock.mprotect


c014ed77 <=1 3418    <=10 4853       <=50 1674       <=100   70      <=512    0
--------
c014e84e T copy_page_range
c014efc6 T free_pgtables


c01522ab <=1 3728    <=10    0       <=50    0       <=100    0      <=512    0
--------
c0151e90 T sys_mprotect
c01524d3 t .text.lock.mprotect


c014b809 <=1 3752    <=10 1654       <=50  302       <=100   10      <=512    3
--------
c014b300 T unmap_vmas
c014b9ba T zap_page_range


c014b80b <=1   32    <=10   30       <=50   30       <=100    1      <=512    0
--------
c014b300 T unmap_vmas
c014b9ba T zap_page_range


-Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Why is 'emulate' as good as writable PT's?
  2006-06-06 20:28 ` Andrew Theurer
@ 2006-06-06 21:14   ` Keir Fraser
  2006-06-06 22:02     ` Andrew Theurer
  2006-06-08 16:05     ` Andrew Theurer
  0 siblings, 2 replies; 9+ messages in thread
From: Keir Fraser @ 2006-06-06 21:14 UTC (permalink / raw)
  To: Andrew Theurer; +Cc: Ian Pratt, Xen development list, Rolf Neugebauer


On 6 Jun 2006, at 21:28, Andrew Theurer wrote:

> Yes, we definitely have a problem here.  Tons of flushes with 
> modified=1, and lots with <=10.  The three benchmarks all seem to hit 
> the same areas.  Here is the output from running SDET, with snippets 
> from System.map mixed in:

Is this PAE? SMP guest?

Do you know much about the SDET benchmark? For example, do you know how 
big the mprotect() calls it makes are likely to be? If vma's are small 
and fairly sparse then the writable pagetable batching won't be a win.

  -- Keir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Why is 'emulate' as good as writable PT's?
  2006-06-06 21:14   ` Keir Fraser
@ 2006-06-06 22:02     ` Andrew Theurer
  2006-06-08 16:05     ` Andrew Theurer
  1 sibling, 0 replies; 9+ messages in thread
From: Andrew Theurer @ 2006-06-06 22:02 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Ian Pratt, Xen development list, Rolf Neugebauer

Keir Fraser wrote:
>
> On 6 Jun 2006, at 21:28, Andrew Theurer wrote:
>
>> Yes, we definitely have a problem here.  Tons of flushes with 
>> modified=1, and lots with <=10.  The three benchmarks all seem to hit 
>> the same areas.  Here is the output from running SDET, with snippets 
>> from System.map mixed in:
>
> Is this PAE? SMP guest?
>
> Do you know much about the SDET benchmark? For example, do you know 
> how big the mprotect() calls it makes are likely to be? If vma's are 
> small and fairly sparse then the writable pagetable batching won't be 
> a win.
1-way SMP kernel, PAE.  not sure about the mprotect() calls.  SDET 
basically calls a lot of utilities like ps, gcc, ispell, etc.  Is it 
feasible to "xen-ify" unmap_vmas() and copy_page_range(), such that we 
use explicit hypercalls instead of faulting on the writes?

-Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Why is 'emulate' as good as writable PT's?
  2006-06-06 21:14   ` Keir Fraser
  2006-06-06 22:02     ` Andrew Theurer
@ 2006-06-08 16:05     ` Andrew Theurer
  1 sibling, 0 replies; 9+ messages in thread
From: Andrew Theurer @ 2006-06-08 16:05 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Ian Pratt, Xen development list, Rolf Neugebauer

Keir Fraser wrote:
>
> On 6 Jun 2006, at 21:28, Andrew Theurer wrote:
>
>> Yes, we definitely have a problem here.  Tons of flushes with 
>> modified=1, and lots with <=10.  The three benchmarks all seem to hit 
>> the same areas.  Here is the output from running SDET, with snippets 
>> from System.map mixed in:
>
> Is this PAE? SMP guest?
>
> Do you know much about the SDET benchmark? For example, do you know 
> how big the mprotect() calls it makes are likely to be? If vma's are 
> small and fairly sparse then the writable pagetable batching won't be 
> a win.
>
>  -- Keir
>
I was wondering, perhaps we are not just triggering writable pagetables 
when we shouldn't, but maybe we are flushing them back too early.  I 
added some xen perf counters to get an idea of why we are flushing back 
wtpt's (run on SDET again):

                        modified:  0 <=10  <=20  <=30  <=40  <=50
 1 writable pt updates    T=1086   0  612   194   111    49    85
 2 ptwr_flush: called from ptwr_emulated_update because wtpt exists  T=0
 3 ptwr_flush: called from ptwr_do_page_fault because wtpt is already 
used T=338
 4 ptwr_flush: called from spurious_page_fault  T=0
 5 ptwr_flush: called from fixup_page_fault  T=0
 6 ptwr_flush: called from cleanup_wpt, do_mmuext_op (active) T=467
 7 ptwr_flush: called from cleanup_wpt, do_mmuext_op (inactive) T=0
 8 ptwr_flush: called from cleanup_wpt, update_va_mapping (active) T=280
 9 ptwr_flush: called from cleanup_wpt, update_va_mapping (inactive) T=0
10 ptwr_flush: called from cleanup_wpt, do_mmu_update (active) T=1
11 ptwr_flush: called from cleanup_wpt, do_mmu_update (inactive) T=0

line 2:  I don't think we have a choice here, right?  Not a big deal, as 
it's not happening anyway.

line 3:  I think we can just goto emulate instead of flushing back the 
wtpt here, right?  I've tried this, but no real difference in 
performance.  Could we increase the number of wtpt's we keep track of, 
so we don't have to flush back or emulate?

line 6: We seem to call cleanup_writable_pagetables unconditionally 
here, and if either of the active or inactive pages are used, they get 
flushed back.  Do we always need to do this?

line 8: Also call cleanup_writable_pagetables unconditionally here.  Do 
the wtpt's always need this to happen?  Is is possible the 
update_va_mapping call is for an address space which does not affect the 
wtpt?

line 10: Not seeing many flushes here, so I guess it's not an issue.

Sorry if these questions seem odd.  There's a good chance I am not 
"getting it" :)

Thanks,

-Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Why is 'emulate' as good as writable PT's?
@ 2006-06-12  9:15 Ian Pratt
  2006-06-13 14:47 ` Andrew Theurer
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Pratt @ 2006-06-12  9:15 UTC (permalink / raw)
  To: Andrew Theurer, Keir Fraser; +Cc: Xen development list, Rolf Neugebauer

> I was wondering, perhaps we are not just triggering writable
pagetables
> when we shouldn't, but maybe we are flushing them back too early.  I
> added some xen perf counters to get an idea of why we are flushing
back
> wtpt's (run on SDET again):

Are these numbers taken on a uniprocessor guest (or dom0?)

>                         modified:  0 <=10  <=20  <=30  <=40  <=50
>  1 writable pt updates    T=1086   0  612   194   111    49    85
>  2 ptwr_flush: called from ptwr_emulated_update because wtpt exists
T=0
>  3 ptwr_flush: called from ptwr_do_page_fault because wtpt is already
> used T=338
>  4 ptwr_flush: called from spurious_page_fault  T=0
>  5 ptwr_flush: called from fixup_page_fault  T=0
>  6 ptwr_flush: called from cleanup_wpt, do_mmuext_op (active) T=467
>  7 ptwr_flush: called from cleanup_wpt, do_mmuext_op (inactive) T=0
>  8 ptwr_flush: called from cleanup_wpt, update_va_mapping (active)
T=280
>  9 ptwr_flush: called from cleanup_wpt, update_va_mapping (inactive)
T=0
> 10 ptwr_flush: called from cleanup_wpt, do_mmu_update (active) T=1
> 11 ptwr_flush: called from cleanup_wpt, do_mmu_update (inactive) T=0


> line 3:  I think we can just goto emulate instead of flushing back the
> wtpt here, right?  I've tried this, but no real difference in
> performance.  Could we increase the number of wtpt's we keep track of,
> so we don't have to flush back or emulate?

This will happen as part of a fork when we move on to the next page in
the PT. It should be harmless unless we're flopping back and forth.

> line 6: We seem to call cleanup_writable_pagetables unconditionally
> here, and if either of the active or inactive pages are used, they get
> flushed back.  Do we always need to do this?

What's the op? is it a TLB flush, invplg, or cr3 load?
 
> line 8: Also call cleanup_writable_pagetables unconditionally here.
Do
> the wtpt's always need this to happen?  Is is possible the
> update_va_mapping call is for an address space which does not affect
the
> wtpt?

It's interesting to understand what the interaction is here. I'd like to
know 

> line 10: Not seeing many flushes here, so I guess it's not an issue.
> 
> Sorry if these questions seem odd.  There's a good chance I am not
> "getting it" :)

This is useful work. It's been on our todo list to re-profile this on
newer kernels. Once upon a time we had it quite nicely tuned...

Could you find out all the kernel EIPs that are triggering writeable
pagetables with any frequency and list them for us. It might be good to
turn everything into using update mmuop and then just turn on direct
writes just for the fork case which is where we know need it.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Why is 'emulate' as good as writable PT's?
  2006-06-12  9:15 Ian Pratt
@ 2006-06-13 14:47 ` Andrew Theurer
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Theurer @ 2006-06-13 14:47 UTC (permalink / raw)
  To: Ian Pratt; +Cc: Xen development list, Rolf Neugebauer

Ian Pratt wrote:
>> I was wondering, perhaps we are not just triggering writable pagetables when we shouldn't, but maybe we are flushing them back too early.  I
>> added some xen perf counters to get an idea of why we are flushing back
>> wtpt's (run on SDET again):
>
> Are these numbers taken on a uniprocessor guest (or dom0?)
Yes.
>
>>                         modified:  0 <=10  <=20  <=30  <=40  <=50
>> 1 writable pt updates    T=1086   0  612   194   111    49    85
>> 2 ptwr_fl: called from ptwr_emulated_update because wtpt exists T=0
>> 3 ptwr_fl: called from ptwr_do_page_fault because wtpt is used T=338
>> 4 ptwr_fl: called from spurious_page_fault  T=0
>> 5 ptwr_fl: called from fixup_page_fault  T=0
>> 6 ptwr_fl: called from cleanup_wpt, do_mmuext_op (active) T=467
>> 7 ptwr_fl: called from cleanup_wpt, do_mmuext_op (inactive) T=0
>> 8 ptwr_fl: called from cleanup_wpt, update_va_mapping (active) T=280
>> 9 ptwr_fl: called from cleanup_wpt, update_va_mapping (inactive) T=0
>> 10 ptwr_flush: called from cleanup_wpt, do_mmu_update (active) T=1
>> 11 ptwr_flush: called from cleanup_wpt, do_mmu_update (inactive) T=0
>> line 3:  I think we can just goto emulate instead of flushing back the
>> wtpt here, right?  I've tried this, but no real difference in
>> performance.  Could we increase the number of wtpt's we keep track of,
>> so we don't have to flush back or emulate?
>
> This will happen as part of a fork when we move on to the next page in
> the PT. It should be harmless unless we're flopping back and forth.
OK
>
>> line 6: We seem to call cleanup_writable_pagetables unconditionally
>> here, and if either of the active or inactive pages are used, they get
>> flushed back.  Do we always need to do this?
>
> What's the op? is it a TLB flush, invplg, or cr3 load?
I don't know, but I'll find out.
>  
>> line 8: Also call cleanup_writable_pagetables unconditionally here.
>> Do the wtpt's always need this to happen?  Is is possible the
>> update_va_mapping call is for an address space which does not affect
>> the wtpt?
>
> It's interesting to understand what the interaction is here. I'd like to
> know 
>
>> line 10: Not seeing many flushes here, so I guess it's not an issue.
>>
>> Sorry if these questions seem odd.  There's a good chance I am not
>> "getting it" :)
>
> This is useful work. It's been on our todo list to re-profile this on
> newer kernels. Once upon a time we had it quite nicely tuned...
>
> Could you find out all the kernel EIPs that are triggering writeable
> pagetables with any frequency and list them for us. It might be good to
> turn everything into using update mmuop and then just turn on direct
> writes just for the fork case which is where we know need it.
I think I can do that.  I'll just use something similar to the EIP 
logging Xen has for finding out what triggered the wtpt flushes.

Thanks,

Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-06-13 14:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-05 22:17 Why is 'emulate' as good as writable PT's? Ian Pratt
2006-06-05 22:29 ` Andrew Theurer
2006-06-06 20:28 ` Andrew Theurer
2006-06-06 21:14   ` Keir Fraser
2006-06-06 22:02     ` Andrew Theurer
2006-06-08 16:05     ` Andrew Theurer
  -- strict thread matches above, loose matches on Subject: below --
2006-06-12  9:15 Ian Pratt
2006-06-13 14:47 ` Andrew Theurer
2006-06-05 21:45 Andrew Theurer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.