performance counters

All of lore.kernel.org
 help / color / mirror / Atom feed

* performance counters
@ 2007-03-15 11:27 Jan Beulich
  2007-03-15 11:58 ` Keir Fraser
  2007-03-23  1:56 ` question about machine-to-physic table and phy-to-machine table tgh
  0 siblings, 2 replies; 13+ messages in thread
From: Jan Beulich @ 2007-03-15 11:27 UTC (permalink / raw)
  To: xen-devel

In order to be meaningful and usable together with other measuring methods,
their use in my opinion should impose as little overhead as possible. With that,
I wonder why per-cpu counters use atomic operations.

Thanks, Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: performance counters
  2007-03-15 11:27 performance counters Jan Beulich
@ 2007-03-15 11:58 ` Keir Fraser
  2007-03-15 13:57   ` Jan Beulich
  2007-03-23  1:56 ` question about machine-to-physic table and phy-to-machine table tgh
  1 sibling, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2007-03-15 11:58 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 15/3/07 11:27, "Jan Beulich" <jbeulich@novell.com> wrote:

> In order to be meaningful and usable together with other measuring methods,
> their use in my opinion should impose as little overhead as possible. With
> that,
> I wonder why per-cpu counters use atomic operations.

Well, they shouldn't be. Nearly all (apart from the array/histogram ones)
are per-cpu anyway. And even if they weren't, a few lost increments wouldn't
matter (assuming the read and write parts of the increment are each
themselves atomic -- otherwise you could get worse write-conflict problems
like word tearing).

 -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: performance counters
  2007-03-15 11:58 ` Keir Fraser
@ 2007-03-15 13:57   ` Jan Beulich
  2007-03-15 14:01     ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2007-03-15 13:57 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

>>> Keir Fraser <keir@xensource.com> 15.03.07 12:58 >>>
>On 15/3/07 11:27, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>> In order to be meaningful and usable together with other measuring methods,
>> their use in my opinion should impose as little overhead as possible. With
>> that,
>> I wonder why per-cpu counters use atomic operations.
>
>Well, they shouldn't be. Nearly all (apart from the array/histogram ones)
>are per-cpu anyway. And even if they weren't, a few lost increments wouldn't
>matter (assuming the read and write parts of the increment are each
>themselves atomic -- otherwise you could get worse write-conflict problems
>like word tearing).

Hmm, I wouldn't want to do away with the atomicity here altogether. That,
however, would imply adding knowledge about the field name of the atomic_t
to include/xen/perfc.h (and hence imply that all architectures use the same
name here). Would you consider this acceptable?

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: performance counters
  2007-03-15 13:57   ` Jan Beulich
@ 2007-03-15 14:01     ` Keir Fraser
  2007-03-15 14:37       ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2007-03-15 14:01 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel

On 15/3/07 13:57, "Jan Beulich" <jbeulich@novell.com> wrote:

>> Well, they shouldn't be. Nearly all (apart from the array/histogram ones)
>> are per-cpu anyway. And even if they weren't, a few lost increments wouldn't
>> matter (assuming the read and write parts of the increment are each
>> themselves atomic -- otherwise you could get worse write-conflict problems
>> like word tearing).
> 
> Hmm, I wouldn't want to do away with the atomicity here altogether. That,
> however, would imply adding knowledge about the field name of the atomic_t
> to include/xen/perfc.h (and hence imply that all architectures use the same
> name here). Would you consider this acceptable?

Why is that? Every type of perfcounter (per-cpu, per-array, etc) has its own
declaration macro. You could change just the ones you want to be non-atomic
to 'unsigned int'. I'd be very much for getting rid of atomicity altogether,
at least on architectures where we know the resulting incorrectness is not
'too bad' (that includes x86). Some of the shared counters are on hot paths.
Alternatively we could make *all* counters per-cpu, even histogram counter
arrays.

 -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: performance counters
  2007-03-15 14:01     ` Keir Fraser
@ 2007-03-15 14:37       ` Jan Beulich
  2007-03-15 15:03         ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2007-03-15 14:37 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

>>> Keir Fraser <keir@xensource.com> 15.03.07 15:01 >>>
>On 15/3/07 13:57, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>>> Well, they shouldn't be. Nearly all (apart from the array/histogram ones)
>>> are per-cpu anyway. And even if they weren't, a few lost increments wouldn't
>>> matter (assuming the read and write parts of the increment are each
>>> themselves atomic -- otherwise you could get worse write-conflict problems
>>> like word tearing).
>> 
>> Hmm, I wouldn't want to do away with the atomicity here altogether. That,
>> however, would imply adding knowledge about the field name of the atomic_t
>> to include/xen/perfc.h (and hence imply that all architectures use the same
>> name here). Would you consider this acceptable?
>
>Why is that? Every type of perfcounter (per-cpu, per-array, etc) has its own
>declaration macro. You could change just the ones you want to be non-atomic
>to 'unsigned int'. I'd be very much for getting rid of atomicity altogether,

Because that would require re-writing xen/common/perfc.c, which currently
assumes all 'struct perfcounter' members are of type atomic_t. Of course one
could also use ugly __typeof__ trickery to obtain the type of the field of atomic_t.

>at least on architectures where we know the resulting incorrectness is not
>'too bad' (that includes x86). Some of the shared counters are on hot paths.
>Alternatively we could make *all* counters per-cpu, even histogram counter
>arrays.

To me that would seem like the better alternative than dropping atomicity.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: performance counters
  2007-03-15 14:37       ` Jan Beulich
@ 2007-03-15 15:03         ` Keir Fraser
  2007-03-16  8:17           ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2007-03-15 15:03 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel




On 15/3/07 14:37, "Jan Beulich" <jbeulich@novell.com> wrote:

>> at least on architectures where we know the resulting incorrectness is not
>> 'too bad' (that includes x86). Some of the shared counters are on hot paths.
>> Alternatively we could make *all* counters per-cpu, even histogram counter
>> arrays.
> 
> To me that would seem like the better alternative than dropping atomicity.

It has the added benefit of avoiding cacheline bouncing.

It just needs someone to implement this improvement. ;-)

 -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: performance counters
  2007-03-15 15:03         ` Keir Fraser
@ 2007-03-16  8:17           ` Jan Beulich
  2007-03-16  8:21             ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2007-03-16  8:17 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

>>> Keir Fraser <keir@xensource.com> 15.03.07 16:03 >>>
>On 15/3/07 14:37, "Jan Beulich" <jbeulich@novell.com> wrote:
>
>>> at least on architectures where we know the resulting incorrectness is not
>>> 'too bad' (that includes x86). Some of the shared counters are on hot paths.
>>> Alternatively we could make *all* counters per-cpu, even histogram counter
>>> arrays.
>> 
>> To me that would seem like the better alternative than dropping atomicity.
>
>It has the added benefit of avoiding cacheline bouncing.
>
>It just needs someone to implement this improvement. ;-)

I'll take care of this, but it may take me a few days to get to it. But I
also think there's no need for rush here.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: performance counters
  2007-03-16  8:17           ` Jan Beulich
@ 2007-03-16  8:21             ` Keir Fraser
  0 siblings, 0 replies; 13+ messages in thread
From: Keir Fraser @ 2007-03-16  8:21 UTC (permalink / raw)
  To: Jan Beulich, Keir Fraser; +Cc: xen-devel




On 16/3/07 08:17, "Jan Beulich" <jbeulich@novell.com> wrote:

>> It has the added benefit of avoiding cacheline bouncing.
>> 
>> It just needs someone to implement this improvement. ;-)
> 
> I'll take care of this, but it may take me a few days to get to it. But I
> also think there's no need for rush here.

Indeed. The counters aren't even compiled by default. It may be, however,
that per-cpu non-atomic counters are cheap enough to enable all the time.
We'll have to see.

 -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* question about machine-to-physic table and phy-to-machine table
  2007-03-15 11:27 performance counters Jan Beulich
  2007-03-15 11:58 ` Keir Fraser
@ 2007-03-23  1:56 ` tgh
  2007-03-23 11:44   ` Daniel Stodden
  2007-03-27  4:14   ` Mark Williamson
  1 sibling, 2 replies; 13+ messages in thread
From: tgh @ 2007-03-23  1:56 UTC (permalink / raw)
  To: xen-devel

hi
 I read the code ,there are machine-to-physic table and 
physic-to-machine table
there are machine address for hardward address ,physic address for 
guestos's view hardware and virtual address ,is it right?

phy-to-machine table is a mapping for guestos's view hardware to real 
hardward ,is it right?
I am confused about the meaning and function of  machine-to-physic address

could you help me
Thanks in advnace

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: question about machine-to-physic table and phy-to-machine table
  2007-03-23  1:56 ` question about machine-to-physic table and phy-to-machine table tgh
@ 2007-03-23 11:44   ` Daniel Stodden
  2007-03-26  1:39     ` tgh
  2007-03-27  4:14   ` Mark Williamson
  1 sibling, 1 reply; 13+ messages in thread
From: Daniel Stodden @ 2007-03-23 11:44 UTC (permalink / raw)
  To: tgh; +Cc: xen-devel

On Fri, 2007-03-23 at 09:56 +0800, tgh wrote:
> hi
>  I read the code ,there are machine-to-physic table and 
> physic-to-machine table
> there are machine address for hardward address ,physic address for 
> guestos's view hardware and virtual address ,is it right?
> 
> phy-to-machine table is a mapping for guestos's view hardware to real 
> hardward ,is it right?

right. a paravirtual guest os will recognize its existence, and perform
the table lookup itself. shadowed memory removes that need. in that
case, xen would perform the translation. that is where 'gmfn' comes into
play. so a 'gmfn' is the pfn of a guest os which doesn't account for its
page table entries being bogus ones.

might get clear from the following macro:

#define mfn_to_gmfn(_d, mfn)                            \
    ( (shadow_mode_translate(_d))                      \
      ? get_gpfn_from_mfn(mfn)                          \
      : (mfn) )

you'll see those distinctions quite regularly on the xen side.

> I am confused about the meaning and function of  machine-to-physic address

it *is* confusing, admittedly. in my understanding, one reaseon for
'm2p'/'p2m' being used is that guest operating systems, most prominently
linux, have always been using 'pfn' for 'page frame number' and the like
when referring to 'physical' memory. now you need some kind of
distinction in the paravirtual guest case, because those oses will deal
with both.

that host memory becoming a non-contiguous, non-physical one clearly
doesn't justify to substitute the names all across the kernel codebase.
equally, you could not name it virtual or similar in the vmm, because
the term 'virtual' has obviously been allocated elsewhere.

so host memory became 'machine' memory. in a different universe, it
might have rather been the actual 'physical' one. or 'host' memory.
virtual machine memory got a 'p' like in both 'pseudo-physical' and/or
'pfn' and i suppose turned for a significant number of people into
'physical' at some point. which is largely misleading.

regards,
daniel

-- 
Daniel Stodden
LRR     -      Lehrstuhl für Rechnertechnik und Rechnerorganisation
Institut für Informatik der TU München             D-85748 Garching
http://www.lrr.in.tum.de/~stodden         mailto:stodden@cs.tum.edu
PGP Fingerprint: F5A4 1575 4C56 E26A 0B33  3D80 457E 82AE B0D8 735B

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: question about machine-to-physic table and phy-to-machine table
  2007-03-23 11:44   ` Daniel Stodden
@ 2007-03-26  1:39     ` tgh
  0 siblings, 0 replies; 13+ messages in thread
From: tgh @ 2007-03-26  1:39 UTC (permalink / raw)
  To: Daniel Stodden; +Cc: xen-devel

Thank you for your reply

>> I am confused about the meaning and function of  machine-to-physic address
>>     
>
> it *is* confusing, admittedly. in my understanding, one reaseon for
> 'm2p'/'p2m' being used is that guest operating systems, most prominently
> linux, have always been using 'pfn' for 'page frame number' and the like
> when referring to 'physical' memory. now you need some kind of
> distinction in the paravirtual guest case, because those oses will deal
> with both.
>   
in the paravirt case, guestos maintain its own mfn which need m2p and 
p2m ,is it right?
I am confused about how does guestOS maintain its virt-to-physic and 
physic-to-mach mapping ,in the linux ,there is only v2p mapping, how 
does guestOS maintain its p2m mapping ,and when a virt address is put 
into a mmu, does cpu hardware convert virt-addr into machine address or 
guest's phyiscal address?

I am confused about it

could you help me
Thanks in advance
> that host memory becoming a non-contiguous, non-physical one clearly
> doesn't justify to substitute the names all across the kernel codebase.
> equally, you could not name it virtual or similar in the vmm, because
> the term 'virtual' has obviously been allocated elsewhere.
>
> so host memory became 'machine' memory. in a different universe, it
> might have rather been the actual 'physical' one. or 'host' memory.
> virtual machine memory got a 'p' like in both 'pseudo-physical' and/or
> 'pfn' and i suppose turned for a significant number of people into
> 'physical' at some point. which is largely misleading.
>
> regards,
> daniel
>
>   

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: question about machine-to-physic table and phy-to-machine table
  2007-03-23  1:56 ` question about machine-to-physic table and phy-to-machine table tgh
  2007-03-23 11:44   ` Daniel Stodden
@ 2007-03-27  4:14   ` Mark Williamson
  2007-03-27  6:14     ` tgh
  1 sibling, 1 reply; 13+ messages in thread
From: Mark Williamson @ 2007-03-27  4:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Daniel Stodden, tgh

>  I read the code ,there are machine-to-physic table and
> physic-to-machine table
> there are machine address for hardward address ,physic address for
> guestos's view hardware and virtual address ,is it right?
>
> phy-to-machine table is a mapping for guestos's view hardware to real
> hardward ,is it right?
> I am confused about the meaning and function of  machine-to-physic address

* Machine addresses represent real RAM in the host.  The memory a guest owns 
will certainly not start at 0 and will not necessarily be contiguous - it 
might be in a number of chunks with big gaps between.

* (pseudo)physical addresses represent the memory the guest owns.  This 
address space starts at 0 and is contiguous.

* Virtual addresses are used by software running in the guest, and by the 
guest kernel.  They're translated by the host CPU into machine addresses so 
that it can access the correct RAM.

Guests use physical addresses as an abstraction: most operating system memory 
management code assumes that the RAM owned by the OS starts at 0 and is 
contiguous.  Because this is not the case for Machine addresses under Xen, 
most of the guest's code is "tricked" by giving it pseudophysical addresses 
that look like it expects memory to look.

The P2M and M2P tables record the relationship between pseudophysical page 
frames (which the core OS code uses) and machine page frames (which the host 
really uses).  The Xen "architecture" code within the guest OS uses these 
tables to manage the translation between pseudophysical and machine page 
frames so that the guest's page tables can be handled correctly.  For 
paravirtualised guests, page tables must contain machine addresses - these 
must be translated from the pseudophysical addresses used by core OS code.

Hope that helps clarify how this all fits together, tgh.

Cheers,
Mark

-- 
Dave: Just a question. What use is a unicyle with no seat?  And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: question about machine-to-physic table and phy-to-machine table
  2007-03-27  4:14   ` Mark Williamson
@ 2007-03-27  6:14     ` tgh
  0 siblings, 0 replies; 13+ messages in thread
From: tgh @ 2007-03-27  6:14 UTC (permalink / raw)
  To: Mark Williamson; +Cc: xen-devel, Daniel Stodden

Thank you for your detail reply
your explanation is really helpful

and
p2m and m2p are all lazily allocated ,that is when the guestos deal with 
page fault for using ,say writing, a page which is virtual-mapped but 
not allocated machine page ,is it right?



Mark Williamson 写道:
>>  I read the code ,there are machine-to-physic table and
>> physic-to-machine table
>> there are machine address for hardward address ,physic address for
>> guestos's view hardware and virtual address ,is it right?
>>
>> phy-to-machine table is a mapping for guestos's view hardware to real
>> hardward ,is it right?
>> I am confused about the meaning and function of  machine-to-physic address
>>     
>
> * Machine addresses represent real RAM in the host.  The memory a guest owns 
> will certainly not start at 0 and will not necessarily be contiguous - it 
> might be in a number of chunks with big gaps between.
>
> * (pseudo)physical addresses represent the memory the guest owns.  This 
> address space starts at 0 and is contiguous.
>
> * Virtual addresses are used by software running in the guest, and by the 
> guest kernel.  They're translated by the host CPU into machine addresses so 
> that it can access the correct RAM.
>
> Guests use physical addresses as an abstraction: most operating system memory 
> management code assumes that the RAM owned by the OS starts at 0 and is 
> contiguous.  Because this is not the case for Machine addresses under Xen, 
> most of the guest's code is "tricked" by giving it pseudophysical addresses 
> that look like it expects memory to look.
>
> The P2M and M2P tables record the relationship between pseudophysical page 
> frames (which the core OS code uses) and machine page frames (which the host 
> really uses).  The Xen "architecture" code within the guest OS uses these 
> tables to manage the translation between pseudophysical and machine page 
> frames so that the guest's page tables can be handled correctly.  For 
> paravirtualised guests, page tables must contain machine addresses - these 
> must be translated from the pseudophysical addresses used by core OS code.
>
> Hope that helps clarify how this all fits together, tgh.
>
> Cheers,
> Mark
>
>   

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-03-27  6:14 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-15 11:27 performance counters Jan Beulich
2007-03-15 11:58 ` Keir Fraser
2007-03-15 13:57   ` Jan Beulich
2007-03-15 14:01     ` Keir Fraser
2007-03-15 14:37       ` Jan Beulich
2007-03-15 15:03         ` Keir Fraser
2007-03-16  8:17           ` Jan Beulich
2007-03-16  8:21             ` Keir Fraser
2007-03-23  1:56 ` question about machine-to-physic table and phy-to-machine table tgh
2007-03-23 11:44   ` Daniel Stodden
2007-03-26  1:39     ` tgh
2007-03-27  4:14   ` Mark Williamson
2007-03-27  6:14     ` tgh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.