All of lore.kernel.org
 help / color / mirror / Atom feed
* xenpm: opensolaris HVM domU stops getting timer interrupts when C3 used
@ 2010-01-05 20:55 Frank Van Der Linden
  2010-01-05 21:02 ` Keir Fraser
  2010-01-06  1:55 ` Wei, Gang
  0 siblings, 2 replies; 5+ messages in thread
From: Frank Van Der Linden @ 2010-01-05 20:55 UTC (permalink / raw)
  To: xen-devel

I've been looking at Xen (3.4.2) power management, and have been trying 
out different settings. One machine I tried was a Toshiba M30 laptop, 
which has an Intel Core 2 Duo CPU:

CPU0: Intel(R) Core(TM)2 Duo CPU     P8400  @ 2.26GHz stepping 06
CPU1: Intel(R) Core(TM)2 Duo CPU     P8400  @ 2.26GHz stepping 06

When dom0 initialized power management, the messages printed are as follows:

Set CPU acpi_id(0) cpuid(0) Px State info:
  _PCT: descriptor=0, length=0, space_id=127, bit_width=0, bit_offset=0, 
reserved=0, address=0
  _PCT: descriptor=0, length=0, space_id=127, bit_width=0, bit_offset=0, 
reserved=0, address=0
  _PSS: state_count=4
  State0: 2261MHz 35000mW 10us 10us 0x927 0x927
  State1: 2260MHz 35000mW 10us 10us 0x4822 0x4822
  State2: 1600MHz 16000mW 10us 10us 0x617 0x617
  State3: 800MHz 14300mW 10us 10us 0x8611 0x8611
  _PSD: num_entries=5 rev=0 domain=0 coord_type=252 num_processors=1
  _PPC: 0
xen_pminfo: @acpi_cpufreq_cpu_init,HARDWARE addr space
CPU 0 initialization completed
Set CPU acpi_id(1) cpuid(1) Px State info:
  _PCT: descriptor=0, length=0, space_id=127, bit_width=0, bit_offset=0, 
reserved=0, address=0
  _PCT: descriptor=0, length=0, space_id=127, bit_width=0, bit_offset=0, 
reserved=0, address=0
  _PSS: state_count=4
  State0: 2261MHz 35000mW 10us 10us 0x927 0x927
  State1: 2260MHz 35000mW 10us 10us 0x4822 0x4822
  State2: 1600MHz 16000mW 10us 10us 0x617 0x617
  State3: 800MHz 14300mW 10us 10us 0x8611 0x8611
  _PSD: num_entries=5 rev=0 domain=1 coord_type=252 num_processors=1
  _PPC: 0
xen_pminfo: @acpi_cpufreq_cpu_init,HARDWARE addr space
CPU 1 initialization completed
cpuid.MWAIT[.eax=40, .ebx=40, .ecx=3, .edx=3122220]
Monitor-Mwait will be used to enter C-1 state
cpuid.MWAIT[.eax=40, .ebx=40, .ecx=3, .edx=3122220]
Monitor-Mwait will be used to enter C-1 state

It seems to mostly work ok, except that an OpenSolaris HVM domU slows 
down to a crawl. When I examined it, I noticed that it wasn't getting 
any timer interrupts. Or, almost none; occasionally the interrupt count 
for the timer jumps, but mostly it stays frozen.

If I limit the maximum C state to C2, things seem ok, but when C3 is 
used, the problems pops up pretty quickly, though not immediately: if I 
initialize the maximum C state to 1, but then change it to C3 when the 
domU is already running, it won't immediately slow down, it might take 
about a minute, during which time the xenpm utility already is showing 
that C3 is being used. So a switch to C3 isn't guaranteed to cause the 
problem, but it eventually will.

I tried other HVM domains, such as Windows, and they don't seem to have 
the problem. But, they might be more "tickless" than OpenSolaris is.

I haven't looked at the PM code much yet, and I haven't tried -unstable, 
but there don't seem to be any changes in -unstable that would fix this.

Is this a known problem? Any hints for diagnosing the problem?

- Frank

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xenpm: opensolaris HVM domU stops getting timer interrupts when C3 used
  2010-01-05 20:55 xenpm: opensolaris HVM domU stops getting timer interrupts when C3 used Frank Van Der Linden
@ 2010-01-05 21:02 ` Keir Fraser
  2010-01-06  1:55 ` Wei, Gang
  1 sibling, 0 replies; 5+ messages in thread
From: Keir Fraser @ 2010-01-05 21:02 UTC (permalink / raw)
  To: Frank Van Der Linden, xen-devel@lists.xensource.com

On 05/01/2010 20:55, "Frank Van Der Linden" <Frank.Vanderlinden@Sun.COM>
wrote:

> I tried other HVM domains, such as Windows, and they don't seem to have
> the problem. But, they might be more "tickless" than OpenSolaris is.
> 
> I haven't looked at the PM code much yet, and I haven't tried -unstable,
> but there don't seem to be any changes in -unstable that would fix this.
> 
> Is this a known problem? Any hints for diagnosing the problem?

Perhaps CPUs are not being woken from C3 in a timely manner? Could check how
late timers are fired in timer_softirq_action(), perhaps printing largest
missed deadline every few seconds or soemthing like that.

 -- Keir

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: xenpm: opensolaris HVM domU stops getting timer interrupts when C3 used
  2010-01-05 20:55 xenpm: opensolaris HVM domU stops getting timer interrupts when C3 used Frank Van Der Linden
  2010-01-05 21:02 ` Keir Fraser
@ 2010-01-06  1:55 ` Wei, Gang
  2010-01-06  2:29   ` Frank Van Der Linden
  1 sibling, 1 reply; 5+ messages in thread
From: Wei, Gang @ 2010-01-06  1:55 UTC (permalink / raw)
  To: Frank Van Der Linden, xen-devel@lists.xensource.com

Frank Van Der Linden wrote:
> I haven't looked at the PM code much yet, and I haven't tried
> -unstable, but there don't seem to be any changes in -unstable that
> would fix this. 
> 
> Is this a known problem? Any hints for diagnosing the problem?
> 
> - Frank

To make sure whether it is problem of the xen time sub-system, can you try some "while true; do date; sleep 1; done" loop from dom0 directly, w/o creating HVM domain?

BTW, could you also do "xm debug c; xm dmesg > xmdmesg.log" in dom0 only case and attach the entire xmdmesg.log? More platform related information can be got from it.

Jimmy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xenpm: opensolaris HVM domU stops getting timer interrupts when C3 used
  2010-01-06  1:55 ` Wei, Gang
@ 2010-01-06  2:29   ` Frank Van Der Linden
  2010-01-06  4:01     ` Wei, Gang
  0 siblings, 1 reply; 5+ messages in thread
From: Frank Van Der Linden @ 2010-01-06  2:29 UTC (permalink / raw)
  To: Wei, Gang; +Cc: xen-devel@lists.xensource.com

Wei, Gang wrote:
> Frank Van Der Linden wrote:
>   
>> I haven't looked at the PM code much yet, and I haven't tried
>> -unstable, but there don't seem to be any changes in -unstable that
>> would fix this. 
>>
>> Is this a known problem? Any hints for diagnosing the problem?
>>
>> - Frank
>>     
>
> To make sure whether it is problem of the xen time sub-system, can you try some "while true; do date; sleep 1; done" loop from dom0 directly, w/o creating HVM domain?
>
> BTW, could you also do "xm debug c; xm dmesg > xmdmesg.log" in dom0 only case and attach the entire xmdmesg.log? More platform related information can be got from it.
>
> Jimmy
>   
I tested dom0, and time on dom0 is working just fine. This is the 
Solaris dom0, but that shouldn't matter in this case. Here is the xm 
dmesg output (plus the c debug key output), with dom0 having been up for 
a while, but no domUs had ever been run:

(xVM)  xVM version 3.4.2-xvm-debug
(xVM) Command line: console=com1 com1=115200,8n1 dom0_mem=1536M
(xVM) Video information:
(xVM)  VGA is text mode 80x25, font 8x16
(xVM)  VBE/DDC methods: none; EDID transfer time: 0 seconds
(xVM)  EDID info not retrieved because no DDC retrieval method detected
(xVM) Disc information:
(xVM)  Found 1 MBR signatures
(xVM)  Found 1 EDD information structures
(xVM) Xen-e820 RAM map:
(xVM)  0000000000000000 - 000000000009bc00 (usable)
(xVM)  000000000009bc00 - 00000000000a0000 (reserved)
(xVM)  00000000000e8000 - 00000000000eee00 (reserved)
(xVM)  00000000000eee00 - 00000000000ef000 (ACPI NVS)
(xVM)  00000000000ef000 - 0000000000100000 (reserved)
(xVM)  0000000000100000 - 00000000bfba0000 (usable)
(xVM)  00000000bfba0000 - 00000000c0000000 (reserved)
(xVM)  00000000fec00000 - 00000000fec28000 (reserved)
(xVM)  00000000fed00000 - 00000000fed00500 (reserved)
(xVM)  00000000fed10000 - 00000000fed1a000 (reserved)
(xVM)  00000000fed1c000 - 00000000fed94000 (reserved)
(xVM)  00000000fee00000 - 00000000fee01000 (reserved)
(xVM)  00000000ffa00000 - 00000000ffc00000 (reserved)
(xVM)  00000000ffd00000 - 0000000100000000 (reserved)
(xVM)  0000000100000000 - 000000013c000000 (usable)
(xVM) System RAM: 4027MB (4123884kB)
(xVM) ACPI: RSDP 000F00B0, 0014 (r0 TOSHIB)
(xVM) ACPI: RSDT BFBA0000, 0058 (r1 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: FACP BFBA0084, 0084 (r2 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: DSDT BFBA0108, 82BF (r2 TOSHIB A0066    20080603 MSFT  3000000)
(xVM) ACPI: FACS 000EEE00, 0040
(xVM) ACPI: SSDT BFBA83C7, 0506 (r2 TOSHIB A0066    20070720 MSFT  3000000)
(xVM) ACPI: BOOT BFBA005C, 0028 (r1 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: APIC BFBA8EA1, 0068 (r1 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: MCFG BFBA8F09, 003C (r1 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: HPET BFBA8F45, 0038 (r1 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: TCPA BFBA924A, 0032 (r2 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: SLIC BFBA927C, 0176 (r1 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: ASF! BFBA8FB1, 0075 (r16 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: DMAR BFBA9026, 00C8 (r1 TOSHIB A0066    20080603 TASM  4010000)
(xVM) ACPI: SSDT BFBA9E91, 01A3 (r2 TOSHIB A0066    20080317 MSFT  3000000)
(xVM) ACPI: SSDT BFBA93F2, 0076 (r2 TOSHIB A0066    20080310 MSFT  3000000)
(xVM) ACPI: SSDT BFBA9468, 0A29 (r2 TOSHIB A0066    20080411 MSFT  3000000)
(xVM) NUMA turned off
(xVM) Faking a node at 0000000000000000-000000013c000000
(xVM) Domain heap initialised
(xVM) DMI 2.5 present.
(xVM) Using APIC driver default
(xVM) ACPI: PM-Timer IO Port: 0xd808
(xVM) ACPI: ACPI SLEEP INFO: pm1x_cnt[d804,0], pm1x_evt[d800,0]
(xVM) ACPI:                  wakeup_vec[eee0c], vec_size[20]
(xVM) ACPI: Local APIC address 0xfee00000
(xVM) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
(xVM) Processor #0 7:7 APIC version 20
(xVM) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
(xVM) Processor #1 7:7 APIC version 20
(xVM) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
(xVM) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
(xVM) ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
(xVM) IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
(xVM) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(xVM) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(xVM) ACPI: IRQ0 used by override.
(xVM) ACPI: IRQ2 used by override.
(xVM) ACPI: IRQ9 used by override.
(xVM) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(xVM) ACPI: HPET id: 0x8086a301 base: 0xfed00000
(xVM) Using ACPI (MADT) for SMP configuration information
(xVM) Using scheduler: SMP Credit Scheduler (credit)
(xVM) Initializing CPU#0
(xVM) Detected 2261.053 MHz processor.
(xVM) CPU: L1 I cache: 32K, L1 D cache: 32K
(xVM) CPU: L2 cache: 3072K
(xVM) CPU: Physical Processor ID: 0
(xVM) CPU: Processor Core ID: 0
(xVM) VMX: Supported advanced features:
(xVM)  - APIC MMIO access virtualisation
(xVM)  - APIC TPR shadow
(xVM)  - Virtual NMI
(xVM)  - MSR direct-access bitmap
(xVM) HVM: VMX enabled
(xVM) Intel machine check reporting enabled on CPU#0.
(xVM) mce_init: init bank1
(xVM) mce_init: init bank2
(xVM) mce_init: init bank3
(xVM) mce_init: init bank4
(xVM) mce_init: init bank5
(xVM) CPU0: Thermal monitoring handled by SMI
(xVM) CMCI: CPU0 has no CMCI support
(xVM) CPU0: Intel(R) Core(TM)2 Duo CPU     P8400  @ 2.26GHz stepping 06
(xVM) Booting processor 1/1 eip 8c000
(xVM) Initializing CPU#1
(xVM) CPU: L1 I cache: 32K, L1 D cache: 32K
(xVM) CPU: L2 cache: 3072K
(xVM) CPU: Physical Processor ID: 0
(xVM) CPU: Processor Core ID: 1
(xVM) Intel machine check reporting enabled on CPU#1.
(xVM) mce_init: init bank1
(xVM) mce_init: init bank2
(xVM) mce_init: init bank4
(xVM) mce_init: init bank5
(xVM) CPU1: Thermal monitoring enabled (TM2)
(xVM) CMCI: CPU1 has no CMCI support
(xVM) CPU1: Intel(R) Core(TM)2 Duo CPU     P8400  @ 2.26GHz stepping 06
(xVM) Total of 2 processors activated.
(xVM) ENABLING IO-APIC IRQs
(xVM)  -> Using new ACK method
(xVM) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(xVM) checking TSC synchronization across 2 CPUs: passed.
(xVM) Platform timer is 14.318MHz HPET
(xVM) Brought up 2 CPUs
(xVM) microcode.c:73:d32767 microcode: CPU1 resumed
(xVM) I/O virtualisation disabled
(xVM) HPET: 4 timers in total, 0 timers will be used for broadcast
(xVM) ACPI sleep modes: S3
(xVM) mcheck_poll: Machine check polling timer started.
(xVM) *** LOADING DOMAIN 0 ***
(xVM) elf_parse_binary: phdr: paddr=0x40800000 memsz=0x273e8
(xVM) elf_parse_binary: phdr: paddr=0x40000000 memsz=0x1a916c
(xVM) elf_parse_binary: phdr: paddr=0x40400000 memsz=0xd97f0
(xVM) elf_parse_binary: memory: 0x40000000 -> 0x408273e8
(xVM) elf_xen_parse_note: LOADER = "generic"
(xVM) elf_xen_parse_note: XEN_VERSION = "xen-3.0"
(xVM) elf_xen_parse_note: GUEST_OS = "Solaris"
(xVM) elf_xen_parse_note: VIRT_BASE = 0x40000000
(xVM) elf_xen_parse_note: PADDR_OFFSET = 0x40000000
(xVM) elf_xen_addr_calc_check: addresses:
(xVM)     virt_base        = 0x40000000
(xVM)     elf_paddr_offset = 0x40000000
(xVM)     virt_offset      = 0x0
(xVM)     virt_kstart      = 0x40000000
(xVM)     virt_kend        = 0x408273e8
(xVM)     virt_entry       = 0x40800000
(xVM)     p2m_base         = 0xffffffffffffffff
(xVM)  Xen  kernel: 64-bit, lsb, compat32
(xVM)  Dom0 kernel: 64-bit, lsb, paddr 0x40000000 -> 0x408273e8
(xVM) PHYSICAL MEMORY ARRANGEMENT:
(xVM)  Dom0 alloc.:   0000000134000000->0000000138000000 (376832 pages 
to be allocated)
(xVM) VIRTUAL MEMORY ARRANGEMENT:
(xVM)  Loaded kernel: 0000000040000000->00000000408273e8
(xVM)  Init. ramdisk: 0000000040828000->0000000043548000
(xVM)  Phys-Mach map: 0000000043548000->0000000043848000
(xVM)  Start info:    0000000043848000->00000000438484b4
(xVM)  Page tables:   0000000043849000->000000004386a000
(xVM)  Boot stack:    000000004386a000->000000004386b000
(xVM)  TOTAL:         0000000040000000->0000000043c00000
(xVM)  ENTRY ADDRESS: 0000000040800000
(xVM) Dom0 has maximum 2 VCPUs
(xVM) elf_load_binary: phdr 1 at 0x0000000040800000 -> 0x00000000408273e8
(xVM) elf_load_binary: phdr 2 at 0x0000000040000000 -> 0x00000000401a916c
(xVM) elf_load_binary: phdr 3 at 0x0000000040400000 -> 0x00000000404136ff
(xVM) Scrubbing Free RAM: ........................done.
(xVM) Xen trace buffers: disabled
(xVM) Std. Loglevel: All
(xVM) Guest Loglevel: All
(xVM) *** Serial input -> DOM0 (type \047CTRL-a\047 three times to 
switch input to Xen)
(xVM) Freed 156kB init memory.
(xVM) grant_table.c:808:d0 Expanding dom (0) grant table from (4) to 
(32) frames.
(xVM) PCI add device 00:1f.2
(xVM) Set CPU acpi_id(0) cpuid(0) Px State info:
(xVM)   _PCT: descriptor=0, length=0, space_id=127, bit_width=0, 
bit_offset=0, reserved=0, address=0
(xVM)   _PCT: descriptor=0, length=0, space_id=127, bit_width=0, 
bit_offset=0, reserved=0, address=0
(xVM)   _PSS: state_count=4
(xVM)   State0: 2261MHz 35000mW 10us 10us 0x927 0x927
(xVM)   State1: 2260MHz 35000mW 10us 10us 0x4822 0x4822
(xVM)   State2: 1600MHz 16000mW 10us 10us 0x617 0x617
(xVM)   State3: 800MHz 14300mW 10us 10us 0x8611 0x8611
(xVM)   _PSD: num_entries=5 rev=0 domain=0 coord_type=252 num_processors=1
(xVM)   _PPC: 0
(xVM) xen_pminfo: @acpi_cpufreq_cpu_init,HARDWARE addr space
(xVM) CPU 0 initialization completed
(xVM) Set CPU acpi_id(1) cpuid(1) Px State info:
(xVM)   _PCT: descriptor=0, length=0, space_id=127, bit_width=0, 
bit_offset=0, reserved=0, address=0
(xVM)   _PCT: descriptor=0, length=0, space_id=127, bit_width=0, 
bit_offset=0, reserved=0, address=0
(xVM)   _PSS: state_count=4
(xVM)   State0: 2261MHz 35000mW 10us 10us 0x927 0x927
(xVM)   State1: 2260MHz 35000mW 10us 10us 0x4822 0x4822
(xVM)   State2: 1600MHz 16000mW 10us 10us 0x617 0x617
(xVM)   State3: 800MHz 14300mW 10us 10us 0x8611 0x8611
(xVM)   _PSD: num_entries=5 rev=0 domain=1 coord_type=252 num_processors=1
(xVM)   _PPC: 0
(xVM) xen_pminfo: @acpi_cpufreq_cpu_init,HARDWARE addr space
(xVM) CPU 1 initialization completed
(xVM) cpuid.MWAIT[.eax=40, .ebx=40, .ecx=3, .edx=3122220]
(xVM) Monitor-Mwait will be used to enter C-1 state
(xVM) cpuid.MWAIT[.eax=40, .ebx=40, .ecx=3, .edx=3122220]
(xVM) Monitor-Mwait will be used to enter C-1 state
(xVM) PCI add device 00:19.0
(xVM) PCI add device 00:1b.0
(xVM) Xen trace buffers: initialized
(xVM) *** Serial input -> Xen (type \047CTRL-a\047 three times to switch 
input to DOM0)
(xVM) ==cpu0==
(xVM) active state:             C3
(xVM) max_cstate:               C7
(xVM) states:
(xVM)     C1:   type[C1] latency[001] usage[00005351] duration[16709386]
(xVM)     C2:   type[C2] latency[001] usage[00072721] duration[48331319757]
(xVM)    *C3:   type[C3] latency[057] usage[05261685] 
duration[32698354952269]
(xVM)     C0:   usage[05339757] duration[337842300997]
(xVM) ==cpu1==
(xVM) active state:             C3
(xVM) max_cstate:               C7
(xVM) states:
(xVM)     C1:   type[C1] latency[001] usage[00000050] duration[1916444]
(xVM)     C2:   type[C2] latency[001] usage[00039050] duration[77772952140]
(xVM)    *C3:   type[C3] latency[057] usage[05592302] 
duration[32757362666651]
(xVM)     C0:   usage[05631402] duration[249407777117]

- Frank

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: xenpm: opensolaris HVM domU stops getting timer interrupts when C3 used
  2010-01-06  2:29   ` Frank Van Der Linden
@ 2010-01-06  4:01     ` Wei, Gang
  0 siblings, 0 replies; 5+ messages in thread
From: Wei, Gang @ 2010-01-06  4:01 UTC (permalink / raw)
  To: Frank Van Der Linden; +Cc: xen-devel@lists.xensource.com

[-- Attachment #1: Type: text/plain, Size: 678 bytes --]

Frank Van Der Linden wrote:
> I tested dom0, and time on dom0 is working just fine. This is the
> Solaris dom0, but that shouldn't matter in this case. Here is the xm
> dmesg output (plus the c debug key output), with dom0 having been up
> for 
> a while, but no domUs had ever been run:

Can't see any problem from the serial output. Let's take a further step. Attached patch is a __draft__ patch to collect statistics related with timer interrupts & timer fns. Can you apply it, rebuild the xen hypervisor code, then do "xm debug 9; sleep 20; xm debug 9; xm dmesg > xmdmesg.log" in both dom0 only & 1 HVM cases. Let's have a look on the timer statistics first.

Jimmy

[-- Attachment #2: timer_fn_stat_v1.patch --]
[-- Type: application/octet-stream, Size: 11847 bytes --]

diff -r a7b2d4e03a1f xen/arch/x86/apic.c
--- a/xen/arch/x86/apic.c	Mon Jan 04 09:09:27 2010 +0000
+++ b/xen/arch/x86/apic.c	Wed Jan 06 11:37:28 2010 +0800
@@ -1225,6 +1225,9 @@ int reprogram_timer(s_time_t timeout)
 
 fastcall void smp_apic_timer_interrupt(struct cpu_user_regs * regs)
 {
+    extern void timer_stat_apic_inc( void );
+    timer_stat_apic_inc();
+
     ack_APIC_irq();
     perfc_incr(apic_timer);
     raise_softirq(TIMER_SOFTIRQ);
diff -r a7b2d4e03a1f xen/arch/x86/hpet.c
--- a/xen/arch/x86/hpet.c	Mon Jan 04 09:09:27 2010 +0000
+++ b/xen/arch/x86/hpet.c	Wed Jan 06 11:37:28 2010 +0800
@@ -181,6 +181,11 @@ static void handle_hpet_broadcast(struct
     int cpu;
 
     spin_lock_irq(&ch->lock);
+
+    {
+        extern void timer_stat_hpet_inc (void);
+        timer_stat_hpet_inc();
+    }
 
 again:
     ch->next_event = STIME_MAX;
diff -r a7b2d4e03a1f xen/arch/x86/time.c
--- a/xen/arch/x86/time.c	Mon Jan 04 09:09:27 2010 +0000
+++ b/xen/arch/x86/time.c	Wed Jan 06 11:37:28 2010 +0800
@@ -186,6 +186,11 @@ static void smp_send_timer_broadcast_ipi
 {
     int cpu = smp_processor_id();
     cpumask_t mask;
+
+    {
+        extern void timer_stat_pit_inc (void);
+        timer_stat_pit_inc();
+    }
 
     cpus_and(mask, cpu_online_map, pit_broadcast_mask);
 
diff -r a7b2d4e03a1f xen/common/timer.c
--- a/xen/common/timer.c	Mon Jan 04 09:09:27 2010 +0000
+++ b/xen/common/timer.c	Wed Jan 06 11:37:28 2010 +0800
@@ -18,6 +18,7 @@
 #include <xen/timer.h>
 #include <xen/keyhandler.h>
 #include <xen/percpu.h>
+#include <xen/guest_access.h>
 #include <asm/system.h>
 #include <asm/desc.h>
 
@@ -37,6 +38,136 @@ struct timers {
 } __cacheline_aligned;
 
 static DEFINE_PER_CPU(struct timers, timers);
+
+struct timer_stat{
+    u64 nr_executed_timer;
+    u64 nr_timer_softirq;
+    u64 nr_timer_softirq_add_timer;
+    u64 nr_timer_softirq_stop_timer;
+    u64 nr_local_apic_intr;
+    u64 nr_hpet_intr;     /* received HEPT interrupt */
+    u64 nr_pit_intr;      /* received PIT interrupt */
+    u64 nr_broadcast_IPI; /* received broadcast EVENT_CHECK IPI
+                             Issued by cpumask_raise_softirq()*/
+};
+
+static DEFINE_PER_CPU(struct timer_stat, timer_stat)={0UL};
+static DEFINE_PER_CPU(struct timer_stat, saved_timer_stat) = {0UL};
+
+static s_time_t timer_stat_now = 0;
+
+#define TIMER_STAT_INC(_field) do { \
+            this_cpu(timer_stat)._field++; \
+            }while(0)
+
+#define TIMER_STAT_INC_CPU(_field, _cpu) do {\
+            per_cpu(timer_stat, _cpu)._field++;\
+            }while(0)
+
+void timer_stat_apic_inc( void )
+{
+    TIMER_STAT_INC(nr_local_apic_intr);
+}
+
+void timer_stat_hpet_inc (void)
+{
+    TIMER_STAT_INC(nr_hpet_intr);
+}
+
+void timer_stat_pit_inc (void)
+{
+    TIMER_STAT_INC(nr_pit_intr);
+}
+
+void timer_stat_broadcast_IPI_inc(cpumask_t mask)
+{
+    int cpu;
+    for_each_cpu_mask(cpu, mask)
+    {
+        TIMER_STAT_INC_CPU(nr_broadcast_IPI, cpu);
+    }
+}
+
+static void timer_stat_dump_per_cpu( int cpu, s_time_t now)
+{
+#define S_DIFF(_saved,_current,_field) (_current->_field - _saved->_field)
+    struct timer_stat *t, *saved_t;
+    u64    nr_all_intr;
+    u64    HZ_all_intr;
+
+    t       = &per_cpu( timer_stat, cpu );
+    saved_t = &per_cpu( saved_timer_stat, cpu );
+
+    nr_all_intr = S_DIFF( saved_t, t, nr_local_apic_intr ) +
+                  S_DIFF( saved_t, t, nr_hpet_intr ) +
+                  S_DIFF( saved_t, t, nr_pit_intr ) +
+                  S_DIFF( saved_t, t, nr_broadcast_IPI );
+
+    if (now > timer_stat_now){
+	    HZ_all_intr = nr_all_intr * 10000 / ((now-timer_stat_now)/100000);
+    }else{
+	    HZ_all_intr = 0;
+    }
+
+    printk("%d", cpu);
+    printk("\t");
+    printk("\t%"PRIu64, S_DIFF( saved_t, t, nr_executed_timer ));
+    printk("\t%"PRIu64, S_DIFF( saved_t, t, nr_local_apic_intr ));
+    printk("\t%"PRIu64, S_DIFF( saved_t, t, nr_hpet_intr ));
+    printk("\t%"PRIu64, S_DIFF( saved_t, t, nr_pit_intr ));
+    printk("\t%"PRIu64, S_DIFF( saved_t, t, nr_broadcast_IPI ));
+    printk("\t%"PRIu64, HZ_all_intr );
+    if (nr_all_intr != 0) {
+	    printk("\t%"PRIu64, 100 * S_DIFF( saved_t, t, nr_executed_timer ) / nr_all_intr );
+    }
+    else {
+	    printk("\t0");
+    }
+
+    printk("\n");
+
+    memcpy(saved_t, t, sizeof(struct timer_stat));
+}
+
+static void timer_stat_dump(unsigned char key)
+{
+    int i;
+    s_time_t now;
+
+    printk("Dump timer statistic\n");
+    printk("CPU");
+    /* number of executed timer, i.e. calling number of execute_timer() */
+    printk("\ttimer");
+
+    printk("\tlapic");
+    printk("\thpet");
+    printk("\tpit");
+
+    /* number of received broadcast IPI */
+    printk("\tIPI");
+
+    /* interrupt frequency : (lapic_intr + hpet_intr + bc_IPI) / period */
+    printk("\tHZ");
+
+    /* number of executed timer per interrupt */
+    printk("\t100*timer/all_intr");
+    printk("\n");
+
+    now = NOW();
+    for_each_cpu ( i )
+    {
+        timer_stat_dump_per_cpu(i, now);
+    }
+
+    printk("Period: %"PRId64" seconds\n", (now - timer_stat_now)/1000000000);
+    timer_stat_now = now;
+
+    {
+        void timer_stat_dump_1(void);
+        timer_stat_dump_1();
+    }
+}
+
 
 DEFINE_PER_CPU(s_time_t, timer_deadline);
 
@@ -211,14 +342,20 @@ static inline void __add_timer(struct ti
 {
     int cpu = timer->cpu;
     if ( add_entry(&per_cpu(timers, cpu), timer) )
+    {
+        TIMER_STAT_INC_CPU(nr_timer_softirq_add_timer, cpu);
         cpu_raise_softirq(cpu, TIMER_SOFTIRQ);
+    }
 }
 
 static inline void __stop_timer(struct timer *timer)
 {
     int cpu = timer->cpu;
     if ( remove_entry(&per_cpu(timers, cpu), timer) )
+    {
+        TIMER_STAT_INC_CPU(nr_timer_softirq_stop_timer, cpu);
         cpu_raise_softirq(cpu, TIMER_SOFTIRQ);
+    }
 }
 
 static inline void timer_lock(struct timer *timer)
@@ -353,7 +490,10 @@ static void execute_timer(struct timers 
     void (*fn)(void *) = t->function;
     void *data = t->data;
 
+    TIMER_STAT_INC(nr_executed_timer);
+
     ts->running = t;
+    timer_stat_inc(t);
     spin_unlock_irq(&ts->lock);
     (*fn)(data);
     spin_lock_irq(&ts->lock);
@@ -366,6 +506,8 @@ static void timer_softirq_action(void)
     struct timer  *t, **heap, *next;
     struct timers *ts;
     s_time_t       now;
+
+    TIMER_STAT_INC(nr_timer_softirq);
 
     ts = &this_cpu(timers);
     heap = ts->heap;
@@ -505,7 +647,6 @@ static void dump_timerq(unsigned char ke
     }
 }
 
-
 void __init timer_init(void)
 {
     static struct timer *dummy_heap;
@@ -527,7 +668,86 @@ void __init timer_init(void)
     }
 
     register_keyhandler('a', dump_timerq, "dump timer queues");
+    register_keyhandler('9', timer_stat_dump, "dump timer statistics");
 }
+
+#ifdef CONFIG_TIMER_STAT
+
+static struct timers_stat stat_data;
+static struct timers_stat stat_data_saved;
+static s_time_t now_saved;
+spinlock_t timers_stat_lock = SPIN_LOCK_UNLOCKED;
+
+void timer_stat_dump_1(void)
+{
+    int i, cpu;
+    s_time_t ms,now;
+
+    now = NOW();
+    ms = (now - now_saved)/1000000;
+    if (ms == 0) return;
+
+    printk("%20s", "timer_fn");
+    for_each_online_cpu (cpu)
+        printk("\tCPU%d", cpu);
+    printk("\n");
+
+    for (i=0; i<stat_data.nr_timer; i++)
+    {
+        printk("%20s",stat_data.data[i].fn_name);
+        for_each_online_cpu (cpu)
+        {
+            printk("\t%ld", (stat_data.data[i].count[cpu] -
+                        stat_data_saved.data[i].count[cpu]) * 1000 / ms);
+        }
+        printk("\n");
+    }
+    now_saved = now;
+    memcpy(&stat_data_saved, &stat_data, sizeof(stat_data));
+}
+
+void timer_stat_inc(struct timer* t)
+{
+    if ( t->stat_index > 0 && t->stat_index <= MAX_NR_TIMER ){
+        stat_data.data[t->stat_index-1].count[t->cpu]++;
+    }
+}
+
+void timer_stat_register(struct timer* t, char* fn_name)
+{
+    int i;
+
+    spin_lock( &timers_stat_lock );
+
+    if ( stat_data.nr_timer < MAX_NR_TIMER )
+    {
+        for ( i=0; i < stat_data.nr_timer; i++ )
+        {
+            if ( stat_data.data[i].fn_addr == (u64)t->function )
+            {
+                t->stat_index = i+1;
+                goto out; /* already registered */
+            }
+        }
+
+        stat_data.nr_timer++;
+        t->stat_index = stat_data.nr_timer;
+        strlcpy( stat_data.data[t->stat_index-1].fn_name , fn_name,
+                XEN_TIMER_NAME_MAX_LEN);
+        stat_data.data[t->stat_index-1].fn_addr = (u64)t->function;
+    }
+    else
+    {
+        printk( KERN_WARNING "Warning: Timer statistics array overflow,"
+                "please enlarge MAX_NR_TIMER\n" );
+        t->stat_index = 0;
+    }
+
+out:
+    spin_unlock( &timers_stat_lock );
+}
+
+#endif /* CONFIG_TIMER_STAT */
 
 /*
  * Local variables:
diff -r a7b2d4e03a1f xen/include/xen/softirq.h
--- a/xen/include/xen/softirq.h	Mon Jan 04 09:09:27 2010 +0000
+++ b/xen/include/xen/softirq.h	Wed Jan 06 11:37:28 2010 +0800
@@ -37,6 +37,12 @@ static inline void cpumask_raise_softirq
     {
         if ( test_and_set_bit(nr, &softirq_pending(cpu)) )
             cpu_clear(cpu, mask);
+    }
+
+    if (nr == TIMER_SOFTIRQ)
+    {
+        extern void timer_stat_broadcast_IPI_inc(cpumask_t mask);
+        timer_stat_broadcast_IPI_inc(mask);
     }
 
     smp_send_event_check_mask(mask);
diff -r a7b2d4e03a1f xen/include/xen/timer.h
--- a/xen/include/xen/timer.h	Mon Jan 04 09:09:27 2010 +0000
+++ b/xen/include/xen/timer.h	Wed Jan 06 11:37:28 2010 +0800
@@ -11,6 +11,9 @@
 #include <xen/spinlock.h>
 #include <xen/time.h>
 #include <xen/string.h>
+#include <public/sysctl.h>
+
+#define CONFIG_TIMER_STAT /* timer statistics, default is off */
 
 struct timer {
     /* System time expiry value (nanoseconds since boot). */
@@ -38,6 +41,10 @@ struct timer {
 #define TIMER_STATUS_in_heap  2 /* In use; on timer heap.           */
 #define TIMER_STATUS_in_list  3 /* In use; on overflow linked list. */
     uint8_t status;
+
+#ifdef CONFIG_TIMER_STAT
+    uint16_t stat_index;        /* index in timer statistics array */
+#endif
 };
 
 /*
@@ -60,7 +67,7 @@ static inline int active_timer(struct ti
  * time (and multiple times) on an inactive timer. It must *never* execute
  * concurrently with any other operation on the same timer.
  */
-static inline void init_timer(
+static inline void raw_init_timer(
     struct timer *timer,
     void           (*function)(void *),
     void            *data,
@@ -71,6 +78,20 @@ static inline void init_timer(
     timer->data     = data;
     timer->cpu      = cpu;
 }
+
+#ifdef CONFIG_TIMER_STAT
+
+#define init_timer(x1,x2,x3,x4) do{\
+                raw_init_timer(x1,x2,x3,x4);\
+                timer_stat_register(x1, STR(x2));\
+        }while(0)
+
+#else
+
+#define init_timer(x1,x2,x3,x4) raw_init_timer(x1,x2,x3,x4)
+
+#endif /* CONFIG_TIMER_STAT */
+
 
 /*
  * Set the expiry time and activate a timer. The timer must *previously* have
@@ -119,6 +140,37 @@ extern int reprogram_timer(s_time_t time
 /* calculate the aligned first tick time for a given periodic timer */ 
 extern s_time_t align_timer(s_time_t firsttick, uint64_t period);
 
+#ifdef CONFIG_TIMER_STAT /* timer execution count */
+#define XEN_TIMER_NAME_MAX_LEN 20
+#define MAX_NR_TIMER 64
+
+struct timer_stat_v2{
+    char fn_name[XEN_TIMER_NAME_MAX_LEN];
+    u64  fn_addr;
+    u64  count[NR_CPUS];
+};
+
+struct timers_stat{
+    u32  nr_timer;
+    struct timer_stat_v2 data[MAX_NR_TIMER];
+};
+
+void timer_stat_register(struct timer* t,char* fn_name);
+void timer_stat_inc(struct timer* t);
+int timer_stat_get(struct xen_sysctl_pm_op *op);
+
+#else
+
+#define timer_stat_register(x,y) ((void)0)
+#define timer_stat_inc(x) ((void)0)
+
+static inline int timer_stat_get(truct xen_sysctl_pm_op *op)
+{
+    return  -ENODATA;
+}
+
+#endif /* CONFIG_TIMER_STAT */
+
 #endif /* _TIMER_H_ */
 
 /*

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-01-06  4:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-05 20:55 xenpm: opensolaris HVM domU stops getting timer interrupts when C3 used Frank Van Der Linden
2010-01-05 21:02 ` Keir Fraser
2010-01-06  1:55 ` Wei, Gang
2010-01-06  2:29   ` Frank Van Der Linden
2010-01-06  4:01     ` Wei, Gang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.