xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "zhenzhong.duan" <zhenzhong.duan@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Feng Jin <joe.jin@oracle.com>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: kernel bootup slow issue on ovm3.1.1
Date: Fri, 10 Aug 2012 12:40:07 +0800	[thread overview]
Message-ID: <502490A7.7020603@oracle.com> (raw)
In-Reply-To: <5023AE960200007800093DE8@nat28.tlf.novell.com>


[-- Attachment #1.1: Type: text/plain, Size: 11147 bytes --]



于 2012-08-09 18:35, Jan Beulich 写道:
>>>> On 09.08.12 at 11:42, "zhenzhong.duan"<zhenzhong.duan@oracle.com>  wrote:
>> 于 2012-08-08 23:01, Jan Beulich 写道:
>>>>>> On 08.08.12 at 11:48, "zhenzhong.duan"<zhenzhong.duan@oracle.com>   wrote:
>>>> 于 2012-08-07 16:37, Jan Beulich 写道:
>>>> Some spin at stop_machine after finish their job.
>>> And here you'd need to find out what they're waiting for,
>>> and what those CPUs are doing.
>> They are waiting the vcpu calling generic_set_all and those spin at
>> set_atomicity_lock.
>> In fact, all are waiting generic_set_all
> I think we're moving in circles - what is the vCPU currently
> generic_set_all() then doing?
Add some debug print, generic_set_all->prepare_set->write_cr0 took much 
time,
all else are quick. set_atomicity_lock serialized this process between 
cpus, make it worse.
One iteration:
MTRR: CPU 2
prepare_set: before read_cr0
prepare_set: before write_cr0 ------*block here*
prepare_set: before wbinvd
prepare_set: before read_cr4
prepare_set: before write_cr4
prepare_set: before __flush_tlb
prepare_set: before rdmsr
prepare_set: before wrmsr
generic_set_all: before set_mtrr_state
generic_set_all: before pat_init
post_set: before wbinvd
post_set: before wrmsr
post_set: before write_cr0
post_set: before write_cr4

>
>>> There's not that much being done in generic_set_all(), so the
>>> code should finish reasonably quickly. Are you perhaps having
>>> more vCPU-s in the guest than pCPU-s they can run on?
>> System env is an exalogic node with 24 cores + 100G mem (2 socket , 6
>> cores per socket, 2 HT threads per core).
>> Bootup a pvhvm with 12vpcus (or 24) + 90 GB + pci passthroughed device.
> So you're indeed over-committing the system. How many vCPU-s
> does you Dom0 have? Are there any other VMs? Is there any
> vCPU pinning in effect?
dom0 boot with 24 vcpus(same result with dom0_max_vcpus=4). No other vm 
except dom0. All 24 vcpus spin from xentop result. Below is xentop clip.

       NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) 
MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR  
VBD_RSECT  VBD_WSECT
  SSID
   Domain-0 -----r      43072  158.8    2050560    2.0   no limit       
n/a    24    0        0        0    0        0        0        
0          0          0
     0
VCPUs(sec):   0:      13649s  1:       6197s  2:       4254s  3:       
2006s  4:       1409s
           5:        930s  6:        698s  7:        630s  8:        
612s  9:       2038s
          10:        544s 11:        940s 12:        556s 13:        
510s 14:        456s
          15:        591s 16:        438s 17:        508s 18:       
3350s 19:        512s
          20:        544s 21:        529s 22:        547s 23:        610s
zduan_test -----r      13140 2234.4   92327920   91.7   92327936      
91.7    24    1        0        0    1        0        0        
0          0          0
     0
VCPUs(sec):   0:        556s  1:        551s  2:        549s  3:        
544s  4:        549s
           5:        545s  6:        545s  7:        547s  8:        
545s  9:        548s
          10:        545s 11:        546s 12:        545s 13:        
548s 14:        543s
          15:        544s 16:        551s 17:        545s 18:        
547s 19:        551s
          20:        544s 21:        549s 22:        546s 23:        545s
>>>    Does
>>> your hardware support Pause-Loop-Exiting (or the AMD
>>> equivalent, don't recall their term right now)?
>> I have no access to serial line, could I get the info by a command?
> "xl dmesg" run early enough (i.e. before the log buffer wraps).
Below is xl dmesg result for your reference. thanks
[root@scae02cn01 zduan]# xl dmesg
  __  __            _  _    ___   ____      _____     ____  __
  \ \/ /___ _ __   | || |  / _ \ |___ \    / _ \ \   / /  \/  |
   \  // _ \ '_ \  | || |_| | | |  __) |__| | | \ \ / /| |\/| |
   /  \  __/ | | | |__   _| |_| | / __/|__| |_| |\ V / | |  | |
  /_/\_\___|_| |_|    |_|(_)___(_)_____|   \___/  \_/  |_|  |_|

(XEN) Xen version 4.0.2-OVM (mockbuild@(none)) (gcc version 4.1.2 
20080704 (Red Hat 4.1.2-48)) Fri Dec 23 17:00:16 EST 2011
(XEN) Latest ChangeSet: unavailable
(XEN) Bootloader: GNU GRUB 0.97
(XEN) Command line: dom0_mem=2G
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: none; EDID transfer time: 1 seconds
(XEN)  EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN)  Found 1 MBR signatures
(XEN)  Found 1 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 0000000000099400 (usable)
(XEN)  0000000000099400 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 000000007f780000 (usable)
(XEN)  000000007f78e000 - 000000007f790000 type 9
(XEN)  000000007f790000 - 000000007f79e000 (ACPI data)
(XEN)  000000007f79e000 - 000000007f7d0000 (ACPI NVS)
(XEN)  000000007f7d0000 - 000000007f7e0000 (reserved)
(XEN)  000000007f7ec000 - 0000000080000000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ffc00000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000001880000000 (usable)
(XEN) ACPI: RSDP 000FAA40, 0024 (r2 SUN   )
(XEN) ACPI: XSDT 7F790100, 0094 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: FACP 7F790290, 00F4 (r4 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: DSDT 7F7905C0, 5ECF (r2 SUN    Xxx70           1 INTL 20051117)
(XEN) ACPI: FACS 7F79E000, 0040
(XEN) ACPI: APIC 7F790390, 011E (r2 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: MCFG 7F790500, 003C (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: SLIT 7F790540, 0030 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: SPMI 7F790570, 0041 (r5 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: OEMB 7F79E040, 00BE (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: HPET 7F79A5C0, 0038 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: DMAR 7F79E100, 0130 (r1 SUN    Xxx70           1 MSFT       97)
(XEN) ACPI: SRAT 7F79A600, 0250 (r1 SUN    Xxx70           1 INTC        1)
(XEN) ACPI: SSDT 7F79EF60, 0363 (r1  SUN   Xxx70          12 INTL 20051117)
(XEN) ACPI: EINJ 7F79A850, 0130 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: BERT 7F79A9E0, 0030 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: ERST 7F79AA10, 01B0 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: HEST 7F79ABC0, 00A8 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) System RAM: 98295MB (100654180kB)
(XEN) Domain heap initialised DMA width 32 bits
(XEN) Processor #0 6:12 APIC version 21
(XEN) Processor #2 6:12 APIC version 21
(XEN) Processor #4 6:12 APIC version 21
(XEN) Processor #16 6:12 APIC version 21
(XEN) Processor #18 6:12 APIC version 21
(XEN) Processor #20 6:12 APIC version 21
(XEN) Processor #32 6:12 APIC version 21
(XEN) Processor #34 6:12 APIC version 21
(XEN) Processor #36 6:12 APIC version 21
(XEN) Processor #48 6:12 APIC version 21
(XEN) Processor #50 6:12 APIC version 21
(XEN) Processor #52 6:12 APIC version 21
(XEN) Processor #1 6:12 APIC version 21
(XEN) Processor #3 6:12 APIC version 21
(XEN) Processor #5 6:12 APIC version 21
(XEN) Processor #17 6:12 APIC version 21
(XEN) Processor #19 6:12 APIC version 21
(XEN) Processor #21 6:12 APIC version 21
(XEN) Processor #33 6:12 APIC version 21
(XEN) Processor #35 6:12 APIC version 21
(XEN) Processor #37 6:12 APIC version 21
(XEN) Processor #49 6:12 APIC version 21
(XEN) Processor #51 6:12 APIC version 21
(XEN) Processor #53 6:12 APIC version 21
(XEN) IOAPIC[0]: apic_id 6, version 32, address 0xfec00000, GSI 0-23
(XEN) IOAPIC[1]: apic_id 7, version 32, address 0xfec8a000, GSI 24-47
(XEN) Enabling APIC mode:  Phys.  Using 2 I/O APICs
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2926.029 MHz processor.
(XEN) Initing memory sharing.
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN) EPT supports 2MB super page.
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging detected.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) Total of 24 processors activated.
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) TSC is reliable, synchronization unnecessary
(XEN) Platform timer is 14.318MHz HPET
(XEN) Allocated console ring of 64 KiB.
(XEN) Brought up 24 CPUs
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, lsb, paddr 0x2000 -> 0x6d5000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000835000000->0000000836000000 (520192 pages 
to be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff80002000->ffffffff806d5000
(XEN)  Init. ramdisk: ffffffff806d5000->ffffffff80ed7400
(XEN)  Phys-Mach map: ffffea0000000000->ffffea0000400000
(XEN)  Start info:    ffffffff80ed8000->ffffffff80ed84b4
(XEN)  Page tables:   ffffffff80ed9000->ffffffff80ee4000
(XEN)  Boot stack:    ffffffff80ee4000->ffffffff80ee5000
(XEN)  TOTAL:         ffffffff80000000->ffffffff81000000
(XEN)  ENTRY ADDRESS: ffffffff80002000
(XEN) Dom0 has maximum 24 VCPUs
(XEN) Scrubbing Free RAM: 
done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch 
input to Xen)
(XEN) Freed 168kB init memory.


[-- Attachment #1.2: Type: text/html, Size: 15595 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2012-08-10  4:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-07  7:22 kernel bootup slow issue on ovm3.1.1 zhenzhong.duan
2012-08-07  8:37 ` Jan Beulich
2012-08-08  9:48   ` zhenzhong.duan
2012-08-08 14:47     ` Jan Beulich
2012-08-08 15:01     ` Jan Beulich
2012-08-09  9:42       ` zhenzhong.duan
2012-08-09 10:35         ` Jan Beulich
2012-08-10  4:40           ` zhenzhong.duan [this message]
2012-08-10 14:22             ` Jan Beulich
2012-08-13  7:58               ` zhenzhong.duan
2012-08-13  9:29                 ` Jan Beulich
2012-08-13 11:08                   ` Stefano Stabellini
2012-08-29  5:19                     ` zhenzhong.duan
2012-08-29 18:28                       ` Stefano Stabellini
2012-08-29  5:36                   ` zhenzhong.duan
2012-08-30  9:03                     ` Tim Deegan
2012-09-19  2:39                       ` zhenzhong.duan
2012-09-19 10:29                         ` Jan Beulich
2013-04-29 17:55                       ` Konrad Rzeszutek Wilk
2013-04-30 10:37                         ` George Dunlap
2012-08-31  9:07                     ` Jan Beulich
2012-08-13  9:07               ` Tim Deegan
2012-08-07 16:26 ` Konrad Rzeszutek Wilk
2012-08-08  9:23   ` zhenzhong.duan
2012-08-08 14:43     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=502490A7.7020603@oracle.com \
    --to=zhenzhong.duan@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=joe.jin@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).