All of lore.kernel.org
 help / color / mirror / Atom feed
From: "zhenzhong.duan" <zhenzhong.duan@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Feng Jin <joe.jin@oracle.com>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: kernel bootup slow issue on ovm3.1.1
Date: Fri, 10 Aug 2012 12:40:07 +0800	[thread overview]
Message-ID: <502490A7.7020603@oracle.com> (raw)
In-Reply-To: <5023AE960200007800093DE8@nat28.tlf.novell.com>


[-- Attachment #1.1: Type: text/plain, Size: 11147 bytes --]



于 2012-08-09 18:35, Jan Beulich 写道:
>>>> On 09.08.12 at 11:42, "zhenzhong.duan"<zhenzhong.duan@oracle.com>  wrote:
>> 于 2012-08-08 23:01, Jan Beulich 写道:
>>>>>> On 08.08.12 at 11:48, "zhenzhong.duan"<zhenzhong.duan@oracle.com>   wrote:
>>>> 于 2012-08-07 16:37, Jan Beulich 写道:
>>>> Some spin at stop_machine after finish their job.
>>> And here you'd need to find out what they're waiting for,
>>> and what those CPUs are doing.
>> They are waiting the vcpu calling generic_set_all and those spin at
>> set_atomicity_lock.
>> In fact, all are waiting generic_set_all
> I think we're moving in circles - what is the vCPU currently
> generic_set_all() then doing?
Add some debug print, generic_set_all->prepare_set->write_cr0 took much 
time,
all else are quick. set_atomicity_lock serialized this process between 
cpus, make it worse.
One iteration:
MTRR: CPU 2
prepare_set: before read_cr0
prepare_set: before write_cr0 ------*block here*
prepare_set: before wbinvd
prepare_set: before read_cr4
prepare_set: before write_cr4
prepare_set: before __flush_tlb
prepare_set: before rdmsr
prepare_set: before wrmsr
generic_set_all: before set_mtrr_state
generic_set_all: before pat_init
post_set: before wbinvd
post_set: before wrmsr
post_set: before write_cr0
post_set: before write_cr4

>
>>> There's not that much being done in generic_set_all(), so the
>>> code should finish reasonably quickly. Are you perhaps having
>>> more vCPU-s in the guest than pCPU-s they can run on?
>> System env is an exalogic node with 24 cores + 100G mem (2 socket , 6
>> cores per socket, 2 HT threads per core).
>> Bootup a pvhvm with 12vpcus (or 24) + 90 GB + pci passthroughed device.
> So you're indeed over-committing the system. How many vCPU-s
> does you Dom0 have? Are there any other VMs? Is there any
> vCPU pinning in effect?
dom0 boot with 24 vcpus(same result with dom0_max_vcpus=4). No other vm 
except dom0. All 24 vcpus spin from xentop result. Below is xentop clip.

       NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) 
MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR  
VBD_RSECT  VBD_WSECT
  SSID
   Domain-0 -----r      43072  158.8    2050560    2.0   no limit       
n/a    24    0        0        0    0        0        0        
0          0          0
     0
VCPUs(sec):   0:      13649s  1:       6197s  2:       4254s  3:       
2006s  4:       1409s
           5:        930s  6:        698s  7:        630s  8:        
612s  9:       2038s
          10:        544s 11:        940s 12:        556s 13:        
510s 14:        456s
          15:        591s 16:        438s 17:        508s 18:       
3350s 19:        512s
          20:        544s 21:        529s 22:        547s 23:        610s
zduan_test -----r      13140 2234.4   92327920   91.7   92327936      
91.7    24    1        0        0    1        0        0        
0          0          0
     0
VCPUs(sec):   0:        556s  1:        551s  2:        549s  3:        
544s  4:        549s
           5:        545s  6:        545s  7:        547s  8:        
545s  9:        548s
          10:        545s 11:        546s 12:        545s 13:        
548s 14:        543s
          15:        544s 16:        551s 17:        545s 18:        
547s 19:        551s
          20:        544s 21:        549s 22:        546s 23:        545s
>>>    Does
>>> your hardware support Pause-Loop-Exiting (or the AMD
>>> equivalent, don't recall their term right now)?
>> I have no access to serial line, could I get the info by a command?
> "xl dmesg" run early enough (i.e. before the log buffer wraps).
Below is xl dmesg result for your reference. thanks
[root@scae02cn01 zduan]# xl dmesg
  __  __            _  _    ___   ____      _____     ____  __
  \ \/ /___ _ __   | || |  / _ \ |___ \    / _ \ \   / /  \/  |
   \  // _ \ '_ \  | || |_| | | |  __) |__| | | \ \ / /| |\/| |
   /  \  __/ | | | |__   _| |_| | / __/|__| |_| |\ V / | |  | |
  /_/\_\___|_| |_|    |_|(_)___(_)_____|   \___/  \_/  |_|  |_|

(XEN) Xen version 4.0.2-OVM (mockbuild@(none)) (gcc version 4.1.2 
20080704 (Red Hat 4.1.2-48)) Fri Dec 23 17:00:16 EST 2011
(XEN) Latest ChangeSet: unavailable
(XEN) Bootloader: GNU GRUB 0.97
(XEN) Command line: dom0_mem=2G
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: none; EDID transfer time: 1 seconds
(XEN)  EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN)  Found 1 MBR signatures
(XEN)  Found 1 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 0000000000099400 (usable)
(XEN)  0000000000099400 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 000000007f780000 (usable)
(XEN)  000000007f78e000 - 000000007f790000 type 9
(XEN)  000000007f790000 - 000000007f79e000 (ACPI data)
(XEN)  000000007f79e000 - 000000007f7d0000 (ACPI NVS)
(XEN)  000000007f7d0000 - 000000007f7e0000 (reserved)
(XEN)  000000007f7ec000 - 0000000080000000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ffc00000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000001880000000 (usable)
(XEN) ACPI: RSDP 000FAA40, 0024 (r2 SUN   )
(XEN) ACPI: XSDT 7F790100, 0094 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: FACP 7F790290, 00F4 (r4 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: DSDT 7F7905C0, 5ECF (r2 SUN    Xxx70           1 INTL 20051117)
(XEN) ACPI: FACS 7F79E000, 0040
(XEN) ACPI: APIC 7F790390, 011E (r2 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: MCFG 7F790500, 003C (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: SLIT 7F790540, 0030 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: SPMI 7F790570, 0041 (r5 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: OEMB 7F79E040, 00BE (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: HPET 7F79A5C0, 0038 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: DMAR 7F79E100, 0130 (r1 SUN    Xxx70           1 MSFT       97)
(XEN) ACPI: SRAT 7F79A600, 0250 (r1 SUN    Xxx70           1 INTC        1)
(XEN) ACPI: SSDT 7F79EF60, 0363 (r1  SUN   Xxx70          12 INTL 20051117)
(XEN) ACPI: EINJ 7F79A850, 0130 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: BERT 7F79A9E0, 0030 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: ERST 7F79AA10, 01B0 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) ACPI: HEST 7F79ABC0, 00A8 (r1 SUN    Xxx70    20111011 MSFT       97)
(XEN) System RAM: 98295MB (100654180kB)
(XEN) Domain heap initialised DMA width 32 bits
(XEN) Processor #0 6:12 APIC version 21
(XEN) Processor #2 6:12 APIC version 21
(XEN) Processor #4 6:12 APIC version 21
(XEN) Processor #16 6:12 APIC version 21
(XEN) Processor #18 6:12 APIC version 21
(XEN) Processor #20 6:12 APIC version 21
(XEN) Processor #32 6:12 APIC version 21
(XEN) Processor #34 6:12 APIC version 21
(XEN) Processor #36 6:12 APIC version 21
(XEN) Processor #48 6:12 APIC version 21
(XEN) Processor #50 6:12 APIC version 21
(XEN) Processor #52 6:12 APIC version 21
(XEN) Processor #1 6:12 APIC version 21
(XEN) Processor #3 6:12 APIC version 21
(XEN) Processor #5 6:12 APIC version 21
(XEN) Processor #17 6:12 APIC version 21
(XEN) Processor #19 6:12 APIC version 21
(XEN) Processor #21 6:12 APIC version 21
(XEN) Processor #33 6:12 APIC version 21
(XEN) Processor #35 6:12 APIC version 21
(XEN) Processor #37 6:12 APIC version 21
(XEN) Processor #49 6:12 APIC version 21
(XEN) Processor #51 6:12 APIC version 21
(XEN) Processor #53 6:12 APIC version 21
(XEN) IOAPIC[0]: apic_id 6, version 32, address 0xfec00000, GSI 0-23
(XEN) IOAPIC[1]: apic_id 7, version 32, address 0xfec8a000, GSI 24-47
(XEN) Enabling APIC mode:  Phys.  Using 2 I/O APICs
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2926.029 MHz processor.
(XEN) Initing memory sharing.
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN) EPT supports 2MB super page.
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging detected.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) Total of 24 processors activated.
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) TSC is reliable, synchronization unnecessary
(XEN) Platform timer is 14.318MHz HPET
(XEN) Allocated console ring of 64 KiB.
(XEN) Brought up 24 CPUs
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, lsb, paddr 0x2000 -> 0x6d5000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000835000000->0000000836000000 (520192 pages 
to be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff80002000->ffffffff806d5000
(XEN)  Init. ramdisk: ffffffff806d5000->ffffffff80ed7400
(XEN)  Phys-Mach map: ffffea0000000000->ffffea0000400000
(XEN)  Start info:    ffffffff80ed8000->ffffffff80ed84b4
(XEN)  Page tables:   ffffffff80ed9000->ffffffff80ee4000
(XEN)  Boot stack:    ffffffff80ee4000->ffffffff80ee5000
(XEN)  TOTAL:         ffffffff80000000->ffffffff81000000
(XEN)  ENTRY ADDRESS: ffffffff80002000
(XEN) Dom0 has maximum 24 VCPUs
(XEN) Scrubbing Free RAM: 
.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch 
input to Xen)
(XEN) Freed 168kB init memory.


[-- Attachment #1.2: Type: text/html, Size: 15595 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2012-08-10  4:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-07  7:22 kernel bootup slow issue on ovm3.1.1 zhenzhong.duan
2012-08-07  8:37 ` Jan Beulich
2012-08-08  9:48   ` zhenzhong.duan
2012-08-08 14:47     ` Jan Beulich
2012-08-08 15:01     ` Jan Beulich
2012-08-09  9:42       ` zhenzhong.duan
2012-08-09 10:35         ` Jan Beulich
2012-08-10  4:40           ` zhenzhong.duan [this message]
2012-08-10 14:22             ` Jan Beulich
2012-08-13  7:58               ` zhenzhong.duan
2012-08-13  9:29                 ` Jan Beulich
2012-08-13 11:08                   ` Stefano Stabellini
2012-08-29  5:19                     ` zhenzhong.duan
2012-08-29 18:28                       ` Stefano Stabellini
2012-08-29  5:36                   ` zhenzhong.duan
2012-08-30  9:03                     ` Tim Deegan
2012-09-19  2:39                       ` zhenzhong.duan
2012-09-19 10:29                         ` Jan Beulich
2013-04-29 17:55                       ` Konrad Rzeszutek Wilk
2013-04-30 10:37                         ` George Dunlap
2012-08-31  9:07                     ` Jan Beulich
2012-08-13  9:07               ` Tim Deegan
2012-08-07 16:26 ` Konrad Rzeszutek Wilk
2012-08-08  9:23   ` zhenzhong.duan
2012-08-08 14:43     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=502490A7.7020603@oracle.com \
    --to=zhenzhong.duan@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=joe.jin@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.