From: "zhenzhong.duan" <zhenzhong.duan@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Feng Jin <joe.jin@oracle.com>,
xen-devel <xen-devel@lists.xen.org>
Subject: Re: kernel bootup slow issue on ovm3.1.1
Date: Fri, 10 Aug 2012 12:40:07 +0800 [thread overview]
Message-ID: <502490A7.7020603@oracle.com> (raw)
In-Reply-To: <5023AE960200007800093DE8@nat28.tlf.novell.com>
[-- Attachment #1.1: Type: text/plain, Size: 11147 bytes --]
于 2012-08-09 18:35, Jan Beulich 写道:
>>>> On 09.08.12 at 11:42, "zhenzhong.duan"<zhenzhong.duan@oracle.com> wrote:
>> 于 2012-08-08 23:01, Jan Beulich 写道:
>>>>>> On 08.08.12 at 11:48, "zhenzhong.duan"<zhenzhong.duan@oracle.com> wrote:
>>>> 于 2012-08-07 16:37, Jan Beulich 写道:
>>>> Some spin at stop_machine after finish their job.
>>> And here you'd need to find out what they're waiting for,
>>> and what those CPUs are doing.
>> They are waiting the vcpu calling generic_set_all and those spin at
>> set_atomicity_lock.
>> In fact, all are waiting generic_set_all
> I think we're moving in circles - what is the vCPU currently
> generic_set_all() then doing?
Add some debug print, generic_set_all->prepare_set->write_cr0 took much
time,
all else are quick. set_atomicity_lock serialized this process between
cpus, make it worse.
One iteration:
MTRR: CPU 2
prepare_set: before read_cr0
prepare_set: before write_cr0 ------*block here*
prepare_set: before wbinvd
prepare_set: before read_cr4
prepare_set: before write_cr4
prepare_set: before __flush_tlb
prepare_set: before rdmsr
prepare_set: before wrmsr
generic_set_all: before set_mtrr_state
generic_set_all: before pat_init
post_set: before wbinvd
post_set: before wrmsr
post_set: before write_cr0
post_set: before write_cr4
>
>>> There's not that much being done in generic_set_all(), so the
>>> code should finish reasonably quickly. Are you perhaps having
>>> more vCPU-s in the guest than pCPU-s they can run on?
>> System env is an exalogic node with 24 cores + 100G mem (2 socket , 6
>> cores per socket, 2 HT threads per core).
>> Bootup a pvhvm with 12vpcus (or 24) + 90 GB + pci passthroughed device.
> So you're indeed over-committing the system. How many vCPU-s
> does you Dom0 have? Are there any other VMs? Is there any
> vCPU pinning in effect?
dom0 boot with 24 vcpus(same result with dom0_max_vcpus=4). No other vm
except dom0. All 24 vcpus spin from xentop result. Below is xentop clip.
NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k)
MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR
VBD_RSECT VBD_WSECT
SSID
Domain-0 -----r 43072 158.8 2050560 2.0 no limit
n/a 24 0 0 0 0 0 0
0 0 0
0
VCPUs(sec): 0: 13649s 1: 6197s 2: 4254s 3:
2006s 4: 1409s
5: 930s 6: 698s 7: 630s 8:
612s 9: 2038s
10: 544s 11: 940s 12: 556s 13:
510s 14: 456s
15: 591s 16: 438s 17: 508s 18:
3350s 19: 512s
20: 544s 21: 529s 22: 547s 23: 610s
zduan_test -----r 13140 2234.4 92327920 91.7 92327936
91.7 24 1 0 0 1 0 0
0 0 0
0
VCPUs(sec): 0: 556s 1: 551s 2: 549s 3:
544s 4: 549s
5: 545s 6: 545s 7: 547s 8:
545s 9: 548s
10: 545s 11: 546s 12: 545s 13:
548s 14: 543s
15: 544s 16: 551s 17: 545s 18:
547s 19: 551s
20: 544s 21: 549s 22: 546s 23: 545s
>>> Does
>>> your hardware support Pause-Loop-Exiting (or the AMD
>>> equivalent, don't recall their term right now)?
>> I have no access to serial line, could I get the info by a command?
> "xl dmesg" run early enough (i.e. before the log buffer wraps).
Below is xl dmesg result for your reference. thanks
[root@scae02cn01 zduan]# xl dmesg
__ __ _ _ ___ ____ _____ ____ __
\ \/ /___ _ __ | || | / _ \ |___ \ / _ \ \ / / \/ |
\ // _ \ '_ \ | || |_| | | | __) |__| | | \ \ / /| |\/| |
/ \ __/ | | | |__ _| |_| | / __/|__| |_| |\ V / | | | |
/_/\_\___|_| |_| |_|(_)___(_)_____| \___/ \_/ |_| |_|
(XEN) Xen version 4.0.2-OVM (mockbuild@(none)) (gcc version 4.1.2
20080704 (Red Hat 4.1.2-48)) Fri Dec 23 17:00:16 EST 2011
(XEN) Latest ChangeSet: unavailable
(XEN) Bootloader: GNU GRUB 0.97
(XEN) Command line: dom0_mem=2G
(XEN) Video information:
(XEN) VGA is text mode 80x25, font 8x16
(XEN) VBE/DDC methods: none; EDID transfer time: 1 seconds
(XEN) EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN) Found 1 MBR signatures
(XEN) Found 1 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN) 0000000000000000 - 0000000000099400 (usable)
(XEN) 0000000000099400 - 00000000000a0000 (reserved)
(XEN) 00000000000e0000 - 0000000000100000 (reserved)
(XEN) 0000000000100000 - 000000007f780000 (usable)
(XEN) 000000007f78e000 - 000000007f790000 type 9
(XEN) 000000007f790000 - 000000007f79e000 (ACPI data)
(XEN) 000000007f79e000 - 000000007f7d0000 (ACPI NVS)
(XEN) 000000007f7d0000 - 000000007f7e0000 (reserved)
(XEN) 000000007f7ec000 - 0000000080000000 (reserved)
(XEN) 00000000e0000000 - 00000000f0000000 (reserved)
(XEN) 00000000fee00000 - 00000000fee01000 (reserved)
(XEN) 00000000ffc00000 - 0000000100000000 (reserved)
(XEN) 0000000100000000 - 0000001880000000 (usable)
(XEN) ACPI: RSDP 000FAA40, 0024 (r2 SUN )
(XEN) ACPI: XSDT 7F790100, 0094 (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: FACP 7F790290, 00F4 (r4 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: DSDT 7F7905C0, 5ECF (r2 SUN Xxx70 1 INTL 20051117)
(XEN) ACPI: FACS 7F79E000, 0040
(XEN) ACPI: APIC 7F790390, 011E (r2 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: MCFG 7F790500, 003C (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: SLIT 7F790540, 0030 (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: SPMI 7F790570, 0041 (r5 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: OEMB 7F79E040, 00BE (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: HPET 7F79A5C0, 0038 (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: DMAR 7F79E100, 0130 (r1 SUN Xxx70 1 MSFT 97)
(XEN) ACPI: SRAT 7F79A600, 0250 (r1 SUN Xxx70 1 INTC 1)
(XEN) ACPI: SSDT 7F79EF60, 0363 (r1 SUN Xxx70 12 INTL 20051117)
(XEN) ACPI: EINJ 7F79A850, 0130 (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: BERT 7F79A9E0, 0030 (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: ERST 7F79AA10, 01B0 (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) ACPI: HEST 7F79ABC0, 00A8 (r1 SUN Xxx70 20111011 MSFT 97)
(XEN) System RAM: 98295MB (100654180kB)
(XEN) Domain heap initialised DMA width 32 bits
(XEN) Processor #0 6:12 APIC version 21
(XEN) Processor #2 6:12 APIC version 21
(XEN) Processor #4 6:12 APIC version 21
(XEN) Processor #16 6:12 APIC version 21
(XEN) Processor #18 6:12 APIC version 21
(XEN) Processor #20 6:12 APIC version 21
(XEN) Processor #32 6:12 APIC version 21
(XEN) Processor #34 6:12 APIC version 21
(XEN) Processor #36 6:12 APIC version 21
(XEN) Processor #48 6:12 APIC version 21
(XEN) Processor #50 6:12 APIC version 21
(XEN) Processor #52 6:12 APIC version 21
(XEN) Processor #1 6:12 APIC version 21
(XEN) Processor #3 6:12 APIC version 21
(XEN) Processor #5 6:12 APIC version 21
(XEN) Processor #17 6:12 APIC version 21
(XEN) Processor #19 6:12 APIC version 21
(XEN) Processor #21 6:12 APIC version 21
(XEN) Processor #33 6:12 APIC version 21
(XEN) Processor #35 6:12 APIC version 21
(XEN) Processor #37 6:12 APIC version 21
(XEN) Processor #49 6:12 APIC version 21
(XEN) Processor #51 6:12 APIC version 21
(XEN) Processor #53 6:12 APIC version 21
(XEN) IOAPIC[0]: apic_id 6, version 32, address 0xfec00000, GSI 0-23
(XEN) IOAPIC[1]: apic_id 7, version 32, address 0xfec8a000, GSI 24-47
(XEN) Enabling APIC mode: Phys. Using 2 I/O APICs
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2926.029 MHz processor.
(XEN) Initing memory sharing.
(XEN) VMX: Supported advanced features:
(XEN) - APIC MMIO access virtualisation
(XEN) - APIC TPR shadow
(XEN) - Extended Page Tables (EPT)
(XEN) - Virtual-Processor Identifiers (VPID)
(XEN) - Virtual NMI
(XEN) - MSR direct-access bitmap
(XEN) - Unrestricted Guest
(XEN) EPT supports 2MB super page.
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging detected.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) I/O virtualisation enabled
(XEN) - Dom0 mode: Relaxed
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) Total of 24 processors activated.
(XEN) ENABLING IO-APIC IRQs
(XEN) -> Using old ACK method
(XEN) TSC is reliable, synchronization unnecessary
(XEN) Platform timer is 14.318MHz HPET
(XEN) Allocated console ring of 64 KiB.
(XEN) Brought up 24 CPUs
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Xen kernel: 64-bit, lsb, compat32
(XEN) Dom0 kernel: 64-bit, lsb, paddr 0x2000 -> 0x6d5000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN) Dom0 alloc.: 0000000835000000->0000000836000000 (520192 pages
to be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN) Loaded kernel: ffffffff80002000->ffffffff806d5000
(XEN) Init. ramdisk: ffffffff806d5000->ffffffff80ed7400
(XEN) Phys-Mach map: ffffea0000000000->ffffea0000400000
(XEN) Start info: ffffffff80ed8000->ffffffff80ed84b4
(XEN) Page tables: ffffffff80ed9000->ffffffff80ee4000
(XEN) Boot stack: ffffffff80ee4000->ffffffff80ee5000
(XEN) TOTAL: ffffffff80000000->ffffffff81000000
(XEN) ENTRY ADDRESS: ffffffff80002000
(XEN) Dom0 has maximum 24 VCPUs
(XEN) Scrubbing Free RAM:
done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
input to Xen)
(XEN) Freed 168kB init memory.
[-- Attachment #1.2: Type: text/html, Size: 15595 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2012-08-10 4:40 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-07 7:22 kernel bootup slow issue on ovm3.1.1 zhenzhong.duan
2012-08-07 8:37 ` Jan Beulich
2012-08-08 9:48 ` zhenzhong.duan
2012-08-08 14:47 ` Jan Beulich
2012-08-08 15:01 ` Jan Beulich
2012-08-09 9:42 ` zhenzhong.duan
2012-08-09 10:35 ` Jan Beulich
2012-08-10 4:40 ` zhenzhong.duan [this message]
2012-08-10 14:22 ` Jan Beulich
2012-08-13 7:58 ` zhenzhong.duan
2012-08-13 9:29 ` Jan Beulich
2012-08-13 11:08 ` Stefano Stabellini
2012-08-29 5:19 ` zhenzhong.duan
2012-08-29 18:28 ` Stefano Stabellini
2012-08-29 5:36 ` zhenzhong.duan
2012-08-30 9:03 ` Tim Deegan
2012-09-19 2:39 ` zhenzhong.duan
2012-09-19 10:29 ` Jan Beulich
2013-04-29 17:55 ` Konrad Rzeszutek Wilk
2013-04-30 10:37 ` George Dunlap
2012-08-31 9:07 ` Jan Beulich
2012-08-13 9:07 ` Tim Deegan
2012-08-07 16:26 ` Konrad Rzeszutek Wilk
2012-08-08 9:23 ` zhenzhong.duan
2012-08-08 14:43 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=502490A7.7020603@oracle.com \
--to=zhenzhong.duan@oracle.com \
--cc=JBeulich@suse.com \
--cc=joe.jin@oracle.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).