* [GIT PULL] x86/mm changes for v3.9-rc1
@ 2013-02-22 0:34 H. Peter Anvin
2013-02-22 16:22 ` Linus Torvalds
2013-02-22 16:55 ` Konrad Rzeszutek Wilk
0 siblings, 2 replies; 16+ messages in thread
From: H. Peter Anvin @ 2013-02-22 0:34 UTC (permalink / raw)
To: Linus Torvalds
Cc: David S. Miller, H. Peter Anvin, Rafael J. Wysocki, stable,
Alexander Duyck, Andrea Arcangeli, Andrew Morton,
Andrzej Pietrasiewicz, Arnd Bergmann, Borislav Petkov,
Borislav Petkov, Christoph Lameter, Daniel J Blueman, Dave Hansen,
Eric Biederman
Hi Linus,
This is a huge set of several partly interrelated (and concurrently
developed) changes, which is why the branch history is messier than
one would like.
The *really* big items are two humonguous patchsets mostly developed
by Yinghai Lu at my request, which completely revamps the way we
create initial page tables. In particular, rather than estimating how
much memory we will need for page tables and then build them into that
memory -- a calculation that has shown to be incredibly fragile -- we
now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
a #PF handler which creates temporary page tables on demand.
This has several advantages:
1. It makes it much easier to support things that need access to
data very early (a followon patchset uses this to load microcode
way early in the kernel startup).
2. It allows the kernel and all the kernel data objects to be invoked
from above the 4 GB limit. This allows kdump to work on very large
systems.
3. It greatly reduces the difference between Xen and native (Xen's
equivalent of the #PF handler are the temporary page tables created
by the domain builder), eliminating a bunch of fragile hooks.
The patch series also gets us a bit closer to W^X.
Additional work in this pull is the 64-bit get_user() work which you
were also involved with, and a bunch of cleanups/speedups to
__phys_addr()/__pa().
----------------------------------------------------------------
The following changes since commit 5dcd14ecd41ea2b3ae3295a9b30d98769d52165f:
x86, boot: Sanitize boot_params if not zeroed on creation (2013-01-29 01:22:17 -0800)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-mm-for-linus
for you to fetch changes up to 0da3e7f526fde7a6522a3038b7ce609fc50f6707:
Merge branch 'x86/mm2' into x86/mm (2013-02-15 09:25:08 -0800)
----------------------------------------------------------------
Alexander Duyck (9):
x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h
x86: Improve __phys_addr performance by making use of carry flags and inlining
x86: Make it so that __pa_symbol can only process kernel symbols on x86_64
x86: Drop 4 unnecessary calls to __pa_symbol
x86: Use __pa_symbol instead of __pa on C visible symbols
x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols
x86/acpi: Use __pa_symbol instead of __pa on C visible symbols
x86/lguest: Use __pa_symbol instead of __pa on C visible symbols
x86: Fix warning about cast from pointer to integer of different size
Borislav Petkov (1):
x86/numa: Use __pa_nodebug() instead
Dave Hansen (6):
x86, mm: Make DEBUG_VIRTUAL work earlier in boot
x86, mm: Pagetable level size/shift/mask helpers
x86, mm: Use new pagetable helpers in try_preserve_large_page()
x86, mm: Create slow_virt_to_phys()
x86, kvm: Fix kvm's use of __pa() on percpu areas
x86-32, mm: Rip out x86_32 NUMA remapping code
H. Peter Anvin (13):
Merge branch 'x86/mm' of ssh://ra.kernel.org/.../tip/tip into x86/mm
Merge tag 'v3.8-rc5' into x86/mm
Merge remote-tracking branch 'origin/x86/boot' into x86/mm2
x86, 64bit: Use a #PF handler to materialize early mappings on demand
x86-32, mm: Remove reference to resume_map_numa_kva()
x86-32, mm: Remove reference to alloc_remap()
Merge remote-tracking branch 'origin/x86/mm' into x86/mm2
x86, mm: Use a bitfield to mask nuisance get_user() warnings
x86: Be consistent with data size in getuser.S
x86, mm: Redesign get_user with a __builtin_choose_expr hack
x86, doc: Clarify the use of asm("%edx") in uaccess.h
x86, mm: Move reserving low memory later in initialization
Merge branch 'x86/mm2' into x86/mm
Ingo Molnar (1):
x86/mm: Don't flush the TLB on #WP pmd fixups
Jacob Shin (3):
x86, mm: if kernel .text .data .bss are not marked as E820_RAM, complain and fix
x86, mm: Fixup code testing if a pfn is direct mapped
x86, mm: Only direct map addresses that are marked as E820_RAM
Shuah Khan (1):
x86/kvm: Fix compile warning in kvm_register_steal_time()
Stefano Stabellini (1):
x86, mm: Add pointer about Xen mmu requirement for alloc_low_pages
Ville Syrjälä (1):
x86-32: Add support for 64bit get_user()
Yinghai Lu (74):
x86, mm: Add global page_size_mask and probe one time only
x86, mm: Split out split_mem_range from init_memory_mapping
x86, mm: Move down find_early_table_space()
x86, mm: Move init_memory_mapping calling out of setup.c
x86, mm: Revert back good_end setting for 64bit
x86, mm: Change find_early_table_space() paramters
x86, mm: Find early page table buffer together
x86, mm: Separate out calculate_table_space_size()
x86, mm: Set memblock initial limit to 1M
x86, mm: use pfn_range_is_mapped() with CPA
x86, mm: use pfn_range_is_mapped() with gart
x86, mm: use pfn_range_is_mapped() with reserve_initrd
x86, mm: relocate initrd under all mem for 64bit
x86, mm: Align start address to correct big page size
x86, mm: Use big page size for small memory range
x86, mm: Don't clear page table if range is ram
x86, mm: Break down init_all_memory_mapping
x86, mm: setup page table in top-down
x86, mm: Remove early_memremap workaround for page table accessing on 64bit
x86, mm: Remove parameter in alloc_low_page for 64bit
x86, mm: Merge alloc_low_page between 64bit and 32bit
x86, mm: Move min_pfn_mapped back to mm/init.c
x86, mm, Xen: Remove mapping_pagetable_reserve()
x86, mm: Add alloc_low_pages(num)
x86, mm: only call early_ioremap_page_table_range_init() once
x86, mm: Move back pgt_buf_* to mm/init.c
x86, mm: Move init_gbpages() out of setup.c
x86, mm: change low/hignmem_pfn_init to static on 32bit
x86, mm: Move function declaration into mm_internal.h
x86, mm: Add check before clear pte above max_low_pfn on 32bit
x86, mm: use round_up/down in split_mem_range()
x86, mm: use PFN_DOWN in split_mem_range()
x86, mm: use pfn instead of pos in split_mem_range
x86, mm: use limit_pfn for end pfn
x86, mm: Unifying after_bootmem for 32bit and 64bit
x86, mm: Move after_bootmem to mm_internel.h
x86, mm: Use clamp_t() in init_range_memory_mapping
x86, mm: kill numa_free_all_bootmem()
x86, mm: kill numa_64.h
sparc, mm: Remove calling of free_all_bootmem_node()
mm: Kill NO_BOOTMEM version free_all_bootmem_node()
x86, mm: Let "memmap=" take more entries one time
x86, mm: Fix page table early allocation offset checking
x86: Factor out e820_add_kernel_range()
x86, 64bit, mm: Make pgd next calculation consistent with pud/pmd
x86, realmode: Set real_mode permissions early
x86, 64bit, mm: Add generic kernel/ident mapping helper
x86, 64bit: Copy struct boot_params early
x86, 64bit, realmode: Use init_level4_pgt to set trampoline_pgd directly
x86, realmode: Separate real_mode reserve and setup
x86, 64bit: #PF handler set page to cover only 2M per #PF
x86, 64bit: Don't set max_pfn_mapped wrong value early on native path
x86: Merge early_reserve_initrd for 32bit and 64bit
x86: Add get_ramdisk_image/size()
x86, boot: Add get_cmd_line_ptr()
x86, boot: Move checking of cmd_line_ptr out of common path
x86, boot: Pass cmd_line_ptr with unsigned long instead
x86, boot: Move verify_cpu.S and no_longmode down
x86, boot: Move lldt/ltr out of 64bit code section
x86, kexec: Remove 1024G limitation for kexec buffer on 64bit
x86, kexec: Set ident mapping for kernel that is above max_pfn
x86, kexec: Replace ident_mapping_init and init_level4_page
x86, kexec, 64bit: Only set ident mapping for ram.
x86, boot: Support loading bzImage, boot_params and ramdisk above 4G
x86, boot: Update comments about entries for 64bit image
x86, boot: Not need to check setup_header version for setup_data
memblock: Add memblock_mem_size()
x86, kdump: Remove crashkernel range find limit for 64bit
x86: Add Crash kernel low reservation
x86: Merge early kernel reserve for 32bit and 64bit
x86, 64bit, mm: Mark data/bss/brk to nx
x86, 64bit, mm: hibernate use generic mapping_init
mm: Add alloc_bootmem_low_pages_nopanic()
x86: Don't panic if can not alloc buffer for swiotlb
Documentation/kernel-parameters.txt | 3 +
Documentation/x86/boot.txt | 38 +++
arch/mips/cavium-octeon/dma-octeon.c | 3 +-
arch/sparc/mm/init_64.c | 24 +-
arch/x86/Kconfig | 4 -
arch/x86/boot/boot.h | 18 +-
arch/x86/boot/cmdline.c | 12 +-
arch/x86/boot/compressed/cmdline.c | 12 +-
arch/x86/boot/compressed/head_64.S | 48 ++--
arch/x86/boot/header.S | 10 +-
arch/x86/include/asm/init.h | 28 +-
arch/x86/include/asm/kexec.h | 6 +-
arch/x86/include/asm/mmzone_32.h | 6 -
arch/x86/include/asm/numa.h | 2 -
arch/x86/include/asm/numa_64.h | 6 -
arch/x86/include/asm/page.h | 7 +-
arch/x86/include/asm/page_32.h | 1 +
arch/x86/include/asm/page_64.h | 36 +++
arch/x86/include/asm/page_64_types.h | 22 --
arch/x86/include/asm/page_types.h | 2 +
arch/x86/include/asm/pgtable.h | 16 ++
arch/x86/include/asm/pgtable_64.h | 5 +
arch/x86/include/asm/pgtable_64_types.h | 4 +
arch/x86/include/asm/pgtable_types.h | 4 +-
arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/realmode.h | 3 +-
arch/x86/include/asm/uaccess.h | 55 ++--
arch/x86/include/asm/x86_init.h | 12 -
arch/x86/kernel/acpi/boot.c | 1 -
arch/x86/kernel/acpi/sleep.c | 2 +-
arch/x86/kernel/amd_gart_64.c | 5 +-
arch/x86/kernel/apic/apic_numachip.c | 1 +
arch/x86/kernel/cpu/amd.c | 9 +-
arch/x86/kernel/cpu/intel.c | 3 +-
arch/x86/kernel/e820.c | 16 +-
arch/x86/kernel/ftrace.c | 4 +-
arch/x86/kernel/head32.c | 20 --
arch/x86/kernel/head64.c | 131 ++++++---
arch/x86/kernel/head_64.S | 210 +++++++++------
arch/x86/kernel/i386_ksyms_32.c | 1 +
arch/x86/kernel/kvm.c | 11 +-
arch/x86/kernel/kvmclock.c | 4 +-
arch/x86/kernel/machine_kexec_64.c | 171 ++++--------
arch/x86/kernel/setup.c | 260 +++++++++++-------
arch/x86/kernel/traps.c | 9 +
arch/x86/kernel/x8664_ksyms_64.c | 3 +
arch/x86/kernel/x86_init.c | 4 -
arch/x86/lguest/boot.c | 3 +-
arch/x86/lib/getuser.S | 43 ++-
arch/x86/mm/init.c | 459 +++++++++++++++++++++-----------
arch/x86/mm/init_32.c | 106 +++++---
arch/x86/mm/init_64.c | 255 ++++++++++--------
arch/x86/mm/mm_internal.h | 19 ++
arch/x86/mm/numa.c | 32 +--
arch/x86/mm/numa_32.c | 161 -----------
arch/x86/mm/numa_64.c | 13 -
arch/x86/mm/numa_internal.h | 6 -
arch/x86/mm/pageattr.c | 66 +++--
arch/x86/mm/pat.c | 4 +-
arch/x86/mm/pgtable.c | 7 +-
arch/x86/mm/physaddr.c | 60 +++--
arch/x86/platform/efi/efi.c | 11 +-
arch/x86/power/hibernate_32.c | 2 -
arch/x86/power/hibernate_64.c | 66 ++---
arch/x86/realmode/init.c | 49 ++--
arch/x86/xen/mmu.c | 28 --
drivers/xen/swiotlb-xen.c | 4 +-
include/linux/bootmem.h | 5 +
include/linux/kexec.h | 3 +
include/linux/memblock.h | 1 +
include/linux/mm.h | 1 -
include/linux/swiotlb.h | 2 +-
kernel/kexec.c | 34 ++-
lib/swiotlb.c | 47 ++--
mm/bootmem.c | 8 +
mm/memblock.c | 17 ++
mm/nobootmem.c | 23 +-
77 files changed, 1541 insertions(+), 1247 deletions(-)
[Skipping the full diff due to size]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 0:34 [GIT PULL] x86/mm changes for v3.9-rc1 H. Peter Anvin
@ 2013-02-22 16:22 ` Linus Torvalds
2013-02-22 17:31 ` H. Peter Anvin
2013-02-22 16:55 ` Konrad Rzeszutek Wilk
1 sibling, 1 reply; 16+ messages in thread
From: Linus Torvalds @ 2013-02-22 16:22 UTC (permalink / raw)
To: H. Peter Anvin
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
Ingo Molnar, Ville Syrjälä, Marek Szyprowski,
Andrea Arcangeli, Lee Schermerhorn, Xen-devel@lists.xensource.com,
Russell King, Len Brown, Joerg Roedel
On Thu, Feb 21, 2013 at 4:34 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
>
> This is a huge set of several partly interrelated (and concurrently
> developed) changes, which is why the branch history is messier than
> one would like.
>
> The *really* big items are two humonguous patchsets mostly developed
> by Yinghai Lu at my request, which completely revamps the way we
> create initial page tables.
Ugh. So I've tried to walk through this, and it's painful. If this
results in problems, we're going to be *so* screwed. Is it bisectable?
I also don't understand how "early_idt_handler" could *possibly* work.
In particular, it seems to rely on the trap number being set up in the
stack frame:
cmpl $14,72(%rsp) # Page fault?
but that's not even *true*. Why? Because we export both the
early_idt_handlers[] array (that sets up the trap number and makes the
stack frame be reliable) and the single early_idt_handler function
(that relies on the trap number and the reliable stack frame), AND
AFAIK WE USE THE LATTER!
See x86_64_start_kernel():
for (i = 0; i < NUM_EXCEPTION_VECTORS; i++) {
#ifdef CONFIG_EARLY_PRINTK
set_intr_gate(i, &early_idt_handlers[i]);
#else
set_intr_gate(i, early_idt_handler);
#endif
}
so unless you have CONFIG_EARLY_PRINTK, the interrupt gate will point
to that raw early_idt_handler function that doesn't *work* on its own,
afaik.
Btw, it's not just the page fault index testing that is wrong. The whole
cmpl $__KERNEL_CS,96(%rsp)
jne 11f
also relies on the stack frame being set up the same way for all
exceptions - which again is only true if we ran through the
early_idt_handlers[] prologue that added the extra stack entry.
How does this even work for me? I don't have EARLY_PRINTK enabled.
What am I missing?
Linus
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 0:34 [GIT PULL] x86/mm changes for v3.9-rc1 H. Peter Anvin
2013-02-22 16:22 ` Linus Torvalds
@ 2013-02-22 16:55 ` Konrad Rzeszutek Wilk
2013-02-22 17:12 ` H. Peter Anvin
` (2 more replies)
1 sibling, 3 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-22 16:55 UTC (permalink / raw)
To: H. Peter Anvin
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
Ingo Molnar, Ville Syrjälä, Marek Szyprowski,
Andrea Arcangeli, Lee Schermerhorn, xen-devel, Russell King,
Len Brown, Joerg Roedel, linux-pm, Hugh Dickins,
Yasuaki Ishimatsu
On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote:
> Hi Linus,
>
> This is a huge set of several partly interrelated (and concurrently
> developed) changes, which is why the branch history is messier than
> one would like.
>
> The *really* big items are two humonguous patchsets mostly developed
> by Yinghai Lu at my request, which completely revamps the way we
> create initial page tables. In particular, rather than estimating how
> much memory we will need for page tables and then build them into that
> memory -- a calculation that has shown to be incredibly fragile -- we
> now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
> a #PF handler which creates temporary page tables on demand.
>
> This has several advantages:
>
> 1. It makes it much easier to support things that need access to
> data very early (a followon patchset uses this to load microcode
> way early in the kernel startup).
>
> 2. It allows the kernel and all the kernel data objects to be invoked
> from above the 4 GB limit. This allows kdump to work on very large
> systems.
>
> 3. It greatly reduces the difference between Xen and native (Xen's
> equivalent of the #PF handler are the temporary page tables created
> by the domain builder), eliminating a bunch of fragile hooks.
>
> The patch series also gets us a bit closer to W^X.
>
> Additional work in this pull is the 64-bit get_user() work which you
> were also involved with, and a bunch of cleanups/speedups to
> __phys_addr()/__pa().
Looking at figuring out which of the patches in the branch did this, but
with this merge I am getting a crash with a very simple PV guest (booted with
one 1G):
Call Trace:
[<ffffffff8103feba>] xen_get_user_pgd+0x5a <--
[<ffffffff8103feba>] xen_get_user_pgd+0x5a
[<ffffffff81042d27>] xen_write_cr3+0x77
[<ffffffff81ad2d21>] init_mem_mapping+0x1f9
[<ffffffff81ac293f>] setup_arch+0x742
[<ffffffff81666d71>] printk+0x48
[<ffffffff81abcd62>] start_kernel+0x90
[<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b
[<ffffffff81abc5f7>] x86_64_start_reservations+0x2a
[<ffffffff81abf0c7>] xen_start_kernel+0x564
And the hypervisor says:
(XEN) d7:v0: unhandled page fault (ec=0000)
(XEN) Pagetable walk from ffffea000005b2d0:
(XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 7 (vcpu#0) crashed on cpu#3:
(XEN) ----[ Xen-4.2.0 x86_64 debug=n Not tainted ]----
(XEN) CPU: 3
(XEN) RIP: e033:[<ffffffff8103feba>]
(XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
(XEN) rax: ffffea0000000000 rbx: 0000000001a0c000 rcx: 0000000080000000
(XEN) rdx: 000000000005b2a0 rsi: 0000000001a0c000 rdi: 0000000000000000
(XEN) rbp: ffffffff81a01dd8 rsp: ffffffff81a01d90 r8: 0000000000000000
(XEN) r9: 0000000010000001 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: 0000000000000000 r13: 0000001000000000 r14: 0000000000000000
(XEN) r15: 0000000000100000 cr0: 000000008005003b cr4: 00000000000406f0
(XEN) cr3: 0000000411165000 cr2: ffffea000005b2d0
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=ffffffff81a01d90:
(XEN) 0000000080000000 0000000000000000 0000000000000000 ffffffff8103feba
(XEN) 000000010000e030 0000000000010006 ffffffff81a01dd8 000000000000e02b
What is bizzare is that I do recall testing this (and Stefano also did it).
So I am not sure what has altered.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 16:55 ` Konrad Rzeszutek Wilk
@ 2013-02-22 17:12 ` H. Peter Anvin
2013-02-22 17:38 ` Konrad Rzeszutek Wilk
2013-02-22 17:24 ` Konrad Rzeszutek Wilk
2013-02-22 17:30 ` Dave Hansen
2 siblings, 1 reply; 16+ messages in thread
From: H. Peter Anvin @ 2013-02-22 17:12 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, sparclinux, Christoph Lameter, Ingo Molnar,
Ville Syrjälä, Marek Szyprowski, Andrea Arcangeli,
Lee Schermerhorn, xen-devel, Russell King, Len Brown,
Joerg Roedel, linux-pm, Hugh Dickins, Yasuaki Ishimatsu
On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
>
> What is bizzare is that I do recall testing this (and Stefano also did it).
> So I am not sure what has altered.
>
Yes, there was a very specific reason why I wanted you guys to test it...
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 16:55 ` Konrad Rzeszutek Wilk
2013-02-22 17:12 ` H. Peter Anvin
@ 2013-02-22 17:24 ` Konrad Rzeszutek Wilk
2013-02-22 17:30 ` H. Peter Anvin
2013-02-22 17:53 ` Yinghai Lu
2013-02-22 17:30 ` Dave Hansen
2 siblings, 2 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-22 17:24 UTC (permalink / raw)
To: H. Peter Anvin, yinghai
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
Ingo Molnar, Ville Syrjälä, Marek Szyprowski,
Andrea Arcangeli, Lee Schermerhorn, xen-devel, Russell King,
Len Brown, Joerg Roedel, linux-pm, Hugh Dickins,
Yasuaki Ishimatsu
On Fri, Feb 22, 2013 at 11:55:31AM -0500, Konrad Rzeszutek Wilk wrote:
> On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote:
> > Hi Linus,
> >
> > This is a huge set of several partly interrelated (and concurrently
> > developed) changes, which is why the branch history is messier than
> > one would like.
> >
> > The *really* big items are two humonguous patchsets mostly developed
> > by Yinghai Lu at my request, which completely revamps the way we
> > create initial page tables. In particular, rather than estimating how
> > much memory we will need for page tables and then build them into that
> > memory -- a calculation that has shown to be incredibly fragile -- we
> > now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
> > a #PF handler which creates temporary page tables on demand.
> >
> > This has several advantages:
> >
> > 1. It makes it much easier to support things that need access to
> > data very early (a followon patchset uses this to load microcode
> > way early in the kernel startup).
> >
> > 2. It allows the kernel and all the kernel data objects to be invoked
> > from above the 4 GB limit. This allows kdump to work on very large
> > systems.
> >
> > 3. It greatly reduces the difference between Xen and native (Xen's
> > equivalent of the #PF handler are the temporary page tables created
> > by the domain builder), eliminating a bunch of fragile hooks.
> >
> > The patch series also gets us a bit closer to W^X.
> >
> > Additional work in this pull is the 64-bit get_user() work which you
> > were also involved with, and a bunch of cleanups/speedups to
> > __phys_addr()/__pa().
>
> Looking at figuring out which of the patches in the branch did this, but
> with this merge I am getting a crash with a very simple PV guest (booted with
> one 1G):
>
> Call Trace:
> [<ffffffff8103feba>] xen_get_user_pgd+0x5a <--
> [<ffffffff8103feba>] xen_get_user_pgd+0x5a
> [<ffffffff81042d27>] xen_write_cr3+0x77
> [<ffffffff81ad2d21>] init_mem_mapping+0x1f9
> [<ffffffff81ac293f>] setup_arch+0x742
> [<ffffffff81666d71>] printk+0x48
> [<ffffffff81abcd62>] start_kernel+0x90
> [<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b
> [<ffffffff81abc5f7>] x86_64_start_reservations+0x2a
> [<ffffffff81abf0c7>] xen_start_kernel+0x564
>
> And the hypervisor says:
> (XEN) d7:v0: unhandled page fault (ec=0000)
> (XEN) Pagetable walk from ffffea000005b2d0:
> (XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
> (XEN) domain_crash_sync called from entry.S
> (XEN) Domain 7 (vcpu#0) crashed on cpu#3:
> (XEN) ----[ Xen-4.2.0 x86_64 debug=n Not tainted ]----
> (XEN) CPU: 3
> (XEN) RIP: e033:[<ffffffff8103feba>]
> (XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
> (XEN) rax: ffffea0000000000 rbx: 0000000001a0c000 rcx: 0000000080000000
> (XEN) rdx: 000000000005b2a0 rsi: 0000000001a0c000 rdi: 0000000000000000
> (XEN) rbp: ffffffff81a01dd8 rsp: ffffffff81a01d90 r8: 0000000000000000
> (XEN) r9: 0000000010000001 r10: 0000000000000000 r11: 0000000000000000
> (XEN) r12: 0000000000000000 r13: 0000001000000000 r14: 0000000000000000
> (XEN) r15: 0000000000100000 cr0: 000000008005003b cr4: 00000000000406f0
> (XEN) cr3: 0000000411165000 cr2: ffffea000005b2d0
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
> (XEN) Guest stack trace from rsp=ffffffff81a01d90:
> (XEN) 0000000080000000 0000000000000000 0000000000000000 ffffffff8103feba
> (XEN) 000000010000e030 0000000000010006 ffffffff81a01dd8 000000000000e02b
Here is a better serial log of the crash (just booting a normal Xen 4.1 + initial
kernel with 8GB):
PXELINUX 3.82 2009-06-09 Copyright (C) 1994-2009 H. Peter Anvin et al
boot:
Loading xen.gz... ok
Loading vmlinuz... ok
Loading initramfs.cpio.gz... ok
__ __ _ _ _ ____
\ \/ /___ _ __ | || | / | | ___| _ __ _ __ ___
\ // _ \ '_(_)_(_)____/ | .__/|_| \___|
|_|
(XEN) Xen version 4.1.5-pre (konrad@dumpdata.com) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) Fri Feb 22 11:37:00 EST 2013
(XEN) Latest ChangeSet: Fri Feb 15 15:31:55 2013 +0100 23459:9f12bdd6b7f0
(XEN) Console output is synchronous.
(XEN) Bootloader: unknown
(XEN) Command line: cpuinfo conring_size=1048576 sync_console cpufreq=verbose com1=115200,8n1 console=com1,vga loglvl=all guest_loglvl=all
(XEN) Video information:
(XEN) VGA is text mode 80x25, font 8x16
(XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds
(XEN) EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN) Found 1 MBR signatures
(XEN) Found 1 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN) 0000000000000000 - 000000000009ec00 (usable)
(XEN) 000000000009ec00 - 00000000000a0000 (reserved)
(XEN) 00000000000e0000 - 0000000000100000 (reserved)
(XEN) 0000000000100000 - 0000000020000000 (usable)
(XEN) 0000000020000000 - 0000000020200000 (reserved)
(XEN) 0000000020200000 - 0000000040000000 (usable)
(XEN) 0000000040000000 - 0000000040200000 (reserved)
(XEN) 0000000040200000 - 00000000bad80000 (usable)
(XEN) 00000000bad80000 - 00000000badc9000 (ACPI NVS)
(XEN) 00000000badc9000 - 00000000badd1000 (ACPI data)
(XEN) 00000000badd1000 - 00000000badf4000 (reserved)
(XEN) 00000000badf4000 - 00000000badf6000 (usable)
(XEN) 00000000badf6000 - 00000000bae06000 (reserved)
(XEN) 00000000bae06000 - 00000000bae14000 (ACPI NVS)
(XEN) 00000000bae14000 - 00000000bae3c000 (reserved)
(XEN) 00000000bae3c000 - 00000000bae7f000 (ACPI NVS)
(XEN) 00000000bae7f000 - 00000000bb000000 (usable)
(XEN) 00000000bb800000 - 00000000bfa00000 (reserved)
(XEN) 00000000fed1c000 - 00000000fed40000 (reserved)
(XEN) 00000000ff000000 - 0000000100000000 (reserved)
(XEN) 0000000100000000 - 000000023fe00000 (usable)
(XEN) ACPI: RSDP 000F0450, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT BADC9068, 0054 (r1 ALASKA A M I 1072009 AMI 10013)
(XEN) PROC 1 MSFT 3000001)
(XEN) ACPI: MCFG BADD0580, 003C (r1 ALASKA A M I 1072009 MSFT 97)
(XEN) ACPI: HPET BADD05C0, 0038 (r1 ALASKA A M I 1072009 AMI. 4)
(XEN) ACPI: ASF! BADD05F8, 00A0 (r32 INTEL HCG 1 TFSM F4240)
(XEN) System RAM: 8104MB (8299140kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-000000023fe00000
(XEN) Domain heap initialised
(XEN) found SMP MP-table at 000fcde0
(XEN) DMI 2.7 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x408
(XEN) ACPI: ACPI SLEEP INFO: pm1x_cnt[404,0], pm1x_evt[400,0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - bae0bf80/0000000000000000, using 32
(XEN) ACPI: wakeup_vec[bae0bf8c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) Processor #0 6:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) Processor #2 6:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
(XEN) Processor #1 6:10 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
(XEN) Processor #3 6:10 APIC version 21
(XEN) ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
(XEN) ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode: Flat. Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
(XEN) PCI: Not using MMCONFIG.
(XEN) Table is not found!
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) IRQ limits: 24 GSI, 760 MSI/MSI-X
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Initializing CPU#0
(XEN) Detected 3093.067 MHz processor.
(XEN) Initing memory sharing.
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 0
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 3072K
(XEN) mce_intel.c:1162: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0 extended MCE MSR 0
(XEN) CPU0: Thermal monitoring enabled (TM1)
(XEN) Intel machine check reporting enabled
(XEN) I/O virtualisation disabled
(XEN) CPU0: Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz stepping 07
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN) -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Platform timer is 14.318MHz HPET
(XEN) Allocated console ring of 1048576 KiB.
(XEN) VMX: Supported advanced features:
(isation
(XEN) - APIC TPR shadow
(XEN) - Extended Page Tables (EPT)
(XEN) - Virtual-Processor Identifiers (VPID)
(XEN) - Virtual NMI
(XEN) - MSR direct-access bitmap
(XEN) - Unrestricted Guest
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB
(XEN) Booting processor 1/1 eip 7c000
(XEN) Initializing CPU#1
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 0
(XEN) CPU: L1 I cache: 32K, L1 D
(XEN) Initializing CPU#2
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 1
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 3072K
(XEN) CPU2: Thermal monitoring enabled (TM1)
(XEN) CPU2: Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz stepping 07
(XEN) Booting processor 3/3 eip 7c000
(XEN) Initializing CPU#3
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 1
(XEN) CPU: L1 I cache: 32K, L1 D100 CPU @ 3.10GHz stepping 07
(XEN) Brought up 4 CPUs
(XEN) ACPI sleep modes: S3
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) *** LOADING DOMAIN 0 ***
(XEN) elf_parse_binary: phdr: paddr=0x1000000 memsz=0x9e0000
(XEN) elf_parse_binary: phdr: paddr=0x1a00000 memsz=0xa60f0
(XEN): paddr=0x1abc000 memsz=0x61b000
(XEN) elf_parse_binary: memory: 0x1000000 -> 0x20d7000
(XEN) elf_xen_parse_note: GUEST_OS = "linux"
(XEN) elf_xen_parse_note: GUEST_VERSION = "2.6"
(XEN) elf_xen_parse_note: XEN_VERSION = "xen-3.0"
(XEN) elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
(XEN) elf_xen_parse_note: ENTRY = 0xffffffff81abc1e0
(XEN) elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81001000
(XEN) elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
(XEN) elf_xen_parse_note: PAE_MODE = "yes"
(XEN) elf_xen_parse_note: LOADER = "generic"
(XEN) elf_xen_parse_note: unknown xen elf note (0xd)
(XEN) elf_xen_parse_note: SUSPEND_CANCEL = 0x1
(XEN) elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
(XEN) elf_xen_parse_note: PADDR_OFFSET = 0x0
(XEN) elf_xen_addr_calc_check: addresses:
(XEN) virt_base = 0xffffffff80000000
(XEN) elf_paddr_offset = 0x0
(XEN) virt_offset = 0xffffffff80000000
(XEN) virt_kstart = 0xffffffff81000000
(XEN) virt_kend = 0xffffffff820d7000
(XEN) virt_entry = 0xffffffff81abc1e0
(XEN) p2m_base = 0xffffffffffffffff
(XEN) Xen kernel: 64-bit, lsb, compat32
(XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x20d7000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN) Dom0 alloc.: 0000000220000000->0000000224000000 (1661249 pages to be allocated)
(XEN) Init. ramdisk: 000000022cbdc000->000000023fe00000
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN) Loaded kernel: ffffffff81000000->ffffffff820d7000
(XEN) Init. ramdisk: ffffffff820d7000->ffffffff952fb000
(XEN) Phys-Mach map: ffffffff952fb000->ffffffff96060b28
(XEN) Start info: ffffffff96061000->ffffffff960614b4
(XEN) Page tables: ffffffff96062000->ffffffff96117000
(XEN) Boot stack: ffffffff96117000->ffffffff96118000
(XEN) TOTAL: ffffffff80000000->ffffffff96400000
(XEN) ENTRY ADDRESS: ffffffff81abc1e0
(XEN) Dom0 has maximum 4 VCPUs
(XEN) elf_load_binary: phdr 0 at 0xffffffff81000000 -> 0xffffffff819e0000
(XEN) elf_load_binary: phdr 1 at 0xffffffff81a00000 -> 0xffffffff81aa60f0
(XEN) elf_load_binary: phdr 2 at 0xffffffff81aa7000 -> 0xffffffff81abbbc0
(XEN) elf_load_binary: phdr 3 at 0xffffffff81abc000 -> 0xffffffff81baf000
(XEN) Scrubbing Free RAM: .done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************intended to aid debugging of Xen by ensuring
(XEN) ******* that all output is synchronously delivered on the serial line.
(XEN) ******* However it can introduce SIGNIFICANT latencies and affect
(XEN) ******* timekeeping. It is NOT recommended for production use!
(XEN) **********************************************
(XEN) 3... 2... 1...
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
(XEN) Freed 224kB init memory.
mapping kernel into physical memory
about to get started...
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.8.0upstream-06471-g2ef14f4-dirty (konrad@build.dumpdata.com) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) #1 SMP Fri Feb 22 11:36:48 EST 2013
[ 0.000000] Command line: initcall_debug debug console=hvc0 loglevel=10 xen-pciback.hide=(01:00.0) earlyprintk=xen
[ 0.000000] Freeing 9e-100 pfn range: 98 pages freed
[ 0.000000] 1-1 mapping on 9e->100
[ 0.000000] Freeing 20000-20200 pfn range: 512 pages freed
[ 0.000000] 1-1 mapping on 20000->20200
[ 0.000000] Freeing 40000-40200 pfn range: 512 pages freed
[ 0.000000] 1-1 mapping on 40000->40200
[ 0.000000] Freeing bad80-badf4 pfn range: 116 pages freed
[ 0.000000] 1-1 mapping on bad80->badf4
[ 0.000000] Freeing badf6-bae7f pfn range: 137 pages freed
[ 0.000000] 1-1 mapping on badf6->bae7f
[ 0.000000] Freeing bb000-100000 pfn range: 282624 pages freed
[ 0.000000] 1-1 mapping on bb000->100000
[ 0.000000] Released 283999 pages of unused memory
[ 0.000000] Set 283999 page(s) to 1-1 mapping
[ 0.000000] Populating 1acb65-1f20c4 pfn range: 283999 pages added
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] Xen: [mem 0x0000000000000000-0x000000000009dfff] usable
[ 0.000000] Xen: [mem 0x000000000009ec00-0x00000000000fffff] reserved
[ 0.000000] Xen: [mem 0x0000000000100000-0x000000001fffffff] usable
[ 0.000000] Xen: [mem 0x0000000020000000-0x00000000201fffff] reserved
[ 0.000000] Xen: [mem 0x0000000020200000-0x000000003fffffff] usable
[ 0.000000] Xen: [mem 0x0000000040000000-0x00000000401fffff] reserved
[ 0.000000] Xen: [mem 0x0000000040200000-0x00000000bad7ffff] usable
[ 0.000000] Xen: [mem 0x00000000bad80000-0x00000000badc8fff] ACPI NVS
[ 0.000000] Xen: [mem 0x00000000badc9000-0x00000000badd0fff] ACPI data
[ 0.000000] Xen: [mem 0x00000000badd1000-0x00000000badf3fff] reserved
[ 0.000000] Xen: [mem 0x00000000badf4000-0x00000000badf5fff] usable
[ 0.000000] Xen: [mem 0x00000000badf6000-0x00000000bae05fff] reserved
[ 0.000000] Xen: [mem 0x00000000bae06000-0x00000000bae13fff] ACPI NVS
[ 0.000000] Xen: [mem 0x00000000bae14000-0x00000000bae3bfff] reserved
[ 0.000000] Xen: [mem 0x00000000bae3c000-0x00000000bae7efff] ACPI NVS
[ 0.000000] Xen: [mem 0x00000000bae7f000-0x00000000baffffff] usable
[ 0.000000] Xen: [mem 0x00000000bb800000-0x00000000bf9fffff] reserved
[ 0.000000] Xen: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[ 0.000000] Xen: [mem 0x00000000fed1c000-0x00000000fed3ffff] reserved
[ 0.000000] Xen: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[ 0.000000] Xen: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[ 0.000000] Xen: [mem 0x0000000100000000-0x000000023fdfffff] usable
[ 0.000000] bootconsole [xenboot0] enabled
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 2.7 present.
[ 0.000000] DMI: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 03/14/2011
[ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[ 0.000000] No AGP bridge found
[ 0.000000] e820: last_pfn = 0x23fe00 max_arch_pfn = 0x400000000
[ 0.000000] e820: lacanning 1 areas for low memory corruption
[ 0.000000] Base memory trampoline at [ffff880000098000] 98000 size 24576
[ 0.000000] reserving inaccessible SNB gfx pages
[ 0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
[ 0.000000] [mem 0x00000000-0x000fffff] page 4k
[ 0.000000] init_memory_mapping: [mem 0x1f2000000-0x1f20c3fff]
[ 0.000000] [mem 0x1f2000000-0x1f20c3fff] page 4k
[ 0.000000] BRK [0x01cd2000, 0x01cd2fff] PGTABLE
[ 0.000000] BRK [0x01cd3000, 0x01cd3fff] PGTABLE
[ 0.000000] init_memory_mapping: [mem 0x1f0000000-0x1f1ffffff]
[ 0.000000] [mem 0x1f0000000-0x1f1ffffff] page 4k
[ 0.000000] BRK [0x01cd4000, 0x01cd4fff] PGTABLE
[ 0.000000] BRK [0x01cd5000, 0x01cd5fff] PGTABLE
[ 0.000000] BRK [0x01cd6000, 0x01cd6fff] PGTABLE
[ 0.000000] init_memory_mapping: [mem 0x180000000-0x1efffffff]
[ 0.000000] [mem 0x180000000-0x1efffffff] page 4k
[ 0.000000] init_memory_mapping: [mem 0x00100000-0x1fffffff]
[ 0.000000] [mem 0x00100000-0x1fffffff] page 4k
[ 0.000000] init_memory_mapping: [mem 0x20200000-0x3fffffff]
[ 0.000000] [mem 0x20200000-0x3fffffff] page 4k
[ 0.000000] init_memory_mapping: [mem 0x40200000-0xbad7ffff]
[ 0.000000] [mem 0x40200000-0xbad7ffff] page 4k
[ 0.000000] init_memory_mapping: [mem 0xbadf4000-0xbadf5fff]
[ 0.000000] [mem 0xbadf4000-0xbadf5fff] page 4k
[ 0.000000] init_memory_mapping: [mem 0xbae7f000-0xbaffffff]
[ 0.000000] [mem 0xbae7f000-0xbaffffff] page 4k
[ 0.000000] init_memory_mapping: [mem 0x100000000-0x17fffffff]
[ 0.000000] [mem 0x100000000-0x17fffffff] page 4k
[ 0.000000] init_memory_mapping: [mem 0x1f20c4000-0x23fdfffff]
[ 0.000000] [mem 0x1f20c4000-0x23fdfffff] page 4k
(XEN) d0:v0: unhandled page fault (ec=0000)
(XEN) Pagetable walk from ffffea000005b2d0:
(XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1.5-pre x86_64 debug=y Tainted: C ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff8103feba>]
(XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
(XEN) rax: ffffea0000000000 rbx: 0000000001a0c000 rcx: 0000000080000000
(XEN) rdx: 000000000005b2a0 rsi: 0000000001a0c000 rdi: 0000000000000000
(XEN) rbp: ffffffff81a01dd8 rsp: ffffffff81a01d90 r8: 0000000000000000
(XEN) r9: 0000000010000001 r10: 0000000000000005 r11: 0000000000100000
(XEN) r12: 0000000000000000 r13: 0000020000000000 r14: 0000000000000000
(XEN) r15: 0000000000100000 cr0: 000000008005003b cr4: 00000000000026f0
(XEN) cr3: 0000000221a0c000 cr2: ffffea000005b2d0
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=ffffffff81a01d90:
(XEN) 0000000080000000 0000000000100000 0000000000000000 ffffffff8103feba
(XEN) 000000010000e030 0000000000010006 ffffffff81a01dd8 000000000000e02b
(XEN) 0000000000000000 ffffffff81a01e08 ffffffff81042d27 000000023fe00000
(XEN) 00000001f20c4000 0000020000000000 00000001acac7000 ffffffff81a01e48
(XEN) ffffffff81ad2d21 0000000000000000 0000000000000028 0000000040004000
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01ed8
(XEN) ffffffff81ac293f ffffffff81b46900 0000000000000000 0000000000000000
(XEN) 0000000000000000 ffffffff81a01f00 ffffffff8165fbd1 ffffffff00000010
(XEN) ffffffff81a01ee8 ffffffff81a01ea8 0000000000000000 ffffffff81a01ec8
(XEN) ffffffffffffffff ffffffff81b46900 0000000000000000 0000000000000000
(XEN) 0000000000000000 ffffffff81a01f28 ffffffff81abcd62 ffffffff96062000
(XEN) ffffffff81cc6000 ffffffff81ccd000 ffffffff81b4f2e0 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f38
(XEN) ffffffff81abc5f7 ffffffff81a01ff8 ffffffff81abf0c7 0300000100000032
(XEN) 0000000000000005 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 819822831fc9cbf5 000206a700100800 0000000000000001
(XEN) 0000000000000000 0000000000000000 0f00000060c0c748 ccccccccccccc305
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 17:24 ` Konrad Rzeszutek Wilk
@ 2013-02-22 17:30 ` H. Peter Anvin
2013-02-22 17:53 ` Yinghai Lu
1 sibling, 0 replies; 16+ messages in thread
From: H. Peter Anvin @ 2013-02-22 17:30 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk
Cc: H. Peter Anvin, yinghai, Linus Torvalds, David S. Miller,
Rafael J. Wysocki, stable, Alexander Duyck, Andrea Arcangeli,
Andrew Morton, Andrzej Pietrasiewicz, Arnd Bergmann,
Borislav Petkov, Borislav Petkov, Christoph Lameter,
Daniel J Blueman, Dave Hansen, Eric Biederman, Fenghua Yu,
Frederic Weisbecker, Gleb Natapov, Gokul Caushik <caushik>
On 02/22/2013 09:24 AM, Konrad Rzeszutek Wilk wrote:
>
> Here is a better serial log of the crash (just booting a normal Xen 4.1 + initial
> kernel with 8GB):
>
Configuration, please, especially: is early_printk compiled in? Also,
since this is Xen-related we really need your help on this. A lot of
this is not going to be meaningful to non-Xen people.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 16:55 ` Konrad Rzeszutek Wilk
2013-02-22 17:12 ` H. Peter Anvin
2013-02-22 17:24 ` Konrad Rzeszutek Wilk
@ 2013-02-22 17:30 ` Dave Hansen
2013-02-22 17:33 ` H. Peter Anvin
2 siblings, 1 reply; 16+ messages in thread
From: Dave Hansen @ 2013-02-22 17:30 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
Ingo Molnar, Ville Syrjälä, Marek Szyprowski,
Andrea Arcangeli, Lee Schermerhorn, xen-devel, Russell King,
Len Brown, Joerg Roedel, linux-pm, Hugh Dickins,
Yasuaki Ishimatsu
On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
> On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote:
>> Hi Linus,
>>
>> This is a huge set of several partly interrelated (and concurrently
>> developed) changes, which is why the branch history is messier than
>> one would like.
>>
>> The *really* big items are two humonguous patchsets mostly developed
>> by Yinghai Lu at my request, which completely revamps the way we
>> create initial page tables. In particular, rather than estimating how
>> much memory we will need for page tables and then build them into that
>> memory -- a calculation that has shown to be incredibly fragile -- we
>> now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
>> a #PF handler which creates temporary page tables on demand.
>>
>> This has several advantages:
>>
>> 1. It makes it much easier to support things that need access to
>> data very early (a followon patchset uses this to load microcode
>> way early in the kernel startup).
>>
>> 2. It allows the kernel and all the kernel data objects to be invoked
>> from above the 4 GB limit. This allows kdump to work on very large
>> systems.
>>
>> 3. It greatly reduces the difference between Xen and native (Xen's
>> equivalent of the #PF handler are the temporary page tables created
>> by the domain builder), eliminating a bunch of fragile hooks.
>>
>> The patch series also gets us a bit closer to W^X.
>>
>> Additional work in this pull is the 64-bit get_user() work which you
>> were also involved with, and a bunch of cleanups/speedups to
>> __phys_addr()/__pa().
>
> Looking at figuring out which of the patches in the branch did this, but
> with this merge I am getting a crash with a very simple PV guest (booted with
> one 1G):
>
> Call Trace:
> [<ffffffff8103feba>] xen_get_user_pgd+0x5a <--
> [<ffffffff8103feba>] xen_get_user_pgd+0x5a
> [<ffffffff81042d27>] xen_write_cr3+0x77
> [<ffffffff81ad2d21>] init_mem_mapping+0x1f9
> [<ffffffff81ac293f>] setup_arch+0x742
> [<ffffffff81666d71>] printk+0x48
> [<ffffffff81abcd62>] start_kernel+0x90
> [<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b
> [<ffffffff81abc5f7>] x86_64_start_reservations+0x2a
> [<ffffffff81abf0c7>] xen_start_kernel+0x564
Do you have CONFIG_DEBUG_VIRTUAL on?
You're probably hitting the new BUG_ON() in __phys_addr(). It's
intended to detect places where someone is doing a __pa()/__phys_addr()
on an address that's outside the kernel's identity mapping.
There are a lot of __pa() calls around there, but from the looks of it,
it's this code:
static pgd_t *xen_get_user_pgd(pgd_t *pgd)
{
...
if (offset < pgd_index(USER_LIMIT)) {
struct page *page = virt_to_page(pgd_page);
I'm a bit fuzzy on exactly what the code is trying to do here. It could
mean either that the identity mapping isn't set up enough yet, or that
__pa() is getting called on a bogus address.
I'm especially fuzzy on why we'd be calling anything that's looking at
userspace pagetables (xen_get_user_pgd() ??) this early in boot.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 16:22 ` Linus Torvalds
@ 2013-02-22 17:31 ` H. Peter Anvin
0 siblings, 0 replies; 16+ messages in thread
From: H. Peter Anvin @ 2013-02-22 17:31 UTC (permalink / raw)
To: Linus Torvalds
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, sparclinux, Christoph Lameter, Ingo Molnar,
Ville Syrjälä, Marek Szyprowski, Andrea Arcangeli,
Lee Schermerhorn, Xen-devel@lists.xensource.com, Russell King,
Len Brown, Joerg Roedel, Stefano Stabellini
On 02/22/2013 08:22 AM, Linus Torvalds wrote:
>
> Ugh. So I've tried to walk through this, and it's painful. If this
> results in problems, we're going to be *so* screwed. Is it bisectable?
>
I can't tell you for sure that it is bisectable at every point. There
are definite bisection points in there, though, as this is several
pieces of work from two kernel cycles that were independently tested.
> I also don't understand how "early_idt_handler" could *possibly* work.
> In particular, it seems to rely on the trap number being set up in the
> stack frame:
>
> cmpl $14,72(%rsp) # Page fault?
>
> but that's not even *true*. Why? Because we export both the
> early_idt_handlers[] array (that sets up the trap number and makes the
> stack frame be reliable) and the single early_idt_handler function
> (that relies on the trap number and the reliable stack frame), AND
> AFAIK WE USE THE LATTER!
>
> See x86_64_start_kernel():
>
> for (i = 0; i < NUM_EXCEPTION_VECTORS; i++) {
> #ifdef CONFIG_EARLY_PRINTK
> set_intr_gate(i, &early_idt_handlers[i]);
> #else
> set_intr_gate(i, early_idt_handler);
> #endif
> }
>
> so unless you have CONFIG_EARLY_PRINTK, the interrupt gate will point
> to that raw early_idt_handler function that doesn't *work* on its own,
> afaik.
>
This is a (pre-existing!) bug that absolutely needs to be fixed, which
ought to break other things too (early use of *msr_safe for example, or
anything else that relies on an early exception entry, which there
aren't a lot of so far). The fix is simple and obvious.
But you're right... what the heck is going on here?
My own testing would probably not have caught this, as I consider
EARLY_PRINTK a must have, but Ingo's test machines definitely would have.
> Btw, it's not just the page fault index testing that is wrong. The whole
>
> cmpl $__KERNEL_CS,96(%rsp)
> jne 11f
>
> also relies on the stack frame being set up the same way for all
> exceptions - which again is only true if we ran through the
> early_idt_handlers[] prologue that added the extra stack entry.
>
> How does this even work for me? I don't have EARLY_PRINTK enabled.
>
> What am I missing?
I just ran a simulation without EARLY_PRINTK, presumably based on the
memory layout, we can apparently go through the entire bootup sequence
without actually ever taking an early trap. It is a bug, though, and it
is a bug even without this patchset. I will submit a fix. However, the
Xen "we tested this, this worked, now it doesn't" worries me a lot.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 17:30 ` Dave Hansen
@ 2013-02-22 17:33 ` H. Peter Anvin
0 siblings, 0 replies; 16+ messages in thread
From: H. Peter Anvin @ 2013-02-22 17:33 UTC (permalink / raw)
To: Dave Hansen
Cc: linux-mips, Jeremy Fitzhardinge, Fenghua Yu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, sparclinux, Christoph Lameter, Ingo Molnar,
Ville Syrjälä, Marek Szyprowski, Andrea Arcangeli,
Lee Schermerhorn, xen-devel, Russell King, Len Brown,
Joerg Roedel, Stefano Stabellini, Hugh Dickins, Yasuaki Ishimatsu
On 02/22/2013 09:30 AM, Dave Hansen wrote:
>
> Do you have CONFIG_DEBUG_VIRTUAL on?
>
> You're probably hitting the new BUG_ON() in __phys_addr(). It's
> intended to detect places where someone is doing a __pa()/__phys_addr()
> on an address that's outside the kernel's identity mapping.
>
> There are a lot of __pa() calls around there, but from the looks of it,
> it's this code:
>
> static pgd_t *xen_get_user_pgd(pgd_t *pgd)
> {
> ...
> if (offset < pgd_index(USER_LIMIT)) {
> struct page *page = virt_to_page(pgd_page);
>
> I'm a bit fuzzy on exactly what the code is trying to do here. It could
> mean either that the identity mapping isn't set up enough yet, or that
> __pa() is getting called on a bogus address.
>
> I'm especially fuzzy on why we'd be calling anything that's looking at
> userspace pagetables (xen_get_user_pgd() ??) this early in boot.
>
Ah yes, of course.
This is unrelated to the early page table setups, which is why it didn't
trip in Konrad's earlier testing.
This debugging bits has already found real bugs in the kernel, and this
might be another.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 17:12 ` H. Peter Anvin
@ 2013-02-22 17:38 ` Konrad Rzeszutek Wilk
2013-02-22 18:06 ` Stefano Stabellini
2013-02-22 18:08 ` Yinghai Lu
0 siblings, 2 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-22 17:38 UTC (permalink / raw)
To: H. Peter Anvin
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, sparclinux, Christoph Lameter, Ingo Molnar,
Ville Syrjälä, Marek Szyprowski, Andrea Arcangeli,
Lee Schermerhorn, xen-devel, Russell King, Len Brown,
Joerg Roedel, linux-pm, Hugh Dickins, Yasuaki Ishimatsu
On Fri, Feb 22, 2013 at 09:12:57AM -0800, H. Peter Anvin wrote:
> On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
> >
> >What is bizzare is that I do recall testing this (and Stefano also did it).
> >So I am not sure what has altered.
> >
>
> Yes, there was a very specific reason why I wanted you guys to test it...
Exactly. And I re-ran the same test, but with a new kernel. This is what
git reflog tells me:
473cd24 HEAD@{75}: checkout: moving from 08f321ed97353cf3b3fafa6b1c1971d6a8970830 to linux-next
08f321e HEAD@{76}: checkout: moving from linux-next to yinghai/for-x86-mm
eb827a7 HEAD@{77}: checkout: moving from 1b66ccf15ff4bd0200567e8d70446a8763f96ee7 to linux-next
[konrad@build linux]$ git show 08f321e
commit 08f321ed97353cf3b3fafa6b1c1971d6a8970830
Author: Yinghai Lu <yinghai@kernel.org>
Date: Thu Nov 8 00:00:19 2012 -0800
mm: Kill NO_BOOTMEM version free_all_bootmem_node()
And I recall Stefano later on testing (I was in a conference and did not have
the opportunity to test it). Not sure what he ran with.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 17:24 ` Konrad Rzeszutek Wilk
2013-02-22 17:30 ` H. Peter Anvin
@ 2013-02-22 17:53 ` Yinghai Lu
2013-02-22 18:23 ` Konrad Rzeszutek Wilk
2013-02-22 18:25 ` [Xen-devel] " Andrew Cooper
1 sibling, 2 replies; 16+ messages in thread
From: Yinghai Lu @ 2013-02-22 17:53 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
Ingo Molnar, Ville Syrjälä, Marek Szyprowski,
Andrea Arcangeli, Lee Schermerhorn, xen-devel, Russell King,
Len Brown, Joerg Roedel, linux-pm, Hugh Dickins,
Yasuaki Ishimatsu
On Fri, Feb 22, 2013 at 9:24 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Fri, Feb 22, 2013 at 11:55:31AM -0500, Konrad Rzeszutek Wilk wrote:
>> On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote:
>> > Hi Linus,
>> >
>> > This is a huge set of several partly interrelated (and concurrently
>> > developed) changes, which is why the branch history is messier than
>> > one would like.
>> >
>> > The *really* big items are two humonguous patchsets mostly developed
>> > by Yinghai Lu at my request, which completely revamps the way we
>> > create initial page tables. In particular, rather than estimating how
>> > much memory we will need for page tables and then build them into that
>> > memory -- a calculation that has shown to be incredibly fragile -- we
>> > now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
>> > a #PF handler which creates temporary page tables on demand.
>> >
>> > This has several advantages:
>> >
>> > 1. It makes it much easier to support things that need access to
>> > data very early (a followon patchset uses this to load microcode
>> > way early in the kernel startup).
>> >
>> > 2. It allows the kernel and all the kernel data objects to be invoked
>> > from above the 4 GB limit. This allows kdump to work on very large
>> > systems.
>> >
>> > 3. It greatly reduces the difference between Xen and native (Xen's
>> > equivalent of the #PF handler are the temporary page tables created
>> > by the domain builder), eliminating a bunch of fragile hooks.
>> >
>> > The patch series also gets us a bit closer to W^X.
>> >
>> > Additional work in this pull is the 64-bit get_user() work which you
>> > were also involved with, and a bunch of cleanups/speedups to
>> > __phys_addr()/__pa().
>>
>> Looking at figuring out which of the patches in the branch did this, but
>> with this merge I am getting a crash with a very simple PV guest (booted with
>> one 1G):
>>
>> Call Trace:
>> [<ffffffff8103feba>] xen_get_user_pgd+0x5a <--
>> [<ffffffff8103feba>] xen_get_user_pgd+0x5a
>> [<ffffffff81042d27>] xen_write_cr3+0x77
>> [<ffffffff81ad2d21>] init_mem_mapping+0x1f9
>> [<ffffffff81ac293f>] setup_arch+0x742
>> [<ffffffff81666d71>] printk+0x48
>> [<ffffffff81abcd62>] start_kernel+0x90
>> [<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b
>> [<ffffffff81abc5f7>] x86_64_start_reservations+0x2a
>> [<ffffffff81abf0c7>] xen_start_kernel+0x564
>>
>> And the hypervisor says:
>> (XEN) d7:v0: unhandled page fault (ec=0000)
>> (XEN) Pagetable walk from ffffea000005b2d0:
>> (XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
>> (XEN) domain_crash_sync called from entry.S
>> (XEN) Domain 7 (vcpu#0) crashed on cpu#3:
>> (XEN) ----[ Xen-4.2.0 x86_64 debug=n Not tainted ]----
>> (XEN) CPU: 3
>> (XEN) RIP: e033:[<ffffffff8103feba>]
>> (XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
>> (XEN) rax: ffffea0000000000 rbx: 0000000001a0c000 rcx: 0000000080000000
>> (XEN) rdx: 000000000005b2a0 rsi: 0000000001a0c000 rdi: 0000000000000000
>> (XEN) rbp: ffffffff81a01dd8 rsp: ffffffff81a01d90 r8: 0000000000000000
>> (XEN) r9: 0000000010000001 r10: 0000000000000000 r11: 0000000000000000
>> (XEN) r12: 0000000000000000 r13: 0000001000000000 r14: 0000000000000000
>> (XEN) r15: 0000000000100000 cr0: 000000008005003b cr4: 00000000000406f0
>> (XEN) cr3: 0000000411165000 cr2: ffffea000005b2d0
>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
>> (XEN) Guest stack trace from rsp=ffffffff81a01d90:
>> (XEN) 0000000080000000 0000000000000000 0000000000000000 ffffffff8103feba
>> (XEN) 000000010000e030 0000000000010006 ffffffff81a01dd8 000000000000e02b
>
> Here is a better serial log of the crash (just booting a normal Xen 4.1 + initial
> kernel with 8GB):
>
> PXELINUX 3.82 2009-06-09 Copyright (C) 1994-2009 H. Peter Anvin et al
> boot:
> Loading xen.gz... ok
> Loading vmlinuz... ok
> Loading initramfs.cpio.gz... ok
> __ __ _ _ _ ____
> \ \/ /___ _ __ | || | / | | ___| _ __ _ __ ___
> \ // _ \ '_(_)_(_)____/ | .__/|_| \___|
> |_|
> (XEN) Xen version 4.1.5-pre (konrad@dumpdata.com) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) Fri Feb 22 11:37:00 EST 2013
> (XEN) Latest ChangeSet: Fri Feb 15 15:31:55 2013 +0100 23459:9f12bdd6b7f0
> (XEN) Console output is synchronous.
> (XEN) Bootloader: unknown
> (XEN) Command line: cpuinfo conring_size=1048576 sync_console cpufreq=verbose com1=115200,8n1 console=com1,vga loglvl=all guest_loglvl=all
> (XEN) Video information:
> (XEN) VGA is text mode 80x25, font 8x16
> (XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds
> (XEN) EDID info not retrieved because no DDC retrieval method detected
> (XEN) Disc information:
> (XEN) Found 1 MBR signatures
> (XEN) Found 1 EDD information structures
> (XEN) Xen-e820 RAM map:
> (XEN) 0000000000000000 - 000000000009ec00 (usable)
> (XEN) 000000000009ec00 - 00000000000a0000 (reserved)
> (XEN) 00000000000e0000 - 0000000000100000 (reserved)
> (XEN) 0000000000100000 - 0000000020000000 (usable)
> (XEN) 0000000020000000 - 0000000020200000 (reserved)
> (XEN) 0000000020200000 - 0000000040000000 (usable)
> (XEN) 0000000040000000 - 0000000040200000 (reserved)
> (XEN) 0000000040200000 - 00000000bad80000 (usable)
> (XEN) 00000000bad80000 - 00000000badc9000 (ACPI NVS)
> (XEN) 00000000badc9000 - 00000000badd1000 (ACPI data)
> (XEN) 00000000badd1000 - 00000000badf4000 (reserved)
> (XEN) 00000000badf4000 - 00000000badf6000 (usable)
> (XEN) 00000000badf6000 - 00000000bae06000 (reserved)
> (XEN) 00000000bae06000 - 00000000bae14000 (ACPI NVS)
> (XEN) 00000000bae14000 - 00000000bae3c000 (reserved)
> (XEN) 00000000bae3c000 - 00000000bae7f000 (ACPI NVS)
> (XEN) 00000000bae7f000 - 00000000bb000000 (usable)
> (XEN) 00000000bb800000 - 00000000bfa00000 (reserved)
> (XEN) 00000000fed1c000 - 00000000fed40000 (reserved)
> (XEN) 00000000ff000000 - 0000000100000000 (reserved)
> (XEN) 0000000100000000 - 000000023fe00000 (usable)
> (XEN) ACPI: RSDP 000F0450, 0024 (r2 ALASKA)
> (XEN) ACPI: XSDT BADC9068, 0054 (r1 ALASKA A M I 1072009 AMI 10013)
> (XEN) PROC 1 MSFT 3000001)
> (XEN) ACPI: MCFG BADD0580, 003C (r1 ALASKA A M I 1072009 MSFT 97)
> (XEN) ACPI: HPET BADD05C0, 0038 (r1 ALASKA A M I 1072009 AMI. 4)
> (XEN) ACPI: ASF! BADD05F8, 00A0 (r32 INTEL HCG 1 TFSM F4240)
> (XEN) System RAM: 8104MB (8299140kB)
> (XEN) No NUMA configuration found
> (XEN) Faking a node at 0000000000000000-000000023fe00000
> (XEN) Domain heap initialised
> (XEN) found SMP MP-table at 000fcde0
> (XEN) DMI 2.7 present.
> (XEN) Using APIC driver default
> (XEN) ACPI: PM-Timer IO Port: 0x408
> (XEN) ACPI: ACPI SLEEP INFO: pm1x_cnt[404,0], pm1x_evt[400,0]
> (XEN) ACPI: 32/64X FACS address mismatch in FADT - bae0bf80/0000000000000000, using 32
> (XEN) ACPI: wakeup_vec[bae0bf8c], vec_size[20]
> (XEN) ACPI: Local APIC address 0xfee00000
> (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
> (XEN) Processor #0 6:10 APIC version 21
> (XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
> (XEN) Processor #2 6:10 APIC version 21
> (XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
> (XEN) Processor #1 6:10 APIC version 21
> (XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
> (XEN) Processor #3 6:10 APIC version 21
> (XEN) ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
> (XEN) ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0])
> (XEN) IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23
> (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> (XEN) ACPI: IRQ0 used by override.
> (XEN) ACPI: IRQ2 used by override.
> (XEN) ACPI: IRQ9 used by override.
> (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs
> (XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
> (XEN) PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
> (XEN) PCI: Not using MMCONFIG.
> (XEN) Table is not found!
> (XEN) Using ACPI (MADT) for SMP configuration information
> (XEN) IRQ limits: 24 GSI, 760 MSI/MSI-X
> (XEN) Using scheduler: SMP Credit Scheduler (credit)
> (XEN) Initializing CPU#0
> (XEN) Detected 3093.067 MHz processor.
> (XEN) Initing memory sharing.
> (XEN) CPU: Physical Processor ID: 0
> (XEN) CPU: Processor Core ID: 0
> (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
> (XEN) CPU: L2 cache: 256K
> (XEN) CPU: L3 cache: 3072K
> (XEN) mce_intel.c:1162: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0 extended MCE MSR 0
> (XEN) CPU0: Thermal monitoring enabled (TM1)
> (XEN) Intel machine check reporting enabled
> (XEN) I/O virtualisation disabled
> (XEN) CPU0: Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz stepping 07
> (XEN) Enabled directed EOI with ioapic_ack_old on!
> (XEN) ENABLING IO-APIC IRQs
> (XEN) -> Using old ACK method
> (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
> (XEN) TSC deadline timer enabled
> (XEN) Platform timer is 14.318MHz HPET
> (XEN) Allocated console ring of 1048576 KiB.
> (XEN) VMX: Supported advanced features:
> (isation
> (XEN) - APIC TPR shadow
> (XEN) - Extended Page Tables (EPT)
> (XEN) - Virtual-Processor Identifiers (VPID)
> (XEN) - Virtual NMI
> (XEN) - MSR direct-access bitmap
> (XEN) - Unrestricted Guest
> (XEN) HVM: ASIDs enabled.
> (XEN) HVM: VMX enabled
> (XEN) HVM: Hardware Assisted Paging (HAP) detected
> (XEN) HVM: HAP page sizes: 4kB, 2MB
> (XEN) Booting processor 1/1 eip 7c000
> (XEN) Initializing CPU#1
> (XEN) CPU: Physical Processor ID: 0
> (XEN) CPU: Processor Core ID: 0
> (XEN) CPU: L1 I cache: 32K, L1 D
> (XEN) Initializing CPU#2
> (XEN) CPU: Physical Processor ID: 0
> (XEN) CPU: Processor Core ID: 1
> (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
> (XEN) CPU: L2 cache: 256K
> (XEN) CPU: L3 cache: 3072K
> (XEN) CPU2: Thermal monitoring enabled (TM1)
> (XEN) CPU2: Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz stepping 07
> (XEN) Booting processor 3/3 eip 7c000
> (XEN) Initializing CPU#3
> (XEN) CPU: Physical Processor ID: 0
> (XEN) CPU: Processor Core ID: 1
> (XEN) CPU: L1 I cache: 32K, L1 D100 CPU @ 3.10GHz stepping 07
> (XEN) Brought up 4 CPUs
> (XEN) ACPI sleep modes: S3
> (XEN) mcheck_poll: Machine check polling timer started.
> (XEN) *** LOADING DOMAIN 0 ***
> (XEN) elf_parse_binary: phdr: paddr=0x1000000 memsz=0x9e0000
> (XEN) elf_parse_binary: phdr: paddr=0x1a00000 memsz=0xa60f0
> (XEN): paddr=0x1abc000 memsz=0x61b000
> (XEN) elf_parse_binary: memory: 0x1000000 -> 0x20d7000
> (XEN) elf_xen_parse_note: GUEST_OS = "linux"
> (XEN) elf_xen_parse_note: GUEST_VERSION = "2.6"
> (XEN) elf_xen_parse_note: XEN_VERSION = "xen-3.0"
> (XEN) elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
> (XEN) elf_xen_parse_note: ENTRY = 0xffffffff81abc1e0
> (XEN) elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81001000
> (XEN) elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
> (XEN) elf_xen_parse_note: PAE_MODE = "yes"
> (XEN) elf_xen_parse_note: LOADER = "generic"
> (XEN) elf_xen_parse_note: unknown xen elf note (0xd)
> (XEN) elf_xen_parse_note: SUSPEND_CANCEL = 0x1
> (XEN) elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
> (XEN) elf_xen_parse_note: PADDR_OFFSET = 0x0
> (XEN) elf_xen_addr_calc_check: addresses:
> (XEN) virt_base = 0xffffffff80000000
> (XEN) elf_paddr_offset = 0x0
> (XEN) virt_offset = 0xffffffff80000000
> (XEN) virt_kstart = 0xffffffff81000000
> (XEN) virt_kend = 0xffffffff820d7000
> (XEN) virt_entry = 0xffffffff81abc1e0
> (XEN) p2m_base = 0xffffffffffffffff
> (XEN) Xen kernel: 64-bit, lsb, compat32
> (XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x20d7000
> (XEN) PHYSICAL MEMORY ARRANGEMENT:
> (XEN) Dom0 alloc.: 0000000220000000->0000000224000000 (1661249 pages to be allocated)
> (XEN) Init. ramdisk: 000000022cbdc000->000000023fe00000
> (XEN) VIRTUAL MEMORY ARRANGEMENT:
> (XEN) Loaded kernel: ffffffff81000000->ffffffff820d7000
> (XEN) Init. ramdisk: ffffffff820d7000->ffffffff952fb000
> (XEN) Phys-Mach map: ffffffff952fb000->ffffffff96060b28
> (XEN) Start info: ffffffff96061000->ffffffff960614b4
> (XEN) Page tables: ffffffff96062000->ffffffff96117000
> (XEN) Boot stack: ffffffff96117000->ffffffff96118000
> (XEN) TOTAL: ffffffff80000000->ffffffff96400000
> (XEN) ENTRY ADDRESS: ffffffff81abc1e0
> (XEN) Dom0 has maximum 4 VCPUs
> (XEN) elf_load_binary: phdr 0 at 0xffffffff81000000 -> 0xffffffff819e0000
> (XEN) elf_load_binary: phdr 1 at 0xffffffff81a00000 -> 0xffffffff81aa60f0
> (XEN) elf_load_binary: phdr 2 at 0xffffffff81aa7000 -> 0xffffffff81abbbc0
> (XEN) elf_load_binary: phdr 3 at 0xffffffff81abc000 -> 0xffffffff81baf000
> (XEN) Scrubbing Free RAM: .done.
> (XEN) Xen trace buffers: disabled
> (XEN) Std. Loglevel: All
> (XEN) Guest Loglevel: All
> (XEN) ***************************intended to aid debugging of Xen by ensuring
> (XEN) ******* that all output is synchronously delivered on the serial line.
> (XEN) ******* However it can introduce SIGNIFICANT latencies and affect
> (XEN) ******* timekeeping. It is NOT recommended for production use!
> (XEN) **********************************************
> (XEN) 3... 2... 1...
> (XEN) Xen is relinquishing VGA console.
> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
> (XEN) Freed 224kB init memory.
> mapping kernel into physical memory
> about to get started...
> [ 0.000000] Initializing cgroup subsys cpuset
> [ 0.000000] Initializing cgroup subsys cpu
> [ 0.000000] Linux version 3.8.0upstream-06471-g2ef14f4-dirty (konrad@build.dumpdata.com) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) #1 SMP Fri Feb 22 11:36:48 EST 2013
> [ 0.000000] Command line: initcall_debug debug console=hvc0 loglevel=10 xen-pciback.hide=(01:00.0) earlyprintk=xen
> [ 0.000000] Freeing 9e-100 pfn range: 98 pages freed
> [ 0.000000] 1-1 mapping on 9e->100
> [ 0.000000] Freeing 20000-20200 pfn range: 512 pages freed
> [ 0.000000] 1-1 mapping on 20000->20200
> [ 0.000000] Freeing 40000-40200 pfn range: 512 pages freed
> [ 0.000000] 1-1 mapping on 40000->40200
> [ 0.000000] Freeing bad80-badf4 pfn range: 116 pages freed
> [ 0.000000] 1-1 mapping on bad80->badf4
> [ 0.000000] Freeing badf6-bae7f pfn range: 137 pages freed
> [ 0.000000] 1-1 mapping on badf6->bae7f
> [ 0.000000] Freeing bb000-100000 pfn range: 282624 pages freed
> [ 0.000000] 1-1 mapping on bb000->100000
> [ 0.000000] Released 283999 pages of unused memory
> [ 0.000000] Set 283999 page(s) to 1-1 mapping
> [ 0.000000] Populating 1acb65-1f20c4 pfn range: 283999 pages added
> [ 0.000000] e820: BIOS-provided physical RAM map:
> [ 0.000000] Xen: [mem 0x0000000000000000-0x000000000009dfff] usable
> [ 0.000000] Xen: [mem 0x000000000009ec00-0x00000000000fffff] reserved
> [ 0.000000] Xen: [mem 0x0000000000100000-0x000000001fffffff] usable
> [ 0.000000] Xen: [mem 0x0000000020000000-0x00000000201fffff] reserved
> [ 0.000000] Xen: [mem 0x0000000020200000-0x000000003fffffff] usable
> [ 0.000000] Xen: [mem 0x0000000040000000-0x00000000401fffff] reserved
> [ 0.000000] Xen: [mem 0x0000000040200000-0x00000000bad7ffff] usable
> [ 0.000000] Xen: [mem 0x00000000bad80000-0x00000000badc8fff] ACPI NVS
> [ 0.000000] Xen: [mem 0x00000000badc9000-0x00000000badd0fff] ACPI data
> [ 0.000000] Xen: [mem 0x00000000badd1000-0x00000000badf3fff] reserved
> [ 0.000000] Xen: [mem 0x00000000badf4000-0x00000000badf5fff] usable
> [ 0.000000] Xen: [mem 0x00000000badf6000-0x00000000bae05fff] reserved
> [ 0.000000] Xen: [mem 0x00000000bae06000-0x00000000bae13fff] ACPI NVS
> [ 0.000000] Xen: [mem 0x00000000bae14000-0x00000000bae3bfff] reserved
> [ 0.000000] Xen: [mem 0x00000000bae3c000-0x00000000bae7efff] ACPI NVS
> [ 0.000000] Xen: [mem 0x00000000bae7f000-0x00000000baffffff] usable
> [ 0.000000] Xen: [mem 0x00000000bb800000-0x00000000bf9fffff] reserved
> [ 0.000000] Xen: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
> [ 0.000000] Xen: [mem 0x00000000fed1c000-0x00000000fed3ffff] reserved
> [ 0.000000] Xen: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
> [ 0.000000] Xen: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
> [ 0.000000] Xen: [mem 0x0000000100000000-0x000000023fdfffff] usable
> [ 0.000000] bootconsole [xenboot0] enabled
> [ 0.000000] NX (Execute Disable) protection: active
> [ 0.000000] SMBIOS 2.7 present.
> [ 0.000000] DMI: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 03/14/2011
> [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
> [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
> [ 0.000000] No AGP bridge found
> [ 0.000000] e820: last_pfn = 0x23fe00 max_arch_pfn = 0x400000000
> [ 0.000000] e820: lacanning 1 areas for low memory corruption
> [ 0.000000] Base memory trampoline at [ffff880000098000] 98000 size 24576
> [ 0.000000] reserving inaccessible SNB gfx pages
> [ 0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> [ 0.000000] [mem 0x00000000-0x000fffff] page 4k
> [ 0.000000] init_memory_mapping: [mem 0x1f2000000-0x1f20c3fff]
> [ 0.000000] [mem 0x1f2000000-0x1f20c3fff] page 4k
> [ 0.000000] BRK [0x01cd2000, 0x01cd2fff] PGTABLE
> [ 0.000000] BRK [0x01cd3000, 0x01cd3fff] PGTABLE
> [ 0.000000] init_memory_mapping: [mem 0x1f0000000-0x1f1ffffff]
> [ 0.000000] [mem 0x1f0000000-0x1f1ffffff] page 4k
> [ 0.000000] BRK [0x01cd4000, 0x01cd4fff] PGTABLE
> [ 0.000000] BRK [0x01cd5000, 0x01cd5fff] PGTABLE
> [ 0.000000] BRK [0x01cd6000, 0x01cd6fff] PGTABLE
> [ 0.000000] init_memory_mapping: [mem 0x180000000-0x1efffffff]
> [ 0.000000] [mem 0x180000000-0x1efffffff] page 4k
> [ 0.000000] init_memory_mapping: [mem 0x00100000-0x1fffffff]
> [ 0.000000] [mem 0x00100000-0x1fffffff] page 4k
> [ 0.000000] init_memory_mapping: [mem 0x20200000-0x3fffffff]
> [ 0.000000] [mem 0x20200000-0x3fffffff] page 4k
> [ 0.000000] init_memory_mapping: [mem 0x40200000-0xbad7ffff]
> [ 0.000000] [mem 0x40200000-0xbad7ffff] page 4k
> [ 0.000000] init_memory_mapping: [mem 0xbadf4000-0xbadf5fff]
> [ 0.000000] [mem 0xbadf4000-0xbadf5fff] page 4k
> [ 0.000000] init_memory_mapping: [mem 0xbae7f000-0xbaffffff]
> [ 0.000000] [mem 0xbae7f000-0xbaffffff] page 4k
> [ 0.000000] init_memory_mapping: [mem 0x100000000-0x17fffffff]
> [ 0.000000] [mem 0x100000000-0x17fffffff] page 4k
> [ 0.000000] init_memory_mapping: [mem 0x1f20c4000-0x23fdfffff]
> [ 0.000000] [mem 0x1f20c4000-0x23fdfffff] page 4k
so init_memory_mapping are all done.
> (XEN) d0:v0: unhandled page fault (ec=0000)
> (XEN) Pagetable walk from ffffea000005b2d0:
> (XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
> (XEN) domain_crash_sync called from entry.S
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-4.1.5-pre x86_64 debug=y Tainted: C ]----
> (XEN) CPU: 0
> (XEN) RIP: e033:[<ffffffff8103feba>]
> (XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
> (XEN) rax: ffffea0000000000 rbx: 0000000001a0c000 rcx: 0000000080000000
> (XEN) rdx: 000000000005b2a0 rsi: 0000000001a0c000 rdi: 0000000000000000
> (XEN) rbp: ffffffff81a01dd8 rsp: ffffffff81a01d90 r8: 0000000000000000
> (XEN) r9: 0000000010000001 r10: 0000000000000005 r11: 0000000000100000
> (XEN) r12: 0000000000000000 r13: 0000020000000000 r14: 0000000000000000
> (XEN) r15: 0000000000100000 cr0: 000000008005003b cr4: 00000000000026f0
> (XEN) cr3: 0000000221a0c000 cr2: ffffea000005b2d0
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
> (XEN) Guest stack trace from rsp=ffffffff81a01d90:
> (XEN) 0000000080000000 0000000000100000 0000000000000000 ffffffff8103feba
> (XEN) 000000010000e030 0000000000010006 ffffffff81a01dd8 000000000000e02b
> (XEN) 0000000000000000 ffffffff81a01e08 ffffffff81042d27 000000023fe00000
> (XEN) 00000001f20c4000 0000020000000000 00000001acac7000 ffffffff81a01e48
> (XEN) ffffffff81ad2d21 0000000000000000 0000000000000028 0000000040004000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01ed8
> (XEN) ffffffff81ac293f ffffffff81b46900 0000000000000000 0000000000000000
> (XEN) 0000000000000000 ffffffff81a01f00 ffffffff8165fbd1 ffffffff00000010
> (XEN) ffffffff81a01ee8 ffffffff81a01ea8 0000000000000000 ffffffff81a01ec8
> (XEN) ffffffffffffffff ffffffff81b46900 0000000000000000 0000000000000000
> (XEN) 0000000000000000 ffffffff81a01f28 ffffffff81abcd62 ffffffff96062000
> (XEN) ffffffff81cc6000 ffffffff81ccd000 ffffffff81b4f2e0 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f38
> (XEN) ffffffff81abc5f7 ffffffff81a01ff8 ffffffff81abf0c7 0300000100000032
> (XEN) 0000000000000005 0000000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 819822831fc9cbf5 000206a700100800 0000000000000001
> (XEN) 0000000000000000 0000000000000000 0f00000060c0c748 ccccccccccccc305
> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
can we get kernel trace instead?
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 17:38 ` Konrad Rzeszutek Wilk
@ 2013-02-22 18:06 ` Stefano Stabellini
2013-02-22 18:22 ` Yinghai Lu
2013-02-22 18:08 ` Yinghai Lu
1 sibling, 1 reply; 16+ messages in thread
From: Stefano Stabellini @ 2013-02-22 18:06 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk
Cc: H. Peter Anvin, H. Peter Anvin, Linus Torvalds, David S. Miller,
Rafael J. Wysocki, stable@vger.kernel.org, Alexander Duyck,
Andrea Arcangeli, Andrew Morton, Andrzej Pietrasiewicz,
Arnd Bergmann, Borislav Petkov, Borislav Petkov,
Christoph Lameter, Daniel J Blueman, Dave Hansen, Eric Biederman,
Fenghua Yu, Frederic Weisbecker, Gleb
On Fri, 22 Feb 2013, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 22, 2013 at 09:12:57AM -0800, H. Peter Anvin wrote:
> > On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
> > >
> > >What is bizzare is that I do recall testing this (and Stefano also did it).
> > >So I am not sure what has altered.
> > >
> >
> > Yes, there was a very specific reason why I wanted you guys to test it...
>
> Exactly. And I re-ran the same test, but with a new kernel. This is what
> git reflog tells me:
>
> 473cd24 HEAD@{75}: checkout: moving from 08f321ed97353cf3b3fafa6b1c1971d6a8970830 to linux-next
> 08f321e HEAD@{76}: checkout: moving from linux-next to yinghai/for-x86-mm
> eb827a7 HEAD@{77}: checkout: moving from 1b66ccf15ff4bd0200567e8d70446a8763f96ee7 to linux-next
> [konrad@build linux]$ git show 08f321e
> commit 08f321ed97353cf3b3fafa6b1c1971d6a8970830
> Author: Yinghai Lu <yinghai@kernel.org>
> Date: Thu Nov 8 00:00:19 2012 -0800
>
> mm: Kill NO_BOOTMEM version free_all_bootmem_node()
>
> And I recall Stefano later on testing (I was in a conference and did not have
> the opportunity to test it). Not sure what he ran with.
>
FYI the last patch series I tested was Yinghai's "x86, boot, 64bit: Add
support for loading ramdisk and bzImage above 4G" v7u1.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 17:38 ` Konrad Rzeszutek Wilk
2013-02-22 18:06 ` Stefano Stabellini
@ 2013-02-22 18:08 ` Yinghai Lu
1 sibling, 0 replies; 16+ messages in thread
From: Yinghai Lu @ 2013-02-22 18:08 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
Ingo Molnar, Ville Syrjälä, Marek Szyprowski,
Andrea Arcangeli, Lee Schermerhorn, xen-devel, Russell King,
Len Brown, Joerg Roedel, linux-pm, Hugh Dickins,
Yasuaki Ishimatsu
On Fri, Feb 22, 2013 at 9:38 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Fri, Feb 22, 2013 at 09:12:57AM -0800, H. Peter Anvin wrote:
>> On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
>> >
>> >What is bizzare is that I do recall testing this (and Stefano also did it).
>> >So I am not sure what has altered.
>> >
>>
>> Yes, there was a very specific reason why I wanted you guys to test it...
>
> Exactly. And I re-ran the same test, but with a new kernel. This is what
> git reflog tells me:
>
> 473cd24 HEAD@{75}: checkout: moving from 08f321ed97353cf3b3fafa6b1c1971d6a8970830 to linux-next
> 08f321e HEAD@{76}: checkout: moving from linux-next to yinghai/for-x86-mm
> eb827a7 HEAD@{77}: checkout: moving from 1b66ccf15ff4bd0200567e8d70446a8763f96ee7 to linux-next
> [konrad@build linux]$ git show 08f321e
> commit 08f321ed97353cf3b3fafa6b1c1971d6a8970830
> Author: Yinghai Lu <yinghai@kernel.org>
> Date: Thu Nov 8 00:00:19 2012 -0800
>
> mm: Kill NO_BOOTMEM version free_all_bootmem_node()
>
> And I recall Stefano later on testing (I was in a conference and did not have
> the opportunity to test it). Not sure what he ran with.
the commit in tip and linus tree have different hash...
commit 600cc5b7f6371706679490d7ee108015ae57ac2f
Author: Yinghai Lu <yinghai@kernel.org>
Date: Fri Nov 16 19:39:22 2012 -0800
mm: Kill NO_BOOTMEM version free_all_bootmem_node()
Now NO_BOOTMEM version free_all_bootmem_node() does not really
do free_bootmem at all, and it only call register_page_bootmem_info_node
for online nodes instead.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 18:06 ` Stefano Stabellini
@ 2013-02-22 18:22 ` Yinghai Lu
0 siblings, 0 replies; 16+ messages in thread
From: Yinghai Lu @ 2013-02-22 18:22 UTC (permalink / raw)
To: Stefano Stabellini
Cc: linux-mips@linux-mips.org, Jeremy Fitzhardinge, H. J. Lu,
Frederic Weisbecker, Joe Millenbach,
virtualization@lists.linux-foundation.org, Gokul Caushik,
Ralf Baechle, Pavel Machek, H. Peter Anvin,
sparclinux@vger.kernel.org, Christoph Lameter, Ingo Molnar,
Ville Syrjälä, Marek Szyprowski, Andrea Arcangeli,
Lee Schermerhorn, xen-devel@lists.xensource.com, Russell King
On Fri, Feb 22, 2013 at 10:06 AM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> On Fri, 22 Feb 2013, Konrad Rzeszutek Wilk wrote:
>> On Fri, Feb 22, 2013 at 09:12:57AM -0800, H. Peter Anvin wrote:
>> > On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
>> > >
>> > >What is bizzare is that I do recall testing this (and Stefano also did it).
>> > >So I am not sure what has altered.
>> > >
>> >
>> > Yes, there was a very specific reason why I wanted you guys to test it...
>>
>> Exactly. And I re-ran the same test, but with a new kernel. This is what
>> git reflog tells me:
>>
>> 473cd24 HEAD@{75}: checkout: moving from 08f321ed97353cf3b3fafa6b1c1971d6a8970830 to linux-next
>> 08f321e HEAD@{76}: checkout: moving from linux-next to yinghai/for-x86-mm
>> eb827a7 HEAD@{77}: checkout: moving from 1b66ccf15ff4bd0200567e8d70446a8763f96ee7 to linux-next
>> [konrad@build linux]$ git show 08f321e
>> commit 08f321ed97353cf3b3fafa6b1c1971d6a8970830
>> Author: Yinghai Lu <yinghai@kernel.org>
>> Date: Thu Nov 8 00:00:19 2012 -0800
>>
>> mm: Kill NO_BOOTMEM version free_all_bootmem_node()
>>
>> And I recall Stefano later on testing (I was in a conference and did not have
>> the opportunity to test it). Not sure what he ran with.
>>
>
> FYI the last patch series I tested was Yinghai's "x86, boot, 64bit: Add
> support for loading ramdisk and bzImage above 4G" v7u1.
the one in tip and linus's tree is
---
-v7u2: update changelog and comments, and clear more fields for sentinel.
Update swiotlb autoswitch off patch.
Fix crash with xen PV guest with 2G.
---
and it fixes xen crash that you reported with v7u1, and you tested
that add-on patch
fix_xen_2g.patch with v7u1.
and I fold the addon patch into offending patch in v7u2.
Thanks
Yinghai
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 17:53 ` Yinghai Lu
@ 2013-02-22 18:23 ` Konrad Rzeszutek Wilk
2013-02-22 18:25 ` [Xen-devel] " Andrew Cooper
1 sibling, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-22 18:23 UTC (permalink / raw)
To: Yinghai Lu
Cc: linux-mips, Jeremy Fitzhardinge, H. J. Lu, Frederic Weisbecker,
Joe Millenbach, virtualization, Gokul Caushik, Ralf Baechle,
Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
Ingo Molnar, Ville Syrjälä, Marek Szyprowski,
Andrea Arcangeli, Lee Schermerhorn, xen-devel, Russell King,
Len Brown, Joerg Roedel, linux-pm, Hugh Dickins,
Yasuaki Ishimatsu
> > [ 0.000000] DMI: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 03/14/2011
> > [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
> > [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
> > [ 0.000000] No AGP bridge found
> > [ 0.000000] e820: last_pfn = 0x23fe00 max_arch_pfn = 0x400000000
> > [ 0.000000] e820: lacanning 1 areas for low memory corruption
> > [ 0.000000] Base memory trampoline at [ffff880000098000] 98000 size 24576
> > [ 0.000000] reserving inaccessible SNB gfx pages
> > [ 0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> > [ 0.000000] [mem 0x00000000-0x000fffff] page 4k
> > [ 0.000000] init_memory_mapping: [mem 0x1f2000000-0x1f20c3fff]
> > [ 0.000000] [mem 0x1f2000000-0x1f20c3fff] page 4k
> > [ 0.000000] BRK [0x01cd2000, 0x01cd2fff] PGTABLE
> > [ 0.000000] BRK [0x01cd3000, 0x01cd3fff] PGTABLE
> > [ 0.000000] init_memory_mapping: [mem 0x1f0000000-0x1f1ffffff]
> > [ 0.000000] [mem 0x1f0000000-0x1f1ffffff] page 4k
> > [ 0.000000] BRK [0x01cd4000, 0x01cd4fff] PGTABLE
> > [ 0.000000] BRK [0x01cd5000, 0x01cd5fff] PGTABLE
> > [ 0.000000] BRK [0x01cd6000, 0x01cd6fff] PGTABLE
> > [ 0.000000] init_memory_mapping: [mem 0x180000000-0x1efffffff]
> > [ 0.000000] [mem 0x180000000-0x1efffffff] page 4k
> > [ 0.000000] init_memory_mapping: [mem 0x00100000-0x1fffffff]
> > [ 0.000000] [mem 0x00100000-0x1fffffff] page 4k
> > [ 0.000000] init_memory_mapping: [mem 0x20200000-0x3fffffff]
> > [ 0.000000] [mem 0x20200000-0x3fffffff] page 4k
> > [ 0.000000] init_memory_mapping: [mem 0x40200000-0xbad7ffff]
> > [ 0.000000] [mem 0x40200000-0xbad7ffff] page 4k
> > [ 0.000000] init_memory_mapping: [mem 0xbadf4000-0xbadf5fff]
> > [ 0.000000] [mem 0xbadf4000-0xbadf5fff] page 4k
> > [ 0.000000] init_memory_mapping: [mem 0xbae7f000-0xbaffffff]
> > [ 0.000000] [mem 0xbae7f000-0xbaffffff] page 4k
> > [ 0.000000] init_memory_mapping: [mem 0x100000000-0x17fffffff]
> > [ 0.000000] [mem 0x100000000-0x17fffffff] page 4k
> > [ 0.000000] init_memory_mapping: [mem 0x1f20c4000-0x23fdfffff]
> > [ 0.000000] [mem 0x1f20c4000-0x23fdfffff] page 4k
>
> so init_memory_mapping are all done.
Not so.
>
> > (XEN) d0:v0: unhandled page fault (ec=0000)
> > (XEN) Pagetable walk from ffffea000005b2d0:
> > (XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
> > (XEN) domain_crash_sync called from entry.S
> > (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> > (XEN) ----[ Xen-4.1.5-pre x86_64 debug=y Tainted: C ]----
> > (XEN) CPU: 0
> > (XEN) RIP: e033:[<ffffffff8103feba>]
> > (XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
> > (XEN) rax: ffffea0000000000 rbx: 0000000001a0c000 rcx: 0000000080000000
> > (XEN) rdx: 000000000005b2a0 rsi: 0000000001a0c000 rdi: 0000000000000000
> > (XEN) rbp: ffffffff81a01dd8 rsp: ffffffff81a01d90 r8: 0000000000000000
> > (XEN) r9: 0000000010000001 r10: 0000000000000005 r11: 0000000000100000
> > (XEN) r12: 0000000000000000 r13: 0000020000000000 r14: 0000000000000000
> > (XEN) r15: 0000000000100000 cr0: 000000008005003b cr4: 00000000000026f0
> > (XEN) cr3: 0000000221a0c000 cr2: ffffea000005b2d0
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
> > (XEN) Guest stack trace from rsp=ffffffff81a01d90:
> > (XEN) 0000000080000000 0000000000100000 0000000000000000 ffffffff8103feba
> > (XEN) 000000010000e030 0000000000010006 ffffffff81a01dd8 000000000000e02b
> > (XEN) 0000000000000000 ffffffff81a01e08 ffffffff81042d27 000000023fe00000
> > (XEN) 00000001f20c4000 0000020000000000 00000001acac7000 ffffffff81a01e48
> > (XEN) ffffffff81ad2d21 0000000000000000 0000000000000028 0000000040004000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01ed8
> > (XEN) ffffffff81ac293f ffffffff81b46900 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 ffffffff81a01f00 ffffffff8165fbd1 ffffffff00000010
> > (XEN) ffffffff81a01ee8 ffffffff81a01ea8 0000000000000000 ffffffff81a01ec8
> > (XEN) ffffffffffffffff ffffffff81b46900 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 ffffffff81a01f28 ffffffff81abcd62 ffffffff96062000
> > (XEN) ffffffff81cc6000 ffffffff81ccd000 ffffffff81b4f2e0 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f38
> > (XEN) ffffffff81abc5f7 ffffffff81a01ff8 ffffffff81abf0c7 0300000100000032
> > (XEN) 0000000000000005 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 819822831fc9cbf5 000206a700100800 0000000000000001
> > (XEN) 0000000000000000 0000000000000000 0f00000060c0c748 ccccccccccccc305
> > (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
> > (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
>
> can we get kernel trace instead?
If you look at the initial one I had posted:
> >> Call Trace:
> >> [<ffffffff8103feba>] xen_get_user_pgd+0x5a <--
> >> [<ffffffff8103feba>] xen_get_user_pgd+0x5a
> >> [<ffffffff81042d27>] xen_write_cr3+0x77
> >> [<ffffffff81ad2d21>] init_mem_mapping+0x1f9
> >> [<ffffffff81ac293f>] setup_arch+0x742
> >> [<ffffffff81666d71>] printk+0x48
> >> [<ffffffff81abcd62>] start_kernel+0x90
> >> [<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b
> >> [<ffffffff81abc5f7>] x86_64_start_reservations+0x2a
> >> [<ffffffff81abf0c7>] xen_start_kernel+0x564
The EIP matches with what this stack strace has. So we are still in
init_mem_mapping.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-devel] [GIT PULL] x86/mm changes for v3.9-rc1
2013-02-22 17:53 ` Yinghai Lu
2013-02-22 18:23 ` Konrad Rzeszutek Wilk
@ 2013-02-22 18:25 ` Andrew Cooper
1 sibling, 0 replies; 16+ messages in thread
From: Andrew Cooper @ 2013-02-22 18:25 UTC (permalink / raw)
To: Yinghai Lu
Cc: linux-mips@linux-mips.org, Jeremy Fitzhardinge, Len Brown,
Frederic Weisbecker, Joe Millenbach,
virtualization@lists.linux-foundation.org, Gokul Caushik,
stable@vger.kernel.org, Pavel Machek, H. Peter Anvin,
sparclinux@vger.kernel.org, Christoph Lameter, Ingo Molnar,
Ville Syrjälä, Marek Szyprowski, Andrea Arcangeli,
Eric Biederman, xen-devel@lists.xensource.com, Russell King
On 22/02/13 17:53, Yinghai Lu wrote:
> On Fri, Feb 22, 2013 at 9:24 AM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
>> On Fri, Feb 22, 2013 at 11:55:31AM -0500, Konrad Rzeszutek Wilk wrote:
>>> On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote:
>>>> Hi Linus,
>>>>
>>>> This is a huge set of several partly interrelated (and concurrently
>>>> developed) changes, which is why the branch history is messier than
>>>> one would like.
>>>>
>>>> The *really* big items are two humonguous patchsets mostly developed
>>>> by Yinghai Lu at my request, which completely revamps the way we
>>>> create initial page tables. In particular, rather than estimating how
>>>> much memory we will need for page tables and then build them into that
>>>> memory -- a calculation that has shown to be incredibly fragile -- we
>>>> now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
>>>> a #PF handler which creates temporary page tables on demand.
>>>>
>>>> This has several advantages:
>>>>
>>>> 1. It makes it much easier to support things that need access to
>>>> data very early (a followon patchset uses this to load microcode
>>>> way early in the kernel startup).
>>>>
>>>> 2. It allows the kernel and all the kernel data objects to be invoked
>>>> from above the 4 GB limit. This allows kdump to work on very large
>>>> systems.
>>>>
>>>> 3. It greatly reduces the difference between Xen and native (Xen's
>>>> equivalent of the #PF handler are the temporary page tables created
>>>> by the domain builder), eliminating a bunch of fragile hooks.
>>>>
>>>> The patch series also gets us a bit closer to W^X.
>>>>
>>>> Additional work in this pull is the 64-bit get_user() work which you
>>>> were also involved with, and a bunch of cleanups/speedups to
>>>> __phys_addr()/__pa().
>>> Looking at figuring out which of the patches in the branch did this, but
>>> with this merge I am getting a crash with a very simple PV guest (booted with
>>> one 1G):
>>>
>>> Call Trace:
>>> [<ffffffff8103feba>] xen_get_user_pgd+0x5a <--
>>> [<ffffffff8103feba>] xen_get_user_pgd+0x5a
>>> [<ffffffff81042d27>] xen_write_cr3+0x77
>>> [<ffffffff81ad2d21>] init_mem_mapping+0x1f9
>>> [<ffffffff81ac293f>] setup_arch+0x742
>>> [<ffffffff81666d71>] printk+0x48
>>> [<ffffffff81abcd62>] start_kernel+0x90
>>> [<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b
>>> [<ffffffff81abc5f7>] x86_64_start_reservations+0x2a
>>> [<ffffffff81abf0c7>] xen_start_kernel+0x564
>>>
>>> And the hypervisor says:
>>> (XEN) d7:v0: unhandled page fault (ec=0000)
>>> (XEN) Pagetable walk from ffffea000005b2d0:
>>> (XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
>>> (XEN) domain_crash_sync called from entry.S
>>> (XEN) Domain 7 (vcpu#0) crashed on cpu#3:
>>> (XEN) ----[ Xen-4.2.0 x86_64 debug=n Not tainted ]----
>>> (XEN) CPU: 3
>>> (XEN) RIP: e033:[<ffffffff8103feba>]
>>> (XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
>>> (XEN) rax: ffffea0000000000 rbx: 0000000001a0c000 rcx: 0000000080000000
>>> (XEN) rdx: 000000000005b2a0 rsi: 0000000001a0c000 rdi: 0000000000000000
>>> (XEN) rbp: ffffffff81a01dd8 rsp: ffffffff81a01d90 r8: 0000000000000000
>>> (XEN) r9: 0000000010000001 r10: 0000000000000000 r11: 0000000000000000
>>> (XEN) r12: 0000000000000000 r13: 0000001000000000 r14: 0000000000000000
>>> (XEN) r15: 0000000000100000 cr0: 000000008005003b cr4: 00000000000406f0
>>> (XEN) cr3: 0000000411165000 cr2: ffffea000005b2d0
>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
>>> (XEN) Guest stack trace from rsp=ffffffff81a01d90:
>>> (XEN) 0000000080000000 0000000000000000 0000000000000000 ffffffff8103feba
>>> (XEN) 000000010000e030 0000000000010006 ffffffff81a01dd8 000000000000e02b
>> Here is a better serial log of the crash (just booting a normal Xen 4.1 + initial
>> kernel with 8GB):
>>
>> PXELINUX 3.82 2009-06-09 Copyright (C) 1994-2009 H. Peter Anvin et al
>> boot:
>> Loading xen.gz... ok
>> Loading vmlinuz... ok
>> Loading initramfs.cpio.gz... ok
>> __ __ _ _ _ ____
>> \ \/ /___ _ __ | || | / | | ___| _ __ _ __ ___
>> \ // _ \ '_(_)_(_)____/ | .__/|_| \___|
>> |_|
>> (XEN) Xen version 4.1.5-pre (konrad@dumpdata.com) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) Fri Feb 22 11:37:00 EST 2013
>> (XEN) Latest ChangeSet: Fri Feb 15 15:31:55 2013 +0100 23459:9f12bdd6b7f0
>> (XEN) Console output is synchronous.
>> (XEN) Bootloader: unknown
>> (XEN) Command line: cpuinfo conring_size=1048576 sync_console cpufreq=verbose com1=115200,8n1 console=com1,vga loglvl=all guest_loglvl=all
>> (XEN) Video information:
>> (XEN) VGA is text mode 80x25, font 8x16
>> (XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds
>> (XEN) EDID info not retrieved because no DDC retrieval method detected
>> (XEN) Disc information:
>> (XEN) Found 1 MBR signatures
>> (XEN) Found 1 EDD information structures
>> (XEN) Xen-e820 RAM map:
>> (XEN) 0000000000000000 - 000000000009ec00 (usable)
>> (XEN) 000000000009ec00 - 00000000000a0000 (reserved)
>> (XEN) 00000000000e0000 - 0000000000100000 (reserved)
>> (XEN) 0000000000100000 - 0000000020000000 (usable)
>> (XEN) 0000000020000000 - 0000000020200000 (reserved)
>> (XEN) 0000000020200000 - 0000000040000000 (usable)
>> (XEN) 0000000040000000 - 0000000040200000 (reserved)
>> (XEN) 0000000040200000 - 00000000bad80000 (usable)
>> (XEN) 00000000bad80000 - 00000000badc9000 (ACPI NVS)
>> (XEN) 00000000badc9000 - 00000000badd1000 (ACPI data)
>> (XEN) 00000000badd1000 - 00000000badf4000 (reserved)
>> (XEN) 00000000badf4000 - 00000000badf6000 (usable)
>> (XEN) 00000000badf6000 - 00000000bae06000 (reserved)
>> (XEN) 00000000bae06000 - 00000000bae14000 (ACPI NVS)
>> (XEN) 00000000bae14000 - 00000000bae3c000 (reserved)
>> (XEN) 00000000bae3c000 - 00000000bae7f000 (ACPI NVS)
>> (XEN) 00000000bae7f000 - 00000000bb000000 (usable)
>> (XEN) 00000000bb800000 - 00000000bfa00000 (reserved)
>> (XEN) 00000000fed1c000 - 00000000fed40000 (reserved)
>> (XEN) 00000000ff000000 - 0000000100000000 (reserved)
>> (XEN) 0000000100000000 - 000000023fe00000 (usable)
>> (XEN) ACPI: RSDP 000F0450, 0024 (r2 ALASKA)
>> (XEN) ACPI: XSDT BADC9068, 0054 (r1 ALASKA A M I 1072009 AMI 10013)
>> (XEN) PROC 1 MSFT 3000001)
>> (XEN) ACPI: MCFG BADD0580, 003C (r1 ALASKA A M I 1072009 MSFT 97)
>> (XEN) ACPI: HPET BADD05C0, 0038 (r1 ALASKA A M I 1072009 AMI. 4)
>> (XEN) ACPI: ASF! BADD05F8, 00A0 (r32 INTEL HCG 1 TFSM F4240)
>> (XEN) System RAM: 8104MB (8299140kB)
>> (XEN) No NUMA configuration found
>> (XEN) Faking a node at 0000000000000000-000000023fe00000
>> (XEN) Domain heap initialised
>> (XEN) found SMP MP-table at 000fcde0
>> (XEN) DMI 2.7 present.
>> (XEN) Using APIC driver default
>> (XEN) ACPI: PM-Timer IO Port: 0x408
>> (XEN) ACPI: ACPI SLEEP INFO: pm1x_cnt[404,0], pm1x_evt[400,0]
>> (XEN) ACPI: 32/64X FACS address mismatch in FADT - bae0bf80/0000000000000000, using 32
>> (XEN) ACPI: wakeup_vec[bae0bf8c], vec_size[20]
>> (XEN) ACPI: Local APIC address 0xfee00000
>> (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
>> (XEN) Processor #0 6:10 APIC version 21
>> (XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
>> (XEN) Processor #2 6:10 APIC version 21
>> (XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
>> (XEN) Processor #1 6:10 APIC version 21
>> (XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
>> (XEN) Processor #3 6:10 APIC version 21
>> (XEN) ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
>> (XEN) ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0])
>> (XEN) IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23
>> (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>> (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
>> (XEN) ACPI: IRQ0 used by override.
>> (XEN) ACPI: IRQ2 used by override.
>> (XEN) ACPI: IRQ9 used by override.
>> (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs
>> (XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
>> (XEN) PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
>> (XEN) PCI: Not using MMCONFIG.
>> (XEN) Table is not found!
>> (XEN) Using ACPI (MADT) for SMP configuration information
>> (XEN) IRQ limits: 24 GSI, 760 MSI/MSI-X
>> (XEN) Using scheduler: SMP Credit Scheduler (credit)
>> (XEN) Initializing CPU#0
>> (XEN) Detected 3093.067 MHz processor.
>> (XEN) Initing memory sharing.
>> (XEN) CPU: Physical Processor ID: 0
>> (XEN) CPU: Processor Core ID: 0
>> (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
>> (XEN) CPU: L2 cache: 256K
>> (XEN) CPU: L3 cache: 3072K
>> (XEN) mce_intel.c:1162: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0 extended MCE MSR 0
>> (XEN) CPU0: Thermal monitoring enabled (TM1)
>> (XEN) Intel machine check reporting enabled
>> (XEN) I/O virtualisation disabled
>> (XEN) CPU0: Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz stepping 07
>> (XEN) Enabled directed EOI with ioapic_ack_old on!
>> (XEN) ENABLING IO-APIC IRQs
>> (XEN) -> Using old ACK method
>> (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
>> (XEN) TSC deadline timer enabled
>> (XEN) Platform timer is 14.318MHz HPET
>> (XEN) Allocated console ring of 1048576 KiB.
>> (XEN) VMX: Supported advanced features:
>> (isation
>> (XEN) - APIC TPR shadow
>> (XEN) - Extended Page Tables (EPT)
>> (XEN) - Virtual-Processor Identifiers (VPID)
>> (XEN) - Virtual NMI
>> (XEN) - MSR direct-access bitmap
>> (XEN) - Unrestricted Guest
>> (XEN) HVM: ASIDs enabled.
>> (XEN) HVM: VMX enabled
>> (XEN) HVM: Hardware Assisted Paging (HAP) detected
>> (XEN) HVM: HAP page sizes: 4kB, 2MB
>> (XEN) Booting processor 1/1 eip 7c000
>> (XEN) Initializing CPU#1
>> (XEN) CPU: Physical Processor ID: 0
>> (XEN) CPU: Processor Core ID: 0
>> (XEN) CPU: L1 I cache: 32K, L1 D
>> (XEN) Initializing CPU#2
>> (XEN) CPU: Physical Processor ID: 0
>> (XEN) CPU: Processor Core ID: 1
>> (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
>> (XEN) CPU: L2 cache: 256K
>> (XEN) CPU: L3 cache: 3072K
>> (XEN) CPU2: Thermal monitoring enabled (TM1)
>> (XEN) CPU2: Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz stepping 07
>> (XEN) Booting processor 3/3 eip 7c000
>> (XEN) Initializing CPU#3
>> (XEN) CPU: Physical Processor ID: 0
>> (XEN) CPU: Processor Core ID: 1
>> (XEN) CPU: L1 I cache: 32K, L1 D100 CPU @ 3.10GHz stepping 07
>> (XEN) Brought up 4 CPUs
>> (XEN) ACPI sleep modes: S3
>> (XEN) mcheck_poll: Machine check polling timer started.
>> (XEN) *** LOADING DOMAIN 0 ***
>> (XEN) elf_parse_binary: phdr: paddr=0x1000000 memsz=0x9e0000
>> (XEN) elf_parse_binary: phdr: paddr=0x1a00000 memsz=0xa60f0
>> (XEN): paddr=0x1abc000 memsz=0x61b000
>> (XEN) elf_parse_binary: memory: 0x1000000 -> 0x20d7000
>> (XEN) elf_xen_parse_note: GUEST_OS = "linux"
>> (XEN) elf_xen_parse_note: GUEST_VERSION = "2.6"
>> (XEN) elf_xen_parse_note: XEN_VERSION = "xen-3.0"
>> (XEN) elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
>> (XEN) elf_xen_parse_note: ENTRY = 0xffffffff81abc1e0
>> (XEN) elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81001000
>> (XEN) elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
>> (XEN) elf_xen_parse_note: PAE_MODE = "yes"
>> (XEN) elf_xen_parse_note: LOADER = "generic"
>> (XEN) elf_xen_parse_note: unknown xen elf note (0xd)
>> (XEN) elf_xen_parse_note: SUSPEND_CANCEL = 0x1
>> (XEN) elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
>> (XEN) elf_xen_parse_note: PADDR_OFFSET = 0x0
>> (XEN) elf_xen_addr_calc_check: addresses:
>> (XEN) virt_base = 0xffffffff80000000
>> (XEN) elf_paddr_offset = 0x0
>> (XEN) virt_offset = 0xffffffff80000000
>> (XEN) virt_kstart = 0xffffffff81000000
>> (XEN) virt_kend = 0xffffffff820d7000
>> (XEN) virt_entry = 0xffffffff81abc1e0
>> (XEN) p2m_base = 0xffffffffffffffff
>> (XEN) Xen kernel: 64-bit, lsb, compat32
>> (XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x20d7000
>> (XEN) PHYSICAL MEMORY ARRANGEMENT:
>> (XEN) Dom0 alloc.: 0000000220000000->0000000224000000 (1661249 pages to be allocated)
>> (XEN) Init. ramdisk: 000000022cbdc000->000000023fe00000
>> (XEN) VIRTUAL MEMORY ARRANGEMENT:
>> (XEN) Loaded kernel: ffffffff81000000->ffffffff820d7000
>> (XEN) Init. ramdisk: ffffffff820d7000->ffffffff952fb000
>> (XEN) Phys-Mach map: ffffffff952fb000->ffffffff96060b28
>> (XEN) Start info: ffffffff96061000->ffffffff960614b4
>> (XEN) Page tables: ffffffff96062000->ffffffff96117000
>> (XEN) Boot stack: ffffffff96117000->ffffffff96118000
>> (XEN) TOTAL: ffffffff80000000->ffffffff96400000
>> (XEN) ENTRY ADDRESS: ffffffff81abc1e0
>> (XEN) Dom0 has maximum 4 VCPUs
>> (XEN) elf_load_binary: phdr 0 at 0xffffffff81000000 -> 0xffffffff819e0000
>> (XEN) elf_load_binary: phdr 1 at 0xffffffff81a00000 -> 0xffffffff81aa60f0
>> (XEN) elf_load_binary: phdr 2 at 0xffffffff81aa7000 -> 0xffffffff81abbbc0
>> (XEN) elf_load_binary: phdr 3 at 0xffffffff81abc000 -> 0xffffffff81baf000
>> (XEN) Scrubbing Free RAM: .done.
>> (XEN) Xen trace buffers: disabled
>> (XEN) Std. Loglevel: All
>> (XEN) Guest Loglevel: All
>> (XEN) ***************************intended to aid debugging of Xen by ensuring
>> (XEN) ******* that all output is synchronously delivered on the serial line.
>> (XEN) ******* However it can introduce SIGNIFICANT latencies and affect
>> (XEN) ******* timekeeping. It is NOT recommended for production use!
>> (XEN) **********************************************
>> (XEN) 3... 2... 1...
>> (XEN) Xen is relinquishing VGA console.
>> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
>> (XEN) Freed 224kB init memory.
>> mapping kernel into physical memory
>> about to get started...
>> [ 0.000000] Initializing cgroup subsys cpuset
>> [ 0.000000] Initializing cgroup subsys cpu
>> [ 0.000000] Linux version 3.8.0upstream-06471-g2ef14f4-dirty (konrad@build.dumpdata.com) (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC) ) #1 SMP Fri Feb 22 11:36:48 EST 2013
>> [ 0.000000] Command line: initcall_debug debug console=hvc0 loglevel=10 xen-pciback.hide=(01:00.0) earlyprintk=xen
>> [ 0.000000] Freeing 9e-100 pfn range: 98 pages freed
>> [ 0.000000] 1-1 mapping on 9e->100
>> [ 0.000000] Freeing 20000-20200 pfn range: 512 pages freed
>> [ 0.000000] 1-1 mapping on 20000->20200
>> [ 0.000000] Freeing 40000-40200 pfn range: 512 pages freed
>> [ 0.000000] 1-1 mapping on 40000->40200
>> [ 0.000000] Freeing bad80-badf4 pfn range: 116 pages freed
>> [ 0.000000] 1-1 mapping on bad80->badf4
>> [ 0.000000] Freeing badf6-bae7f pfn range: 137 pages freed
>> [ 0.000000] 1-1 mapping on badf6->bae7f
>> [ 0.000000] Freeing bb000-100000 pfn range: 282624 pages freed
>> [ 0.000000] 1-1 mapping on bb000->100000
>> [ 0.000000] Released 283999 pages of unused memory
>> [ 0.000000] Set 283999 page(s) to 1-1 mapping
>> [ 0.000000] Populating 1acb65-1f20c4 pfn range: 283999 pages added
>> [ 0.000000] e820: BIOS-provided physical RAM map:
>> [ 0.000000] Xen: [mem 0x0000000000000000-0x000000000009dfff] usable
>> [ 0.000000] Xen: [mem 0x000000000009ec00-0x00000000000fffff] reserved
>> [ 0.000000] Xen: [mem 0x0000000000100000-0x000000001fffffff] usable
>> [ 0.000000] Xen: [mem 0x0000000020000000-0x00000000201fffff] reserved
>> [ 0.000000] Xen: [mem 0x0000000020200000-0x000000003fffffff] usable
>> [ 0.000000] Xen: [mem 0x0000000040000000-0x00000000401fffff] reserved
>> [ 0.000000] Xen: [mem 0x0000000040200000-0x00000000bad7ffff] usable
>> [ 0.000000] Xen: [mem 0x00000000bad80000-0x00000000badc8fff] ACPI NVS
>> [ 0.000000] Xen: [mem 0x00000000badc9000-0x00000000badd0fff] ACPI data
>> [ 0.000000] Xen: [mem 0x00000000badd1000-0x00000000badf3fff] reserved
>> [ 0.000000] Xen: [mem 0x00000000badf4000-0x00000000badf5fff] usable
>> [ 0.000000] Xen: [mem 0x00000000badf6000-0x00000000bae05fff] reserved
>> [ 0.000000] Xen: [mem 0x00000000bae06000-0x00000000bae13fff] ACPI NVS
>> [ 0.000000] Xen: [mem 0x00000000bae14000-0x00000000bae3bfff] reserved
>> [ 0.000000] Xen: [mem 0x00000000bae3c000-0x00000000bae7efff] ACPI NVS
>> [ 0.000000] Xen: [mem 0x00000000bae7f000-0x00000000baffffff] usable
>> [ 0.000000] Xen: [mem 0x00000000bb800000-0x00000000bf9fffff] reserved
>> [ 0.000000] Xen: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
>> [ 0.000000] Xen: [mem 0x00000000fed1c000-0x00000000fed3ffff] reserved
>> [ 0.000000] Xen: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
>> [ 0.000000] Xen: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
>> [ 0.000000] Xen: [mem 0x0000000100000000-0x000000023fdfffff] usable
>> [ 0.000000] bootconsole [xenboot0] enabled
>> [ 0.000000] NX (Execute Disable) protection: active
>> [ 0.000000] SMBIOS 2.7 present.
>> [ 0.000000] DMI: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 03/14/2011
>> [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
>> [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
>> [ 0.000000] No AGP bridge found
>> [ 0.000000] e820: last_pfn = 0x23fe00 max_arch_pfn = 0x400000000
>> [ 0.000000] e820: lacanning 1 areas for low memory corruption
>> [ 0.000000] Base memory trampoline at [ffff880000098000] 98000 size 24576
>> [ 0.000000] reserving inaccessible SNB gfx pages
>> [ 0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
>> [ 0.000000] [mem 0x00000000-0x000fffff] page 4k
>> [ 0.000000] init_memory_mapping: [mem 0x1f2000000-0x1f20c3fff]
>> [ 0.000000] [mem 0x1f2000000-0x1f20c3fff] page 4k
>> [ 0.000000] BRK [0x01cd2000, 0x01cd2fff] PGTABLE
>> [ 0.000000] BRK [0x01cd3000, 0x01cd3fff] PGTABLE
>> [ 0.000000] init_memory_mapping: [mem 0x1f0000000-0x1f1ffffff]
>> [ 0.000000] [mem 0x1f0000000-0x1f1ffffff] page 4k
>> [ 0.000000] BRK [0x01cd4000, 0x01cd4fff] PGTABLE
>> [ 0.000000] BRK [0x01cd5000, 0x01cd5fff] PGTABLE
>> [ 0.000000] BRK [0x01cd6000, 0x01cd6fff] PGTABLE
>> [ 0.000000] init_memory_mapping: [mem 0x180000000-0x1efffffff]
>> [ 0.000000] [mem 0x180000000-0x1efffffff] page 4k
>> [ 0.000000] init_memory_mapping: [mem 0x00100000-0x1fffffff]
>> [ 0.000000] [mem 0x00100000-0x1fffffff] page 4k
>> [ 0.000000] init_memory_mapping: [mem 0x20200000-0x3fffffff]
>> [ 0.000000] [mem 0x20200000-0x3fffffff] page 4k
>> [ 0.000000] init_memory_mapping: [mem 0x40200000-0xbad7ffff]
>> [ 0.000000] [mem 0x40200000-0xbad7ffff] page 4k
>> [ 0.000000] init_memory_mapping: [mem 0xbadf4000-0xbadf5fff]
>> [ 0.000000] [mem 0xbadf4000-0xbadf5fff] page 4k
>> [ 0.000000] init_memory_mapping: [mem 0xbae7f000-0xbaffffff]
>> [ 0.000000] [mem 0xbae7f000-0xbaffffff] page 4k
>> [ 0.000000] init_memory_mapping: [mem 0x100000000-0x17fffffff]
>> [ 0.000000] [mem 0x100000000-0x17fffffff] page 4k
>> [ 0.000000] init_memory_mapping: [mem 0x1f20c4000-0x23fdfffff]
>> [ 0.000000] [mem 0x1f20c4000-0x23fdfffff] page 4k
> so init_memory_mapping are all done.
>
>> (XEN) d0:v0: unhandled page fault (ec=0000)
>> (XEN) Pagetable walk from ffffea000005b2d0:
>> (XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
>> (XEN) domain_crash_sync called from entry.S
>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>> (XEN) ----[ Xen-4.1.5-pre x86_64 debug=y Tainted: C ]----
>> (XEN) CPU: 0
>> (XEN) RIP: e033:[<ffffffff8103feba>]
>> (XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest
>> (XEN) rax: ffffea0000000000 rbx: 0000000001a0c000 rcx: 0000000080000000
>> (XEN) rdx: 000000000005b2a0 rsi: 0000000001a0c000 rdi: 0000000000000000
>> (XEN) rbp: ffffffff81a01dd8 rsp: ffffffff81a01d90 r8: 0000000000000000
>> (XEN) r9: 0000000010000001 r10: 0000000000000005 r11: 0000000000100000
>> (XEN) r12: 0000000000000000 r13: 0000020000000000 r14: 0000000000000000
>> (XEN) r15: 0000000000100000 cr0: 000000008005003b cr4: 00000000000026f0
>> (XEN) cr3: 0000000221a0c000 cr2: ffffea000005b2d0
>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
>> (XEN) Guest stack trace from rsp=ffffffff81a01d90:
>> (XEN) 0000000080000000 0000000000100000 0000000000000000 ffffffff8103feba
>> (XEN) 000000010000e030 0000000000010006 ffffffff81a01dd8 000000000000e02b
>> (XEN) 0000000000000000 ffffffff81a01e08 ffffffff81042d27 000000023fe00000
>> (XEN) 00000001f20c4000 0000020000000000 00000001acac7000 ffffffff81a01e48
>> (XEN) ffffffff81ad2d21 0000000000000000 0000000000000028 0000000040004000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01ed8
>> (XEN) ffffffff81ac293f ffffffff81b46900 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 ffffffff81a01f00 ffffffff8165fbd1 ffffffff00000010
>> (XEN) ffffffff81a01ee8 ffffffff81a01ea8 0000000000000000 ffffffff81a01ec8
>> (XEN) ffffffffffffffff ffffffff81b46900 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 ffffffff81a01f28 ffffffff81abcd62 ffffffff96062000
>> (XEN) ffffffff81cc6000 ffffffff81ccd000 ffffffff81b4f2e0 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81a01f38
>> (XEN) ffffffff81abc5f7 ffffffff81a01ff8 ffffffff81abf0c7 0300000100000032
>> (XEN) 0000000000000005 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> (XEN) 0000000000000000 819822831fc9cbf5 000206a700100800 0000000000000001
>> (XEN) 0000000000000000 0000000000000000 0f00000060c0c748 ccccccccccccc305
>> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
>> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
> can we get kernel trace instead?
(XEN) d0:v0: unhandled page fault (ec=0000)
(XEN) Pagetable walk from ffffea000005b2d0:
(XEN) L4[0x1d4] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
This means that Xen was unable to context switch back to dom0
The value in rax looks suspiciously related to the faulting address in cr2.
Konrad: is the instruction at ffffffff8103feba possibly a rax-relative
dereference?
~Andrew
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-02-22 18:25 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-22 0:34 [GIT PULL] x86/mm changes for v3.9-rc1 H. Peter Anvin
2013-02-22 16:22 ` Linus Torvalds
2013-02-22 17:31 ` H. Peter Anvin
2013-02-22 16:55 ` Konrad Rzeszutek Wilk
2013-02-22 17:12 ` H. Peter Anvin
2013-02-22 17:38 ` Konrad Rzeszutek Wilk
2013-02-22 18:06 ` Stefano Stabellini
2013-02-22 18:22 ` Yinghai Lu
2013-02-22 18:08 ` Yinghai Lu
2013-02-22 17:24 ` Konrad Rzeszutek Wilk
2013-02-22 17:30 ` H. Peter Anvin
2013-02-22 17:53 ` Yinghai Lu
2013-02-22 18:23 ` Konrad Rzeszutek Wilk
2013-02-22 18:25 ` [Xen-devel] " Andrew Cooper
2013-02-22 17:30 ` Dave Hansen
2013-02-22 17:33 ` H. Peter Anvin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).