* [PATCH 0/2] x86: cleanup highmap after brk is concluded
@ 2011-02-28 18:24 Stefano Stabellini
2011-02-28 18:25 ` [PATCH 1/2] x86: Cleanup " stefano.stabellini
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Stefano Stabellini @ 2011-02-28 18:24 UTC (permalink / raw)
To: linux-kernel
Cc: xen-devel, Jeremy Fitzhardinge, H. Peter Anvin,
Konrad Rzeszutek Wilk, Yinghai Lu, Stefano Stabellini
[-- Attachment #1: Type: text/plain, Size: 3226 bytes --]
Hi all,
a little while ago I sent a patch titled "x86/mm/init: respect memblock
reserved regions when destroying mappings"
(https://lkml.org/lkml/2011/1/31/232) to fix a serious boot crash
problem on Xen (full logs attached):
Pid: 0, comm: swapper Not tainted 2.6.38-rc6+ #1270 Hewlett-Packard HP xw8600 Workstation/0A98h
RIP: e030:[<ffffffff81008314>] [<ffffffff81008314>] get_phys_to_machine+0x44/0x50
RSP: e02b:ffffffff82001ca0 EFLAGS: 00010002
RAX: ffffffff824ce000 RBX: 0000000126004067 RCX: 0000000000000010
RDX: 0000000000000000 RSI: 00000001cfdc2000 RDI: 0000000000000004
RBP: ffffffff82001ca0 R08: 0000000000000020 R09: 0000000000000000
R10: 0000000000000007 R11: 00000000ffffffff R12: 0000000000126004
R13: 0000000000002004 R14: ffff880100000000 R15: ffff8801cfdc2000
FS: 0000000000000000(0000) GS:ffffffff82162000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000002003000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
Process swapper (pid: 0, threadinfo ffffffff82000000, task ffffffff8200b020)
Stack:
ffffffff82001cd0 ffffffff8100582c ffffffff81dce1bc ffffffff82001e10
00000001cfdc2000 ffffffff82003880 ffffffff82001ce0 ffffffff8100587e
ffffffff82001d98 ffffffff8100498f 00000000ffffffff 0000000000000007
Call Trace:
[<ffffffff8100582c>] pte_mfn_to_pfn+0x8c/0xb0
[<ffffffff8100587e>] xen_pgd_val+0xe/0x10
[<ffffffff8100498f>] __raw_callee_save_xen_pgd_val+0x11/0x1e
[<ffffffff813ba570>] ? xenboot_write_console+0x0/0xd0
[<ffffffff821c24b8>] ? kernel_physical_mapping_init+0x83/0x1db
[<ffffffff8195469f>] init_memory_mapping+0x31f/0x6d0
[<ffffffff821989fd>] ? memblock_reserve+0x1b/0x21
[<ffffffff8217de95>] setup_arch+0xa59/0xd89
[<ffffffff819b9c90>] ? _raw_spin_unlock_irqrestore+0x20/0x30
[<ffffffff810074bd>] ? __raw_callee_save_xen_irq_disable+0x11/0x1e
[<ffffffff82177b35>] start_kernel+0xc6/0x4df
[<ffffffff821772c5>] x86_64_start_reservations+0xa5/0xc9
[<ffffffff8217b6fa>] xen_start_kernel+0x5d3/0x6a9
Even though a clear solution wasn't reached in the following discussion,
Yinghai Lu sent a patch to move cleanup_highmap() after reserve_brk() so
that we don't have to clear the initial mappings in two steps.
The patch is a nice cleanup and with few small changes to honour the
variable max_pfn_mapped can be used to fix the boot issue on Xen: all we
have to do is setting max_pfn_mapped to the last valid pfn mapped on Xen
that is the page baking _end.
The list of patches with diffstat follows, comments and suggestions are
very welcome:
Stefano Stabellini (1):
xen: set max_pfn_mapped to the last pfn mapped
Yinghai Lu (1):
x86: Cleanup highmap after brk is concluded
arch/x86/kernel/head64.c | 3 ---
arch/x86/kernel/setup.c | 6 ++++++
arch/x86/mm/init.c | 19 -------------------
arch/x86/mm/init_64.c | 11 ++++++-----
arch/x86/xen/mmu.c | 13 +++++++------
5 files changed, 19 insertions(+), 33 deletions(-)
A git branch based on 2.6.38-rc6 is available here:
git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git 2.6.38-rc6-mm-fix
Cheers,
Stefano
[-- Attachment #2: Type: text/plain, Size: 16800 bytes --]
__ __ _ _ _ ___ _ _
\ \/ /___ _ __ | || | / | / _ \ _ __ ___| || | _ __ _ __ ___
\ // _ \ '_ \ | || |_ | || | | |__| '__/ __| || |_ __| '_ \| '__/ _ \
/ \ __/ | | | |__ _|| || |_| |__| | | (__|__ _|__| |_) | | | __/
/_/\_\___|_| |_| |_|(_)_(_)___/ |_| \___| |_| | .__/|_| \___|
|_|
(XEN) Xen version 4.1.0-rc4-pre (sstabellini@localdomain) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) Thu Feb 10 18:45:30 GMT 2011
(XEN) Latest ChangeSet: Thu Feb 10 09:02:50 2011 +0000 22895:19b2424be183
(XEN) Bootloader: GNU GRUB 0.97
(XEN) Command line: iommu=1 com1=115200,8n1 console=com1,tty dom0_mem=770M
(XEN) Video information:
(XEN) VGA is text mode 80x25, font 8x16
(XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds
(XEN) EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN) Found 2 MBR signatures
(XEN) Found 2 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN) 0000000000000000 - 0000000000096400 (usable)
(XEN) 0000000000096400 - 00000000000a0000 (reserved)
(XEN) 00000000000e8000 - 0000000000100000 (reserved)
(XEN) 0000000000100000 - 00000000cffc2840 (usable)
(XEN) 00000000cffc2840 - 00000000d0000000 (reserved)
(XEN) 00000000e0000000 - 00000000f0000000 (reserved)
(XEN) 00000000fec00000 - 0000000100000000 (reserved)
(XEN) 0000000100000000 - 0000000130000000 (usable)
(XEN) ACPI: RSDP 000E9810, 0024 (r2 HPQOEM)
(XEN) ACPI: XSDT CFFC52EC, 006C (r1 HPQOEM SLIC-WKS 20080808 0)
(XEN) ACPI: FACP CFFC5494, 00F4 (r3 HPQOEM SEABURG 1 0)
(XEN) ACPI Error (tbfadt-0455): 32/64X address mismatch in "Gpe0Block": [0000F828] [000000000001F030], using 64X [20070126]
(XEN) ACPI: DSDT CFFC5A7A, 26A4 (r1 HPQOEM DSDT 1 MSFT 100000E)
(XEN) ACPI: FACS CFFC5200, 0040
(XEN) ACPI: SSDT CFFC811E, 6DC0 (r1 HPQOEM PROJECT 1 MSFT 100000E)
(XEN) ACPI: APIC CFFC5588, 00D4 (r1 HPQOEM SEABURG 1 0)
(XEN) ACPI: ASF! CFFC565C, 006A (r32 HPQOEM SEABURG 1 0)
(XEN) ACPI: MCFG CFFC585E, 003C (r1 HPQOEM SEABURG 1 0)
(XEN) ACPI: SLIC CFFC589A, 0176 (r1 HPQOEM SLIC-WKS 1 0)
(XEN) ACPI: HPET CFFC5A10, 0038 (r1 HPQOEM SEABURG 1 0)
(XEN) ACPI: TCPA CFFC5A48, 0032 (r1 HPQOEM SEABURG 1 0)
(XEN) ACPI: DMAR CFFC56C6, 0120 (r1 HPQOEM SEABURG 1 0)
(XEN) System RAM: 4095MB (4193632kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-0000000130000000
(XEN) Domain heap initialised
(XEN) found SMP MP-table at 000fe700
(XEN) DMI 2.5 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0xf808
(XEN) ACPI: ACPI SLEEP INFO: pm1x_cnt[f804,460], pm1x_evt[f800,0]
(XEN) ACPI: wakeup_vec[cffc520c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) Processor #0 7:7 APIC version 20
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) Processor #2 7:7 APIC version 20
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
(XEN) Processor #1 7:7 APIC version 20
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
(XEN) Processor #3 7:7 APIC version 20
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x08] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x09] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x0a] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x0b] disabled)
(XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
(XEN) ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec89000] gsi_base[24])
(XEN) IOAPIC[1]: apic_id 2, version 32, address 0xfec89000, GSI 24-47
(XEN) ACPI: IOAPIC (id[0x03] address[0xfec88000] gsi_base[48])
(XEN) IOAPIC[2]: apic_id 3, version 32, address 0xfec88000, GSI 48-71
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode: Flat. Using 3 I/O APICs
(XEN) ACPI: HPET id: 0x8086a201 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) Table is not found!
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) IRQ limits: 72 GSI, 712 MSI/MSI-X
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2000.131 MHz processor.
(XEN) Initing memory sharing.
(XEN) mce_intel.c:1162: MCA Capability: BCAST 1 SER 0 CMCI 0 firstbank 1 extended MCE MSR 0
(XEN) Intel machine check reporting enabled
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation not enabled.
(XEN) Intel VT-d Interrupt Remapping not enabled.
(XEN) I/O virtualisation enabled
(XEN) - Dom0 mode: Relaxed
(XEN) ENABLING IO-APIC IRQs
(XEN) -> Using new ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) Platform timer is 14.318MHz HPET
XEN) Allocated console ring of 32 KiB.
(XEN) VMX: Supported advanced features:
(XEN) - APIC MMIO access virtualisation
(XEN) - APIC TPR shadow
(XEN) - Virtual NMI
(XEN) - MSR direct-access bitmap
(XEN) HVM: ASIDs disabled.
(XEN) HVM: VMX enabled
(XEN) Brought up 4 CPUs
(XEN) HPET: 3 timers in total, 0 timers will be used for broadcast
(XEN) ACPI sleep modes: S3
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) *** LOADING DOMAIN 0 ***
(XEN) elf_parse_binary: phdr: paddr=0x1000000 memsz=0xecf000
(XEN) elf_parse_binary: phdr: paddr=0x2000000 memsz=0x1606c0
(XEN) elf_parse_binary: phdr: paddr=0x2161000 memsz=0x8c8
(XEN) elf_parse_binary: phdr: paddr=0x2162000 memsz=0x14580
(XEN) elf_parse_binary: phdr: paddr=0x2177000 memsz=0x347000
(XEN) elf_parse_binary: memory: 0x1000000 -> 0x24be000
(XEN) elf_xen_parse_note: GUEST_OS = "linux"
(XEN) elf_xen_parse_note: GUEST_VERSION = "2.6"
(XEN) elf_xen_parse_note: XEN_VERSION = "xen-3.0"
(XEN) elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
(XEN) elf_xen_parse_note: ENTRY = 0xffffffff82177200
(XEN) elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81001000
(XEN) elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
(XEN) elf_xen_parse_note: PAE_MODE = "yes"
(XEN) elf_xen_parse_note: LOADER = "generic"
(XEN) elf_xen_parse_note: unknown xen elf note (0xd)
(XEN) elf_xen_parse_note: SUSPEND_CANCEL = 0x1
(XEN) elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
(XEN) elf_xen_parse_note: PADDR_OFFSET = 0x0
(XEN) elf_xen_addr_calc_check: addresses:
(XEN) virt_base = 0xffffffff80000000
(XEN) elf_paddr_offset = 0x0
(XEN) virt_offset = 0xffffffff80000000
(XEN) virt_kstart = 0xffffffff81000000
(XEN) virt_kend = 0xffffffff824be000
(XEN) virt_entry = 0xffffffff82177200
(XEN) p2m_base = 0xffffffffffffffff
(XEN) Xen kernel: 64-bit, lsb, compat32
(XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x24be000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN) Dom0 alloc.: 0000000124000000->0000000128000000 (180736 pages to be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN) Loaded kernel: ffffffff81000000->ffffffff824be000
(XEN) Init. ramdisk: ffffffff824be000->ffffffff824be000
(XEN) Phys-Mach map: ffffffff824be000->ffffffff8263f000
(XEN) Start info: ffffffff8263f000->ffffffff8263f4b4
(XEN) Page tables: ffffffff82640000->ffffffff82657000
(XEN) Boot stack: ffffffff82657000->ffffffff82658000
(XEN) TOTAL: ffffffff80000000->ffffffff82800000
(XEN) ENTRY ADDRESS: ffffffff82177200
(XEN) Dom0 has maximum 4 VCPUs
(XEN) elf_load_binary: phdr 0 at 0xffffffff81000000 -> 0xffffffff81ecf000
(XEN) elf_load_binary: phdr 1 at 0xffffffff82000000 -> 0xffffffff821606c0
(XEN) elf_load_binary: phdr 2 at 0xffffffff82161000 -> 0xffffffff821618c8
(XEN) elf_load_binary: phdr 3 at 0xffffffff82162000 -> 0xffffffff82176580
(XEN) elf_load_binary: phdr 4 at 0xffffffff82177000 -> 0xffffffff82252000
(XEN) Scrubbing Free RAM: ................................done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
(XEN) Freed 216kB init memory.
mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Linux version 2.6.38-rc6+ (sstabellini@cosworth) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #1270 SMP Mon Feb 28 18:04:13 GMT 2011
[ 0.000000] Command line: earlyprintk=xenboot debug root=/dev/sda1 console=hvc0 loglevel=9
[ 0.000000] released 0 pages of unused memory
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] Xen: 0000000000000000 - 0000000000096000 (usable)
[ 0.000000] Xen: 0000000000096400 - 0000000000100000 (reserved)
[ 0.000000] Xen: 0000000000100000 - 0000000030200000 (usable)
[ 0.000000] Xen: 00000000cffc2840 - 00000000d0000000 (reserved)
[ 0.000000] Xen: 00000000e0000000 - 00000000f0000000 (reserved)
[ 0.000000] Xen: 00000000fec00000 - 0000000100000000 (reserved)
[ 0.000000] Xen: 0000000100000000 - 00000001cfdc2000 (usable)
[ 0.000000] bootconsole [xenboot0] enabled
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] DMI 2.5 present.
[ 0.000000] DMI: Hewlett-Packard HP xw8600 Workstation/0A98h, BIOS 786F5 v01.27 08/08/2008
[ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
[ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
[ 0.000000] No AGP bridge found
[ 0.000000] last_pfn = 0x1cfdc2 max_arch_pfn = 0x400000000
[ 0.000000] last_pfn = 0x30200 max_arch_pfn = 0x400000000
[ 0.000000] found SMP MP-table at [ffff8800000fe700] fe700
[ 0.000000] Scanning 0 areas for low memory corruption
[ 0.000000] initial memory mapped : 0 - 02fff000
[ 0.000000] init_memory_mapping: 0000000000000000-0000000030200000
[ 0.000000] 0000000000 - 0030200000 page 4k
[ 0.000000] kernel direct mapping tables up to 30200000 @ 2e7c000-2fff000
[ 0.000000] init_memory_mapping: 0000000100000000-00000001cfdc2000
[ 0.000000] 0100000000 - 01cfdc2000 page 4k
[ 0.000000] kernel direct mapping tables up to 1cfdc2000 @ 2f378000-30200000
[ 0.000000] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 0.000000] IP: [<ffffffff81008314>] get_phys_to_machine+0x44/0x50
[ 0.000000] PGD 0
[ 0.000000] Oops: 0000 [#1] SMP
[ 0.000000] last sysfs file:
[ 0.000000] CPU 0
[ 0.000000] Modules linked in:
[ 0.000000]
[ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.38-rc6+ #1270 Hewlett-Packard HP xw8600 Workstation/0A98h
[ 0.000000] RIP: e030:[<ffffffff81008314>] [<ffffffff81008314>] get_phys_to_machine+0x44/0x50
[ 0.000000] RSP: e02b:ffffffff82001ca0 EFLAGS: 00010002
[ 0.000000] RAX: ffffffff824ce000 RBX: 0000000126004067 RCX: 0000000000000010
[ 0.000000] RDX: 0000000000000000 RSI: 00000001cfdc2000 RDI: 0000000000000004
[ 0.000000] RBP: ffffffff82001ca0 R08: 0000000000000020 R09: 0000000000000000
[ 0.000000] R10: 0000000000000007 R11: 00000000ffffffff R12: 0000000000126004
[ 0.000000] R13: 0000000000002004 R14: ffff880100000000 R15: ffff8801cfdc2000
[ 0.000000] FS: 0000000000000000(0000) GS:ffffffff82162000(0000) knlGS:0000000000000000
[ 0.000000] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.000000] CR2: 0000000000000000 CR3: 0000000002003000 CR4: 0000000000002660
[ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.000000] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
[ 0.000000] Process swapper (pid: 0, threadinfo ffffffff82000000, task ffffffff8200b020)
[ 0.000000] Stack:
[ 0.000000] ffffffff82001cd0 ffffffff8100582c ffffffff81dce1bc ffffffff82001e10
[ 0.000000] 00000001cfdc2000 ffffffff82003880 ffffffff82001ce0 ffffffff8100587e
[ 0.000000] ffffffff82001d98 ffffffff8100498f 00000000ffffffff 0000000000000007
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff8100582c>] pte_mfn_to_pfn+0x8c/0xb0
[ 0.000000] [<ffffffff8100587e>] xen_pgd_val+0xe/0x10
[ 0.000000] [<ffffffff8100498f>] __raw_callee_save_xen_pgd_val+0x11/0x1e
[ 0.000000] [<ffffffff813ba570>] ? xenboot_write_console+0x0/0xd0
[ 0.000000] [<ffffffff821c24b8>] ? kernel_physical_mapping_init+0x83/0x1db
[ 0.000000] [<ffffffff8195469f>] init_memory_mapping+0x31f/0x6d0
[ 0.000000] [<ffffffff821989fd>] ? memblock_reserve+0x1b/0x21
[ 0.000000] [<ffffffff8217de95>] setup_arch+0xa59/0xd89
[ 0.000000] [<ffffffff819b9c90>] ? _raw_spin_unlock_irqrestore+0x20/0x30
[ 0.000000] [<ffffffff810074bd>] ? __raw_callee_save_xen_irq_disable+0x11/0x1e
[ 0.000000] [<ffffffff82177b35>] start_kernel+0xc6/0x4df
[ 0.000000] [<ffffffff821772c5>] x86_64_start_reservations+0xa5/0xc9
[ 0.000000] [<ffffffff8217b6fa>] xen_start_kernel+0x5d3/0x6a9
[ 0.000000] Code: 48 89 fa 48 8b 05 ed 1a 25 01 48 89 f9 48 c1 ea 12 48 c1 e9 09 81 e7 ff 01 00 00 89 d2 81 e1 ff 01 00 00 48 8b 04 d0 48 8b 04 c8 <48> 8b 04 f8 c9 c3 66 0f 1f 44 00 00 55 48 89 e5 e8 97 3a 00 00
[ 0.000000] RIP [<ffffffff81008314>] get_phys_to_machine+0x44/0x50
[ 0.000000] RSP <ffffffff82001ca0>
[ 0.000000] CR2: 0000000000000000
[ 0.000000] ---[ end trace a7919e7f17c0a725 ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] Pid: 0, comm: swapper Tainted: G D 2.6.38-rc6+ #1270
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff819b68ca>] ? panic+0xbf/0x1f0
[ 0.000000] [<ffffffff8107ff21>] ? __blocking_notifier_call_chain+0x21/0x90
[ 0.000000] [<ffffffff8105cdd1>] ? do_exit+0x791/0x860
[ 0.000000] [<ffffffff81058990>] ? kmsg_dump+0x50/0x110
[ 0.000000] [<ffffffff819bb0c4>] ? oops_end+0xe4/0xf0
[ 0.000000] [<ffffffff810383a9>] ? no_context+0xf9/0x270
[ 0.000000] [<ffffffff81038675>] ? __bad_area_nosemaphore+0x155/0x200
[ 0.000000] [<ffffffff810082d9>] ? get_phys_to_machine+0x9/0x50
[ 0.000000] [<ffffffff8100749f>] ? __raw_callee_save_xen_restore_fl+0x11/0x1e
[ 0.000000] [<ffffffff81038733>] ? bad_area_nosemaphore+0x13/0x20
[ 0.000000] [<ffffffff819bd97e>] ? do_page_fault+0x3ae/0x4d0
[ 0.000000] [<ffffffff813ba570>] ? xenboot_write_console+0x0/0xd0
[ 0.000000] [<ffffffff8100749f>] ? __raw_callee_save_xen_restore_fl+0x11/0x1e
[ 0.000000] [<ffffffff8100749f>] ? __raw_callee_save_xen_restore_fl+0x11/0x1e
[ 0.000000] [<ffffffff813ba570>] ? xenboot_write_console+0x0/0xd0
[ 0.000000] [<ffffffff819b9c90>] ? _raw_spin_unlock_irqrestore+0x20/0x30
[ 0.000000] [<ffffffff819ba3d5>] ? page_fault+0x25/0x30
[ 0.000000] [<ffffffff81008314>] ? get_phys_to_machine+0x44/0x50
[ 0.000000] [<ffffffff8100582c>] ? pte_mfn_to_pfn+0x8c/0xb0
[ 0.000000] [<ffffffff8100587e>] ? xen_pgd_val+0xe/0x10
[ 0.000000] [<ffffffff8100498f>] ? __raw_callee_save_xen_pgd_val+0x11/0x1e
[ 0.000000] [<ffffffff813ba570>] ? xenboot_write_console+0x0/0xd0
[ 0.000000] [<ffffffff821c24b8>] ? kernel_physical_mapping_init+0x83/0x1db
[ 0.000000] [<ffffffff8195469f>] ? init_memory_mapping+0x31f/0x6d0
[ 0.000000] [<ffffffff821989fd>] ? memblock_reserve+0x1b/0x21
[ 0.000000] [<ffffffff8217de95>] ? setup_arch+0xa59/0xd89
[ 0.000000] [<ffffffff819b9c90>] ? _raw_spin_unlock_irqrestore+0x20/0x30
[ 0.000000] [<ffffffff810074bd>] ? __raw_callee_save_xen_irq_disable+0x11/0x1e
[ 0.000000] [<ffffffff82177b35>] ? start_kernel+0xc6/0x4df
[ 0.000000] [<ffffffff821772c5>] ? x86_64_start_reservations+0xa5/0xc9
[ 0.000000] [<ffffffff8217b6fa>] ? xen_start_kernel+0x5d3/0x6a9
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] x86: Cleanup highmap after brk is concluded
2011-02-28 18:24 [PATCH 0/2] x86: cleanup highmap after brk is concluded Stefano Stabellini
@ 2011-02-28 18:25 ` stefano.stabellini
2011-02-28 18:25 ` [PATCH 2/2] xen: set max_pfn_mapped to the last pfn mapped stefano.stabellini
2011-02-28 18:42 ` [PATCH 0/2] x86: cleanup highmap after brk is concluded Yinghai Lu
2 siblings, 0 replies; 6+ messages in thread
From: stefano.stabellini @ 2011-02-28 18:25 UTC (permalink / raw)
To: linux-kernel
Cc: xen-devel, hpa, konrad.wilk, yinghai, jeremy, Stefano.Stabellini,
Stefano Stabellini
From: Yinghai Lu <yinghai@kernel.org>
Now cleanup_highmap actually is in two steps: one is early in head64.c
and only clears above _end; a second one is in init_memory_mapping() and
tries to clean from _brk_end to _end.
It should check if those boundaries are PMD_SIZE aligned but currently
does not.
Also init_memory_mapping() is called several times for numa or memory
hotplug, so we really should not handle initial kernel mappings there.
This patch moves cleanup_highmap() down after _brk_end is settled so
we can do everything in one step.
Also we honor max_pfn_mapped in the implementation of cleanup_highmap.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
arch/x86/kernel/head64.c | 3 ---
arch/x86/kernel/setup.c | 6 ++++++
arch/x86/mm/init.c | 19 -------------------
arch/x86/mm/init_64.c | 11 ++++++-----
4 files changed, 12 insertions(+), 27 deletions(-)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 2d2673c..5655c22 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -77,9 +77,6 @@ void __init x86_64_start_kernel(char * real_mode_data)
/* Make NULL pointers segfault */
zap_identity_mappings();
- /* Cleanup the over mapped high alias */
- cleanup_highmap();
-
max_pfn_mapped = KERNEL_IMAGE_SIZE >> PAGE_SHIFT;
for (i = 0; i < NUM_EXCEPTION_VECTORS; i++) {
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index d3cfe26..f03e6e0 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -297,6 +297,9 @@ static void __init init_gbpages(void)
static inline void init_gbpages(void)
{
}
+static void __init cleanup_highmap(void)
+{
+}
#endif
static void __init reserve_brk(void)
@@ -922,6 +925,9 @@ void __init setup_arch(char **cmdline_p)
*/
reserve_brk();
+ /* Cleanup the over mapped high alias after _brk_end*/
+ cleanup_highmap();
+
memblock.current_limit = get_max_mapped();
memblock_x86_fill();
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 947f42a..f13ff3a 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -279,25 +279,6 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
load_cr3(swapper_pg_dir);
#endif
-#ifdef CONFIG_X86_64
- if (!after_bootmem && !start) {
- pud_t *pud;
- pmd_t *pmd;
-
- mmu_cr4_features = read_cr4();
-
- /*
- * _brk_end cannot change anymore, but it and _end may be
- * located on different 2M pages. cleanup_highmap(), however,
- * can only consider _end when it runs, so destroy any
- * mappings beyond _brk_end here.
- */
- pud = pud_offset(pgd_offset_k(_brk_end), _brk_end);
- pmd = pmd_offset(pud, _brk_end - 1);
- while (++pmd <= pmd_offset(pud, (unsigned long)_end - 1))
- pmd_clear(pmd);
- }
-#endif
__flush_tlb_all();
if (!after_bootmem && e820_table_end > e820_table_start)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 71a5929..a8d08c2 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -51,6 +51,7 @@
#include <asm/numa.h>
#include <asm/cacheflush.h>
#include <asm/init.h>
+#include <asm/setup.h>
static int __init parse_direct_gbpages_off(char *arg)
{
@@ -293,18 +294,18 @@ void __init init_extra_mapping_uc(unsigned long phys, unsigned long size)
* to the compile time generated pmds. This results in invalid pmds up
* to the point where we hit the physaddr 0 mapping.
*
- * We limit the mappings to the region from _text to _end. _end is
- * rounded up to the 2MB boundary. This catches the invalid pmds as
+ * We limit the mappings to the region from _text to _brk_end. _brk_end
+ * is rounded up to the 2MB boundary. This catches the invalid pmds as
* well, as they are located before _text:
*/
void __init cleanup_highmap(void)
{
unsigned long vaddr = __START_KERNEL_map;
- unsigned long end = roundup((unsigned long)_end, PMD_SIZE) - 1;
+ unsigned long vaddr_end = __START_KERNEL_map + (max_pfn_mapped << PAGE_SHIFT);
+ unsigned long end = roundup((unsigned long)_brk_end, PMD_SIZE) - 1;
pmd_t *pmd = level2_kernel_pgt;
- pmd_t *last_pmd = pmd + PTRS_PER_PMD;
- for (; pmd < last_pmd; pmd++, vaddr += PMD_SIZE) {
+ for (; vaddr + PMD_SIZE - 1 < vaddr_end; pmd++, vaddr += PMD_SIZE) {
if (pmd_none(*pmd))
continue;
if (vaddr < (unsigned long) _text || vaddr > end)
--
1.5.6.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] xen: set max_pfn_mapped to the last pfn mapped
2011-02-28 18:24 [PATCH 0/2] x86: cleanup highmap after brk is concluded Stefano Stabellini
2011-02-28 18:25 ` [PATCH 1/2] x86: Cleanup " stefano.stabellini
@ 2011-02-28 18:25 ` stefano.stabellini
2011-02-28 18:42 ` [PATCH 0/2] x86: cleanup highmap after brk is concluded Yinghai Lu
2 siblings, 0 replies; 6+ messages in thread
From: stefano.stabellini @ 2011-02-28 18:25 UTC (permalink / raw)
To: linux-kernel
Cc: xen-devel, hpa, konrad.wilk, yinghai, jeremy, Stefano.Stabellini,
Stefano Stabellini
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Do not set max_pfn_mapped to the end of the initial memory mappings,
that also contain pages that don't belong in pfn space (like the mfn
list).
Set max_pfn_mapped to the last real pfn mapped in the initial memory
mappings that is the pfn backing _end.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
arch/x86/xen/mmu.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 5e92b61..6092f73 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1653,9 +1653,6 @@ static __init void xen_map_identity_early(pmd_t *pmd, unsigned long max_pfn)
for (pteidx = 0; pteidx < PTRS_PER_PTE; pteidx++, pfn++) {
pte_t pte;
- if (pfn > max_pfn_mapped)
- max_pfn_mapped = pfn;
-
if (!pte_none(pte_page[pteidx]))
continue;
@@ -1713,6 +1710,12 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
pud_t *l3;
pmd_t *l2;
+ /* max_pfn_mapped is the last pfn mapped in the initial memory
+ * mappings. Considering that on Xen after the kernel mappings we
+ * have the mappings of some pages that don't exist in pfn space, we
+ * set max_pfn_mapped to the last real pfn mapped. */
+ max_pfn_mapped = PFN_DOWN(__pa(xen_start_info->mfn_list));
+
/* Zap identity mapping */
init_level4_pgt[0] = __pgd(0);
@@ -1817,9 +1820,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
initial_kernel_pmd =
extend_brk(sizeof(pmd_t) * PTRS_PER_PMD, PAGE_SIZE);
- max_pfn_mapped = PFN_DOWN(__pa(xen_start_info->pt_base) +
- xen_start_info->nr_pt_frames * PAGE_SIZE +
- 512*1024);
+ max_pfn_mapped = PFN_DOWN(__pa(xen_start_info->mfn_list));
kernel_pmd = m2v(pgd[KERNEL_PGD_BOUNDARY].pgd);
memcpy(initial_kernel_pmd, kernel_pmd, sizeof(pmd_t) * PTRS_PER_PMD);
--
1.5.6.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] x86: cleanup highmap after brk is concluded
2011-02-28 18:24 [PATCH 0/2] x86: cleanup highmap after brk is concluded Stefano Stabellini
2011-02-28 18:25 ` [PATCH 1/2] x86: Cleanup " stefano.stabellini
2011-02-28 18:25 ` [PATCH 2/2] xen: set max_pfn_mapped to the last pfn mapped stefano.stabellini
@ 2011-02-28 18:42 ` Yinghai Lu
2011-03-01 15:13 ` Stefano Stabellini
2 siblings, 1 reply; 6+ messages in thread
From: Yinghai Lu @ 2011-02-28 18:42 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Jeremy Fitzhardinge, xen-devel, linux-kernel,
Konrad Rzeszutek Wilk, H. Peter Anvin
On 02/28/2011 10:24 AM, Stefano Stabellini wrote:
> Hi all,
> a little while ago I sent a patch titled "x86/mm/init: respect memblock
> reserved regions when destroying mappings"
> (https://lkml.org/lkml/2011/1/31/232) to fix a serious boot crash
> problem on Xen (full logs attached):
>
> Pid: 0, comm: swapper Not tainted 2.6.38-rc6+ #1270 Hewlett-Packard HP xw8600 Workstation/0A98h
> RIP: e030:[<ffffffff81008314>] [<ffffffff81008314>] get_phys_to_machine+0x44/0x50
> RSP: e02b:ffffffff82001ca0 EFLAGS: 00010002
> RAX: ffffffff824ce000 RBX: 0000000126004067 RCX: 0000000000000010
> RDX: 0000000000000000 RSI: 00000001cfdc2000 RDI: 0000000000000004
> RBP: ffffffff82001ca0 R08: 0000000000000020 R09: 0000000000000000
> R10: 0000000000000007 R11: 00000000ffffffff R12: 0000000000126004
> R13: 0000000000002004 R14: ffff880100000000 R15: ffff8801cfdc2000
> FS: 0000000000000000(0000) GS:ffffffff82162000(0000) knlGS:0000000000000000
> CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 0000000002003000 CR4: 0000000000002660
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
> Process swapper (pid: 0, threadinfo ffffffff82000000, task ffffffff8200b020)
> Stack:
> ffffffff82001cd0 ffffffff8100582c ffffffff81dce1bc ffffffff82001e10
> 00000001cfdc2000 ffffffff82003880 ffffffff82001ce0 ffffffff8100587e
> ffffffff82001d98 ffffffff8100498f 00000000ffffffff 0000000000000007
> Call Trace:
> [<ffffffff8100582c>] pte_mfn_to_pfn+0x8c/0xb0
> [<ffffffff8100587e>] xen_pgd_val+0xe/0x10
> [<ffffffff8100498f>] __raw_callee_save_xen_pgd_val+0x11/0x1e
> [<ffffffff813ba570>] ? xenboot_write_console+0x0/0xd0
> [<ffffffff821c24b8>] ? kernel_physical_mapping_init+0x83/0x1db
> [<ffffffff8195469f>] init_memory_mapping+0x31f/0x6d0
> [<ffffffff821989fd>] ? memblock_reserve+0x1b/0x21
> [<ffffffff8217de95>] setup_arch+0xa59/0xd89
> [<ffffffff819b9c90>] ? _raw_spin_unlock_irqrestore+0x20/0x30
> [<ffffffff810074bd>] ? __raw_callee_save_xen_irq_disable+0x11/0x1e
> [<ffffffff82177b35>] start_kernel+0xc6/0x4df
> [<ffffffff821772c5>] x86_64_start_reservations+0xa5/0xc9
> [<ffffffff8217b6fa>] xen_start_kernel+0x5d3/0x6a9
>
>
> Even though a clear solution wasn't reached in the following discussion,
> Yinghai Lu sent a patch to move cleanup_highmap() after reserve_brk() so
> that we don't have to clear the initial mappings in two steps.
> The patch is a nice cleanup and with few small changes to honour the
> variable max_pfn_mapped can be used to fix the boot issue on Xen: all we
> have to do is setting max_pfn_mapped to the last valid pfn mapped on Xen
> that is the page baking _end.
>
>
> The list of patches with diffstat follows, comments and suggestions are
> very welcome:
>
> Stefano Stabellini (1):
> xen: set max_pfn_mapped to the last pfn mapped
>
> Yinghai Lu (1):
> x86: Cleanup highmap after brk is concluded
>
> arch/x86/kernel/head64.c | 3 ---
> arch/x86/kernel/setup.c | 6 ++++++
> arch/x86/mm/init.c | 19 -------------------
> arch/x86/mm/init_64.c | 11 ++++++-----
> arch/x86/xen/mmu.c | 13 +++++++------
> 5 files changed, 19 insertions(+), 33 deletions(-)
>
>
> A git branch based on 2.6.38-rc6 is available here:
>
Can you please rebase them on top of tip/x86/mm?
http://people.redhat.com/mingo/tip.git/readme.txt
Thanks
Yinghai Lu
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] x86: cleanup highmap after brk is concluded
2011-02-28 18:42 ` [PATCH 0/2] x86: cleanup highmap after brk is concluded Yinghai Lu
@ 2011-03-01 15:13 ` Stefano Stabellini
2011-03-08 14:27 ` Stefano Stabellini
0 siblings, 1 reply; 6+ messages in thread
From: Stefano Stabellini @ 2011-03-01 15:13 UTC (permalink / raw)
To: Yinghai Lu
Cc: Stefano Stabellini, linux-kernel@vger.kernel.org,
xen-devel@lists.xensource.com, Jeremy Fitzhardinge,
H. Peter Anvin, Konrad Rzeszutek Wilk
On Mon, 28 Feb 2011, Yinghai Lu wrote:
> Can you please rebase them on top of tip/x86/mm?
>
> http://people.redhat.com/mingo/tip.git/readme.txt
>
Sure, I rebased the two patches on the very latest tip/x86/mm:
git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git 2.6.38-tip-mm-fix
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] x86: cleanup highmap after brk is concluded
2011-03-01 15:13 ` Stefano Stabellini
@ 2011-03-08 14:27 ` Stefano Stabellini
0 siblings, 0 replies; 6+ messages in thread
From: Stefano Stabellini @ 2011-03-08 14:27 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Yinghai Lu, linux-kernel@vger.kernel.org,
xen-devel@lists.xensource.com, Jeremy Fitzhardinge,
H. Peter Anvin, Konrad Rzeszutek Wilk
On Tue, 1 Mar 2011, Stefano Stabellini wrote:
> On Mon, 28 Feb 2011, Yinghai Lu wrote:
> > Can you please rebase them on top of tip/x86/mm?
> >
> > http://people.redhat.com/mingo/tip.git/readme.txt
> >
>
> Sure, I rebased the two patches on the very latest tip/x86/mm:
>
> git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git 2.6.38-tip-mm-fix
>
ping?
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-03-08 14:27 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-28 18:24 [PATCH 0/2] x86: cleanup highmap after brk is concluded Stefano Stabellini
2011-02-28 18:25 ` [PATCH 1/2] x86: Cleanup " stefano.stabellini
2011-02-28 18:25 ` [PATCH 2/2] xen: set max_pfn_mapped to the last pfn mapped stefano.stabellini
2011-02-28 18:42 ` [PATCH 0/2] x86: cleanup highmap after brk is concluded Yinghai Lu
2011-03-01 15:13 ` Stefano Stabellini
2011-03-08 14:27 ` Stefano Stabellini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).