All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Young <dyoung@redhat.com>
To: tim@edgecast.com
Cc: tj@kernel.org, WANG Cong <xiyou.wangcong@gmail.com>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: Crash during vmcore_init
Date: Wed, 16 Nov 2011 10:22:13 +0800	[thread overview]
Message-ID: <4EC31E55.6040809@redhat.com> (raw)
In-Reply-To: <1321396371.4198.5.camel@boudreau>

[-- Attachment #1: Type: text/plain, Size: 11333 bytes --]

On 11/16/2011 06:32 AM, Tim Hartrick wrote:

> 
> Dave,
> 
> I tested with
> 
> linux-image-3.1.1-030101-generic_3.1.1-030101.201111111651_amd64.deb
> 
> which, as far as I know, is the Ubuntu build of the latest stable.
> Below are the results.
> 
> [    1.427457] ioremap: invalid physical address 5800000000000


Hi, thanks for the testing

Can you applied the debug patch to see if it's per cpu problem?

Don't need test kdump, just
cd /sys/devices/system/cpu
cat cpu[x]/crash_notes

probably cat crash notes of cpu number other than 0 will get the invalid
address

> [    1.433017] ------------[ cut here ]------------
> [    1.437632] WARNING: at /home/apw/COD/linux/arch/x86/mm/ioremap.c:83
> __ioremap_caller+0x35e/0x3a0()
> [    1.446656] Hardware name: PowerEdge R710
> [    1.450655] Modules linked in:
> [    1.453712] Pid: 1, comm: swapper Not tainted 3.1.1-030101-generic
> #201111111651
> [    1.461092] Call Trace:
> [    1.463539]  [<ffffffff81065aef>] warn_slowpath_common+0x7f/0xc0
> [    1.469532]  [<ffffffff81065b4a>] warn_slowpath_null+0x1a/0x20
> [    1.475352]  [<ffffffff810412be>] __ioremap_caller+0x35e/0x3a0
> [    1.481176]  [<ffffffff8103852e>] ? copy_oldmem_page+0x4e/0xc0
> [    1.486995]  [<ffffffff81041334>] ioremap_cache+0x14/0x20
> [    1.492380]  [<ffffffff8103852e>] copy_oldmem_page+0x4e/0xc0
> [    1.498031]  [<ffffffff811dc7b1>] read_from_oldmem+0xb1/0xf0
> [    1.503682]  [<ffffffff8115e4ec>] ? __kmalloc+0x5c/0x160
> [    1.508984]  [<ffffffff81cfef55>] T.635+0x6e/0x211
> [    1.513767]  [<ffffffff811dc7b1>] ? read_from_oldmem+0xb1/0xf0
> [    1.519588]  [<ffffffff8115e4ec>] ? __kmalloc+0x5c/0x160
> [    1.524887]  [<ffffffff81cff20b>] parse_crash_elf64_headers
> +0x113/0x212
> [    1.531489]  [<ffffffff81cff82f>] ? parse_crash_elf_headers
> +0x122/0x122
> [    1.538088]  [<ffffffff81cff78b>] parse_crash_elf_headers+0x7e/0x122
> [    1.544427]  [<ffffffff81cff850>] vmcore_init+0x21/0x75
> [    1.549645]  [<ffffffff81002043>] do_one_initcall+0x43/0x190
> [    1.555293]  [<ffffffff81cd8680>] kernel_init+0xcd/0x151
> [    1.560596]  [<ffffffff81608af4>] kernel_thread_helper+0x4/0x10
> [    1.566504]  [<ffffffff81cd85b3>] ? parse_early_options+0x20/0x20
> [    1.572584]  [<ffffffff81608af0>] ? gs_change+0x13/0x13
> [    1.577802] ---[ end trace a22d306b065d4a66 ]---
> 
> [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.1.1-030101-generic
> root=UUID=ea7a5a27-d58f-469f-a19c-3e65b69587f6 ro console=ttyS0,115200n8
> irqpoll maxcpus=1 nousb memmap=exactmap memmap=640K@0K
> memmap=489836K@33408K elfcorehdr=523244K memmap=252K#2087484K
> 
> 00000000-0000ffff : reserved
> 00010000-0009ffff : System RAM
> 000a0000-000bffff : PCI Bus 0000:00
> 000c0000-000c7fff : Video ROM
> 000c8000-000cdbff : Adapter ROM
> 000ce000-000cefff : Adapter ROM
> 000cf000-000d15ff : Adapter ROM
> 000f0000-000fffff : System ROM
> 00100000-7f678fff : System RAM
>   01000000-0160b9e3 : Kernel code
>   0160b9e4-01cc2dff : Kernel data
>   01dc1000-01f14fff : Kernel bss
>   02000000-1fefffff : Crash kernel
> 7f679000-7f68efff : reserved
>   7f679000-7f679003 : APEI ERST
>   7f67900c-7f679016 : APEI ERST
>   7f679060-7f67906b : APEI ERST
>   7f68d000-7f68efff : APEI ERST
> 7f68f000-7f6cdfff : ACPI Tables
> 7f6ce000-7fffffff : reserved
> 80000000-fdffffff : PCI Bus 0000:00
>   d5800000-d5ffffff : PCI Bus 0000:08
>     d5800000-d5ffffff : 0000:08:03.0
>   d6000000-d9ffffff : PCI Bus 0000:01
>     d6000000-d7ffffff : 0000:01:00.0
>       d6000000-d7ffffff : bnx2
>     d8000000-d9ffffff : 0000:01:00.1
>       d8000000-d9ffffff : bnx2
>   da000000-ddffffff : PCI Bus 0000:02
>     da000000-dbffffff : 0000:02:00.0
>       da000000-dbffffff : bnx2
>     dc000000-ddffffff : 0000:02:00.1
>       dc000000-ddffffff : bnx2
>   de000000-deffffff : PCI Bus 0000:08
>     de000000-de00ffff : 0000:08:03.0
>     de7fc000-de7fffff : 0000:08:03.0
>     de800000-deffffff : 0000:08:03.0
>   df0ff800-df0ffbff : 0000:00:1a.7
>     df0ff800-df0ffbff : ehci_hcd
>   df0ffc00-df0fffff : 0000:00:1d.7
>     df0ffc00-df0fffff : ehci_hcd
>   df100000-df2fffff : PCI Bus 0000:03
>     df100000-df1fffff : 0000:03:00.0
>     df2ec000-df2effff : 0000:03:00.0
>       df2ec000-df2effff : mpt
>     df2f0000-df2fffff : 0000:03:00.0
>       df2f0000-df2fffff : mpt
>   e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
>     e0000000-efffffff : reserved
>       e0000000-efffffff : pnp 00:09
> fe000000-ffffffff : reserved
>   fec00000-fec003ff : IOAPIC 0
>   fec80000-fec803ff : IOAPIC 1
>   fed00000-fed003ff : HPET 0
>   fed40000-fed44fff : PCI Bus 0000:00
>   fed90000-fed91fff : pnp 00:0b
>   fee00000-fee00fff : Local APIC
> 100000000-c7fffffff : System RAM
> 
> 
> 
> On Tue, 2011-11-15 at 16:14 +0800, Dave Young wrote:
>> On 11/15/2011 02:50 AM, Tim Hartrick wrote:
>>
>>>
>>> Wang,
>>>
>>> Thanks for taking the time to look at this.
>>>
>>>
>>> Here is the result from a 2.6.38 kernel used as base kernel and
>>> crashkernel:
>>>
>>> [    1.314762] WARNING:
>>> at /build/buildd/linux-2.6.38/arch/x86/mm/ioremap.c:83 __ioremap_caller
>>> +0x350/0x3d0()
>>> [    1.324394] Hardware name: PowerEdge R710
>>> [    1.328390] Modules linked in:
>>> [    1.331443] Pid: 1, comm: swapper Not tainted 2.6.38-8-server
>>> #42-Ubuntu
>>> [    1.338128] Call Trace:
>>> [    1.340572]  [<ffffffff81065d1f>] ? warn_slowpath_common+0x7f/0xc0
>>> [    1.346741]  [<ffffffff81065d7a>] ? warn_slowpath_null+0x1a/0x20
>>> [    1.352729]  [<ffffffff81040eb0>] ? __ioremap_caller+0x350/0x3d0
>>> [    1.358726]  [<ffffffff810d8575>] ? call_rcu_sched+0x15/0x20
>>> [    1.364375]  [<ffffffff8103452e>] ? copy_oldmem_page+0x4e/0xc0
>>> [    1.370194]  [<ffffffff8113c39e>] ? __purge_vmap_area_lazy+0xfe/0x1f0
>>> [    1.376622]  [<ffffffff81040f64>] ? ioremap_cache+0x14/0x20
>>> [    1.382176]  [<ffffffff8103452e>] ? copy_oldmem_page+0x4e/0xc0
>>> [    1.388002]  [<ffffffff811cad0a>] ? read_from_oldmem+0x7a/0xb0
>>> [    1.393827]  [<ffffffff81b099a0>] ? merge_note_headers_elf64.clone.3
>>> +0x6c/0x214
>>> [    1.401115]  [<ffffffff8103456a>] ? copy_oldmem_page+0x8a/0xc0
>>> [    1.406936]  [<ffffffff811cad0a>] ? read_from_oldmem+0x7a/0xb0
>>> [    1.412752]  [<ffffffff81b09e79>] ? vmcore_init+0x0/0x73
>>> [    1.418051]  [<ffffffff81b09c52>] ? parse_crash_elf64_headers
>>> +0x10a/0x211
>>> [    1.424825]  [<ffffffff8103456a>] ? copy_oldmem_page+0x8a/0xc0
>>> [    1.430640]  [<ffffffff81b09e79>] ? vmcore_init+0x0/0x73
>>> [    1.435940]  [<ffffffff81b09dd4>] ? parse_crash_elf_headers
>>> +0x7b/0x120
>>> [    1.442450]  [<ffffffff81b09e9c>] ? vmcore_init+0x23/0x73
>>> [    1.447839]  [<ffffffff81002175>] ? do_one_initcall+0x45/0x190
>>> [    1.453661]  [<ffffffff81ae1dff>] ? kernel_init+0x169/0x1f3
>>> [    1.459218]  [<ffffffff8100cde4>] ? kernel_thread_helper+0x4/0x10
>>> [    1.465298]  [<ffffffff81ae1c96>] ? kernel_init+0x0/0x1f3
>>> [    1.470680]  [<ffffffff8100cde0>] ? kernel_thread_helper+0x0/0x10
>>>
>>> The command line for the crashkernel:
>>>
>>> [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-2.6.38-8-server
>>> root=UUID=ea7a5a27-d58f-469f-a19c-3e65b69587f6 ro console=ttyS0,115200n8
>>> irqpoll maxcpus=1 nousb memmap=exactmap memmap=640K@0K
>>> memmap=261484K@623232K elfcorehdr=884716K memmap=252K#2087484K
>>>
>>> The contents of /proc/iomem while running the base kernel:
>>>
>>> 00000000-0000ffff : reserved
>>> 00010000-0009ffff : System RAM
>>> 000a0000-000bffff : PCI Bus 0000:00
>>> 00100000-7f678fff : System RAM
>>>   01000000-015e1d6c : Kernel code
>>>   015e1d6d-01aca17f : Kernel data
>>>   01bae000-01d03fff : Kernel bss
>>>   26000000-35ffffff : Crash kernel
>>> 7f679000-7f68efff : reserved
>>>   7f679000-7f679003 : APEI ERST
>>>   7f67900c-7f679016 : APEI ERST
>>>   7f679060-7f67906b : APEI ERST
>>>   7f68d000-7f68efff : APEI ERST
>>> 7f68f000-7f6cdfff : ACPI Tables
>>> 7f6ce000-7fffffff : reserved
>>> 80000000-fdffffff : PCI Bus 0000:00
>>>   d5800000-d5ffffff : PCI Bus 0000:08
>>>     d5800000-d5ffffff : 0000:08:03.0
>>>   d6000000-d9ffffff : PCI Bus 0000:01
>>>     d6000000-d7ffffff : 0000:01:00.0
>>>       d6000000-d7ffffff : bnx2
>>>     d8000000-d9ffffff : 0000:01:00.1
>>>       d8000000-d9ffffff : bnx2
>>>   da000000-ddffffff : PCI Bus 0000:02
>>>     da000000-dbffffff : 0000:02:00.0
>>>       da000000-dbffffff : bnx2
>>>     dc000000-ddffffff : 0000:02:00.1
>>>       dc000000-ddffffff : bnx2
>>>   de000000-deffffff : PCI Bus 0000:08
>>>     de000000-de00ffff : 0000:08:03.0
>>>     de7fc000-de7fffff : 0000:08:03.0
>>>     de800000-deffffff : 0000:08:03.0
>>>   df0ff800-df0ffbff : 0000:00:1a.7
>>>     df0ff800-df0ffbff : ehci_hcd
>>>   df0ffc00-df0fffff : 0000:00:1d.7
>>>     df0ffc00-df0fffff : ehci_hcd
>>>   df100000-df2fffff : PCI Bus 0000:03
>>>     df100000-df1fffff : 0000:03:00.0
>>>     df2ec000-df2effff : 0000:03:00.0
>>>       df2ec000-df2effff : mpt
>>>     df2f0000-df2fffff : 0000:03:00.0
>>>       df2f0000-df2fffff : mpt
>>>   e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
>>>     e0000000-efffffff : reserved
>>>       e0000000-efffffff : pnp 00:09
>>> fe000000-ffffffff : reserved
>>>   fec00000-fec003ff : IOAPIC 0
>>>   fec80000-fec803ff : IOAPIC 1
>>>   fed00000-fed003ff : HPET 0
>>>   fed40000-fed44fff : PCI Bus 0000:00
>>>   fed90000-fed91fff : pnp 00:0b
>>>   fee00000-fee00fff : Local APIC
>>> 100000000-c7fffffff : System RAM
>>>
>>>
>>> tim
>>>
>>>
>>>
>>> On Mon, 2011-11-14 at 13:39 +0000, WANG Cong wrote:
>>>> On Tue, 11 Oct 2011 16:39:05 -0700, Tim Hartrick wrote:
>>>>
>>>>> Kexec,
>>>>>
>>>>> I have been experiencing the crash below on Ubuntu 10.04 running
>>>>> 2.6.32-34-server and 2.6.38-8-server as the crashkernel on X86_64. The
>>>>> tools are:
>>>>>
>>>>> kexec-tools	1:2.0.2-1ubuntu3
>>>>> makedumpfile	1.3.7-2
>>>>> kdump-tools	1.3.7-2
>>>>>
>>>>> I would be interested to know if this is a known problem and if so
>>>>> whether or not there is a patch in the pipeline to correct the problem.
>>>>>
>>>>> I will be happy to provide any other details that are required including
>>>>> debug builds if necessary.
>>>> ....
>>>>>
>>>>> [    1.322100] ioremap: invalid physical address db74000000000000 [  
>>
>>
>> Searching db74000000000000 got several similar cases of this, all are
>> about per cpu invalid crash_notes address, is this one more?
>>
>> OTOH, Can you test latest mainline kernel?
>>
>> ccing lkml and Tejun Heo
>>
>>  
>>>>> 1.327919] ------------[ cut here ]------------ [    1.332530] WARNING:
>>>>> at /build/buildd/linux-2.6.32/arch/x86/mm/ioremap.c:120 __ioremap_caller
>>>>> +0x360/0x3d0()
>>>>
>>>> This probably means that kexec-tools passed some incorrect
>>>> kernel parameter to the second kernel.
>>>>
>>>> So, what is the cmdline of your second kernel? And what is your
>>>> /proc/iomem of your first kernel?
>>>>
>>>> Cheers.
>>>>
>>>>
>>>> _______________________________________________
>>>> kexec mailing list
>>>> kexec@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
>>
>>
>>
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec



-- 
Thanks
Dave

[-- Attachment #2: percpu-test.patch --]
[-- Type: text/plain, Size: 960 bytes --]

--- linux-2.6.orig/mm/percpu.c	2011-11-16 09:38:58.000000000 +0800
+++ linux-2.6/mm/percpu.c	2011-11-16 10:05:36.804771014 +0800
@@ -987,6 +987,7 @@ phys_addr_t per_cpu_ptr_to_phys(void *ad
 	unsigned long first_start, first_end;
 	unsigned int cpu;
 
+	printk(KERN_INFO "per cpu addr %lx\n", addr);
 	/*
 	 * The following test on first_start/end isn't strictly
 	 * necessary but will speed up lookups of addresses which
@@ -1002,11 +1003,19 @@ phys_addr_t per_cpu_ptr_to_phys(void *ad
 
 			if (addr >= start && addr < start + pcpu_unit_size) {
 				in_first_chunk = true;
+				printk(KERN_INFO "addr is in first chunk\n");
+				printk(KERN_INFO "cpu %d, %lx - %lx\n",
+						start, start + pcpu_unit_size);
 				break;
 			}
 		}
 	}
 
+	if (is_vmalloc_addr(addr))
+		printk(KERN_INFO "addr is in vmalloc area\n");
+	else
+		printk(KERN_INFO "addr is not in vmalloc area\n");
+
 	if (in_first_chunk) {
 		if (!is_vmalloc_addr(addr))
 			return __pa(addr);

[-- Attachment #3: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: Dave Young <dyoung@redhat.com>
To: tim@edgecast.com
Cc: tj@kernel.org, WANG Cong <xiyou.wangcong@gmail.com>,
	kexec@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: Crash during vmcore_init
Date: Wed, 16 Nov 2011 10:22:13 +0800	[thread overview]
Message-ID: <4EC31E55.6040809@redhat.com> (raw)
In-Reply-To: <1321396371.4198.5.camel@boudreau>

[-- Attachment #1: Type: text/plain, Size: 11333 bytes --]

On 11/16/2011 06:32 AM, Tim Hartrick wrote:

> 
> Dave,
> 
> I tested with
> 
> linux-image-3.1.1-030101-generic_3.1.1-030101.201111111651_amd64.deb
> 
> which, as far as I know, is the Ubuntu build of the latest stable.
> Below are the results.
> 
> [    1.427457] ioremap: invalid physical address 5800000000000


Hi, thanks for the testing

Can you applied the debug patch to see if it's per cpu problem?

Don't need test kdump, just
cd /sys/devices/system/cpu
cat cpu[x]/crash_notes

probably cat crash notes of cpu number other than 0 will get the invalid
address

> [    1.433017] ------------[ cut here ]------------
> [    1.437632] WARNING: at /home/apw/COD/linux/arch/x86/mm/ioremap.c:83
> __ioremap_caller+0x35e/0x3a0()
> [    1.446656] Hardware name: PowerEdge R710
> [    1.450655] Modules linked in:
> [    1.453712] Pid: 1, comm: swapper Not tainted 3.1.1-030101-generic
> #201111111651
> [    1.461092] Call Trace:
> [    1.463539]  [<ffffffff81065aef>] warn_slowpath_common+0x7f/0xc0
> [    1.469532]  [<ffffffff81065b4a>] warn_slowpath_null+0x1a/0x20
> [    1.475352]  [<ffffffff810412be>] __ioremap_caller+0x35e/0x3a0
> [    1.481176]  [<ffffffff8103852e>] ? copy_oldmem_page+0x4e/0xc0
> [    1.486995]  [<ffffffff81041334>] ioremap_cache+0x14/0x20
> [    1.492380]  [<ffffffff8103852e>] copy_oldmem_page+0x4e/0xc0
> [    1.498031]  [<ffffffff811dc7b1>] read_from_oldmem+0xb1/0xf0
> [    1.503682]  [<ffffffff8115e4ec>] ? __kmalloc+0x5c/0x160
> [    1.508984]  [<ffffffff81cfef55>] T.635+0x6e/0x211
> [    1.513767]  [<ffffffff811dc7b1>] ? read_from_oldmem+0xb1/0xf0
> [    1.519588]  [<ffffffff8115e4ec>] ? __kmalloc+0x5c/0x160
> [    1.524887]  [<ffffffff81cff20b>] parse_crash_elf64_headers
> +0x113/0x212
> [    1.531489]  [<ffffffff81cff82f>] ? parse_crash_elf_headers
> +0x122/0x122
> [    1.538088]  [<ffffffff81cff78b>] parse_crash_elf_headers+0x7e/0x122
> [    1.544427]  [<ffffffff81cff850>] vmcore_init+0x21/0x75
> [    1.549645]  [<ffffffff81002043>] do_one_initcall+0x43/0x190
> [    1.555293]  [<ffffffff81cd8680>] kernel_init+0xcd/0x151
> [    1.560596]  [<ffffffff81608af4>] kernel_thread_helper+0x4/0x10
> [    1.566504]  [<ffffffff81cd85b3>] ? parse_early_options+0x20/0x20
> [    1.572584]  [<ffffffff81608af0>] ? gs_change+0x13/0x13
> [    1.577802] ---[ end trace a22d306b065d4a66 ]---
> 
> [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.1.1-030101-generic
> root=UUID=ea7a5a27-d58f-469f-a19c-3e65b69587f6 ro console=ttyS0,115200n8
> irqpoll maxcpus=1 nousb memmap=exactmap memmap=640K@0K
> memmap=489836K@33408K elfcorehdr=523244K memmap=252K#2087484K
> 
> 00000000-0000ffff : reserved
> 00010000-0009ffff : System RAM
> 000a0000-000bffff : PCI Bus 0000:00
> 000c0000-000c7fff : Video ROM
> 000c8000-000cdbff : Adapter ROM
> 000ce000-000cefff : Adapter ROM
> 000cf000-000d15ff : Adapter ROM
> 000f0000-000fffff : System ROM
> 00100000-7f678fff : System RAM
>   01000000-0160b9e3 : Kernel code
>   0160b9e4-01cc2dff : Kernel data
>   01dc1000-01f14fff : Kernel bss
>   02000000-1fefffff : Crash kernel
> 7f679000-7f68efff : reserved
>   7f679000-7f679003 : APEI ERST
>   7f67900c-7f679016 : APEI ERST
>   7f679060-7f67906b : APEI ERST
>   7f68d000-7f68efff : APEI ERST
> 7f68f000-7f6cdfff : ACPI Tables
> 7f6ce000-7fffffff : reserved
> 80000000-fdffffff : PCI Bus 0000:00
>   d5800000-d5ffffff : PCI Bus 0000:08
>     d5800000-d5ffffff : 0000:08:03.0
>   d6000000-d9ffffff : PCI Bus 0000:01
>     d6000000-d7ffffff : 0000:01:00.0
>       d6000000-d7ffffff : bnx2
>     d8000000-d9ffffff : 0000:01:00.1
>       d8000000-d9ffffff : bnx2
>   da000000-ddffffff : PCI Bus 0000:02
>     da000000-dbffffff : 0000:02:00.0
>       da000000-dbffffff : bnx2
>     dc000000-ddffffff : 0000:02:00.1
>       dc000000-ddffffff : bnx2
>   de000000-deffffff : PCI Bus 0000:08
>     de000000-de00ffff : 0000:08:03.0
>     de7fc000-de7fffff : 0000:08:03.0
>     de800000-deffffff : 0000:08:03.0
>   df0ff800-df0ffbff : 0000:00:1a.7
>     df0ff800-df0ffbff : ehci_hcd
>   df0ffc00-df0fffff : 0000:00:1d.7
>     df0ffc00-df0fffff : ehci_hcd
>   df100000-df2fffff : PCI Bus 0000:03
>     df100000-df1fffff : 0000:03:00.0
>     df2ec000-df2effff : 0000:03:00.0
>       df2ec000-df2effff : mpt
>     df2f0000-df2fffff : 0000:03:00.0
>       df2f0000-df2fffff : mpt
>   e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
>     e0000000-efffffff : reserved
>       e0000000-efffffff : pnp 00:09
> fe000000-ffffffff : reserved
>   fec00000-fec003ff : IOAPIC 0
>   fec80000-fec803ff : IOAPIC 1
>   fed00000-fed003ff : HPET 0
>   fed40000-fed44fff : PCI Bus 0000:00
>   fed90000-fed91fff : pnp 00:0b
>   fee00000-fee00fff : Local APIC
> 100000000-c7fffffff : System RAM
> 
> 
> 
> On Tue, 2011-11-15 at 16:14 +0800, Dave Young wrote:
>> On 11/15/2011 02:50 AM, Tim Hartrick wrote:
>>
>>>
>>> Wang,
>>>
>>> Thanks for taking the time to look at this.
>>>
>>>
>>> Here is the result from a 2.6.38 kernel used as base kernel and
>>> crashkernel:
>>>
>>> [    1.314762] WARNING:
>>> at /build/buildd/linux-2.6.38/arch/x86/mm/ioremap.c:83 __ioremap_caller
>>> +0x350/0x3d0()
>>> [    1.324394] Hardware name: PowerEdge R710
>>> [    1.328390] Modules linked in:
>>> [    1.331443] Pid: 1, comm: swapper Not tainted 2.6.38-8-server
>>> #42-Ubuntu
>>> [    1.338128] Call Trace:
>>> [    1.340572]  [<ffffffff81065d1f>] ? warn_slowpath_common+0x7f/0xc0
>>> [    1.346741]  [<ffffffff81065d7a>] ? warn_slowpath_null+0x1a/0x20
>>> [    1.352729]  [<ffffffff81040eb0>] ? __ioremap_caller+0x350/0x3d0
>>> [    1.358726]  [<ffffffff810d8575>] ? call_rcu_sched+0x15/0x20
>>> [    1.364375]  [<ffffffff8103452e>] ? copy_oldmem_page+0x4e/0xc0
>>> [    1.370194]  [<ffffffff8113c39e>] ? __purge_vmap_area_lazy+0xfe/0x1f0
>>> [    1.376622]  [<ffffffff81040f64>] ? ioremap_cache+0x14/0x20
>>> [    1.382176]  [<ffffffff8103452e>] ? copy_oldmem_page+0x4e/0xc0
>>> [    1.388002]  [<ffffffff811cad0a>] ? read_from_oldmem+0x7a/0xb0
>>> [    1.393827]  [<ffffffff81b099a0>] ? merge_note_headers_elf64.clone.3
>>> +0x6c/0x214
>>> [    1.401115]  [<ffffffff8103456a>] ? copy_oldmem_page+0x8a/0xc0
>>> [    1.406936]  [<ffffffff811cad0a>] ? read_from_oldmem+0x7a/0xb0
>>> [    1.412752]  [<ffffffff81b09e79>] ? vmcore_init+0x0/0x73
>>> [    1.418051]  [<ffffffff81b09c52>] ? parse_crash_elf64_headers
>>> +0x10a/0x211
>>> [    1.424825]  [<ffffffff8103456a>] ? copy_oldmem_page+0x8a/0xc0
>>> [    1.430640]  [<ffffffff81b09e79>] ? vmcore_init+0x0/0x73
>>> [    1.435940]  [<ffffffff81b09dd4>] ? parse_crash_elf_headers
>>> +0x7b/0x120
>>> [    1.442450]  [<ffffffff81b09e9c>] ? vmcore_init+0x23/0x73
>>> [    1.447839]  [<ffffffff81002175>] ? do_one_initcall+0x45/0x190
>>> [    1.453661]  [<ffffffff81ae1dff>] ? kernel_init+0x169/0x1f3
>>> [    1.459218]  [<ffffffff8100cde4>] ? kernel_thread_helper+0x4/0x10
>>> [    1.465298]  [<ffffffff81ae1c96>] ? kernel_init+0x0/0x1f3
>>> [    1.470680]  [<ffffffff8100cde0>] ? kernel_thread_helper+0x0/0x10
>>>
>>> The command line for the crashkernel:
>>>
>>> [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-2.6.38-8-server
>>> root=UUID=ea7a5a27-d58f-469f-a19c-3e65b69587f6 ro console=ttyS0,115200n8
>>> irqpoll maxcpus=1 nousb memmap=exactmap memmap=640K@0K
>>> memmap=261484K@623232K elfcorehdr=884716K memmap=252K#2087484K
>>>
>>> The contents of /proc/iomem while running the base kernel:
>>>
>>> 00000000-0000ffff : reserved
>>> 00010000-0009ffff : System RAM
>>> 000a0000-000bffff : PCI Bus 0000:00
>>> 00100000-7f678fff : System RAM
>>>   01000000-015e1d6c : Kernel code
>>>   015e1d6d-01aca17f : Kernel data
>>>   01bae000-01d03fff : Kernel bss
>>>   26000000-35ffffff : Crash kernel
>>> 7f679000-7f68efff : reserved
>>>   7f679000-7f679003 : APEI ERST
>>>   7f67900c-7f679016 : APEI ERST
>>>   7f679060-7f67906b : APEI ERST
>>>   7f68d000-7f68efff : APEI ERST
>>> 7f68f000-7f6cdfff : ACPI Tables
>>> 7f6ce000-7fffffff : reserved
>>> 80000000-fdffffff : PCI Bus 0000:00
>>>   d5800000-d5ffffff : PCI Bus 0000:08
>>>     d5800000-d5ffffff : 0000:08:03.0
>>>   d6000000-d9ffffff : PCI Bus 0000:01
>>>     d6000000-d7ffffff : 0000:01:00.0
>>>       d6000000-d7ffffff : bnx2
>>>     d8000000-d9ffffff : 0000:01:00.1
>>>       d8000000-d9ffffff : bnx2
>>>   da000000-ddffffff : PCI Bus 0000:02
>>>     da000000-dbffffff : 0000:02:00.0
>>>       da000000-dbffffff : bnx2
>>>     dc000000-ddffffff : 0000:02:00.1
>>>       dc000000-ddffffff : bnx2
>>>   de000000-deffffff : PCI Bus 0000:08
>>>     de000000-de00ffff : 0000:08:03.0
>>>     de7fc000-de7fffff : 0000:08:03.0
>>>     de800000-deffffff : 0000:08:03.0
>>>   df0ff800-df0ffbff : 0000:00:1a.7
>>>     df0ff800-df0ffbff : ehci_hcd
>>>   df0ffc00-df0fffff : 0000:00:1d.7
>>>     df0ffc00-df0fffff : ehci_hcd
>>>   df100000-df2fffff : PCI Bus 0000:03
>>>     df100000-df1fffff : 0000:03:00.0
>>>     df2ec000-df2effff : 0000:03:00.0
>>>       df2ec000-df2effff : mpt
>>>     df2f0000-df2fffff : 0000:03:00.0
>>>       df2f0000-df2fffff : mpt
>>>   e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
>>>     e0000000-efffffff : reserved
>>>       e0000000-efffffff : pnp 00:09
>>> fe000000-ffffffff : reserved
>>>   fec00000-fec003ff : IOAPIC 0
>>>   fec80000-fec803ff : IOAPIC 1
>>>   fed00000-fed003ff : HPET 0
>>>   fed40000-fed44fff : PCI Bus 0000:00
>>>   fed90000-fed91fff : pnp 00:0b
>>>   fee00000-fee00fff : Local APIC
>>> 100000000-c7fffffff : System RAM
>>>
>>>
>>> tim
>>>
>>>
>>>
>>> On Mon, 2011-11-14 at 13:39 +0000, WANG Cong wrote:
>>>> On Tue, 11 Oct 2011 16:39:05 -0700, Tim Hartrick wrote:
>>>>
>>>>> Kexec,
>>>>>
>>>>> I have been experiencing the crash below on Ubuntu 10.04 running
>>>>> 2.6.32-34-server and 2.6.38-8-server as the crashkernel on X86_64. The
>>>>> tools are:
>>>>>
>>>>> kexec-tools	1:2.0.2-1ubuntu3
>>>>> makedumpfile	1.3.7-2
>>>>> kdump-tools	1.3.7-2
>>>>>
>>>>> I would be interested to know if this is a known problem and if so
>>>>> whether or not there is a patch in the pipeline to correct the problem.
>>>>>
>>>>> I will be happy to provide any other details that are required including
>>>>> debug builds if necessary.
>>>> ....
>>>>>
>>>>> [    1.322100] ioremap: invalid physical address db74000000000000 [  
>>
>>
>> Searching db74000000000000 got several similar cases of this, all are
>> about per cpu invalid crash_notes address, is this one more?
>>
>> OTOH, Can you test latest mainline kernel?
>>
>> ccing lkml and Tejun Heo
>>
>>  
>>>>> 1.327919] ------------[ cut here ]------------ [    1.332530] WARNING:
>>>>> at /build/buildd/linux-2.6.32/arch/x86/mm/ioremap.c:120 __ioremap_caller
>>>>> +0x360/0x3d0()
>>>>
>>>> This probably means that kexec-tools passed some incorrect
>>>> kernel parameter to the second kernel.
>>>>
>>>> So, what is the cmdline of your second kernel? And what is your
>>>> /proc/iomem of your first kernel?
>>>>
>>>> Cheers.
>>>>
>>>>
>>>> _______________________________________________
>>>> kexec mailing list
>>>> kexec@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
>>
>>
>>
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec



-- 
Thanks
Dave

[-- Attachment #2: percpu-test.patch --]
[-- Type: text/plain, Size: 960 bytes --]

--- linux-2.6.orig/mm/percpu.c	2011-11-16 09:38:58.000000000 +0800
+++ linux-2.6/mm/percpu.c	2011-11-16 10:05:36.804771014 +0800
@@ -987,6 +987,7 @@ phys_addr_t per_cpu_ptr_to_phys(void *ad
 	unsigned long first_start, first_end;
 	unsigned int cpu;
 
+	printk(KERN_INFO "per cpu addr %lx\n", addr);
 	/*
 	 * The following test on first_start/end isn't strictly
 	 * necessary but will speed up lookups of addresses which
@@ -1002,11 +1003,19 @@ phys_addr_t per_cpu_ptr_to_phys(void *ad
 
 			if (addr >= start && addr < start + pcpu_unit_size) {
 				in_first_chunk = true;
+				printk(KERN_INFO "addr is in first chunk\n");
+				printk(KERN_INFO "cpu %d, %lx - %lx\n",
+						start, start + pcpu_unit_size);
 				break;
 			}
 		}
 	}
 
+	if (is_vmalloc_addr(addr))
+		printk(KERN_INFO "addr is in vmalloc area\n");
+	else
+		printk(KERN_INFO "addr is not in vmalloc area\n");
+
 	if (in_first_chunk) {
 		if (!is_vmalloc_addr(addr))
 			return __pa(addr);

  reply	other threads:[~2011-11-16  2:20 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-11 23:39 Crash during vmcore_init Tim Hartrick
2011-11-14 13:39 ` WANG Cong
2011-11-14 18:50   ` Tim Hartrick
2011-11-15  8:14     ` Dave Young
2011-11-15  8:14       ` Dave Young
2011-11-15 13:47       ` Américo Wang
2011-11-15 13:47         ` Américo Wang
2011-11-15 13:50         ` Américo Wang
2011-11-15 13:50           ` Américo Wang
2011-11-15 22:32       ` Tim Hartrick
2011-11-15 22:32         ` Tim Hartrick
2011-11-16  2:22         ` Dave Young [this message]
2011-11-16  2:22           ` Dave Young
2011-11-16 18:20           ` Tim Hartrick
2011-11-16 18:20             ` Tim Hartrick
2011-11-17  3:30             ` Dave Young
2011-11-17  3:30               ` Dave Young
2011-11-17  4:34               ` Tejun Heo
2011-11-17  4:34                 ` Tejun Heo
2011-11-17  4:46                 ` Dave Young
2011-11-17  4:46                   ` Dave Young
2011-11-17  5:22                   ` Tim Hartrick
2011-11-17  7:21                     ` Dave Young
2011-11-17  7:21                       ` Dave Young
2011-11-17  7:23                       ` Tejun Heo
2011-11-17  7:23                         ` Tejun Heo
2011-11-17  7:42                         ` Américo Wang
2011-11-17  7:42                           ` Américo Wang
2011-11-17 16:40                       ` Tim Hartrick
2011-11-17 16:40                         ` Tim Hartrick
2011-11-18  8:43                         ` Dave Young
2011-11-18  8:43                           ` Dave Young
2011-11-18  8:45                           ` Dave Young
2011-11-18  8:45                             ` Dave Young
2011-11-18 18:55                             ` [PATCH] percpu: fix chunk range calculation Tejun Heo
2011-11-18 18:55                               ` Tejun Heo
2011-11-21  1:45                               ` Dave Young
2011-11-21  1:45                                 ` Dave Young
2011-11-21 16:20                                 ` Tim Hartrick
2011-11-22  2:52                                   ` Dave Young
2011-11-22  2:52                                     ` Dave Young
2011-11-21 17:01                                 ` Tejun Heo
2011-11-21 17:01                                   ` Tejun Heo
2011-11-22  3:00                                   ` Dave Young
2011-11-22  3:00                                     ` Dave Young
2011-11-22 16:02                                     ` Tejun Heo
2011-11-22 16:02                                       ` Tejun Heo
2011-11-21 21:10                               ` Tejun Heo
2011-11-21 21:10                                 ` Tejun Heo
2011-11-22  2:48                                 ` Dave Young
2011-11-22  2:48                                   ` Dave Young
2011-11-22 16:19                                   ` Tejun Heo
2011-11-22 16:19                                     ` Tejun Heo
2011-11-15 14:13     ` Crash during vmcore_init Américo Wang
2011-11-15 22:57       ` Tim Hartrick
2011-11-16 12:47         ` Américo Wang
2011-11-16 13:19           ` Tim Hartrick
2011-11-16 13:31             ` Américo Wang
2011-11-16 13:44               ` Tim Hartrick
     [not found]               ` <1321462343.4198.29.camel@boudreau>
2011-11-17  6:48                 ` Américo Wang
2011-11-17 16:08                   ` Tim Hartrick
2011-11-17 16:31                     ` Tim Hartrick
2011-11-16 15:52           ` Tim Hartrick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EC31E55.6040809@redhat.com \
    --to=dyoung@redhat.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tim@edgecast.com \
    --cc=tj@kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.