public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* kdump broken on Altix 350
@ 2008-08-29 16:03 Bernhard Walle
  2008-08-29 16:05 ` Bernhard Walle
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Bernhard Walle @ 2008-08-29 16:03 UTC (permalink / raw)
  To: Tony Luck, jlan; +Cc: kexec, linux-ia64

Hi Tony,

your commit

    commit 10617bbe84628eb18ab5f723d3ba35005adde143
    Author: Tony Luck <tony.luck@intel.com>
    Date:   Tue Aug 12 10:34:20 2008 -0700

    [IA64] Ensure cpu0 can access per-cpu variables in early boot code

broke kdump on our Altix 350. I get following early crash in kdump
kernel:

------------------------------------- 8< ------------------------------

Pid: 1, CPU 0, comm:              swapper
psr : 00001010085a6010 ifs : 800000000000038a ip  :
[<a0000001004faaf0>]    Not tainted (2.6.27-rc2-default)
ip is at __rtnl_register+0x150/0x1a0
unat: 0000000000000000 pfs : 000000000000038b rsc : 0000000000000003
rnat: 0000000000000014 bsps: 000000000001003e pr  : 0000000000006581
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a0000001004fab70 b6  : a0000001002a8de0 b7  : a000000100434340
f6  : 1003e0000000000000000 f7  : 1003e8888888888888889
f8  : 1003e0000000000000000 f9  : 1003e0000000000000001
f10 : 1003e0000000000000f00 f11 : 1003e00000000000000a0
r1  : a000000100c27bf0 r2  : a000000100a3db68 r3  : a000000100a32be0
r8  : 0000005200000051 r9  : a000000100a0ca40 r10 : 000000019873d109
r11 : fffffffe602b9dae r12 : e0000030192cfdf0 r13 : e0000030192c8000
r14 : 0000005200000071 r15 : 0000000000000000 r16 : 0000000000000000
r17 : a000000100a0be40 r18 : a000000100a3acb0 r19 : a000000100a3acb8
r20 : a000000100a32208 r21 : a000000100a321e8 r22 : 0000000000000000
r23 : e000003037449454 r24 : 0000000000000001 r25 : 0000000000000000
r26 : 0000000000000001 r27 : 00000010085a6010 r28 : e000003019298040
r29 : e000003019298030 r30 : 0000000000000000 r31 : a000000100a0ca38

Call Trace:
 [<a000000100016320>] show_stack+0x40/0xa0
                                spà000030192cf9c0 bspà000030192c90b0
 [<a000000100016c30>] show_regs+0x850/0x8a0
                                spà000030192cfb90 bspà000030192c9058
 [<a000000100039d90>] die+0x1b0/0x2c0
                                spà000030192cfb90 bspà000030192c9010
 [<a000000100609a00>] ia64_do_page_fault+0x9a0/0xb00
                                spà000030192cfb90 bspà000030192c8fb0
 [<a00000010000c720>] ia64_native_leave_kernel+0x0/0x270
                                spà000030192cfc20 bspà000030192c8fb0
 [<a0000001004faaf0>] __rtnl_register+0x150/0x1a0
                                spà000030192cfdf0 bspà000030192c8f60
 [<a0000001004fab70>] rtnl_register+0x30/0x80
                                spà000030192cfdf0 bspà000030192c8f28
 [<a000000100808a00>] rtnetlink_init+0x180/0x2a0
                                spà000030192cfdf0 bspà000030192c8f08
 [<a000000100809a40>] netlink_proto_init+0x380/0x3e0
                                spà000030192cfdf0 bspà000030192c8ec8
 [<a00000010000a960>] do_one_initcall+0xa0/0x2e0
                                spà000030192cfdf0 bspà000030192c8e88
 [<a0000001007c4700>] kernel_init+0x4c0/0x580
                                spà000030192cfe30 bspà000030192c8e68
 [<a000000100014870>] kernel_thread_helper+0xd0/0x100
                                spà000030192cfe30 bspà000030192c8e40
 [<a00000010000a4c0>] start_kernel_thread+0x20/0x40
                                spà000030192cfe30 bspà000030192c8e40
Kernel panic - not syncing: Attempted to kill init!

------------------------------------- >8 ------------------------------

Since the code is very IA64-specific and I don't have the time now to
read all data sheets, I need your help to resolve that issue. :)


Bernhard
-- 
Bernhard Walle, SUSE Linux Products GmbH, Architecture Development

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kdump broken on Altix 350
  2008-08-29 16:03 kdump broken on Altix 350 Bernhard Walle
@ 2008-08-29 16:05 ` Bernhard Walle
  2008-08-29 20:42   ` Luck, Tony
  2008-09-10 12:19 ` Bernhard Walle
  2008-09-29 23:42 ` Luck, Tony
  2 siblings, 1 reply; 13+ messages in thread
From: Bernhard Walle @ 2008-08-29 16:05 UTC (permalink / raw)
  To: Tony Luck; +Cc: jlan, linux-ia64, kexec

* Bernhard Walle [2008-08-29 18:03]:
>
> your commit
> 
>     commit 10617bbe84628eb18ab5f723d3ba35005adde143
>     Author: Tony Luck <tony.luck@intel.com>
>     Date:   Tue Aug 12 10:34:20 2008 -0700
> 
>     [IA64] Ensure cpu0 can access per-cpu variables in early boot code
> 
> broke kdump on our Altix 350. I get following early crash in kdump
> kernel:

Ah, and the kexec call was

/sbin/kexec -p /boot/vmlinuz-2.6.27-rc2-default --append="
root=/dev/disk/by-id/scsi-SATA_HDS722580VLSA80_VNRB3KC2CZY0RL-part4
kdb=on sysrq=1 console=tty1 console=ttySG0,38400 thash_entries 97152
elevatorfiadline sysrq=1 reset_devices irqpoll maxcpus=1 "
--initrd=/boot/initrd-2.6.27-rc2-default-kdump  --noio

The interesting thing should be the kernel command line here.



Bernhard
-- 
Bernhard Walle, SUSE Linux Products GmbH, Architecture Development
--
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: kdump broken on Altix 350
  2008-08-29 16:05 ` Bernhard Walle
@ 2008-08-29 20:42   ` Luck, Tony
  2008-08-29 20:48     ` Bernhard Walle
  2008-09-10 11:48     ` Bernhard Walle
  0 siblings, 2 replies; 13+ messages in thread
From: Luck, Tony @ 2008-08-29 20:42 UTC (permalink / raw)
  To: Bernhard Walle
  Cc: jlan@sgi.com, linux-ia64@vger.kernel.org,
	kexec@lists.infradead.org

> your commit
>
>     commit 10617bbe84628eb18ab5f723d3ba35005adde143
>     Author: Tony Luck <tony.luck@intel.com>
>     Date:   Tue Aug 12 10:34:20 2008 -0700
>
>     [IA64] Ensure cpu0 can access per-cpu variables in early boot code
>
> broke kdump on our Altix 350. I get following early crash in kdump
> kernel

Sorry about that.  I'll try to reproduce it here.  Do you (or anyone
else reading this) know if the version of kexec that ships with RHEL5.2
works with current 2.6.27-rc kernels (perhaps not a politically correct
question to ask someone with a @suse.de address :-)

-Tony

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: kdump broken on Altix 350
  2008-08-29 20:42   ` Luck, Tony
@ 2008-08-29 20:48     ` Bernhard Walle
  2008-09-10 11:48     ` Bernhard Walle
  1 sibling, 0 replies; 13+ messages in thread
From: Bernhard Walle @ 2008-08-29 20:48 UTC (permalink / raw)
  To: Luck, Tony
  Cc: jlan@sgi.com, linux-ia64@vger.kernel.org,
	kexec@lists.infradead.org

Am Fr 29 Aug 2008 22:42:40 CEST schrieb "Luck, Tony" <tony.luck@intel.com>:
>> your commit
>>
>>     commit 10617bbe84628eb18ab5f723d3ba35005adde143
>>     Author: Tony Luck <tony.luck@intel.com>
>>     Date:   Tue Aug 12 10:34:20 2008 -0700
>>
>>     [IA64] Ensure cpu0 can access per-cpu variables in early boot code
>>
>> broke kdump on our Altix 350. I get following early crash in kdump
>> kernel
>
> Sorry about that.  I'll try to reproduce it here.  Do you (or anyone
> else reading this) know if the version of kexec that ships with RHEL5.2
> works with current 2.6.27-rc kernels

There's the danger of a zero-size /proc/vmcore, but that doesn't  
matter here. If you get a /proc/vmcore, regardless of the size, the  
bug did not hit you. :-)
So you can use it here.


Bernhard


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kdump broken on Altix 350
  2008-08-29 20:42   ` Luck, Tony
  2008-08-29 20:48     ` Bernhard Walle
@ 2008-09-10 11:48     ` Bernhard Walle
  2008-09-10 20:21       ` Jay Lan
  1 sibling, 1 reply; 13+ messages in thread
From: Bernhard Walle @ 2008-09-10 11:48 UTC (permalink / raw)
  To: kexec, jlan

* "Luck, Tony" <tony.luck@intel.com> [2008-08-29]: 

> > your commit
> >
> >     commit 10617bbe84628eb18ab5f723d3ba35005adde143
> >     Author: Tony Luck <tony.luck@intel.com>
> >     Date:   Tue Aug 12 10:34:20 2008 -0700
> >
> >     [IA64] Ensure cpu0 can access per-cpu variables in early boot
> > code
> >
> > broke kdump on our Altix 350. I get following early crash in kdump
> > kernel
> 
> Sorry about that.  I'll try to reproduce it here.

I had some discussion about that with Jay Lan that he could not
reproduce that on his machine. We thought it was different config, but
now I can verify that the problem is reproducible here with the default
configuration (plus CONFIG_SATA_VITESSE).


Bernhard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kdump broken on Altix 350
  2008-08-29 16:03 kdump broken on Altix 350 Bernhard Walle
  2008-08-29 16:05 ` Bernhard Walle
@ 2008-09-10 12:19 ` Bernhard Walle
  2008-09-29 23:42 ` Luck, Tony
  2 siblings, 0 replies; 13+ messages in thread
From: Bernhard Walle @ 2008-09-10 12:19 UTC (permalink / raw)
  To: Bernhard Walle; +Cc: jlan, linux-ia64, Tony Luck, kexec

* Bernhard Walle <bwalle@suse.de> [2008-08-29]: 

> broke kdump on our Altix 350. I get following early crash in kdump
> kernel:

Just as side note: I still have the problem with 2.6.27-rc6.


Bernhard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kdump broken on Altix 350
  2008-09-10 11:48     ` Bernhard Walle
@ 2008-09-10 20:21       ` Jay Lan
  2008-09-27  1:00         ` Jay Lan
  0 siblings, 1 reply; 13+ messages in thread
From: Jay Lan @ 2008-09-10 20:21 UTC (permalink / raw)
  To: Bernhard Walle; +Cc: linux-ia64, Luck, Tony, kexec, Simon Horman

Bernhard Walle wrote:
> * "Luck, Tony" <tony.luck@intel.com> [2008-08-29]: 
> 
>>> your commit
>>>
>>>     commit 10617bbe84628eb18ab5f723d3ba35005adde143
>>>     Author: Tony Luck <tony.luck@intel.com>
>>>     Date:   Tue Aug 12 10:34:20 2008 -0700
>>>
>>>     [IA64] Ensure cpu0 can access per-cpu variables in early boot
>>> code
>>>
>>> broke kdump on our Altix 350. I get following early crash in kdump
>>> kernel
>> Sorry about that.  I'll try to reproduce it here.
> 
> I had some discussion about that with Jay Lan that he could not
> reproduce that on his machine. We thought it was different config, but
> now I can verify that the problem is reproducible here with the default
> configuration (plus CONFIG_SATA_VITESSE).

Hi Bernhard and Tony,

I started seeing this problem, and it affected A4700 in addition to
A350.

It was not clear the system hang was related to this problem. I saw a
kdump kernel hang at cpu_init() at an A350, and a hang in find_memory
on handling pernode space thing at an A4700. No error records and no
backtrace, so i did not relate my problem to this one at first.

Out of curiosity, i backed out Tony's patch mentioned from 2.6.27-rc5
and the kdump kernel hangs were gone on both systems.

Also, i had a kdump kernel MCA problem that was caused by kexec
underallocating kernel memory for the kdump kernel. The  problem
does not happen again after i backed out the patch.

Regards,
jay

> 
> 
> Bernhard
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kdump broken on Altix 350
  2008-09-10 20:21       ` Jay Lan
@ 2008-09-27  1:00         ` Jay Lan
  2008-09-29 20:55           ` Luck, Tony
  0 siblings, 1 reply; 13+ messages in thread
From: Jay Lan @ 2008-09-27  1:00 UTC (permalink / raw)
  To: Luck, Tony; +Cc: kexec, Bernhard Walle, Simon Horman, linux-ia64

Jay Lan wrote:
> Bernhard Walle wrote:
>> * "Luck, Tony" <tony.luck@intel.com> [2008-08-29]: 
>>
>>>> your commit
>>>>
>>>>     commit 10617bbe84628eb18ab5f723d3ba35005adde143
>>>>     Author: Tony Luck <tony.luck@intel.com>
>>>>     Date:   Tue Aug 12 10:34:20 2008 -0700
>>>>
>>>>     [IA64] Ensure cpu0 can access per-cpu variables in early boot
>>>> code
>>>>
>>>> broke kdump on our Altix 350. I get following early crash in kdump
>>>> kernel
>>> Sorry about that.  I'll try to reproduce it here.
>> I had some discussion about that with Jay Lan that he could not
>> reproduce that on his machine. We thought it was different config, but
>> now I can verify that the problem is reproducible here with the default
>> configuration (plus CONFIG_SATA_VITESSE).
> 
> Hi Bernhard and Tony,
> 
> I started seeing this problem, and it affected A4700 in addition to
> A350.
> 
> It was not clear the system hang was related to this problem. I saw a
> kdump kernel hang at cpu_init() at an A350, and a hang in find_memory
> on handling pernode space thing at an A4700. No error records and no
> backtrace, so i did not relate my problem to this one at first.
> 
> Out of curiosity, i backed out Tony's patch mentioned from 2.6.27-rc5
> and the kdump kernel hangs were gone on both systems.
> 
> Also, i had a kdump kernel MCA problem that was caused by kexec
> underallocating kernel memory for the kdump kernel. The  problem
> does not happen again after i backed out the patch.

Tony and Simon,

The program headers (PT_LOAD) of vmlinux before Tony's patch look
like these:

Program Headers:
Type     Offset             VirtAddr           PhysAddr
         FileSiz            MemSiz              Flags  Align
LOAD     0x0000000000010000 0xa000000100000000 0x0000000004000000
         0x0000000000d04480 0x0000000000d04480  RWE    10000
LOAD     0x0000000000d20000 0xffffffffffff0000 0x0000000004d10000
         0x0000000000009620 0x0000000000009620  RW     10000
LOAD     0x0000000000d30000 0xa000000100d20000 0x0000000004d20000
         0x00000000000bef50 0x0000000000564c90  RW     10000

The program headers of vmlinux after Tony's patch look like
these:
Program Headers:
Type     Offset             VirtAddr           PhysAddr
         FileSiz            MemSiz              Flags  Align
LOAD     0x0000000000010000 0xa000000100000000 0x0000000004000000
         0x0000000000d04480 0x0000000000d04480  RWE    10000
LOAD     0x0000000000d20000 0xffffffffffff0000 0x0000000004d20000
         0x0000000000009620 0x0000000000009620  RW     10000
LOAD     0x0000000000d30000 0xa000000100d30000 0x0000000004d30000
         0x00000000000bef58 0x0000000000564c90  RW     10000

The first PT_LOAD is for code, the second for percpu, and the
third for data. The FileSiz and MemSiz of the code and percpu
headers in both cases are identical. The only difference is the
PhyAddr of the percpu header after the patch is 0x10000 greater
than in the case of before patch.

Tony's patch put per-cpu area for cpu0 in the vmlinux itself
(in the percpu section of the ELF executable). If i read the
code correctly, he added extra PERCPU_PAGE_SIZE (0x10000 in ia64)
to the code segment. That explains why the PhysAddr of the percpu
segment became 0x10000 greater after the patch.

Howver, shouldn't the MemSiz of the code segment 0x10000 larger?
The current logic of add_loaded_segments_info() in
kexec/arch/ia64/crashdump-ia64.c counts on that information to
correctly determine how much memory is needed for vmlinux.

I could not figure out how the MemSiz of the code PL_LOAD
header in vmlinux is determined and set.

Regards,
 - jay

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: kdump broken on Altix 350
  2008-09-27  1:00         ` Jay Lan
@ 2008-09-29 20:55           ` Luck, Tony
  0 siblings, 0 replies; 13+ messages in thread
From: Luck, Tony @ 2008-09-29 20:55 UTC (permalink / raw)
  To: Jay Lan
  Cc: kexec@lists.infradead.org, Bernhard Walle, Simon Horman,
	linux-ia64@vger.kernel.org

Maybe I'm starting to see what happened ... and it could well
be my fault.

I wanted to allocate the per-cpu memory for cpu0 statically
in the vmlinux ... so it would be available in head.S to set
up everything before we move to any C code that might try to
access per cpu variables.  To make life easy for myself I just
made this allocation in vmlinus.lds.S immediately before the
initialized block where all the percpu variables live (which
means no extra labels ... and I could initialize this data
with a simple copy of PERCPU_PAGESIZE bytes from (the poorly
named) __phys_per_cpu_start to the unamed block before it
that will be the cpu0 copy.

But my extra allocation is in the "percpu" block in vmlinux.lds.S,
so it ends up in that PT_LOAD section.  Which ultimately confuses
the kexec code.

Probably the cpu0 percpu space should be placed in the data section.

-Tony

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: kdump broken on Altix 350
  2008-08-29 16:03 kdump broken on Altix 350 Bernhard Walle
  2008-08-29 16:05 ` Bernhard Walle
  2008-09-10 12:19 ` Bernhard Walle
@ 2008-09-29 23:42 ` Luck, Tony
  2008-09-30  0:30   ` Jay Lan
  2008-10-02  5:13   ` Simon Horman
  2 siblings, 2 replies; 13+ messages in thread
From: Luck, Tony @ 2008-09-29 23:42 UTC (permalink / raw)
  To: Jay Lan
  Cc: kexec@lists.infradead.org, Bernhard Walle, Simon Horman,
	linux-ia64@vger.kernel.org

Does this make kexec/kdump happier?  Bare minimum testing so far
(builds and boots on tiger ... didn't try kexec yet).



[IA64] Put the space for cpu0 per-cpu area into .data section

Initial fix for making sure that we can access percpu variables
in all C code commit: 10617bbe84628eb18ab5f723d3ba35005adde143
inadvertantly allocated the memory in the "percpu" section of
the vmlinux ELF executable.  This confused kexec.

Signed-off-by: Tony Luck <tony.luck@intel.com>

diff --git a/arch/ia64/include/asm/sections.h b/arch/ia64/include/asm/sections.h
index f667998..1a873b3 100644
--- a/arch/ia64/include/asm/sections.h
+++ b/arch/ia64/include/asm/sections.h
@@ -11,6 +11,9 @@
 #include <asm-generic/sections.h>
 
 extern char __per_cpu_start[], __per_cpu_end[], __phys_per_cpu_start[];
+#ifdef	CONFIG_SMP
+extern char __cpu0_per_cpu[];
+#endif
 extern char __start___vtop_patchlist[], __end___vtop_patchlist[];
 extern char __start___rse_patchlist[], __end___rse_patchlist[];
 extern char __start___mckinley_e9_bundles[], __end___mckinley_e9_bundles[];
diff --git a/arch/ia64/kernel/head.S b/arch/ia64/kernel/head.S
index 8bdea8e..66e491d 100644
--- a/arch/ia64/kernel/head.S
+++ b/arch/ia64/kernel/head.S
@@ -367,16 +367,17 @@ start_ap:
 	;;
 #else
 (isAP)	br.few 2f
-	mov r20=r19
-	sub r19=r19,r18
+	movl r20=__cpu0_per_cpu
 	;;
 	shr.u r18=r18,3
 1:
-	ld8 r21=[r20],8;;
-	st8[r19]=r21,8
+	ld8 r21=[r19],8;;
+	st8[r20]=r21,8
 	adds r18=-1,r18;;
 	cmp4.lt p7,p6=0,r18
 (p7)	br.cond.dptk.few 1b
+	mov r19=r20
+	;;
 2:
 #endif
 	tpa r19=r19
diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index de71da8..10a7d47 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -215,9 +215,6 @@ SECTIONS
   /* Per-cpu data: */
   percpu : { } :percpu
   . = ALIGN(PERCPU_PAGE_SIZE);
-#ifdef	CONFIG_SMP
-  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
-#endif
   __phys_per_cpu_start = .;
   .data.percpu PERCPU_ADDR : AT(__phys_per_cpu_start - LOAD_OFFSET)
 	{
@@ -233,6 +230,11 @@ SECTIONS
   data : { } :data
   .data : AT(ADDR(.data) - LOAD_OFFSET)
 	{
+#ifdef	CONFIG_SMP
+  . = ALIGN(PERCPU_PAGE_SIZE);
+		__cpu0_per_cpu = .;
+  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
+#endif
 		DATA_DATA
 		*(.data1)
 		*(.gnu.linkonce.d*)
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index e566ff4..0ee085e 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -163,7 +163,7 @@ per_cpu_init (void)
 	 * get_zeroed_page().
 	 */
 	if (first_time) {
-		void *cpu0_data = __phys_per_cpu_start - PERCPU_PAGE_SIZE;
+		void *cpu0_data = __cpu0_per_cpu;
 
 		first_time=0;
 
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 78026aa..d8c5fcd 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -144,7 +144,7 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
 
 	for_each_possible_early_cpu(cpu) {
 		if (cpu = 0) {
-			void *cpu0_data = __phys_per_cpu_start - PERCPU_PAGE_SIZE;
+			void *cpu0_data = __cpu0_per_cpu;
 			__per_cpu_offset[cpu] = (char*)cpu0_data -
 				__per_cpu_start;
 		} else if (node = node_cpuid[cpu].nid) {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: kdump broken on Altix 350
  2008-09-29 23:42 ` Luck, Tony
@ 2008-09-30  0:30   ` Jay Lan
  2008-10-02  5:13   ` Simon Horman
  1 sibling, 0 replies; 13+ messages in thread
From: Jay Lan @ 2008-09-30  0:30 UTC (permalink / raw)
  To: Luck, Tony
  Cc: kexec@lists.infradead.org, Bernhard Walle, Simon Horman,
	linux-ia64@vger.kernel.org

Luck, Tony wrote:
> Does this make kexec/kdump happier?  Bare minimum testing so far
> (builds and boots on tiger ... didn't try kexec yet).

Hi Tony,

Yep, the 2.6.27-rc7 kdump kernel built with this patch worked fine!

Actually you probably can predict the results by doing 'readelf -l
vmlinux'. If the PT_LOAD headers do not have a gap betweens headers,
it is good. In other words, if the (PhysAddr+MemSiz) rounded up to
Align value of one header is the same as the PhysAddr of the next
header, kexec should produce a good boot memmap for the kdump kernel.


Thanks for the patch!

jay

> 
> 
> 
> [IA64] Put the space for cpu0 per-cpu area into .data section
> 
> Initial fix for making sure that we can access percpu variables
> in all C code commit: 10617bbe84628eb18ab5f723d3ba35005adde143
> inadvertantly allocated the memory in the "percpu" section of
> the vmlinux ELF executable.  This confused kexec.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> 
> diff --git a/arch/ia64/include/asm/sections.h b/arch/ia64/include/asm/sections.h
> index f667998..1a873b3 100644
> --- a/arch/ia64/include/asm/sections.h
> +++ b/arch/ia64/include/asm/sections.h
> @@ -11,6 +11,9 @@
>  #include <asm-generic/sections.h>
>  
>  extern char __per_cpu_start[], __per_cpu_end[], __phys_per_cpu_start[];
> +#ifdef	CONFIG_SMP
> +extern char __cpu0_per_cpu[];
> +#endif
>  extern char __start___vtop_patchlist[], __end___vtop_patchlist[];
>  extern char __start___rse_patchlist[], __end___rse_patchlist[];
>  extern char __start___mckinley_e9_bundles[], __end___mckinley_e9_bundles[];
> diff --git a/arch/ia64/kernel/head.S b/arch/ia64/kernel/head.S
> index 8bdea8e..66e491d 100644
> --- a/arch/ia64/kernel/head.S
> +++ b/arch/ia64/kernel/head.S
> @@ -367,16 +367,17 @@ start_ap:
>  	;;
>  #else
>  (isAP)	br.few 2f
> -	mov r20=r19
> -	sub r19=r19,r18
> +	movl r20=__cpu0_per_cpu
>  	;;
>  	shr.u r18=r18,3
>  1:
> -	ld8 r21=[r20],8;;
> -	st8[r19]=r21,8
> +	ld8 r21=[r19],8;;
> +	st8[r20]=r21,8
>  	adds r18=-1,r18;;
>  	cmp4.lt p7,p6=0,r18
>  (p7)	br.cond.dptk.few 1b
> +	mov r19=r20
> +	;;
>  2:
>  #endif
>  	tpa r19=r19
> diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
> index de71da8..10a7d47 100644
> --- a/arch/ia64/kernel/vmlinux.lds.S
> +++ b/arch/ia64/kernel/vmlinux.lds.S
> @@ -215,9 +215,6 @@ SECTIONS
>    /* Per-cpu data: */
>    percpu : { } :percpu
>    . = ALIGN(PERCPU_PAGE_SIZE);
> -#ifdef	CONFIG_SMP
> -  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
> -#endif
>    __phys_per_cpu_start = .;
>    .data.percpu PERCPU_ADDR : AT(__phys_per_cpu_start - LOAD_OFFSET)
>  	{
> @@ -233,6 +230,11 @@ SECTIONS
>    data : { } :data
>    .data : AT(ADDR(.data) - LOAD_OFFSET)
>  	{
> +#ifdef	CONFIG_SMP
> +  . = ALIGN(PERCPU_PAGE_SIZE);
> +		__cpu0_per_cpu = .;
> +  . = . + PERCPU_PAGE_SIZE;	/* cpu0 per-cpu space */
> +#endif
>  		DATA_DATA
>  		*(.data1)
>  		*(.gnu.linkonce.d*)
> diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
> index e566ff4..0ee085e 100644
> --- a/arch/ia64/mm/contig.c
> +++ b/arch/ia64/mm/contig.c
> @@ -163,7 +163,7 @@ per_cpu_init (void)
>  	 * get_zeroed_page().
>  	 */
>  	if (first_time) {
> -		void *cpu0_data = __phys_per_cpu_start - PERCPU_PAGE_SIZE;
> +		void *cpu0_data = __cpu0_per_cpu;
>  
>  		first_time=0;
>  
> diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
> index 78026aa..d8c5fcd 100644
> --- a/arch/ia64/mm/discontig.c
> +++ b/arch/ia64/mm/discontig.c
> @@ -144,7 +144,7 @@ static void *per_cpu_node_setup(void *cpu_data, int node)
>  
>  	for_each_possible_early_cpu(cpu) {
>  		if (cpu = 0) {
> -			void *cpu0_data = __phys_per_cpu_start - PERCPU_PAGE_SIZE;
> +			void *cpu0_data = __cpu0_per_cpu;
>  			__per_cpu_offset[cpu] = (char*)cpu0_data -
>  				__per_cpu_start;
>  		} else if (node = node_cpuid[cpu].nid) {
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kdump broken on Altix 350
  2008-09-29 23:42 ` Luck, Tony
  2008-09-30  0:30   ` Jay Lan
@ 2008-10-02  5:13   ` Simon Horman
  2008-10-02 17:04     ` Jay Lan
  1 sibling, 1 reply; 13+ messages in thread
From: Simon Horman @ 2008-10-02  5:13 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Jay Lan, Bernhard Walle, linux-ia64@vger.kernel.org,
	kexec@lists.infradead.org

On Mon, Sep 29, 2008 at 04:42:52PM -0700, Luck, Tony wrote:
> Does this make kexec/kdump happier?  Bare minimum testing so far
> (builds and boots on tiger ... didn't try kexec yet).

Hi Tony,

your analysis (in your previous email) was more or less the same
conclusion that I had come too, though I was puzzling over
why you had put the reserved area for cpu0 where you had - I assumed
I was misunderstanding things.

This patch looks good to me.

Jay,

With this patch I assume that we still need an order of operations fix for
kexec-tools but no section merging changes. Is that correct?

-- 
Simon Horman
  VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
  H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: kdump broken on Altix 350
  2008-10-02  5:13   ` Simon Horman
@ 2008-10-02 17:04     ` Jay Lan
  0 siblings, 0 replies; 13+ messages in thread
From: Jay Lan @ 2008-10-02 17:04 UTC (permalink / raw)
  To: Simon Horman
  Cc: kexec@lists.infradead.org, Bernhard Walle, Luck, Tony,
	linux-ia64@vger.kernel.org

Simon Horman wrote:
> On Mon, Sep 29, 2008 at 04:42:52PM -0700, Luck, Tony wrote:
>> Does this make kexec/kdump happier?  Bare minimum testing so far
>> (builds and boots on tiger ... didn't try kexec yet).
> 
> Hi Tony,
> 
> your analysis (in your previous email) was more or less the same
> conclusion that I had come too, though I was puzzling over
> why you had put the reserved area for cpu0 where you had - I assumed
> I was misunderstanding things.
> 
> This patch looks good to me.
> 
> Jay,
> 
> With this patch I assume that we still need an order of operations fix for
> kexec-tools but no section merging changes. Is that correct?

I think the code should still be simplified.

The 'break' of the if-statement has never been executed due to
the mistake in operation precedence. Thus, the code have been
doing segment merging by calculating p_memsz of each segment
without having to deal with 'gap' between PT_LOAD headers.

As demonstrated by this incidence, when there is a gap happened,
the kernel boot fail. So, if we assume the PT_LOAD headers will
be generated correctly, then the segment merging logic should be
simplified. It does not make sense to pick up p_memsz of each
segment and do all those calculation. It caused confusion.

Regards,
jay

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-10-02 17:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-29 16:03 kdump broken on Altix 350 Bernhard Walle
2008-08-29 16:05 ` Bernhard Walle
2008-08-29 20:42   ` Luck, Tony
2008-08-29 20:48     ` Bernhard Walle
2008-09-10 11:48     ` Bernhard Walle
2008-09-10 20:21       ` Jay Lan
2008-09-27  1:00         ` Jay Lan
2008-09-29 20:55           ` Luck, Tony
2008-09-10 12:19 ` Bernhard Walle
2008-09-29 23:42 ` Luck, Tony
2008-09-30  0:30   ` Jay Lan
2008-10-02  5:13   ` Simon Horman
2008-10-02 17:04     ` Jay Lan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox