* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
@ 2004-11-05 16:26 ` Bjorn Helgaas
2004-11-05 21:04 ` Cliff Larsen
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2004-11-05 16:26 UTC (permalink / raw)
To: linux-ia64
On Friday 05 November 2004 6:55 am, Cliff Larsen wrote:
> The patch is off 2.4.27
There's not much happening on 2.4 these days. And there's
plenty of room for improvement in the 2.6 INIT handler,
hint, hint ;-)
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
2004-11-05 16:26 ` Bjorn Helgaas
@ 2004-11-05 21:04 ` Cliff Larsen
2004-11-05 22:04 ` Bjorn Helgaas
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Cliff Larsen @ 2004-11-05 21:04 UTC (permalink / raw)
To: linux-ia64
On Fri, 2004-11-05 at 11:26, Bjorn Helgaas wrote:
> On Friday 05 November 2004 6:55 am, Cliff Larsen wrote:
> > The patch is off 2.4.27
>
> There's not much happening on 2.4 these days. And there's
> plenty of room for improvement in the 2.6 INIT handler,
> hint, hint ;-)
I've been working with 2.4 so I thought it would be appropriate
to submit the patch with its latest version. I've not gotten to
2.6 yet. I have looked at 2.6 sources and essentially the same
patch would apply. What do you think of the concept of the patch
and its utility in 2.6?
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
2004-11-05 16:26 ` Bjorn Helgaas
2004-11-05 21:04 ` Cliff Larsen
@ 2004-11-05 22:04 ` Bjorn Helgaas
2004-11-05 22:57 ` Cliff Larsen
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2004-11-05 22:04 UTC (permalink / raw)
To: linux-ia64
On Friday 05 November 2004 2:04 pm, Cliff Larsen wrote:
> I've been working with 2.4 so I thought it would be appropriate
> to submit the patch with its latest version. I've not gotten to
> 2.6 yet. I have looked at 2.6 sources and essentially the same
> patch would apply. What do you think of the concept of the patch
> and its utility in 2.6?
Yeah, I'm sure it would apply easily to 2.6. Sorry, I guess I
was just being lazy because I haven't paid much attention to
the MCA/INIT path recently. Some of the folks who have will
probably jump in.
My $0.02 is that it *is* annoying that we just hang after printing
the INIT register state and backtraces. However, I wonder if we
could just leverage the existing panic_timeout (set by "panic=")
so we don't need a new parameter.
I don't have an opinion about whether calling panic from
init_handler_platform() is the right thing to do or not.
Certainly it is a good place for some sort of hook for a
debugger and/or crashdump.
My personal preference would be something like this:
1) dump register state (for all CPUs, not just the INIT monarch)
on the console
2) print backtraces (maybe just for currently-running tasks;
currently we do the task on the INIT monarch plus all other
non-running tasks, which is definitely non-optimal)
3) optional debugger/crashdump hook
4) call panic (maybe)
5) optional timeout, then reboot (if not calling panic)
Part 5 would be trivial and probably not *too* controversial.
Part 1 is harder but extremely useful, and I think someone (Zoltan?)
posted a start. Part 2 should be simple given part 1.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (2 preceding siblings ...)
2004-11-05 22:04 ` Bjorn Helgaas
@ 2004-11-05 22:57 ` Cliff Larsen
2004-11-05 23:04 ` Russ Anderson
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Cliff Larsen @ 2004-11-05 22:57 UTC (permalink / raw)
To: linux-ia64
On Fri, 2004-11-05 at 17:04, Bjorn Helgaas wrote:
> My $0.02 is that it *is* annoying that we just hang after printing
> the INIT register state and backtraces. However, I wonder if we
> could just leverage the existing panic_timeout (set by "panic=")
> so we don't need a new parameter.
I'm fine with that.
> I don't have an opinion about whether calling panic from
> init_handler_platform() is the right thing to do or not.
> Certainly it is a good place for some sort of hook for a
> debugger and/or crashdump.
My major motivation was to get to a crashdump hook and get
to restart, and panic does both, so I chose it.
> My personal preference would be something like this:
> 1) dump register state (for all CPUs, not just the INIT monarch)
> on the console
> 2) print backtraces (maybe just for currently-running tasks;
> currently we do the task on the INIT monarch plus all other
> non-running tasks, which is definitely non-optimal)
> 3) optional debugger/crashdump hook
> 4) call panic (maybe)
> 5) optional timeout, then reboot (if not calling panic)
>
> Part 5 would be trivial and probably not *too* controversial.
> Part 1 is harder but extremely useful, and I think someone (Zoltan?)
> posted a start. Part 2 should be simple given part 1.
I'll see what I can do about most of these. Part 1 would be
difficult since the hardware/firmware we've currently got
available makes both processors the monarch on INIT.
Thanks for your feedback,
Cliff
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (3 preceding siblings ...)
2004-11-05 22:57 ` Cliff Larsen
@ 2004-11-05 23:04 ` Russ Anderson
2004-11-08 12:14 ` Takao Indoh
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Russ Anderson @ 2004-11-05 23:04 UTC (permalink / raw)
To: linux-ia64
Bjorn Helgaas wrote:
>
> My personal preference would be something like this:
> 1) dump register state (for all CPUs, not just the INIT monarch)
> on the console
> 2) print backtraces (maybe just for currently-running tasks;
> currently we do the task on the INIT monarch plus all other
> non-running tasks, which is definitely non-optimal)
> 3) optional debugger/crashdump hook
> 4) call panic (maybe)
> 5) optional timeout, then reboot (if not calling panic)
>
> Part 5 would be trivial and probably not *too* controversial.
> Part 1 is harder but extremely useful, and I think someone (Zoltan?)
> posted a start. Part 2 should be simple given part 1.
I agree. I am working on part 1 (per cpu MCA/INIT save areas).
For example, the following sample patch:
1) Reserves ar.k3 for a pointer to this cpu's mca info save area.
2) Defines the struct layout of the save area.
3) Allocates the memory for the save area (at boot time).
The part that I'm debugging it tying this into mca_asm.S.
-----------------------------------------------------------
Index: sles9-sgidev/linux/include/asm-ia64/kregs.h
=================================--- sles9-sgidev.orig/linux/include/asm-ia64/kregs.h 2004-02-23 22:44:17.000000000 -0600
+++ sles9-sgidev/linux/include/asm-ia64/kregs.h 2004-11-04 11:12:06.000000000 -0600
@@ -14,6 +14,7 @@
*/
#define IA64_KR_IO_BASE 0 /* ar.k0: legacy I/O base address */
#define IA64_KR_TSSD 1 /* ar.k1: IVE uses this as the TSSD */
+#define IA64_KR_MCA_INFO 3 /* ar.k3: phys addr of this cpu's mca_info struct */
#define IA64_KR_CURRENT_STACK 4 /* ar.k4: what's mapped in IA64_TR_CURRENT_STACK */
#define IA64_KR_FPU_OWNER 5 /* ar.k5: fpu-owner (UP only, at the moment) */
#define IA64_KR_CURRENT 6 /* ar.k6: "current" task pointer */
Index: sles9-sgidev/linux/include/asm-ia64/mca.h
=================================--- sles9-sgidev.orig/linux/include/asm-ia64/mca.h 2004-02-23 23:57:45.000000000 -0600
+++ sles9-sgidev/linux/include/asm-ia64/mca.h 2004-11-04 12:38:23.000000000 -0600
@@ -107,6 +107,15 @@
*/
} ia64_mca_os_to_sal_state_t;
+typedef struct ia64_mca_cpu_s {
+ u64 ia64_mca_proc_state_dump[512];
+ u64 ia64_mca_stack[1024] __attribute__((aligned(16)));
+ u64 ia64_mca_stackframe[32];
+ u64 ia64_mca_bspstore[1024];
+ u64 ia64_init_stack[KERNEL_STACK_SIZE/8] __attribute__((aligned(16)));
+ struct ia64_mca_tlb_info ia64_mca_cpu_tlb;
+} ia64_mca_cpu_t;
+
extern void ia64_mca_init(void);
extern void ia64_os_mca_dispatch(void);
extern void ia64_os_mca_dispatch_end(void);
Index: sles9-sgidev/linux/arch/ia64/mm/discontig.c
=================================--- sles9-sgidev.orig/linux/arch/ia64/mm/discontig.c 2004-09-24 08:43:54.000000000 -0500
+++ sles9-sgidev/linux/arch/ia64/mm/discontig.c 2004-11-04 14:36:23.000000000 -0600
@@ -4,6 +4,10 @@
* Copyright (c) 2001 Tony Luck <tony.luck@intel.com>
* Copyright (c) 2002 NEC Corp.
* Copyright (c) 2002 Kimio Suganuma <k-suganuma@da.jp.nec.com>
+ * Copyright (c) 2003-2004 Silicon Graphics, Inc
+ * Russ Anderson <rja@sgi.com>
+ * Jesse Barnes <jbarnes@sgi.com>
+ * Jack Steiner <steiner@sgi.com>
*/
/*
@@ -21,6 +25,7 @@
#include <asm/meminit.h>
#include <asm/numa.h>
#include <asm/sections.h>
+#include <asm/mca.h>
/*
* Track per-node information needed to setup the boot memory allocator, the
@@ -203,12 +208,33 @@
}
/**
+ * early_nr_phys_cpus_node - return number of physical cpus on a given node
+ * @node: node to check
+ *
+ * Count the number of physical cpus on @node. These are cpus that actually
+ * exist. We can't use nr_cpus_node() yet because
+ * acpi_boot_init() (which builds the node_to_cpu_mask array) hasn't been
+ * called yet.
+ */
+static int early_nr_phys_cpus_node(int node)
+{
+ int cpu, n = 0;
+
+ for (cpu = 0; cpu < NR_CPUS; cpu++)
+ if (node = node_cpuid[cpu].nid)
+ if ((cpu = 0) || node_cpuid[cpu].phys_id)
+ n++;
+
+ return n;
+}
+
+/**
* early_nr_cpus_node - return number of cpus on a given node
* @node: node to check
*
* Count the number of cpus on @node. We can't use nr_cpus_node() yet because
* acpi_boot_init() (which builds the node_to_cpu_mask array) hasn't been
- * called yet.
+ * called yet. Note that node 0 will also count all non-existent cpus.
*/
static int early_nr_cpus_node(int node)
{
@@ -235,12 +261,15 @@
* | |
* |~~~~~~~~~~~~~~~~~~~~~~~~| <-- NODEDATA_ALIGN(start, node) for the first
* | PERCPU_PAGE_SIZE * | start and length big enough
- * | NR_CPUS |
+ * | cpus_on_this_node | Node 0 will also have entries for all non-existent cpus.
* |------------------------|
* | local pg_data_t * |
* |------------------------|
* | local ia64_node_data |
* |------------------------|
+ * | MCA/INIT data * |
+ * | cpus_on_this_node |
+ * |------------------------|
* | ??? |
* |________________________|
*
@@ -252,9 +281,9 @@
static int __init find_pernode_space(unsigned long start, unsigned long len,
int node)
{
- unsigned long epfn, cpu, cpus;
+ unsigned long epfn, cpu, cpus, phys_cpus;
unsigned long pernodesize = 0, pernode, pages, mapsize;
- void *cpu_data;
+ void *cpu_data, *mca_data_phys;
struct bootmem_data *bdp = &mem_data[node].bootmem_data;
epfn = (start + len) >> PAGE_SHIFT;
@@ -278,9 +307,11 @@
* for good alignment and alias prevention.
*/
cpus = early_nr_cpus_node(node);
+ phys_cpus = early_nr_phys_cpus_node(node);
pernodesize += PERCPU_PAGE_SIZE * cpus;
pernodesize += L1_CACHE_ALIGN(sizeof(pg_data_t));
pernodesize += L1_CACHE_ALIGN(sizeof(struct ia64_node_data));
+ pernodesize += L1_CACHE_ALIGN(sizeof(ia64_mca_cpu_t)) * phys_cpus;
pernodesize = PAGE_ALIGN(pernodesize);
pernode = NODEDATA_ALIGN(start, node);
@@ -299,6 +330,9 @@
mem_data[node].node_data = __va(pernode);
pernode += L1_CACHE_ALIGN(sizeof(struct ia64_node_data));
+ mca_data_phys = (void *)pernode;
+ pernode += L1_CACHE_ALIGN(sizeof(ia64_mca_cpu_t)) * phys_cpus;
+
mem_data[node].pgdat->bdata = bdp;
pernode += L1_CACHE_ALIGN(sizeof(pg_data_t));
@@ -311,6 +345,14 @@
if (node = node_cpuid[cpu].nid) {
memcpy(__va(cpu_data), __phys_per_cpu_start,
__per_cpu_end - __per_cpu_start);
+ if ((cpu = 0) || (node_cpuid[cpu].phys_id > 0)) {
+ ia64_set_kr(IA64_KR_MCA_INFO, __pa(mca_data_phys));
+ mca_data_phys += L1_CACHE_ALIGN(sizeof(ia64_mca_cpu_t));
+ }
__per_cpu_offset[cpu] = (char*)__va(cpu_data) -
__per_cpu_start;
cpu_data += PERCPU_PAGE_SIZE;
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (4 preceding siblings ...)
2004-11-05 23:04 ` Russ Anderson
@ 2004-11-08 12:14 ` Takao Indoh
2004-11-10 15:53 ` Philip R Auld
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Takao Indoh @ 2004-11-08 12:14 UTC (permalink / raw)
To: linux-ia64
Hi,
On Fri, 05 Nov 2004 17:57:29 -0500, Cliff Larsen wrote:
>> I don't have an opinion about whether calling panic from
>> init_handler_platform() is the right thing to do or not.
>> Certainly it is a good place for some sort of hook for a
>> debugger and/or crashdump.
>
>My major motivation was to get to a crashdump hook and get
>to restart, and panic does both, so I chose it.
IIRC, LKCD is invoked by panic_notifier_list in the panic(), so
LKCD may work correctly. But diskdump/netdump may not. They
are called via BUG(). For example, netdump is called from the following
BUG().
NORET_TYPE void panic(const char * fmt, ...)
{
(snipped)
bust_spinlocks(1);
va_start(args, fmt);
vsprintf(buf, fmt, args);
va_end(args);
printk(KERN_EMERG "Kernel panic: %s\n",buf);
if (netdump_func)
BUG();
Normally BUG() invokes exception handler and dump function is called.
But, I am not sure exception handler is correctly invoked from the INIT
context.
>> My personal preference would be something like this:
>> 1) dump register state (for all CPUs, not just the INIT monarch)
>> on the console
>> 2) print backtraces (maybe just for currently-running tasks;
>> currently we do the task on the INIT monarch plus all other
>> non-running tasks, which is definitely non-optimal)
>> 3) optional debugger/crashdump hook
>> 4) call panic (maybe)
>> 5) optional timeout, then reboot (if not calling panic)
>>
>> Part 5 would be trivial and probably not *too* controversial.
>> Part 1 is harder but extremely useful, and I think someone (Zoltan?)
>> posted a start. Part 2 should be simple given part 1.
>
>I'll see what I can do about most of these. Part 1 would be
>difficult since the hardware/firmware we've currently got
>available makes both processors the monarch on INIT.
Even if crashdump hook is added into the init_handler, dump does not
work correctly because of single INIT stack. Therefore Russ Anderson's
patch which separates INIT stack is also indispensable.
Regards,
Takao Indoh
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (5 preceding siblings ...)
2004-11-08 12:14 ` Takao Indoh
@ 2004-11-10 15:53 ` Philip R Auld
2004-11-11 0:55 ` Takao Indoh
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Philip R Auld @ 2004-11-10 15:53 UTC (permalink / raw)
To: linux-ia64
Hi,
Rumor has it that on Mon, Nov 08, 2004 at 09:14:21PM +0900 Takao Indoh said:
> Hi,
>
> On Fri, 05 Nov 2004 17:57:29 -0500, Cliff Larsen wrote:
>
> >> I don't have an opinion about whether calling panic from
> >> init_handler_platform() is the right thing to do or not.
> >> Certainly it is a good place for some sort of hook for a
> >> debugger and/or crashdump.
> >
> >My major motivation was to get to a crashdump hook and get
> >to restart, and panic does both, so I chose it.
>
> IIRC, LKCD is invoked by panic_notifier_list in the panic(), so
> LKCD may work correctly. But diskdump/netdump may not. They
> are called via BUG(). For example, netdump is called from the following
> BUG().
>
Calling BUG would also work, assuming the hooks are in the
BUG path. I'm not seeing that in 2.6.8 anyway.
> Normally BUG() invokes exception handler and dump function is called.
> But, I am not sure exception handler is correctly invoked from the INIT
> context.
This doesn't currently do much in ia64 as far as I can tell. It ends up
in die via die_if_kernel, but that doesn't look like it will ever get to a
machine restart, much less a crash dump or even a for(;;) loop. I may be
missing something though. I'm pretty new to Itanium.
In i386 there is panic_on_oops in die which can at least get to the
panic call chain (as there used to be in ia64).
None of the dump stuff is in the stock kernels yet is it?
>
>
> >> My personal preference would be something like this:
> >> 1) dump register state (for all CPUs, not just the INIT monarch)
> >> on the console
> >> 2) print backtraces (maybe just for currently-running tasks;
> >> currently we do the task on the INIT monarch plus all other
> >> non-running tasks, which is definitely non-optimal)
> >> 3) optional debugger/crashdump hook
> >> 4) call panic (maybe)
> >> 5) optional timeout, then reboot (if not calling panic)
> >>
> >> Part 5 would be trivial and probably not *too* controversial.
> >> Part 1 is harder but extremely useful, and I think someone (Zoltan?)
> >> posted a start. Part 2 should be simple given part 1.
> >
> >I'll see what I can do about most of these. Part 1 would be
> >difficult since the hardware/firmware we've currently got
> >available makes both processors the monarch on INIT.
>
> Even if crashdump hook is added into the init_handler, dump does not
> work correctly because of single INIT stack. Therefore Russ Anderson's
> patch which separates INIT stack is also indispensable.
>
We are still mostly a working with 2.4 (rhel3 which has netdump_func hooks)
and this all worked fine. A crashdump hook, a call to panic, or a call
to BUG each worked.
I doesn't look like anything but the crashdump hook can work in stock
2.6.8 since there are no dump routine calls in the panic or die paths.
Cheers,
Phil
> Regards,
> Takao Indoh
--
Philip R. Auld, Ph.D. Egenera, Inc.
Software Architect 165 Forest St.
(508) 858-2628 Marlboro, MA 01752
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (6 preceding siblings ...)
2004-11-10 15:53 ` Philip R Auld
@ 2004-11-11 0:55 ` Takao Indoh
2004-11-11 1:14 ` Luck, Tony
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Takao Indoh @ 2004-11-11 0:55 UTC (permalink / raw)
To: linux-ia64
On Wed, 10 Nov 2004 10:53:38 -0500, Philip R Auld wrote:
>> Normally BUG() invokes exception handler and dump function is called.
>> But, I am not sure exception handler is correctly invoked from the INIT
>> context.
>
>This doesn't currently do much in ia64 as far as I can tell. It ends up
>in die via die_if_kernel, but that doesn't look like it will ever get to a
>machine restart, much less a crash dump or even a for(;;) loop. I may be
>missing something though. I'm pretty new to Itanium.
>
>In i386 there is panic_on_oops in die which can at least get to the
>panic call chain (as there used to be in ia64).
>
>None of the dump stuff is in the stock kernels yet is it?
There is not dump stuff.
>> >> My personal preference would be something like this:
>> >> 1) dump register state (for all CPUs, not just the INIT monarch)
>> >> on the console
>> >> 2) print backtraces (maybe just for currently-running tasks;
>> >> currently we do the task on the INIT monarch plus all other
>> >> non-running tasks, which is definitely non-optimal)
>> >> 3) optional debugger/crashdump hook
>> >> 4) call panic (maybe)
>> >> 5) optional timeout, then reboot (if not calling panic)
>> >>
>> >> Part 5 would be trivial and probably not *too* controversial.
>> >> Part 1 is harder but extremely useful, and I think someone (Zoltan?)
>> >> posted a start. Part 2 should be simple given part 1.
>> >
>> >I'll see what I can do about most of these. Part 1 would be
>> >difficult since the hardware/firmware we've currently got
>> >available makes both processors the monarch on INIT.
>>
>> Even if crashdump hook is added into the init_handler, dump does not
>> work correctly because of single INIT stack. Therefore Russ Anderson's
>> patch which separates INIT stack is also indispensable.
>>
>
>We are still mostly a working with 2.4 (rhel3 which has netdump_func hooks)
>and this all worked fine. A crashdump hook, a call to panic, or a call
>to BUG each worked.
Crashdump itself succeeds, but isn't there any problem in analyzing
dump? Backtrace of "current" on each cpu seem to not work because
switch_stack is not saved correctly.
Regards,
Takao Indoh
^ permalink raw reply [flat|nested] 13+ messages in thread* RE: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (7 preceding siblings ...)
2004-11-11 0:55 ` Takao Indoh
@ 2004-11-11 1:14 ` Luck, Tony
2004-11-11 17:12 ` Cliff Larsen
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Luck, Tony @ 2004-11-11 1:14 UTC (permalink / raw)
To: linux-ia64
>> 1) dump register state (for all CPUs, not just the INIT monarch)
>> on the console
>I'll see what I can do about most of these. Part 1 would be
>difficult since the hardware/firmware we've currently got
>available makes both processors the monarch on INIT.
You could change the call to ia64_sal_set_vectors in ia64_mca_init
to point all cpus to just one routine (pass pointer to the same
routine for monarch/slave) ... and then have the OS init code
handle the serialization. That would work on both correct and
buggy SAL implementations.
-Tony
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (8 preceding siblings ...)
2004-11-11 1:14 ` Luck, Tony
@ 2004-11-11 17:12 ` Cliff Larsen
2004-11-11 17:18 ` Cliff Larsen
2004-11-11 17:33 ` Luck, Tony
11 siblings, 0 replies; 13+ messages in thread
From: Cliff Larsen @ 2004-11-11 17:12 UTC (permalink / raw)
To: linux-ia64
On Wed, 2004-11-10 at 19:55, Takao Indoh wrote:
> On Wed, 10 Nov 2004 10:53:38 -0500, Philip R Auld wrote:
> >
> >We are still mostly a working with 2.4 (rhel3 which has netdump_func hooks)
> >and this all worked fine. A crashdump hook, a call to panic, or a call
> >to BUG each worked.
>
> Crashdump itself succeeds, but isn't there any problem in analyzing
> dump? Backtrace of "current" on each cpu seem to not work because
> switch_stack is not saved correctly.
We are seeing the same behavior with our 2.4 - we can backtrace all
process but the active.
--
Cliff Larsen <clarsen@egenera.com>
^ permalink raw reply [flat|nested] 13+ messages in thread* RE: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (9 preceding siblings ...)
2004-11-11 17:12 ` Cliff Larsen
@ 2004-11-11 17:18 ` Cliff Larsen
2004-11-11 17:33 ` Luck, Tony
11 siblings, 0 replies; 13+ messages in thread
From: Cliff Larsen @ 2004-11-11 17:18 UTC (permalink / raw)
To: linux-ia64
On Wed, 2004-11-10 at 20:14, Luck, Tony wrote:
> You could change the call to ia64_sal_set_vectors in ia64_mca_init
> to point all cpus to just one routine (pass pointer to the same
> routine for monarch/slave) ... and then have the OS init code
> handle the serialization. That would work on both correct and
> buggy SAL implementations.
>
> -Tony
Certainly true. Do you have any sense of how widespread the problem
is? Being relatively new to Itanium and having just a SR870BH2
to work with, I'm wondering whether such a workaround would be
generally useful.
--
Cliff Larsen <clarsen@egenera.com>
^ permalink raw reply [flat|nested] 13+ messages in thread* RE: [PATCH] make INIT# handler call panic
2004-11-05 13:55 [PATCH] make INIT# handler call panic Cliff Larsen
` (10 preceding siblings ...)
2004-11-11 17:18 ` Cliff Larsen
@ 2004-11-11 17:33 ` Luck, Tony
11 siblings, 0 replies; 13+ messages in thread
From: Luck, Tony @ 2004-11-11 17:33 UTC (permalink / raw)
To: linux-ia64
>> You could change the call to ia64_sal_set_vectors in ia64_mca_init
>> to point all cpus to just one routine (pass pointer to the same
>> routine for monarch/slave) ... and then have the OS init code
>> handle the serialization. That would work on both correct and
>> buggy SAL implementations.
>>
>> -Tony
>
>Certainly true. Do you have any sense of how widespread the problem
>is? Being relatively new to Itanium and having just a SR870BH2
>to work with, I'm wondering whether such a workaround would be
>generally useful.
The only platform that I _know_ has this SAL bug is ... shuffles feet
in embarrassment ... the Intel Tiger. But it is possible that others
have copied this bug.
-Tony
^ permalink raw reply [flat|nested] 13+ messages in thread