* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609
@ 2008-02-25 17:27 Bjorn Helgaas
2008-02-25 23:08 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: Alex Chiang
` (25 more replies)
0 siblings, 26 replies; 27+ messages in thread
From: Bjorn Helgaas @ 2008-02-25 17:27 UTC (permalink / raw)
To: linux-ia64
On Friday 22 February 2008 12:28:26 am Shaohua Li wrote:
> My tiger machine hangs since 2.6.23 with commit above. I always saw oops
> in ia64_sal_physical_id_info(). In 2.6.22, if ia64_pal_logical_to_phys
> returns UNIMPLENTED, ia64_sal_physical_id_info() isn't called. Below
> patch fixes the issue.
I added a descriptive subject and copied the author of the change.
He's been travelling for a month or so and might not be able to respond
immediately.
> diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c
> index 32ee597..6e0290b 100644
> --- a/arch/ia64/kernel/smpboot.c
> +++ b/arch/ia64/kernel/smpboot.c
> @@ -878,13 +878,10 @@ identify_siblings(struct cpuinfo_ia64 *c)
> printk(KERN_ERR
> "ia64_pal_logical_to_phys failed with %ld\n",
> status);
> - return;
> }
> -
> - info.overview_ppid = 0;
> - info.overview_cpp = 1;
> - info.overview_tpc = 1;
> + return;
> }
> +
> if ((status = ia64_sal_physical_id_info(&pltid)) != PAL_STATUS_SUCCESS) {
> printk(KERN_ERR "ia64_sal_pltid failed with %ld\n", status);
> return;
> @@ -892,9 +889,6 @@ identify_siblings(struct cpuinfo_ia64 *c)
>
> c->socket_id = (pltid << 8) | info.overview_ppid;
>
> - if (info.overview_cpp = 1 && info.overview_tpc = 1)
> - return;
> -
> c->cores_per_socket = info.overview_cpp;
> c->threads_per_core = info.overview_tpc;
> c->num_log = info.overview_num_log;
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
@ 2008-02-25 23:08 ` Alex Chiang
2008-02-26 1:11 ` Shaohua Li
` (24 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-25 23:08 UTC (permalink / raw)
To: linux-ia64
* Bjorn Helgaas <bjorn.helgaas@hp.com>:
> On Friday 22 February 2008 12:28:26 am Shaohua Li wrote:
> > My tiger machine hangs since 2.6.23 with commit above. I always saw oops
> > in ia64_sal_physical_id_info(). In 2.6.22, if ia64_pal_logical_to_phys
> > returns UNIMPLENTED, ia64_sal_physical_id_info() isn't called. Below
> > patch fixes the issue.
>
> I added a descriptive subject and copied the author of the change.
> He's been travelling for a month or so and might not be able to respond
> immediately.
Thanks Bjorn.
> > diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c
> > index 32ee597..6e0290b 100644
> > --- a/arch/ia64/kernel/smpboot.c
> > +++ b/arch/ia64/kernel/smpboot.c
> > @@ -878,13 +878,10 @@ identify_siblings(struct cpuinfo_ia64 *c)
> > printk(KERN_ERR
> > "ia64_pal_logical_to_phys failed with %ld\n",
> > status);
> > - return;
> > }
> > -
> > - info.overview_ppid = 0;
> > - info.overview_cpp = 1;
> > - info.overview_tpc = 1;
> > + return;
My original commit relied on fall-through behavior to still try
and call ia64_sal_physical_id_info(), because there are
cases/platforms where PAL_LOGICAL_TO_PHYSICAL is not implemented
but SAL_PHYSICAL_ID_INFO is.
I think the more interesting question is, why is that SAL call
hanging / oops'ing your machine rather than returning with an
error code?
In other words, why doesn't the error path work?
Thanks.
/ac
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
2008-02-25 23:08 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: Alex Chiang
@ 2008-02-26 1:11 ` Shaohua Li
2008-02-26 7:15 ` Alex Chiang
` (23 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Shaohua Li @ 2008-02-26 1:11 UTC (permalink / raw)
To: linux-ia64
On Mon, 2008-02-25 at 16:08 -0700, Alex Chiang wrote:
> * Bjorn Helgaas <bjorn.helgaas@hp.com>:
> > On Friday 22 February 2008 12:28:26 am Shaohua Li wrote:
> > > My tiger machine hangs since 2.6.23 with commit above. I always saw oops
> > > in ia64_sal_physical_id_info(). In 2.6.22, if ia64_pal_logical_to_phys
> > > returns UNIMPLENTED, ia64_sal_physical_id_info() isn't called. Below
> > > patch fixes the issue.
> >
> > I added a descriptive subject and copied the author of the change.
> > He's been travelling for a month or so and might not be able to respond
> > immediately.
>
> Thanks Bjorn.
>
> > > diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c
> > > index 32ee597..6e0290b 100644
> > > --- a/arch/ia64/kernel/smpboot.c
> > > +++ b/arch/ia64/kernel/smpboot.c
> > > @@ -878,13 +878,10 @@ identify_siblings(struct cpuinfo_ia64 *c)
> > > printk(KERN_ERR
> > > "ia64_pal_logical_to_phys failed with %ld\n",
> > > status);
> > > - return;
> > > }
> > > -
> > > - info.overview_ppid = 0;
> > > - info.overview_cpp = 1;
> > > - info.overview_tpc = 1;
> > > + return;
>
> My original commit relied on fall-through behavior to still try
> and call ia64_sal_physical_id_info(), because there are
> cases/platforms where PAL_LOGICAL_TO_PHYSICAL is not implemented
> but SAL_PHYSICAL_ID_INFO is.
>
> I think the more interesting question is, why is that SAL call
> hanging / oops'ing your machine rather than returning with an
> error code?
>
> In other words, why doesn't the error path work?
Yes, this is strange. But other SAL calls are ok, maybe firmware bug or
something I don't know. I'm not familiar with this area, if you need
further info, let me know.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
2008-02-25 23:08 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: Alex Chiang
2008-02-26 1:11 ` Shaohua Li
@ 2008-02-26 7:15 ` Alex Chiang
2008-02-26 9:24 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Li, Shaohua
` (22 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-26 7:15 UTC (permalink / raw)
To: linux-ia64
* Shaohua Li <shaohua.li@intel.com>:
>
> On Mon, 2008-02-25 at 16:08 -0700, Alex Chiang wrote:
> >
> > My original commit relied on fall-through behavior to still
> > try and call ia64_sal_physical_id_info(), because there are
> > cases/platforms where PAL_LOGICAL_TO_PHYSICAL is not
> > implemented but SAL_PHYSICAL_ID_INFO is.
> >
> > I think the more interesting question is, why is that SAL
> > call hanging / oops'ing your machine rather than returning
> > with an error code?
> >
> > In other words, why doesn't the error path work?
>
> Yes, this is strange. But other SAL calls are ok, maybe
> firmware bug or something I don't know. I'm not familiar with
> this area, if you need further info, let me know.
Do you get anything useful (like the oops message) printed on the
console or in your syslog?
That might be a good first step.
Thanks.
/ac
^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (2 preceding siblings ...)
2008-02-26 7:15 ` Alex Chiang
@ 2008-02-26 9:24 ` Li, Shaohua
2008-02-26 17:51 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
` (21 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Li, Shaohua @ 2008-02-26 9:24 UTC (permalink / raw)
To: linux-ia64
>-----Original Message-----
>From: Alex Chiang [mailto:achiang@hp.com]
>Sent: Tuesday, February 26, 2008 3:15 PM
>To: Li, Shaohua
>Cc: Bjorn Helgaas; Luck, Tony; ia64
>Subject: Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
>regression:113134fcbca83619be4c68d0ca66db6093777b5d)
>
>* Shaohua Li <shaohua.li@intel.com>:
>>
>> On Mon, 2008-02-25 at 16:08 -0700, Alex Chiang wrote:
>> >
>> > My original commit relied on fall-through behavior to still
>> > try and call ia64_sal_physical_id_info(), because there are
>> > cases/platforms where PAL_LOGICAL_TO_PHYSICAL is not
>> > implemented but SAL_PHYSICAL_ID_INFO is.
>> >
>> > I think the more interesting question is, why is that SAL
>> > call hanging / oops'ing your machine rather than returning
>> > with an error code?
>> >
>> > In other words, why doesn't the error path work?
>>
>> Yes, this is strange. But other SAL calls are ok, maybe
>> firmware bug or something I don't know. I'm not familiar with
>> this area, if you need further info, let me know.
>
>Do you get anything useful (like the oops message) printed on the
>console or in your syslog?
>
>That might be a good first step.
SAL: AP wakeup using external interrupt vetor 0xf0
swapper[0]: IA-64 Illegal operation fault 0
Modules linked in:
Pid:0
CPU 0
comm: swapper
psr: 00001010084a2010 ifs:8000000000000818 ip:[<e00000017fe510f0>] Not
tained(2.6.24)
ip is at 0xe00000017fe510f0
unat:0000000000000 pfs0000000000038f rsc0000000000003
rnat 0000000000000 bsps 0000000000000, pr 656960155aa6809b
ldrs:000000000 ccv 00000000000 fpsr 0009804c8a70433f
...........
I can't get the log with serial console, so I copied by hand, so maybe
there are errors. There are a lot of other registers below, if you need
know, I'll copy them too
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (3 preceding siblings ...)
2008-02-26 9:24 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Li, Shaohua
@ 2008-02-26 17:51 ` Alex Chiang
2008-02-26 22:45 ` Alex Chiang
` (20 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-26 17:51 UTC (permalink / raw)
To: linux-ia64
* Li, Shaohua <shaohua.li@intel.com>:
> >
> >Do you get anything useful (like the oops message) printed on the
> >console or in your syslog?
> >
> >That might be a good first step.
> SAL: AP wakeup using external interrupt vetor 0xf0
> swapper[0]: IA-64 Illegal operation fault 0
> Modules linked in:
> Pid:0
> CPU 0
> comm: swapper
> psr: 00001010084a2010 ifs:8000000000000818 ip:[<e00000017fe510f0>] Not
> tained(2.6.24)
> ip is at 0xe00000017fe510f0
This looks like an identity-mapped firmware address. Can you send
the output of the EFI memmap command?
> unat:0000000000000 pfs0000000000038f rsc0000000000003
> rnat 0000000000000 bsps 0000000000000, pr 656960155aa6809b
> ldrs:000000000 ccv 00000000000 fpsr 0009804c8a70433f
> ...........
>
>
> I can't get the log with serial console, so I copied by hand, so maybe
> there are errors. There are a lot of other registers below, if you need
> know, I'll copy them too
How about taking a picture with a camera and sticking it
somewhere? :)
In the mean time, I'll try and reproduce this with the tiger we
have here.
Thanks.
/ac
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (4 preceding siblings ...)
2008-02-26 17:51 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
@ 2008-02-26 22:45 ` Alex Chiang
2008-02-26 23:07 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
` (19 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-26 22:45 UTC (permalink / raw)
To: linux-ia64
* Li, Shaohua <shaohua.li@intel.com>:
>
> I can't get the log with serial console, so I copied by hand,
> so maybe there are errors. There are a lot of other registers
> below, if you need know, I'll copy them too
I was able to reproduce this on my tiger:
PAL Version 5.37
SAL Version 3.00
FPSWA Version 1.18
memmap output:
Type Start End # Pages
BS_data 0000000000000000-0000000000000FFF 0000000000000001 0000000000000009
available 0000000000001000-0000000000006FFF 0000000000000006 0000000000000009
BS_data 0000000000007000-0000000000008FFF 0000000000000002 0000000000000009
available 0000000000009000-0000000000081FFF 0000000000000079 0000000000000009
RT_data 0000000000082000-0000000000083FFF 0000000000000002 8000000000000009
available 0000000000084000-0000000000084FFF 0000000000000001 0000000000000009
BS_data 0000000000085000-000000000009FFFF 000000000000001B 0000000000000009
RT_code 00000000000C0000-00000000000FFFFF 0000000000000040 8000000000000009
available 0000000000100000-000000000FF7FFFF 000000000000FE80 000000000000000B
BS_data 000000000FF80000-000000000FFFFFFF 0000000000000080 000000000000000B
available 0000000010000000-000000007D8FFFFF 000000000006D900 000000000000000B
BS_code 000000007D900000-000000007F97FFFF 0000000000002080 000000000000000B
available 000000007F980000-000000007F9FFFFF 0000000000000080 000000000000000B
RT_code 000000007FA00000-000000007FDFFFFF 0000000000000400 8000000000000009
PAL_code 000000007FE00000-000000007FE3FFFF 0000000000000040 8000000000000009
RT_code 000000007FE40000-000000007FE95FFF 0000000000000056 8000000000000009
available 000000007FE96000-000000007FF27FFF 0000000000000092 000000000000000B
BS_data 000000007FF28000-000000007FF2FFFF 0000000000000008 000000000000000B
RT_data 000000007FF30000-000000007FFFFFFF 00000000000000D0 8000000000000009
MemMapIO 00000000FE000000-00000000FEFFFFFF 0000000000001000 0000000000000001
RT_data 00000000FF000000-00000000FFFFFFFF 0000000000001000 8000000000000001
Uncompressing Linux... done
Linux version 2.6.25-rc3-00081-g7704a8b (achiang@blender) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #213 SMP Tue Feb 26 12:59:43 MST 2008
EFI v1.10 by INTEL: SALsystab=0x7fe4c8c0 ACPI=0x7ff84000 ACPI 2.0=0x7ff83000 MPS=0x7ff82000 SMBIOS=0xf0000
booting generic kernel on platform dig
Early serial console at I/O port 0x2f8 (options '115200')
console [uart0] enabled
ACPI: RSDP 7FF83000, 0024 (r2 INTEL )
ACPI: XSDT 7FF83090, 0034 (r1 INTEL SR870BN4 1072002 MSFT 10013)
ACPI: FACP 7FF83138, 00F4 (r3 INTEL SR870BN4 1072002 MSFT 10013)
ACPI: DSDT 7FF85000, 6D62 (r1 Intel SR870BN4 0 MSFT 100000D)
ACPI: FACS 7FF83318, 0040
ACPI: APIC 7FF83230, 00E6 (r1 INTEL SR870BN4 1072002 MSFT 10013)
Entering add_active_range(0, 256, 32672) 0 entries of 51200 used
Entering add_active_range(0, 32746, 32755) 1 entries of 51200 used
Entering add_active_range(0, 65536, 147455) 2 entries of 51200 used
Entering add_active_range(0, 294912, 327649) 3 entries of 51200 used
Entering add_active_range(0, 327656, 327675) 4 entries of 51200 used
SAL 3.1: Intel Corp SR870BN4 version 3.0
SAL Platform features: BusLock IRQ_Redirection
SAL: AP wakeup using external interrupt vector 0xf0
swapper[0]: IA-64 Illegal operation fault 0 [1]
Pid: 0, CPU 0, comm: swapper
psr : 00001010084a2010 ifs : 8000000000000818 ip : [<e00000017fe50f00>] Not tainted (2.6.25-rc3-00081-g7704a8b)
ip is at 0xe00000017fe50f00
unat: 0000000000000000 pfs : 000000000000038f rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr : 80000000afb580ab
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0930ffff00090000 ssd : 0930ffff00090000
b0 : a000000100cd7800 b6 : e00000017fe50f00 b7 : e00000007fe08010
f6 : 000000000000000000000 f7 : 1003e0a7c5ac471b47843
f8 : 1003e00000000000027ff f9 : 10004c000000000000000
f10 : 10004cbffffffff340000 f11 : 1003e0000000000000033
r1 : e00000008008c4c0 r2 : e00000017fe50f00 r3 : e00000007fe50f40
r8 : 0000000000000000 r9 : 0000000000000000 r10 : 0000000000000000
r11 : 0000000000000000 r12 : a00000010120f8f0 r13 : a000000101200000
r14 : a00000010120fa18 r15 : a00000010120fa00 r16 : 0000000000000020
r17 : a00000010120f938 r18 : a00000010120f939 r19 : a00000010120fa80
r20 : a00000010120f904 r21 : 0000000000000001 r22 : a00000010120f906
r23 : a00000010120f902 r24 : 0000000000000000 r25 : 000000000000000f
r26 : 0000000000000000 r27 : 00000010084a2010 r28 : 0000000000000000
r29 : a000000101465ab0 r30 : e00000007fe48020 r31 : a000000101436af0
kernel unaligned access to 0xffffffffffffffff, ip=0xa0000001001440b0
kernel unaligned access to 0xffffffffffffffff, ip=0xa0000001001440c1
swapper[0]: error during unaligned kernel access
-1 [2]
I looked through some SAL specs, and it turns out that
SAL_PHYSICAL_ID_INFO was introduced in v3.2, but this tiger
implements v3.1.
SAL *should* be returning -1 for unimplemented calls, but
something is going fantastically wrong here. Bjorn pointed out
that both r2 and b6 contain the IP. Maybe SAL isn't computing
branches correctly or something?
So what to do to work around a broken SAL? Seems like a chicken
and egg problem to me -- the only way to try and check if a call
is implemented or not is to call it, and calling it hangs the
machine... :(
Thoughts?
/ac
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (5 preceding siblings ...)
2008-02-26 22:45 ` Alex Chiang
@ 2008-02-26 23:07 ` Matthew Wilcox
2008-02-26 23:46 ` Russ Anderson
` (18 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Matthew Wilcox @ 2008-02-26 23:07 UTC (permalink / raw)
To: linux-ia64
On Tue, Feb 26, 2008 at 03:45:40PM -0700, Alex Chiang wrote:
> I looked through some SAL specs, and it turns out that
> SAL_PHYSICAL_ID_INFO was introduced in v3.2, but this tiger
> implements v3.1.
>
> SAL *should* be returning -1 for unimplemented calls, but
> something is going fantastically wrong here. Bjorn pointed out
> that both r2 and b6 contain the IP. Maybe SAL isn't computing
> branches correctly or something?
>
> So what to do to work around a broken SAL? Seems like a chicken
> and egg problem to me -- the only way to try and check if a call
> is implemented or not is to call it, and calling it hangs the
> machine... :(
While you can check to see what SAL revision is supported, do be wary
of some prototype HP SAL implementations which report numbers in the
60-90 range. It's probably safe to say 'if sal revision < 3.2 answer -1', but we were burned with extended PCI config space many moons ago
when we said 'if sal > 3.1 use new method'.
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (6 preceding siblings ...)
2008-02-26 23:07 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
@ 2008-02-26 23:46 ` Russ Anderson
2008-02-26 23:50 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
` (17 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Russ Anderson @ 2008-02-26 23:46 UTC (permalink / raw)
To: linux-ia64
On Tue, Feb 26, 2008 at 03:45:40PM -0700, Alex Chiang wrote:
>
> I looked through some SAL specs, and it turns out that
> SAL_PHYSICAL_ID_INFO was introduced in v3.2, but this tiger
> implements v3.1.
>
> SAL *should* be returning -1 for unimplemented calls, but
> something is going fantastically wrong here. Bjorn pointed out
> that both r2 and b6 contain the IP. Maybe SAL isn't computing
> branches correctly or something?
>
> So what to do to work around a broken SAL? Seems like a chicken
> and egg problem to me -- the only way to try and check if a call
> is implemented or not is to call it, and calling it hangs the
> machine... :(
>
> Thoughts?
How about putting back some of the code that avoided the problem?
The previous code must have bailed out before getting to
ia64_sal_physical_id_info(). Did it print out an error message,
such as "No logical to physical processor mapping " or
"ia64_pal_logical_to_phys failed with"? What does ia64_pal_logical_to_phys()
return on a tiger box?
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (7 preceding siblings ...)
2008-02-26 23:46 ` Russ Anderson
@ 2008-02-26 23:50 ` Alex Chiang
2008-02-27 0:00 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
` (16 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-26 23:50 UTC (permalink / raw)
To: linux-ia64
* Matthew Wilcox <matthew@wil.cx>:
>
> While you can check to see what SAL revision is supported, do
> be wary of some prototype HP SAL implementations which report
> numbers in the 60-90 range. It's probably safe to say 'if sal
> revision < 3.2 answer = -1', but we were burned with extended
> PCI config space many moons ago when we said 'if sal > 3.1 use
> new method'.
Ah, good idea, Willy.
I believe that the current implementation of check_versions()
does the proper workaround for this bug?
/* Check for broken firmware */
if ((sal_revision = SAL_VERSION_CODE(49, 29))
&& (sal_version = SAL_VERSION_CODE(49, 29)))
{
/*
* Old firmware for zx2000 prototypes have this weird version number,
* reset it to something sane.
*/
sal_revision = SAL_VERSION_CODE(2, 8);
sal_version = SAL_VERSION_CODE(0, 0);
}
In light of that, how about this patch? It allows my Tiger to
boot.
/ac
From: Alex Chiang <achiang@hp.com>
Subject: [PATCH] ia64: ia64_sal_physical_id_info work around broken SAL
Unimplemented SAL calls should return -1, but on at least one
platform (Tiger with SAL v3.1), attempting to call SAL_PHYSICAL_ID_INFO
(which was defined in SAL v3.2 and later) results in an oops and
a hang.
Signed-off-by: Alex Chiang <achiang@hp.com>
---
diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
index 2251118..f4904db 100644
--- a/include/asm-ia64/sal.h
+++ b/include/asm-ia64/sal.h
@@ -807,6 +807,10 @@ static inline s64
ia64_sal_physical_id_info(u16 *splid)
{
struct ia64_sal_retval isrv;
+
+ if (sal_revision < SAL_VERSION_CODE(3,2))
+ return -1;
+
SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
if (splid)
*splid = isrv.v0;
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (8 preceding siblings ...)
2008-02-26 23:50 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
@ 2008-02-27 0:00 ` Matthew Wilcox
2008-02-27 0:10 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
` (15 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Matthew Wilcox @ 2008-02-27 0:00 UTC (permalink / raw)
To: linux-ia64
On Tue, Feb 26, 2008 at 04:50:48PM -0700, Alex Chiang wrote:
> I believe that the current implementation of check_versions()
> does the proper workaround for this bug?
>
> /* Check for broken firmware */
> if ((sal_revision = SAL_VERSION_CODE(49, 29))
> && (sal_version = SAL_VERSION_CODE(49, 29)))
> {
Oh, there were many other broken versions available ;-) I believe this
particular workaround was for a machine David Mosberger had that
couldn't be upgraded.
> In light of that, how about this patch? It allows my Tiger to
> boot.
>
> /ac
>
> From: Alex Chiang <achiang@hp.com>
> Subject: [PATCH] ia64: ia64_sal_physical_id_info work around broken SAL
>
> Unimplemented SAL calls should return -1, but on at least one
> platform (Tiger with SAL v3.1), attempting to call SAL_PHYSICAL_ID_INFO
> (which was defined in SAL v3.2 and later) results in an oops and
> a hang.
>
> Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Matthew Wilcox <willy@linux.intel.com>
> ---
> diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
> index 2251118..f4904db 100644
> --- a/include/asm-ia64/sal.h
> +++ b/include/asm-ia64/sal.h
> @@ -807,6 +807,10 @@ static inline s64
> ia64_sal_physical_id_info(u16 *splid)
> {
> struct ia64_sal_retval isrv;
> +
> + if (sal_revision < SAL_VERSION_CODE(3,2))
> + return -1;
> +
> SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
> if (splid)
> *splid = isrv.v0;
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (9 preceding siblings ...)
2008-02-27 0:00 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
@ 2008-02-27 0:10 ` Alex Chiang
2008-02-27 0:15 ` Shaohua Li
` (14 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-27 0:10 UTC (permalink / raw)
To: linux-ia64
* Russ Anderson <rja@sgi.com>:
>
> How about putting back some of the code that avoided the problem?
>
> The previous code must have bailed out before getting to
> ia64_sal_physical_id_info().
Yes, the previous code actually did this:
- if (smp_num_cpucores = 1 && smp_num_siblings = 1)
- return;
-
if ((status = ia64_pal_logical_to_phys(-1, &info)) != PAL_STATUS_SUCCESS
- printk(KERN_ERR "ia64_pal_logical_to_phys failed with %ld\n",
- status);
- return;
So it never called ia64_pal_logical_to_phys nor did it call
ia64_sal_get_physical_info.
My patch changed the logic so that we would at least try to call
both to extract what useful information we could (because various
HP platforms implement either one, both, or neither calls).
> Did it print out an error message, such as "No logical to
> physical processor mapping " or "ia64_pal_logical_to_phys
> failed with"? What does ia64_pal_logical_to_phys() return on
> a tiger box?
On a Tiger, we didn't see any printks because we bailed before
even making the PAL code. But if it *did* make the PAL call, we
would have seen that printk above.
My earlier patch (that caused a regression) changed that code
path to:
- always make the PAL call
- if return value was not success *and* something other
than "not implemented" then print the error and return
- else, if the PAL call was merely unimplemented, then
make the SAL call to try and get at least something
useful
- if the SAL call was unsuccessful as well (where
unsuccessful *includes* unimplemented condition) then
bail
- finally, combine what we could successfully figure out
and stash it away for later so when a user does a cat
/proc/cpuinfo, at best they'll get something more
useful than before, and at worst, there will be no
change from prior behavior
I think that was a pretty reasonable approach, but I admit it was
based on an assumption that an unimplemented SAL call would
return with -1 rather than doing something nasty like hang the
box.
I think that the Tiger firmware is actually buggy and should be
returning -1 rather than doing the Bad Thing(tm).
The patch I just sent out a bit ago should be a reasonable
workaround.
Thanks.
/ac
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (10 preceding siblings ...)
2008-02-27 0:10 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
@ 2008-02-27 0:15 ` Shaohua Li
2008-02-27 0:23 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
` (13 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Shaohua Li @ 2008-02-27 0:15 UTC (permalink / raw)
To: linux-ia64
On Tue, 2008-02-26 at 16:50 -0700, Alex Chiang wrote:
> * Matthew Wilcox <matthew@wil.cx>:
> >
> > While you can check to see what SAL revision is supported, do
> > be wary of some prototype HP SAL implementations which report
> > numbers in the 60-90 range. It's probably safe to say 'if sal
> > revision < 3.2 answer = -1', but we were burned with extended
> > PCI config space many moons ago when we said 'if sal > 3.1 use
> > new method'.
>
> Ah, good idea, Willy.
>
> I believe that the current implementation of check_versions()
> does the proper workaround for this bug?
>
> /* Check for broken firmware */
> if ((sal_revision = SAL_VERSION_CODE(49, 29))
> && (sal_version = SAL_VERSION_CODE(49, 29)))
> {
> /*
> * Old firmware for zx2000 prototypes have this weird version number,
> * reset it to something sane.
> */
> sal_revision = SAL_VERSION_CODE(2, 8);
> sal_version = SAL_VERSION_CODE(0, 0);
> }
>
> In light of that, how about this patch? It allows my Tiger to
> boot.
>
> /ac
>
> From: Alex Chiang <achiang@hp.com>
> Subject: [PATCH] ia64: ia64_sal_physical_id_info work around broken SAL
>
> Unimplemented SAL calls should return -1, but on at least one
> platform (Tiger with SAL v3.1), attempting to call SAL_PHYSICAL_ID_INFO
> (which was defined in SAL v3.2 and later) results in an oops and
> a hang.
>
> Signed-off-by: Alex Chiang <achiang@hp.com>
> ---
> diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
> index 2251118..f4904db 100644
> --- a/include/asm-ia64/sal.h
> +++ b/include/asm-ia64/sal.h
> @@ -807,6 +807,10 @@ static inline s64
> ia64_sal_physical_id_info(u16 *splid)
> {
> struct ia64_sal_retval isrv;
> +
> + if (sal_revision < SAL_VERSION_CODE(3,2))
> + return -1;
> +
> SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
> if (splid)
> *splid = isrv.v0;
This works for me too.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (11 preceding siblings ...)
2008-02-27 0:15 ` Shaohua Li
@ 2008-02-27 0:23 ` Russ Anderson
2008-02-27 0:34 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
` (12 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Russ Anderson @ 2008-02-27 0:23 UTC (permalink / raw)
To: linux-ia64
On Tue, Feb 26, 2008 at 04:50:48PM -0700, Alex Chiang wrote:
> From: Alex Chiang <achiang@hp.com>
> Subject: [PATCH] ia64: ia64_sal_physical_id_info work around broken SAL
>
> Unimplemented SAL calls should return -1, but on at least one
> platform (Tiger with SAL v3.1), attempting to call SAL_PHYSICAL_ID_INFO
> (which was defined in SAL v3.2 and later) results in an oops and
> a hang.
>
> Signed-off-by: Alex Chiang <achiang@hp.com>
> ---
> diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
> index 2251118..f4904db 100644
> --- a/include/asm-ia64/sal.h
> +++ b/include/asm-ia64/sal.h
> @@ -807,6 +807,10 @@ static inline s64
> ia64_sal_physical_id_info(u16 *splid)
> {
> struct ia64_sal_retval isrv;
> +
> + if (sal_revision < SAL_VERSION_CODE(3,2))
> + return -1;
> +
> SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
> if (splid)
> *splid = isrv.v0;
That causes ia64_sal_physical_id_info() to fail on my Altix. :-(
------------------------------------------------------------------
Shell> fs1:efi\suse\elilo net0:rja/vmlinux.rja.2624 root=/dev/sda8 console=ttySG0 kdb=on
ELILO
001c01 DEBUG: extInt.c line 501 ; PS [EINT4] interrupts enabled
001c01 DEBUG: extInt.c line 490 ; PS [EINT4] interrupts disabled
Uncompressing Linux... done
Initializing cgroup subsys cpuset
Linux version 2.6.24 (rja@attica) (gcc version 4.1.2 20070115 (prerelease) (SUSE Linux)) #39 SMP Tue Feb 26 18:08:59 CST 2008
EFI v1.10 by INTEL: SALsystab=0x6002c25290 ACPI 2.0=0x6002c25b10
console [sn_sal0] enabled
ACPI: RSDP 6002C25B10, 0024 (r2 SGI)
ACPI: XSDT 6002C29270, 0044 (r1 SGI XSDTSN2 10001 5C)
ACPI: APIC 6002C25BB0, 00D4 (r1 SGI APICSN2 10001 1)
ACPI: SRAT 6002C25CA0, 0200 (r1 SGI SRATSN2 10001 1)
ACPI: SLIT 6002C25EB0, 0050 (r1 SGI SLITSN2 10001 1)
ACPI: FACP 6002C25F20, 00F4 (r3 SGI FACPSN2 30001 1)
ACPI: DSDT 6002C28D90, 0024 (r2 SGI DSDTSN2 20001 4C4)
ACPI: FACS 6002C25380, 0040
Number of logical nodes in system = 6
Number of memory chunks in system = 6
SAL 2.9: SGI SN2 version 1.30
SAL Platform features: ITC_Drift
SAL: AP wakeup using external interrupt vector 0x12
ia64_sal_pltid failed with -1
ACPI: Local APIC address c0000000fee00000
register_intr: No IOSAPIC for GSI 52
14 CPUs available, 14 CPUs total
MCA related initialization done
ACPI: RSDP 6002C25B10, 0024 (r2 SGI)
ACPI: XSDT 6002C29270, 005C (r1 SGI XSDTSN2 10001 5C)
ACPI: APIC 6002C25BB0, 00D4 (r1 SGI APICSN2 10001 1)
ACPI: SRAT 6002C25CA0, 0200 (r1 SGI SRATSN2 10001 1)
ACPI: SLIT 6002C25EB0, 0050 (r1 SGI SLITSN2 10001 1)
ACPI: FACP 6002C25F20, 00F4 (r3 SGI FACPSN2 30001 1)
ACPI: DSDT 6002C28D90, 04C4 (r2 SGI DSDTSN2 20101 4C4)
ACPI: FACS 6002C25380, 0040
ACPI: SSDT 6002C27FF0, 0095 (r2 SGI SSDTSN2 20101 95)
ACPI: SSDT 6002C28100, 00F5 (r2 SGI SSDTSN2 20101 F5)
ACPI: SSDT 6002C28450, 024B (r2 SGI SSDTSN2 20101 24B)
SGI SAL version 1.30
Virtual mem_map starts at 0xa07ffffed2c80000
Zone PFN ranges:
Normal 6292224 -> 90241024
Movable zone start PFN for each node
early_node_map[11] active PFN ranges
0: 6292224 -> 6323200
0: 6815744 -> 6847488
0: 7340032 -> 7371776
1: 23069440 -> 23132160
2: 39846656 -> 39972863
3: 56623872 -> 56686592
4: 73401088 -> 73432064
5: 90178304 -> 90240639
5: 90240896 -> 90240981
5: 90240986 -> 90240993
5: 90241000 -> 90241006
Built 6 zonelists in Node order, mobility grouping on. Total pages: 438306
Policy zone: Normal
Kernel command line: BOOT_IMAGE=net0:rja/vmlinux.rja.2624 ro root=/dev/sda8 console=ttySG0 kdb=on
PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour dummy device 80x25
console [ttySG0] enabled
Memory: 27886912k/28111104k available (8301k code, 242368k reserved, 5967k data, 1792k init)
SLUB: Genslabs\x16, HWalign\x128, Order=0-2, MinObjects=8, CPUs\x14, Nodes\x1024
Dentry cache hash table entries: 4194304 (order: 9, 33554432 bytes)
Inode-cache hash table entries: 2097152 (order: 8, 16777216 bytes)
Mount-cache hash table entries: 4096
ACPI: Core revision 20070126
Boot processor id 0x0/0x0
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
ia64_sal_pltid failed with -1
Brought up 14 CPUs
Total of 14 processors activated (44793.85 BogoMIPS).
------------------------------------------------------------------------------
saturn1-10:~ # cat /proc/cpuinfo
processor : 0
vendor : GenuineIntel
arch : IA-64
family : 32
model : 1
model name : Dual-Core Intel(R) Itanium(R) Processor 9150M
revision : 0
archrev : 0
features : branchlong, 16-byte atomic ops
cpu number : 0
cpu regs : 4
cpu MHz : 1669.503
itc MHz : 416.875000
BogoMIPS : 3325.95
siblings : 1
processor : 1
vendor : GenuineIntel
arch : IA-64
family : 32
model : 1
model name : Dual-Core Intel(R) Itanium(R) Processor 9150M
revision : 0
archrev : 0
features : branchlong, 16-byte atomic ops
cpu number : 0
cpu regs : 4
cpu MHz : 1669.503
itc MHz : 416.875000
BogoMIPS : 3325.95
siblings : 1
------------------------------------------------------------------------------
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (12 preceding siblings ...)
2008-02-27 0:23 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
@ 2008-02-27 0:34 ` Alex Chiang
2008-02-27 1:05 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
` (11 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-27 0:34 UTC (permalink / raw)
To: linux-ia64
* Russ Anderson <rja@sgi.com>:
>
> That causes ia64_sal_physical_id_info() to fail on my Altix. :-(
Did it work before?
> ------------------------------------------------------------------
> Shell> fs1:efi\suse\elilo net0:rja/vmlinux.rja.2624 root=/dev/sda8 console=ttySG0 kdb=on
> ELILO
> 001c01 DEBUG: extInt.c line 501 ; PS [EINT4] interrupts enabled
> 001c01 DEBUG: extInt.c line 490 ; PS [EINT4] interrupts disabled
> Uncompressing Linux... done
> Initializing cgroup subsys cpuset
> Linux version 2.6.24 (rja@attica) (gcc version 4.1.2 20070115 (prerelease) (SUSE Linux)) #39 SMP Tue Feb 26 18:08:59 CST 2008
> EFI v1.10 by INTEL: SALsystab=0x6002c25290 ACPI 2.0=0x6002c25b10
> console [sn_sal0] enabled
> ACPI: RSDP 6002C25B10, 0024 (r2 SGI)
> ACPI: XSDT 6002C29270, 0044 (r1 SGI XSDTSN2 10001 5C)
> ACPI: APIC 6002C25BB0, 00D4 (r1 SGI APICSN2 10001 1)
> ACPI: SRAT 6002C25CA0, 0200 (r1 SGI SRATSN2 10001 1)
> ACPI: SLIT 6002C25EB0, 0050 (r1 SGI SLITSN2 10001 1)
> ACPI: FACP 6002C25F20, 00F4 (r3 SGI FACPSN2 30001 1)
> ACPI: DSDT 6002C28D90, 0024 (r2 SGI DSDTSN2 20001 4C4)
> ACPI: FACS 6002C25380, 0040
> Number of logical nodes in system = 6
> Number of memory chunks in system = 6
> SAL 2.9: SGI SN2 version 1.30
I wouldn't expect SAL 2.9 to implement a call defined in SAL 3.2,
unless I'm seriously misunderstanding something?
> SAL Platform features: ITC_Drift
> SAL: AP wakeup using external interrupt vector 0x12
> ia64_sal_pltid failed with -1
Annoying noise, I agree with you.
> ACPI: Local APIC address c0000000fee00000
> register_intr: No IOSAPIC for GSI 52
> 14 CPUs available, 14 CPUs total
> MCA related initialization done
> ACPI: RSDP 6002C25B10, 0024 (r2 SGI)
> ACPI: XSDT 6002C29270, 005C (r1 SGI XSDTSN2 10001 5C)
> ACPI: APIC 6002C25BB0, 00D4 (r1 SGI APICSN2 10001 1)
> ACPI: SRAT 6002C25CA0, 0200 (r1 SGI SRATSN2 10001 1)
> ACPI: SLIT 6002C25EB0, 0050 (r1 SGI SLITSN2 10001 1)
> ACPI: FACP 6002C25F20, 00F4 (r3 SGI FACPSN2 30001 1)
> ACPI: DSDT 6002C28D90, 04C4 (r2 SGI DSDTSN2 20101 4C4)
> ACPI: FACS 6002C25380, 0040
> ACPI: SSDT 6002C27FF0, 0095 (r2 SGI SSDTSN2 20101 95)
> ACPI: SSDT 6002C28100, 00F5 (r2 SGI SSDTSN2 20101 F5)
> ACPI: SSDT 6002C28450, 024B (r2 SGI SSDTSN2 20101 24B)
> SGI SAL version 1.30
> Virtual mem_map starts at 0xa07ffffed2c80000
> Zone PFN ranges:
> Normal 6292224 -> 90241024
> Movable zone start PFN for each node
> early_node_map[11] active PFN ranges
> 0: 6292224 -> 6323200
> 0: 6815744 -> 6847488
> 0: 7340032 -> 7371776
> 1: 23069440 -> 23132160
> 2: 39846656 -> 39972863
> 3: 56623872 -> 56686592
> 4: 73401088 -> 73432064
> 5: 90178304 -> 90240639
> 5: 90240896 -> 90240981
> 5: 90240986 -> 90240993
> 5: 90241000 -> 90241006
> Built 6 zonelists in Node order, mobility grouping on. Total pages: 438306
> Policy zone: Normal
> Kernel command line: BOOT_IMAGE=net0:rja/vmlinux.rja.2624 ro root=/dev/sda8 console=ttySG0 kdb=on
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Console: colour dummy device 80x25
> console [ttySG0] enabled
> Memory: 27886912k/28111104k available (8301k code, 242368k reserved, 5967k data, 1792k init)
> SLUB: Genslabs\x16, HWalign\x128, Order=0-2, MinObjects=8, CPUs\x14, Nodes\x1024
> Dentry cache hash table entries: 4194304 (order: 9, 33554432 bytes)
> Inode-cache hash table entries: 2097152 (order: 8, 16777216 bytes)
> Mount-cache hash table entries: 4096
> ACPI: Core revision 20070126
> Boot processor id 0x0/0x0
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
> ia64_sal_pltid failed with -1
More annoying noise; I thought about removing or changing that
printk the first time around to something like KERN_INFO since
failure of that particular SAL call doesn't really affect any
functionality.
Opinions?
> Brought up 14 CPUs
> Total of 14 processors activated (44793.85 BogoMIPS).
> ------------------------------------------------------------------------------
>
>
> saturn1-10:~ # cat /proc/cpuinfo
> processor : 0
> vendor : GenuineIntel
> arch : IA-64
> family : 32
> model : 1
> model name : Dual-Core Intel(R) Itanium(R) Processor 9150M
This looks like a Montvale? That means we *should* be getting
meaningful "physical id" information from /proc/cpuinfo. :(
What values were you getting before my workaround above?
And again, the more interesting question is, why is your SAL
reporting a revision of 2.9?
Thanks.
/ac
> revision : 0
> archrev : 0
> features : branchlong, 16-byte atomic ops
> cpu number : 0
> cpu regs : 4
> cpu MHz : 1669.503
> itc MHz : 416.875000
> BogoMIPS : 3325.95
> siblings : 1
>
> processor : 1
> vendor : GenuineIntel
> arch : IA-64
> family : 32
> model : 1
> model name : Dual-Core Intel(R) Itanium(R) Processor 9150M
> revision : 0
> archrev : 0
> features : branchlong, 16-byte atomic ops
> cpu number : 0
> cpu regs : 4
> cpu MHz : 1669.503
> itc MHz : 416.875000
> BogoMIPS : 3325.95
> siblings : 1
> ------------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (13 preceding siblings ...)
2008-02-27 0:34 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
@ 2008-02-27 1:05 ` Russ Anderson
2008-02-27 14:38 ` Luck, Tony
` (10 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Russ Anderson @ 2008-02-27 1:05 UTC (permalink / raw)
To: linux-ia64
On Tue, Feb 26, 2008 at 05:34:19PM -0700, Alex Chiang wrote:
> * Russ Anderson <rja@sgi.com>:
> >
> > That causes ia64_sal_physical_id_info() to fail on my Altix. :-(
>
> Did it work before?
Yes.
> > ------------------------------------------------------------------
> > Shell> fs1:efi\suse\elilo net0:rja/vmlinux.rja.2624 root=/dev/sda8 console=ttySG0 kdb=on
> > ELILO
> > Uncompressing Linux... done
> > Initializing cgroup subsys cpuset
> > Linux version 2.6.24 (rja@attica) (gcc version 4.1.2 20070115 (prerelease) (SUSE Linux)) #39 SMP Tue Feb 26 18:08:59 CST 2008
> > EFI v1.10 by INTEL: SALsystab=0x6002c25290 ACPI 2.0=0x6002c25b10
> > console [sn_sal0] enabled
> > ACPI: RSDP 6002C25B10, 0024 (r2 SGI)
> > ACPI: XSDT 6002C29270, 0044 (r1 SGI XSDTSN2 10001 5C)
> > ACPI: APIC 6002C25BB0, 00D4 (r1 SGI APICSN2 10001 1)
> > ACPI: SRAT 6002C25CA0, 0200 (r1 SGI SRATSN2 10001 1)
> > ACPI: SLIT 6002C25EB0, 0050 (r1 SGI SLITSN2 10001 1)
> > ACPI: FACP 6002C25F20, 00F4 (r3 SGI FACPSN2 30001 1)
> > ACPI: DSDT 6002C28D90, 0024 (r2 SGI DSDTSN2 20001 4C4)
> > ACPI: FACS 6002C25380, 0040
> > Number of logical nodes in system = 6
> > Number of memory chunks in system = 6
> > SAL 2.9: SGI SN2 version 1.30
>
> I wouldn't expect SAL 2.9 to implement a call defined in SAL 3.2,
> unless I'm seriously misunderstanding something?
I will ask one of our SAL developers where the 2.9 comes from.
The 1.30 number is our SAL (prom) version. It started at 1.00 with
SGI Altix 4700.
> > SAL Platform features: ITC_Drift
> > SAL: AP wakeup using external interrupt vector 0x12
> > ia64_sal_pltid failed with -1
>
> Annoying noise, I agree with you.
More than annoying. An indication of a problem.
> > ACPI: Local APIC address c0000000fee00000
> > register_intr: No IOSAPIC for GSI 52
> > 14 CPUs available, 14 CPUs total
>
> More annoying noise; I thought about removing or changing that
> printk the first time around to something like KERN_INFO since
> failure of that particular SAL call doesn't really affect any
> functionality.
>
> Opinions?
>
> > Brought up 14 CPUs
> > Total of 14 processors activated (44793.85 BogoMIPS).
> > ------------------------------------------------------------------------------
> >
> >
> > saturn1-10:~ # cat /proc/cpuinfo
> > processor : 0
> > vendor : GenuineIntel
> > arch : IA-64
> > family : 32
> > model : 1
> > model name : Dual-Core Intel(R) Itanium(R) Processor 9150M
>
> This looks like a Montvale? That means we *should* be getting
> meaningful "physical id" information from /proc/cpuinfo. :(
Correct.
> What values were you getting before my workaround above?
See below.
> And again, the more interesting question is, why is your SAL
> reporting a revision of 2.9?
I'll find out.
> Thanks.
>
> /ac
>
> > revision : 0
> > archrev : 0
> > features : branchlong, 16-byte atomic ops
> > cpu number : 0
> > cpu regs : 4
> > cpu MHz : 1669.503
> > itc MHz : 416.875000
> > BogoMIPS : 3325.95
> > siblings : 1
-----------------------------------------------------------
Without the change:
saturn1-10:~ # cat /proc/cpuinfo
processor : 0
vendor : GenuineIntel
arch : IA-64
family : 32
model : 1
model name : Dual-Core Intel(R) Itanium(R) Processor 9150M
revision : 0
archrev : 0
features : branchlong, 16-byte atomic ops
cpu number : 0
cpu regs : 4
cpu MHz : 1669.000503
itc MHz : 416.875000
BogoMIPS : 3325.95
siblings : 2
physical id: 0
core id : 0
thread id : 0
processor : 1
vendor : GenuineIntel
arch : IA-64
family : 32
model : 1
model name : Dual-Core Intel(R) Itanium(R) Processor 9150M
revision : 0
archrev : 0
features : branchlong, 16-byte atomic ops
cpu number : 0
cpu regs : 4
cpu MHz : 1669.000503
itc MHz : 416.875000
BogoMIPS : 3325.95
siblings : 2
physical id: 0
core id : 1
thread id : 0
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (14 preceding siblings ...)
2008-02-27 1:05 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
@ 2008-02-27 14:38 ` Luck, Tony
2008-02-27 15:19 ` Russ Anderson
` (9 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Luck, Tony @ 2008-02-27 14:38 UTC (permalink / raw)
To: linux-ia64
> > Did it work before?
>
> Yes.
How about a more drastic approach to avoiding the
problem ... avoid any poking around looking for
siblings on pre-montecito processors?
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index ebd1a09..7b0e396 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -736,7 +736,8 @@ identify_cpu (struct cpuinfo_ia64 *c)
c->threads_per_core = c->cores_per_socket = c->num_log = 1;
c->socket_id = -1;
- identify_siblings(c);
+ if (cpuid.field.family > 0x1f)
+ identify_siblings(c);
if (c->threads_per_core > smp_num_siblings)
smp_num_siblings = c->threads_per_core;
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (15 preceding siblings ...)
2008-02-27 14:38 ` Luck, Tony
@ 2008-02-27 15:19 ` Russ Anderson
2008-02-27 16:50 ` Russ Anderson
` (8 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Russ Anderson @ 2008-02-27 15:19 UTC (permalink / raw)
To: linux-ia64
On Wed, Feb 27, 2008 at 06:38:04AM -0800, Luck, Tony wrote:
> > > Did it work before?
> >
> > Yes.
>
> How about a more drastic approach to avoiding the
> problem ... avoid any poking around looking for
> siblings on pre-montecito processors?
I like that idea. No need to look for siblings
on CPUs that do not support siblings.
> diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
> index ebd1a09..7b0e396 100644
> --- a/arch/ia64/kernel/setup.c
> +++ b/arch/ia64/kernel/setup.c
> @@ -736,7 +736,8 @@ identify_cpu (struct cpuinfo_ia64 *c)
> c->threads_per_core = c->cores_per_socket = c->num_log = 1;
> c->socket_id = -1;
>
> - identify_siblings(c);
> + if (cpuid.field.family > 0x1f)
> + identify_siblings(c);
>
> if (c->threads_per_core > smp_num_siblings)
> smp_num_siblings = c->threads_per_core;
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (16 preceding siblings ...)
2008-02-27 15:19 ` Russ Anderson
@ 2008-02-27 16:50 ` Russ Anderson
2008-02-27 23:43 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
` (7 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Russ Anderson @ 2008-02-27 16:50 UTC (permalink / raw)
To: linux-ia64
On Tue, Feb 26, 2008 at 05:34:19PM -0700, Alex Chiang wrote:
> > SAL 2.9: SGI SN2 version 1.30
>
> I wouldn't expect SAL 2.9 to implement a call defined in SAL 3.2,
> unless I'm seriously misunderstanding something?
The answer is the 2.9 value is hardcoded in the Altix prom and
was not updated to 3.2 even though the prom supports SAL 3.2.
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (17 preceding siblings ...)
2008-02-27 16:50 ` Russ Anderson
@ 2008-02-27 23:43 ` Alex Chiang
2008-02-28 0:12 ` Alex Chiang
` (6 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-27 23:43 UTC (permalink / raw)
To: linux-ia64
* Luck, Tony <tony.luck@intel.com>:
>
> How about a more drastic approach to avoiding the
> problem ... avoid any poking around looking for
> siblings on pre-montecito processors?
That's a good idea, but there are a few issues...
Primarily, SAL_PHYSICAL_ID_INFO returns useful information on HP
pre-montecito platforms. And as we're learning, there isn't a
nice uniform way of figuring out when we should be making this
call and when we shouldn't because:
(a) Tiger firmware should be returning -1 for an
unimplemented SAL call, but is hanging instead
(b) SGI Altix has hard-coded the incorrect SAL version in
their firmware, so we can't do a simple version check.
In the ideal world, we would just fix (a) to return -1.
In a less ideal world, I would beg SGI fw guys to set their SAL
revision id to 3.2, since that's what they implement.
In reality, I think we have to do some hacky stuff to work around
these firmware issues so that all parties are happy, where happy
is defined as:
(a) no longer hang Tiger boxes
(b) SGI machines have proper entries in /proc/cpuinfo
(c) legacy HP machines have useful information in
/proc/cpuinfo
So here are the two ideas I had. First, a slight modification to
my earlier patch, where we now check for SGI machines too:
> ia64_sal_physical_id_info(u16 *splid)
> {
> struct ia64_sal_retval isrv;
> +
> + if (!ia64_platform_is("sn2") && (sal_revision < SAL_VERSION_CODE(3,2))
> + return -1;
> +
> SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
> if (splid)
> *splid = isrv.v0;
As a slight optimization, I could probably do that check in
ia64_sal_init() and save the results for later for
ia64_sal_physical_id_info to look at.
Second idea is a bit more involved and follows afterwards.
And if anyone has other suggestions too, I'm happy to hear them.
Thanks.
/ac
From: Alex Chiang <achiang@hp.com>
Subject: [PATCH] ia64: workaround tiger hang in ia64_sal_get_physical_id_info
Intel Tiger systems hang if SAL_PHYSICAL_ID_INFO is called
instead of returning -1 like they should.
We can't just check the SAL revision number and avoid this call
if sal_revision < 3.2, because SGI Altix systems have hard-coded
their revision number to 2.9, even though they really implement 3.2.
So look in the XSDT to avoid making the call on Tiger platforms.
Create an interface exposing the XSDT to do so.
Signed-off-by: Alex Chiang <achiang@hp.com>
---
diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index 78f28d8..e5b9cc0 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -69,8 +69,7 @@ unsigned int acpi_cpei_phys_cpuid;
unsigned long acpi_wakeup_address = 0;
-#ifdef CONFIG_IA64_GENERIC
-static unsigned long __init acpi_find_rsdp(void)
+static unsigned long acpi_find_rsdp(void)
{
unsigned long rsdp_phys = 0;
@@ -81,7 +80,30 @@ static unsigned long __init acpi_find_rsdp(void)
"v1.0/r0.71 tables no longer supported\n");
return rsdp_phys;
}
-#endif
+
+struct acpi_table_xsdt *
+acpi_find_xsdt(void)
+{
+ unsigned long rsdp_phys;
+ struct acpi_table_rsdp *rsdp;
+ struct acpi_table_xsdt *xsdt;
+ struct acpi_table_header *hdr;
+
+ rsdp_phys = acpi_find_rsdp();
+ if (!rsdp_phys)
+ return NULL;
+
+ rsdp = (struct acpi_table_rsdp *)__va(rsdp_phys);
+ if (strncmp(rsdp->signature, ACPI_SIG_RSDP, sizeof(ACPI_SIG_RSDP) - 1))
+ return NULL;
+
+ xsdt = (struct acpi_table_xsdt *)__va(rsdp->xsdt_physical_address);
+ hdr = &xsdt->header;
+ if (strncmp(hdr->signature, ACPI_SIG_XSDT, sizeof(ACPI_SIG_XSDT) - 1))
+ return NULL;
+
+ return xsdt;
+}
const char __init *
acpi_get_sysname(void)
diff --git a/arch/ia64/kernel/sal.c b/arch/ia64/kernel/sal.c
index f44fe84..a75de51 100644
--- a/arch/ia64/kernel/sal.c
+++ b/arch/ia64/kernel/sal.c
@@ -286,6 +286,54 @@ ia64_sal_cache_flush (u64 cache_type)
}
EXPORT_SYMBOL_GPL(ia64_sal_cache_flush);
+/*
+ * Intel Tiger systems implement SAL revision 3.1, which does not
+ * define SAL_PHYSICAL_ID_INFO. If this call is made on those platforms,
+ * they *should* return -1 to indicate the call is unimplemented, but
+ * instead, they hang.
+ *
+ * It might be easy to simply check the SAL revision number and avoid
+ * this call if sal_revision < 3.2, but SGI Altix systems have hard-coded
+ * their revision number to 2.9, even though they really implement 3.2.
+ *
+ * So we have to grovel in ACPI's XSDT to try and detect Tiger systems
+ * and avoid making this SAL call.
+ */
+#include <linux/acpi.h>
+static int
+is_intel_tiger(void)
+{
+ struct acpi_table_xsdt *xsdt;
+ struct acpi_table_header *hdr;
+
+ xsdt = acpi_find_xsdt();
+ if (!xsdt)
+ return 0;
+
+ hdr = &xsdt->header;
+ if (strncmp(hdr->oem_id, "INTEL", 5) ||
+ (!strncmp(hdr->oem_table_id, "SR870BH2", 8) &&
+ !strncmp(hdr->oem_table_id, "SR870BN4", 8)))
+ return 0;
+
+ return 1;
+}
+
+s64
+ia64_sal_physical_id_info(u16 *splid)
+{
+ struct ia64_sal_retval isrv;
+
+ if (is_intel_tiger())
+ return -1;
+
+ SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
+ if (splid)
+ *splid = isrv.v0;
+ return isrv.status;
+}
+EXPORT_SYMBOL_GPL(ia64_sal_physical_id);
+
void __init
ia64_sal_init (struct ia64_sal_systab *systab)
{
diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
index 2251118..d30100d 100644
--- a/include/asm-ia64/sal.h
+++ b/include/asm-ia64/sal.h
@@ -652,6 +652,8 @@ typedef struct err_rec {
extern s64 ia64_sal_cache_flush (u64 cache_type);
extern void __init check_sal_cache_flush (void);
+/* Get physical processor die mapping in the platform. */
+extern s64 ia64_sal_physical_id_info(u16 *splid);
/* Initialize all the processor and platform level instruction and data caches */
static inline s64
@@ -802,17 +804,6 @@ ia64_sal_update_pal (u64 param_buf, u64 scratch_buf, u64 scratch_buf_size,
return isrv.status;
}
-/* Get physical processor die mapping in the platform. */
-static inline s64
-ia64_sal_physical_id_info(u16 *splid)
-{
- struct ia64_sal_retval isrv;
- SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
- if (splid)
- *splid = isrv.v0;
- return isrv.status;
-}
-
extern unsigned long sal_platform_features;
extern int (*salinfo_platform_oemdata)(const u8 *, u8 **, u64 *);
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 2c7e003..35e973d 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -90,6 +90,7 @@ int __init acpi_table_parse_entries(char *id, unsigned long table_size,
int acpi_table_parse_madt (enum acpi_madt_type id, acpi_table_entry_handler handler, unsigned int max_entries);
int acpi_parse_mcfg (struct acpi_table_header *header);
void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
+struct acpi_table_xsdt * acpi_find_xsdt(void);
/* the following four functions are architecture-dependent */
#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (18 preceding siblings ...)
2008-02-27 23:43 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
@ 2008-02-28 0:12 ` Alex Chiang
2008-02-28 0:30 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
` (5 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-28 0:12 UTC (permalink / raw)
To: linux-ia64
* Alex Chiang <achiang@hp.com>:
>
> And if anyone has other suggestions too, I'm happy to hear them.
Actually, this might be a relatively clean approach. As far as I
can tell, the only purpose for sal_revision / sal_version is to
display some boot messages. No one (other than my use below) is
keying off them for anything.
My fixup is based on the mail that Russ sent out with his
bootlog. It contained this line:
SAL 2.9: SGI SN2 version 1.30
So that's what I'm keying off of below.
Thanks.
/ac
From: Alex Chiang <achiang@hp.com>
Subject: [PATCH] ia64: workaround tiger ia64_sal_get_physical_id_info hang
Intel Tiger platforms hang when calling SAL_GET_PHYSICAL_ID_INFO
instead of properly returning -1 for unimplemented, so add a
version check.
SGI Altix platforms have an incorrect SAL version hard-coded into
their prom -- they encode 2.9, but actually implement 3.2 -- so
fix it up and allow ia64_sal_get_physical_id_info to keep
working.
Signed-off-by: Alex Chiang <achiang@hp.com>
---
diff --git a/arch/ia64/kernel/sal.c b/arch/ia64/kernel/sal.c
index f44fe84..4f3686a 100644
--- a/arch/ia64/kernel/sal.c
+++ b/arch/ia64/kernel/sal.c
@@ -109,6 +109,15 @@ check_versions (struct ia64_sal_systab *systab)
sal_revision = SAL_VERSION_CODE(2, 8);
sal_version = SAL_VERSION_CODE(0, 0);
}
+
+ if (ia64_platform_is("sn2")
+ && (sal_revision = SAL_VERSION_CODE(2, 9))
+ && (sal_version = SAL_VERSION_CODE(1, 90)))
+ /*
+ * SGI Altix has hard-coded version 2.9 in their prom
+ * but they actually implement 3.2, so let's fix it here.
+ */
+ sal_revision = SAL_VERSION_CODE(3, 2);
}
static void __init
diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
index 2251118..f4904db 100644
--- a/include/asm-ia64/sal.h
+++ b/include/asm-ia64/sal.h
@@ -807,6 +807,10 @@ static inline s64
ia64_sal_physical_id_info(u16 *splid)
{
struct ia64_sal_retval isrv;
+
+ if (sal_revision < SAL_VERSION_CODE(3,2))
+ return -1;
+
SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
if (splid)
*splid = isrv.v0;
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (19 preceding siblings ...)
2008-02-28 0:12 ` Alex Chiang
@ 2008-02-28 0:30 ` Matthew Wilcox
2008-02-28 0:31 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
` (4 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Matthew Wilcox @ 2008-02-28 0:30 UTC (permalink / raw)
To: linux-ia64
On Wed, Feb 27, 2008 at 10:50:25AM -0600, Russ Anderson wrote:
> On Tue, Feb 26, 2008 at 05:34:19PM -0700, Alex Chiang wrote:
> > > SAL 2.9: SGI SN2 version 1.30
> >
> > I wouldn't expect SAL 2.9 to implement a call defined in SAL 3.2,
> > unless I'm seriously misunderstanding something?
>
> The answer is the 2.9 value is hardcoded in the Altix prom and
> was not updated to 3.2 even though the prom supports SAL 3.2.
Sounds like we should just set the sal revision to 3.2 if
ia64_platform_is("sn2"). Would that cause any other problems? What
about older versions of the prom?
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (20 preceding siblings ...)
2008-02-28 0:30 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
@ 2008-02-28 0:31 ` Alex Chiang
2008-02-28 0:34 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
` (3 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-28 0:31 UTC (permalink / raw)
To: linux-ia64
* Alex Chiang <achiang@hp.com>:
>
> My fixup is based on the mail that Russ sent out with his
> bootlog. It contained this line:
>
> SAL 2.9: SGI SN2 version 1.30
Duh, which means I should actually key off the right value. :(
Try #2. Sorry for all the noise.
/ac
From: Alex Chiang <achiang@hp.com>
Subject: [PATCH] ia64: workaround tiger ia64_sal_get_physical_id_info hang
Intel Tiger platforms hang when calling SAL_GET_PHYSICAL_ID_INFO
instead of properly returning -1 for unimplemented, so add a
version check.
SGI Altix platforms have an incorrect SAL version hard-coded into
their prom -- they encode 2.9, but actually implement 3.2 -- so
fix it up and allow ia64_sal_get_physical_id_info to keep
working.
Signed-off-by: Alex Chiang <achiang@hp.com>
---
diff --git a/arch/ia64/kernel/sal.c b/arch/ia64/kernel/sal.c
index f44fe84..4f3686a 100644
--- a/arch/ia64/kernel/sal.c
+++ b/arch/ia64/kernel/sal.c
@@ -109,6 +109,15 @@ check_versions (struct ia64_sal_systab *systab)
sal_revision = SAL_VERSION_CODE(2, 8);
sal_version = SAL_VERSION_CODE(0, 0);
}
+
+ if (ia64_platform_is("sn2")
+ && (sal_revision = SAL_VERSION_CODE(2, 9))
+ && (sal_version = SAL_VERSION_CODE(1, 30)))
+ /*
+ * SGI Altix has hard-coded version 2.9 in their prom
+ * but they actually implement 3.2, so let's fix it here.
+ */
+ sal_revision = SAL_VERSION_CODE(3, 2);
}
static void __init
diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
index 2251118..f4904db 100644
--- a/include/asm-ia64/sal.h
+++ b/include/asm-ia64/sal.h
@@ -807,6 +807,10 @@ static inline s64
ia64_sal_physical_id_info(u16 *splid)
{
struct ia64_sal_retval isrv;
+
+ if (sal_revision < SAL_VERSION_CODE(3,2))
+ return -1;
+
SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
if (splid)
*splid = isrv.v0;
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (21 preceding siblings ...)
2008-02-28 0:31 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
@ 2008-02-28 0:34 ` Russ Anderson
2008-02-28 0:42 ` Matthew Wilcox
` (2 subsequent siblings)
25 siblings, 0 replies; 27+ messages in thread
From: Russ Anderson @ 2008-02-28 0:34 UTC (permalink / raw)
To: linux-ia64
On Wed, Feb 27, 2008 at 05:12:26PM -0700, Alex Chiang wrote:
> * Alex Chiang <achiang@hp.com>:
> >
> > And if anyone has other suggestions too, I'm happy to hear them.
>
> Actually, this might be a relatively clean approach. As far as I
> can tell, the only purpose for sal_revision / sal_version is to
> display some boot messages. No one (other than my use below) is
> keying off them for anything.
>
> My fixup is based on the mail that Russ sent out with his
> bootlog. It contained this line:
>
> SAL 2.9: SGI SN2 version 1.30
>
> So that's what I'm keying off of below.
That may be as clean as anything.
> Thanks.
>
> /ac
>
> From: Alex Chiang <achiang@hp.com>
> Subject: [PATCH] ia64: workaround tiger ia64_sal_get_physical_id_info hang
>
> Intel Tiger platforms hang when calling SAL_GET_PHYSICAL_ID_INFO
> instead of properly returning -1 for unimplemented, so add a
> version check.
>
> SGI Altix platforms have an incorrect SAL version hard-coded into
> their prom -- they encode 2.9, but actually implement 3.2 -- so
> fix it up and allow ia64_sal_get_physical_id_info to keep
> working.
>
> Signed-off-by: Alex Chiang <achiang@hp.com>
> ---
> diff --git a/arch/ia64/kernel/sal.c b/arch/ia64/kernel/sal.c
> index f44fe84..4f3686a 100644
> --- a/arch/ia64/kernel/sal.c
> +++ b/arch/ia64/kernel/sal.c
> @@ -109,6 +109,15 @@ check_versions (struct ia64_sal_systab *systab)
> sal_revision = SAL_VERSION_CODE(2, 8);
> sal_version = SAL_VERSION_CODE(0, 0);
> }
> +
> + if (ia64_platform_is("sn2")
> + && (sal_revision = SAL_VERSION_CODE(2, 9))
> + && (sal_version = SAL_VERSION_CODE(1, 90)))
The sal_version check should be removed. The revision has been
stuck at 2.9 but the version has been changing.
> + /*
> + * SGI Altix has hard-coded version 2.9 in their prom
> + * but they actually implement 3.2, so let's fix it here.
> + */
> + sal_revision = SAL_VERSION_CODE(3, 2);
> }
>
> static void __init
> diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
> index 2251118..f4904db 100644
> --- a/include/asm-ia64/sal.h
> +++ b/include/asm-ia64/sal.h
> @@ -807,6 +807,10 @@ static inline s64
> ia64_sal_physical_id_info(u16 *splid)
> {
> struct ia64_sal_retval isrv;
> +
> + if (sal_revision < SAL_VERSION_CODE(3,2))
> + return -1;
> +
> SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
> if (splid)
> *splid = isrv.v0;
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (22 preceding siblings ...)
2008-02-28 0:34 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
@ 2008-02-28 0:42 ` Matthew Wilcox
2008-02-28 1:41 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
2008-02-28 3:47 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
25 siblings, 0 replies; 27+ messages in thread
From: Matthew Wilcox @ 2008-02-28 0:42 UTC (permalink / raw)
To: linux-ia64
On Wed, Feb 27, 2008 at 06:34:02PM -0600, Russ Anderson wrote:
> > + if (ia64_platform_is("sn2")
> > + && (sal_revision = SAL_VERSION_CODE(2, 9))
> > + && (sal_version = SAL_VERSION_CODE(1, 90)))
>
> The sal_version check should be removed. The revision has been
> stuck at 2.9 but the version has been changing.
What would the right values be for a, b, c, d, e, f to make this work?
if (ia64_platform_is("sn2") &&
sal_revision <= SAL_VERSION_CODE(3, 2)) {
if (sal_version >= SAL_VERSION_CODE(a, b))
sal_revision = SAL_VERSION_CODE(3, 2);
else if (sal_version >= SAL_VERSION_CODE(c, d))
sal_revision = SAL_VERSION_CODE(3, 1);
else if (sal_version >= SAL_VERSION_CODE(e, f))
sal_revision = SAL_VERSION_CODE(3, 0);
}
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC]
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (23 preceding siblings ...)
2008-02-28 0:42 ` Matthew Wilcox
@ 2008-02-28 1:41 ` Alex Chiang
2008-02-28 3:47 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
25 siblings, 0 replies; 27+ messages in thread
From: Alex Chiang @ 2008-02-28 1:41 UTC (permalink / raw)
To: linux-ia64
* Russ Anderson <rja@sgi.com>:
>
> The sal_version check should be removed. The revision has been
> stuck at 2.9 but the version has been changing.
As per Russ' email, here is try #3.
Thanks.
/ac
From: Alex Chiang <achiang@hp.com>
Subject: [PATCH] ia64: workaround tiger ia64_sal_get_physical_id_info hang
Intel Tiger platforms hang when calling SAL_GET_PHYSICAL_ID_INFO
instead of properly returning -1 for unimplemented, so add a
version check.
SGI Altix platforms have an incorrect SAL version hard-coded into
their prom -- they encode 2.9, but actually implement 3.2 -- so
fix it up and allow ia64_sal_get_physical_id_info to keep
working.
Signed-off-by: Alex Chiang <achiang@hp.com>
---
diff --git a/arch/ia64/kernel/sal.c b/arch/ia64/kernel/sal.c
index f44fe84..a3022dc 100644
--- a/arch/ia64/kernel/sal.c
+++ b/arch/ia64/kernel/sal.c
@@ -109,6 +109,13 @@ check_versions (struct ia64_sal_systab *systab)
sal_revision = SAL_VERSION_CODE(2, 8);
sal_version = SAL_VERSION_CODE(0, 0);
}
+
+ if (ia64_platform_is("sn2") && (sal_revision = SAL_VERSION_CODE(2, 9)))
+ /*
+ * SGI Altix has hard-coded version 2.9 in their prom
+ * but they actually implement 3.2, so let's fix it here.
+ */
+ sal_revision = SAL_VERSION_CODE(3, 2);
}
static void __init
diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
index 2251118..f4904db 100644
--- a/include/asm-ia64/sal.h
+++ b/include/asm-ia64/sal.h
@@ -807,6 +807,10 @@ static inline s64
ia64_sal_physical_id_info(u16 *splid)
{
struct ia64_sal_retval isrv;
+
+ if (sal_revision < SAL_VERSION_CODE(3,2))
+ return -1;
+
SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
if (splid)
*splid = isrv.v0;
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
` (24 preceding siblings ...)
2008-02-28 1:41 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
@ 2008-02-28 3:47 ` Russ Anderson
25 siblings, 0 replies; 27+ messages in thread
From: Russ Anderson @ 2008-02-28 3:47 UTC (permalink / raw)
To: linux-ia64
On Wed, Feb 27, 2008 at 06:41:38PM -0700, Alex Chiang wrote:
> * Russ Anderson <rja@sgi.com>:
> >
> > The sal_version check should be removed. The revision has been
> > stuck at 2.9 but the version has been changing.
>
> As per Russ' email, here is try #3.
Looks good. Boots on both new and old hardware.
Sorry about the inconvenience.
Acked-by: Russ Anderson <rja@sgi.com>
> Thanks.
>
> /ac
>
> From: Alex Chiang <achiang@hp.com>
> Subject: [PATCH] ia64: workaround tiger ia64_sal_get_physical_id_info hang
>
> Intel Tiger platforms hang when calling SAL_GET_PHYSICAL_ID_INFO
> instead of properly returning -1 for unimplemented, so add a
> version check.
>
> SGI Altix platforms have an incorrect SAL version hard-coded into
> their prom -- they encode 2.9, but actually implement 3.2 -- so
> fix it up and allow ia64_sal_get_physical_id_info to keep
> working.
>
> Signed-off-by: Alex Chiang <achiang@hp.com>
> ---
> diff --git a/arch/ia64/kernel/sal.c b/arch/ia64/kernel/sal.c
> index f44fe84..a3022dc 100644
> --- a/arch/ia64/kernel/sal.c
> +++ b/arch/ia64/kernel/sal.c
> @@ -109,6 +109,13 @@ check_versions (struct ia64_sal_systab *systab)
> sal_revision = SAL_VERSION_CODE(2, 8);
> sal_version = SAL_VERSION_CODE(0, 0);
> }
> +
> + if (ia64_platform_is("sn2") && (sal_revision = SAL_VERSION_CODE(2, 9)))
> + /*
> + * SGI Altix has hard-coded version 2.9 in their prom
> + * but they actually implement 3.2, so let's fix it here.
> + */
> + sal_revision = SAL_VERSION_CODE(3, 2);
> }
>
> static void __init
> diff --git a/include/asm-ia64/sal.h b/include/asm-ia64/sal.h
> index 2251118..f4904db 100644
> --- a/include/asm-ia64/sal.h
> +++ b/include/asm-ia64/sal.h
> @@ -807,6 +807,10 @@ static inline s64
> ia64_sal_physical_id_info(u16 *splid)
> {
> struct ia64_sal_retval isrv;
> +
> + if (sal_revision < SAL_VERSION_CODE(3,2))
> + return -1;
> +
> SAL_CALL(isrv, SAL_PHYSICAL_ID_INFO, 0, 0, 0, 0, 0, 0, 0);
> if (splid)
> *splid = isrv.v0;
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Russ Anderson, OS RAS/Partitioning Project Lead
SGI - Silicon Graphics Inc rja@sgi.com
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2008-02-28 3:47 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-25 17:27 Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: 113134fcbca83619be4c68d0ca66db609 Bjorn Helgaas
2008-02-25 23:08 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression: Alex Chiang
2008-02-26 1:11 ` Shaohua Li
2008-02-26 7:15 ` Alex Chiang
2008-02-26 9:24 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Li, Shaohua
2008-02-26 17:51 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
2008-02-26 22:45 ` Alex Chiang
2008-02-26 23:07 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
2008-02-26 23:46 ` Russ Anderson
2008-02-26 23:50 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
2008-02-27 0:00 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
2008-02-27 0:10 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
2008-02-27 0:15 ` Shaohua Li
2008-02-27 0:23 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
2008-02-27 0:34 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
2008-02-27 1:05 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
2008-02-27 14:38 ` Luck, Tony
2008-02-27 15:19 ` Russ Anderson
2008-02-27 16:50 ` Russ Anderson
2008-02-27 23:43 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
2008-02-28 0:12 ` Alex Chiang
2008-02-28 0:30 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Matthew Wilcox
2008-02-28 0:31 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
2008-02-28 0:34 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
2008-02-28 0:42 ` Matthew Wilcox
2008-02-28 1:41 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] Alex Chiang
2008-02-28 3:47 ` Tiger oops in ia64_sal_physical_id_info (was [RFC] regression:113134fcbca83619be4c68d0ca66db6093 Russ Anderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox