All of lore.kernel.org
 help / color / mirror / Atom feed
* APIC handling on x86-64
@ 2006-03-17 16:49 Jan Beulich
  2006-03-17 17:06 ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2006-03-17 16:49 UTC (permalink / raw)
  To: xen-devel

As we had a report of a problem booting Xen on an IBM x460, dying on the BUG_ON() in init_apic_ldr() in
xen/include/asm-x86/mach-summit/mach_apic.h, I started comparing 32- and 64-bit APIC handling. Quickly I found that the
same case is handled gracefully in 64-bits, by just tying any extra CPUs to the highest bit. (I suppose, will try to
verify this with the originator, that the same machine also doesn't boot with native 32-bit Linux, as the exact same
issue should exist there).
While doing the same generally shouldn't be a problem, I wonder why this hasn't been discovered so far and how many
else differences there exist.

Thanks, Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
  2006-03-17 16:49 Jan Beulich
@ 2006-03-17 17:06 ` Keir Fraser
  0 siblings, 0 replies; 13+ messages in thread
From: Keir Fraser @ 2006-03-17 17:06 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel


On 17 Mar 2006, at 16:49, Jan Beulich wrote:

> As we had a report of a problem booting Xen on an IBM x460, dying on 
> the BUG_ON() in init_apic_ldr() in
> xen/include/asm-x86/mach-summit/mach_apic.h, I started comparing 32- 
> and 64-bit APIC handling. Quickly I found that the
> same case is handled gracefully in 64-bits, by just tying any extra 
> CPUs to the highest bit. (I suppose, will try to
> verify this with the originator, that the same machine also doesn't 
> boot with native 32-bit Linux, as the exact same
> issue should exist there).
> While doing the same generally shouldn't be a problem, I wonder why 
> this hasn't been discovered so far and how many
> else differences there exist.

Differences between i386 and x86/64 native Linux APIC handling? A fair 
few, although mostly it's because crufty old code has been removed from 
x86/64. I guess there are occasions where Andi Kleen has improved 
correctness at the same time as cleaning up. :-)

This is the first time that the strategy of taking latest i386 APIC 
code has let us down I think.

I guess we just patch it in Xen with a comment explaining the extra 
diff vs native i386 Linux version of the same file.

  -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
@ 2006-03-19 19:15 Jan Beulich
  2006-03-19 23:45 ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2006-03-19 19:15 UTC (permalink / raw)
  To: Keir.Fraser; +Cc: xen-devel

>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 03/17/06 6:06 PM >>>
>
>On 17 Mar 2006, at 16:49, Jan Beulich wrote:
>
>> As we had a report of a problem booting Xen on an IBM x460, dying on 
>> the BUG_ON() in init_apic_ldr() in
>> xen/include/asm-x86/mach-summit/mach_apic.h, I started comparing 32- 
>> and 64-bit APIC handling. Quickly I found that the
>> same case is handled gracefully in 64-bits, by just tying any extra 
>> CPUs to the highest bit. (I suppose, will try to
>> verify this with the originator, that the same machine also doesn't 
>> boot with native 32-bit Linux, as the exact same
>> issue should exist there).
>> While doing the same generally shouldn't be a problem, I wonder why 
>> this hasn't been discovered so far and how many
>> else differences there exist.
>
>Differences between i386 and x86/64 native Linux APIC handling? A fair 
>few, although mostly it's because crufty old code has been removed from 
>x86/64. I guess there are occasions where Andi Kleen has improved 
>correctness at the same time as cleaning up. :-)
>
>This is the first time that the strategy of taking latest i386 APIC 
>code has let us down I think.
>
>I guess we just patch it in Xen with a comment explaining the extra 
>diff vs native i386 Linux version of the same file.

Actually, looking further, the originally mentioned adjustment can't work. x86-64
Linux can do this because they deliver IPIs (on large systems) in physical
destination mode, and hence they don't really need to values written to DFR and
specifically LDR. Since Xen inherited the 32-bit code, IPIs get sent in logical
(cluster) mode, and then playing games like this with the LDR is only going to
cause problems (you'll hit multiple processors with a send that's intended for a
single one only). Hence I would think that more extensive changes are going to
be needed; entirely taking x86-64's model may not be feasible either (aside of
the fact that this would mean quite extensive code changes), as they don't need
to care about pre-Pentium4 processors.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
  2006-03-19 19:15 Jan Beulich
@ 2006-03-19 23:45 ` Keir Fraser
  0 siblings, 0 replies; 13+ messages in thread
From: Keir Fraser @ 2006-03-19 23:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel


On 19 Mar 2006, at 19:15, Jan Beulich wrote:

> Actually, looking further, the originally mentioned adjustment can't 
> work. x86-64
> Linux can do this because they deliver IPIs (on large systems) in 
> physical
> destination mode, and hence they don't really need to values written 
> to DFR and
> specifically LDR. Since Xen inherited the 32-bit code, IPIs get sent 
> in logical
> (cluster) mode, and then playing games like this with the LDR is only 
> going to
> cause problems (you'll hit multiple processors with a send that's 
> intended for a
> single one only). Hence I would think that more extensive changes are 
> going to
> be needed; entirely taking x86-64's model may not be feasible either 
> (aside of
> the fact that this would mean quite extensive code changes), as they 
> don't need
> to care about pre-Pentium4 processors.

Yes, that's what stopped me taking x86/64 Linux code originally. I 
wonder if we can take just the IPI code from x86/64, plus whatever 
small changes we need to fix problems like this x460 bug. IPI code is 
reasonably self contained relative to the bulk of the apic code in 
apic.c. The x86/64 IPI code was a bunch cleaner than the i386 code last 
time I looked, and I don't think there are lurking legacy 32-bit issues 
in that area that would bite us?

  -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
@ 2006-03-20 13:53 Jan Beulich
  2006-03-20 16:00 ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2006-03-20 13:53 UTC (permalink / raw)
  To: Keir.Fraser; +Cc: xen-devel

>Yes, that's what stopped me taking x86/64 Linux code originally. I 
>wonder if we can take just the IPI code from x86/64, plus whatever 
>small changes we need to fix problems like this x460 bug. IPI code is 
>reasonably self contained relative to the bulk of the apic code in 
>apic.c. The x86/64 IPI code was a bunch cleaner than the i386 code last 
>time I looked, and I don't think there are lurking legacy 32-bit issues 
>in that area that would bite us?

Hmm, I'm not sure. Both because it wouldn't address all issues (the
I/O APIC redirection table entries also need to be programmed in
physical mode) and because I'm not so certain about legacy 32-bit
issues (on 32-bits you still have to account for the PentiumPro/II/III
behavior, namely the only 4-bit wide IDs, as that at least seems to
imply different cut-off criteria for the decision what destination mode
to use).

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
  2006-03-20 13:53 Jan Beulich
@ 2006-03-20 16:00 ` Keir Fraser
  2006-03-20 16:21   ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2006-03-20 16:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen Developers


On 20 Mar 2006, at 13:53, Jan Beulich wrote:

> Hmm, I'm not sure. Both because it wouldn't address all issues (the
> I/O APIC redirection table entries also need to be programmed in
> physical mode) and because I'm not so certain about legacy 32-bit
> issues (on 32-bits you still have to account for the PentiumPro/II/III
> behavior, namely the only 4-bit wide IDs, as that at least seems to
> imply different cut-off criteria for the decision what destination mode
> to use).

We can just steal x86/64 phys cluster code for summit. ES7000 already 
has its own phys cluster code that doesn't look like it will have this 
issue, Bigsmp uses phys flat, and Default uses logical flat. So summit 
is the only subarch that needs fixing.

Xen's io_apic.c will program logical/phys dest mode appropriately 
(although it's taken from i386 io_apic.c, those parts of the i386 file 
are identical to x86/64 io_apic.c).

I'd also like to fix send_IPI_mask_sequence() to use physical 
destination mode. It scares me a bit that 2.6.16 native i386 bigsmp 
mode uses physical flat model, but uses send_IPI_mask_sequence with 
logical delivery. Seems weird to me.

Both this and the summit fix will be after 3.0.2 now.

  -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
  2006-03-20 16:00 ` Keir Fraser
@ 2006-03-20 16:21   ` Keir Fraser
  0 siblings, 0 replies; 13+ messages in thread
From: Keir Fraser @ 2006-03-20 16:21 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Xen Developers, Jan Beulich


On 20 Mar 2006, at 16:00, Keir Fraser wrote:

> I'd also like to fix send_IPI_mask_sequence() to use physical 
> destination mode. It scares me a bit that 2.6.16 native i386 bigsmp 
> mode uses physical flat model, but uses send_IPI_mask_sequence with 
> logical delivery. Seems weird to me.

Oh wow. I see that in fact mach-bigsmap and mach-es7000 both redefine 
APIC_DEST_LOGICAL to zero. That's very sleazy. :-)

  -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
@ 2006-03-21 14:23 Jan Beulich
  2006-03-21 14:34 ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2006-03-21 14:23 UTC (permalink / raw)
  To: Keir.Fraser; +Cc: xen-devel

>Xen's io_apic.c will program logical/phys dest mode appropriately 
>(although it's taken from i386 io_apic.c, those parts of the i386 file 
>are identical to x86/64 io_apic.c).

Good - I didn't check this explicitly.

>I'd also like to fix send_IPI_mask_sequence() to use physical 
>destination mode. It scares me a bit that 2.6.16 native i386 bigsmp 
>mode uses physical flat model, but uses send_IPI_mask_sequence with 
>logical delivery. Seems weird to me.

Haven't been able to download 2.6.16, yet (the WLANs I'm currently moving
in seem to be rather unreliable, and I have no copy of the unmodified sources
of any of the RCs around), but in any case our bigsmp kernel already appears
to be using consistent physical mode.

>Both this and the summit fix will be after 3.0.2 now.

We'll have to pick them up anyway (if you get around to do it before I return
from BrainShare, where I can't really do any significant work, and namely no
testing).

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
  2006-03-21 14:23 APIC handling on x86-64 Jan Beulich
@ 2006-03-21 14:34 ` Keir Fraser
  2006-03-21 15:26   ` Paul Larson
  0 siblings, 1 reply; 13+ messages in thread
From: Keir Fraser @ 2006-03-21 14:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel


On 21 Mar 2006, at 14:23, Jan Beulich wrote:

>> Both this and the summit fix will be after 3.0.2 now.
>
> We'll have to pick them up anyway (if you get around to do it before I 
> return
> from BrainShare, where I can't really do any significant work, and 
> namely no
> testing).

Testing is the scary thing for me. I have no access to Summit or ES7000 
boxes, and I don't know what external testing is being done on those 
systems and on what schedule. For all I know, bugs could be introduced 
and not fixed for an arbitrary time. If I know there is testing 
infrastructure I would like to move closer to x86/64's genapic code 
(which uses generic mechanisms to determine what APIC mode to use) and 
keep the minimal amount possible of default/bigsmp/summit/es7000 
specific subarchitecture code. Still, some incremental progress on that 
path is definitely achievable.

  -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
@ 2006-03-21 14:47 Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2006-03-21 14:47 UTC (permalink / raw)
  To: Keir.Fraser; +Cc: xen-devel

>Testing is the scary thing for me. I have no access to Summit or ES7000 
>boxes, and I don't know what external testing is being done on those 
>systems and on what schedule. For all I know, bugs could be introduced 
>and not fixed for an arbitrary time. If I know there is testing 
>infrastructure I would like to move closer to x86/64's genapic code 
>(which uses generic mechanisms to determine what APIC mode to use) and 
>keep the minimal amount possible of default/bigsmp/summit/es7000 
>specific subarchitecture code. Still, some incremental progress on that 
>path is definitely achievable.

Whatever change we're going to do here, we'll have IBM test this on the
x460-s they reported the problem on. While we supposedly have at least
on x460 in our Lab, it's only a 16-(logical-)processor one, and there it
doesn't matter whether clustered logical or physical modes get used.

However, even if they disable HT on the 4-node system (reducing to 32
logical processors) they apparently also have a BIOS problem there in
that, with the intention to make the OS use clustered logical mode, they
assign processors to 16 clusters (2 processors each), ending up with
two processors with APIC IDs of the form 0xFx, which to my
understanding is valid only for physical mode.

Jan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
  2006-03-21 14:34 ` Keir Fraser
@ 2006-03-21 15:26   ` Paul Larson
  0 siblings, 0 replies; 13+ messages in thread
From: Paul Larson @ 2006-03-21 15:26 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 881 bytes --]

On Tuesday 21 March 2006 08:34, Keir Fraser wrote:
> Testing is the scary thing for me. I have no access to Summit or ES7000
> boxes, and I don't know what external testing is being done on those
> systems and on what schedule. For all I know, bugs could be introduced
> and not fixed for an arbitrary time. If I know there is testing
> infrastructure I would like to move closer to x86/64's genapic code
> (which uses generic mechanisms to determine what APIC mode to use) and
> keep the minimal amount possible of default/bigsmp/summit/es7000
> specific subarchitecture code. Still, some incremental progress on that
> path is definitely achievable.
We have not seen this issue on our 2-node 460 or any of our other summit 
boxes.  However we do have a couple of summit boxes we could use for testing 
if you have some code to throw at us.


Thanks,
Paul Larson

[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: APIC handling on x86-64
@ 2006-03-21 15:30 Puthiyaparambil, Aravindh
  2006-03-21 15:38 ` Keir Fraser
  0 siblings, 1 reply; 13+ messages in thread
From: Puthiyaparambil, Aravindh @ 2006-03-21 15:30 UTC (permalink / raw)
  To: Keir Fraser, Jan Beulich; +Cc: xen-devel

> Testing is the scary thing for me. I have no access to Summit or
ES7000
> boxes, and I don't know what external testing is being done on those
> systems and on what schedule. For all I know, bugs could be introduced
> and not fixed for an arbitrary time. If I know there is testing
> infrastructure I would like to move closer to x86/64's genapic code
> (which uses generic mechanisms to determine what APIC mode to use) and
> keep the minimal amount possible of default/bigsmp/summit/es7000
> specific subarchitecture code. Still, some incremental progress on
that
> path is definitely achievable.

Keir,

We can help with the testing on ES7000 boxes. I have always wanted to
move closer to the x86_64 genapic code so I will help in whatever
possible way.

Thanks,
Aravindh Puthiyaparambil
Xen Development Team
Unisys Tredyffrin E240
E-Mail:  aravindh.puthiyaparambil@unisys.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: APIC handling on x86-64
  2006-03-21 15:30 Puthiyaparambil, Aravindh
@ 2006-03-21 15:38 ` Keir Fraser
  0 siblings, 0 replies; 13+ messages in thread
From: Keir Fraser @ 2006-03-21 15:38 UTC (permalink / raw)
  To: Puthiyaparambil, Aravindh; +Cc: xen-devel, Jan Beulich


On 21 Mar 2006, at 15:30, Puthiyaparambil, Aravindh wrote:

> We can help with the testing on ES7000 boxes. I have always wanted to
> move closer to the x86_64 genapic code so I will help in whatever
> possible way.

Thanks (also to Paul Larson re Summit testing). I'll look into this 
after 3.0.2 then.

  -- Keir

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2006-03-21 15:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-21 14:23 APIC handling on x86-64 Jan Beulich
2006-03-21 14:34 ` Keir Fraser
2006-03-21 15:26   ` Paul Larson
  -- strict thread matches above, loose matches on Subject: below --
2006-03-21 15:30 Puthiyaparambil, Aravindh
2006-03-21 15:38 ` Keir Fraser
2006-03-21 14:47 Jan Beulich
2006-03-20 13:53 Jan Beulich
2006-03-20 16:00 ` Keir Fraser
2006-03-20 16:21   ` Keir Fraser
2006-03-19 19:15 Jan Beulich
2006-03-19 23:45 ` Keir Fraser
2006-03-17 16:49 Jan Beulich
2006-03-17 17:06 ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.