public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RE: Intel vs AMD64
@ 2004-02-26  4:28 richard.brunner
  0 siblings, 0 replies; 13+ messages in thread
From: richard.brunner @ 2004-02-26  4:28 UTC (permalink / raw)
  To: linux-kernel

Not sure about other architectures, but in the
AMD64 architecture, the 66h and 67h prefixes 
can be applied to the near branch
instructions and have an *architecturally* 
defined action (rather than implementation-defined 
action) which all AMD64 processors follow. It's all 
described in the AMD64 Architecture Programmer's 
Manuals ...

(http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_739_7044,00.html)

But, I definitely agree that it is sorta hard to figure 
out what a 64-bit general purpose compiler would 
actually *do* with some of them. However, there are 
embedded/special-purpose scenarios where this might 
be just fine.

For example, for JMP (near):

In 64-bit mode, if the JMP target is specified by a
displacement in the instruction, the signed displacement is
added to the rIP (of the following instruction), and the
result is truncated to 16 or 64 bits depending on operand
size. [rb: 64-bit is default, 66h forces 16-bit].  The
signed displacement can be 8 bits, 16 bits, or 32 bits,
depending on the opcode and the operand size.  [rb: 8-bit
has its own opcode (EB); for the E9 opcode: 32-bit is
default and 66h forces 16-bit].

] -Rich ...
] AMD Fellow
] richard.brunner at amd com  

> -----Original Message-----
> From: Nakajima, Jun [mailto:jun.nakajima@intel.com] 
> Sent: Wednesday, February 25, 2004 5:20 PM
> To: H. Peter Anvin; Timothy Miller
> Cc: linux-kernel@vger.kernel.org
> Subject: RE: Intel vs AMD x86-64
> 
> Yes, that's the very reason I said "useless for compilers." The way
> IP/RIP is updated is different (and implementation specific) on those
> processors if 66H is used with a near branch. For example, RIP may be
> zero-extended to 64 bits (from IP), as you observed before.
> 
> Jun
> >-----Original Message-----
> >From: H. Peter Anvin [mailto:hpa@zytor.com]
> >Sent: Wednesday, February 25, 2004 4:14 PM
> >To: Timothy Miller
> >Cc: Nakajima, Jun; linux-kernel@vger.kernel.org
> >Subject: Re: Intel vs AMD x86-64
> >
> >Timothy Miller wrote:
> >>
> >>
> >> Nakajima, Jun wrote:
> >>
> >>> For near branches (CALL, RET, JCC, JCXZ, JMP, etc.), the operand size is
> >>> forced to 64 bits on both processors in 64-bit mode, basically meaning
> >>> RIP is updated.
> >>>
> >>> Compilers would typically use a JMP short for "intraprocedural jumps",
> >>> which requires just an 8-bit displacement relative to RIP.
> >>
> >> I see.  It's too bad you can't have a 16-bit displacement.
> >>
> >> Ummm... so if 66H were used with a near branch, would that affect the
> >> size of the immediate operand which gets added to RIP, or would that
> >> affect the the portion of IP/EIP/RIP affected?  If it's the latter,
> >> that's pretty silly.
> >>
> >
> >Yes, that would be pretty silly.
> >
> >I honestly don't remember off the top of my head what "o16 jmp blah"
> >does on i386; I have a vague memory that it zero-extends %eip to 32
> >bits, which makes it useless, of course.
> >
> >	-hpa
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: Intel vs AMD64
@ 2004-02-26  5:32 Nakajima, Jun
  2004-02-26 13:39 ` Chris Wedgwood
  2004-02-26 19:18 ` Timothy Miller
  0 siblings, 2 replies; 13+ messages in thread
From: Nakajima, Jun @ 2004-02-26  5:32 UTC (permalink / raw)
  To: richard.brunner, linux-kernel

Thanks for the clarification.

Yes, "implementation specific" is one of the differences between IA-32e
and AMD64, i.e. that behavior is architecturally defined on AMD64, but
on IA-32e (as I posted): 
  Near branch with 66H prefix:
    As documented in PRM the behavior is implementation specific and
should 
    avoid using 66H prefix on near branches.

Jun
>-----Original Message-----
>From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
>owner@vger.kernel.org] On Behalf Of richard.brunner@amd.com
>Sent: Wednesday, February 25, 2004 8:28 PM
>To: linux-kernel@vger.kernel.org
>Subject: RE: Intel vs AMD64
>
>Not sure about other architectures, but in the
>AMD64 architecture, the 66h and 67h prefixes
>can be applied to the near branch
>instructions and have an *architecturally*
>defined action (rather than implementation-defined
>action) which all AMD64 processors follow. It's all
>described in the AMD64 Architecture Programmer's
>Manuals ...
>
>(http://www.amd.com/us-
>en/Processors/DevelopWithAMD/0,,30_2252_739_7044,00.html)
>
>But, I definitely agree that it is sorta hard to figure
>out what a 64-bit general purpose compiler would
>actually *do* with some of them. However, there are
>embedded/special-purpose scenarios where this might
>be just fine.
>
>For example, for JMP (near):
>
>In 64-bit mode, if the JMP target is specified by a
>displacement in the instruction, the signed displacement is
>added to the rIP (of the following instruction), and the
>result is truncated to 16 or 64 bits depending on operand
>size. [rb: 64-bit is default, 66h forces 16-bit].  The
>signed displacement can be 8 bits, 16 bits, or 32 bits,
>depending on the opcode and the operand size.  [rb: 8-bit
>has its own opcode (EB); for the E9 opcode: 32-bit is
>default and 66h forces 16-bit].
>
>] -Rich ...
>] AMD Fellow
>] richard.brunner at amd com
>
>> -----Original Message-----
>> From: Nakajima, Jun [mailto:jun.nakajima@intel.com]
>> Sent: Wednesday, February 25, 2004 5:20 PM
>> To: H. Peter Anvin; Timothy Miller
>> Cc: linux-kernel@vger.kernel.org
>> Subject: RE: Intel vs AMD x86-64
>>
>> Yes, that's the very reason I said "useless for compilers." The way
>> IP/RIP is updated is different (and implementation specific) on those
>> processors if 66H is used with a near branch. For example, RIP may be
>> zero-extended to 64 bits (from IP), as you observed before.
>>
>> Jun
>> >-----Original Message-----
>> >From: H. Peter Anvin [mailto:hpa@zytor.com]
>> >Sent: Wednesday, February 25, 2004 4:14 PM
>> >To: Timothy Miller
>> >Cc: Nakajima, Jun; linux-kernel@vger.kernel.org
>> >Subject: Re: Intel vs AMD x86-64
>> >
>> >Timothy Miller wrote:
>> >>
>> >>
>> >> Nakajima, Jun wrote:
>> >>
>> >>> For near branches (CALL, RET, JCC, JCXZ, JMP, etc.), the operand
size
>is
>> >>> forced to 64 bits on both processors in 64-bit mode, basically
>meaning
>> >>> RIP is updated.
>> >>>
>> >>> Compilers would typically use a JMP short for "intraprocedural
jumps",
>> >>> which requires just an 8-bit displacement relative to RIP.
>> >>
>> >> I see.  It's too bad you can't have a 16-bit displacement.
>> >>
>> >> Ummm... so if 66H were used with a near branch, would that affect
the
>> >> size of the immediate operand which gets added to RIP, or would
that
>> >> affect the the portion of IP/EIP/RIP affected?  If it's the
latter,
>> >> that's pretty silly.
>> >>
>> >
>> >Yes, that would be pretty silly.
>> >
>> >I honestly don't remember off the top of my head what "o16 jmp blah"
>> >does on i386; I have a vague memory that it zero-extends %eip to 32
>> >bits, which makes it useless, of course.
>> >
>> >	-hpa
>>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26  5:32 Intel vs AMD64 Nakajima, Jun
@ 2004-02-26 13:39 ` Chris Wedgwood
  2004-02-26 14:35   ` Richard B. Johnson
  2004-02-27 18:44   ` H. Peter Anvin
  2004-02-26 19:18 ` Timothy Miller
  1 sibling, 2 replies; 13+ messages in thread
From: Chris Wedgwood @ 2004-02-26 13:39 UTC (permalink / raw)
  To: Nakajima, Jun; +Cc: richard.brunner, linux-kernel

On Wed, Feb 25, 2004 at 09:32:08PM -0800, Nakajima, Jun wrote:

> Yes, "implementation specific" is one of the differences between
> IA-32e and AMD64, i.e. that behavior is architecturally defined on
> AMD64, but on IA-32e (as I posted):

>   Near branch with 66H prefix:
>     As documented in PRM the behavior is implementation specific and
>     should avoid using 66H prefix on near branches.


Not that it really matters that much --- but I'm curious to know why
Intel made this decision?

It seems really dumb to make such differences when Intel is already
sorely lagging behind their competitor here, I would think given the
circumstances Intel would try to be as compatible as possible on all
fronts.

I'd almost be nervous about getting an IA-32e CPU right now given that
the AMD64 chips work just fine, have had lots of testing and there is
plenty of code out there which is *known* to work reliably.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26 13:39 ` Chris Wedgwood
@ 2004-02-26 14:35   ` Richard B. Johnson
  2004-02-26 19:25     ` Timothy Miller
  2004-02-27 18:44   ` H. Peter Anvin
  1 sibling, 1 reply; 13+ messages in thread
From: Richard B. Johnson @ 2004-02-26 14:35 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Nakajima, Jun, richard.brunner, linux-kernel

On Thu, 26 Feb 2004, Chris Wedgwood wrote:

> On Wed, Feb 25, 2004 at 09:32:08PM -0800, Nakajima, Jun wrote:
>
> > Yes, "implementation specific" is one of the differences between
> > IA-32e and AMD64, i.e. that behavior is architecturally defined on
> > AMD64, but on IA-32e (as I posted):
>
> >   Near branch with 66H prefix:
> >     As documented in PRM the behavior is implementation specific and
> >     should avoid using 66H prefix on near branches.
>
>
> Not that it really matters that much --- but I'm curious to know why
> Intel made this decision?
>
> It seems really dumb to make such differences when Intel is already
> sorely lagging behind their competitor here, I would think given the
> circumstances Intel would try to be as compatible as possible on all
> fronts.
>
> I'd almost be nervous about getting an IA-32e CPU right now given that
> the AMD64 chips work just fine, have had lots of testing and there is
> plenty of code out there which is *known* to work reliably.
>

Errmm. The 0x66 prefix is used to change the implied register size
when using a register. It has nothing to do with a branch. It
can be used to cause the CPU to load the full-width of the
instruction pointer and the segment descriptor when transitioning
from 16-bit to 32-bit execution mode. This is an unconditional
jump immediate, instruction, not a branch. If it's used with a branch
instruction, near or far, the CPU should execute an "illegal operand
fault" (actually invalid opcode fault is the correct grammar since
neither AMD nor Intel make law).

Whether or not the CPU traps this invalid instruction is moot. No
compiler would emit junk like this and anybody horsing around with
an assembler deserves whatever they get, although you shouldn't
be able to smoke the CPU on a multi-user multitasking system because
it can be used as a DOS attack.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26  5:32 Intel vs AMD64 Nakajima, Jun
  2004-02-26 13:39 ` Chris Wedgwood
@ 2004-02-26 19:18 ` Timothy Miller
  2004-02-26 19:45   ` Scott Robert Ladd
  2004-03-03 17:34   ` Pavel Machek
  1 sibling, 2 replies; 13+ messages in thread
From: Timothy Miller @ 2004-02-26 19:18 UTC (permalink / raw)
  To: Nakajima, Jun; +Cc: richard.brunner, linux-kernel



Nakajima, Jun wrote:
> Thanks for the clarification.
> 
> Yes, "implementation specific" is one of the differences between IA-32e
> and AMD64, i.e. that behavior is architecturally defined on AMD64, but
> on IA-32e (as I posted): 
>   Near branch with 66H prefix:
>     As documented in PRM the behavior is implementation specific and
> should 
>     avoid using 66H prefix on near branches.


In other words, Intel's implementation deviates from the architecture as 
defined by AMD.  So it's not 100% compatible.  I just want this point to 
be clear.


If these sorts of branches are common enough (and I suspect they are), 
then this sort of deviation could have a notable code-size (and L1) 
impact on code which is compiled to be compatible with both implementations.

Why did Intel decide to do that?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26 14:35   ` Richard B. Johnson
@ 2004-02-26 19:25     ` Timothy Miller
  2004-02-26 19:46       ` Richard B. Johnson
  0 siblings, 1 reply; 13+ messages in thread
From: Timothy Miller @ 2004-02-26 19:25 UTC (permalink / raw)
  To: root; +Cc: Chris Wedgwood, Nakajima, Jun, richard.brunner, linux-kernel



Richard B. Johnson wrote:

> Whether or not the CPU traps this invalid instruction is moot. No
> compiler would emit junk like this and anybody horsing around with
> an assembler deserves whatever they get, although you shouldn't
> be able to smoke the CPU on a multi-user multitasking system because
> it can be used as a DOS attack.


If this is junk that's invalid, why was it mentioned in the first place?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26 19:18 ` Timothy Miller
@ 2004-02-26 19:45   ` Scott Robert Ladd
  2004-02-27 14:43     ` Timothy Miller
  2004-03-03 17:34   ` Pavel Machek
  1 sibling, 1 reply; 13+ messages in thread
From: Scott Robert Ladd @ 2004-02-26 19:45 UTC (permalink / raw)
  To: Timothy Miller; +Cc: Nakajima, Jun, richard.brunner, linux-kernel

Timothy Miller wrote:
> In other words, Intel's implementation deviates from the architecture as 
> defined by AMD.  So it's not 100% compatible.  I just want this point to 
> be clear.

There may exist non-instruction-set differences between the chips as 
well. Opteron systems (which have per-CPU memory control) operate as 
NUMA machines; will the same be true for any of Intel's ia32e chips?

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26 19:25     ` Timothy Miller
@ 2004-02-26 19:46       ` Richard B. Johnson
  0 siblings, 0 replies; 13+ messages in thread
From: Richard B. Johnson @ 2004-02-26 19:46 UTC (permalink / raw)
  To: Timothy Miller
  Cc: Chris Wedgwood, Nakajima, Jun, richard.brunner, linux-kernel

On Thu, 26 Feb 2004, Timothy Miller wrote:

>
>
> Richard B. Johnson wrote:
>
> > Whether or not the CPU traps this invalid instruction is moot. No
> > compiler would emit junk like this and anybody horsing around with
> > an assembler deserves whatever they get, although you shouldn't
> > be able to smoke the CPU on a multi-user multitasking system because
> > it can be used as a DOS attack.
>
>
> If this is junk that's invalid, why was it mentioned in the first place?
>

Because there are hobbiest that look for undocumented op-codes, see
	http://www.x86.org

They find some interesting things and then they wonder if what
they've found will work with other vendor's CPUs.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
       [not found] ` <403E4681.20603@techsource.com.suse.lists.linux.kernel>
@ 2004-02-26 20:17   ` Andi Kleen
  2004-02-27 14:50     ` Timothy Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2004-02-26 20:17 UTC (permalink / raw)
  To: Timothy Miller; +Cc: Nakajima, Jun, richard.brunner, linux-kernel

Timothy Miller <miller@techsource.com> writes:
> 
> If these sorts of branches are common enough (and I suspect they are),

No they are not at all. Did you really read the descriptions of
their semantics in this thread from Richard or Jun? They are
completely useless for a 64bit program because they will truncate your
64bit program counter to 16bits. 

They may make sense in 16bit compat mode with 64K segment, but there
there is no incompatibility because this difference only applies to
64bit programs. I doubt anybody has ever used them in a 64bit or even
in a 32bit program.

> Why did Intel decide to do that?

Most likely they didn't plan to, but it happened by accident 
and is obscure enough to be not worth fixing. I would agree with
them that it's not worth fixing.

-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26 19:45   ` Scott Robert Ladd
@ 2004-02-27 14:43     ` Timothy Miller
  0 siblings, 0 replies; 13+ messages in thread
From: Timothy Miller @ 2004-02-27 14:43 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: Nakajima, Jun, richard.brunner, linux-kernel



Scott Robert Ladd wrote:
> Timothy Miller wrote:
> 
>> In other words, Intel's implementation deviates from the architecture 
>> as defined by AMD.  So it's not 100% compatible.  I just want this 
>> point to be clear.
> 
> 
> There may exist non-instruction-set differences between the chips as 
> well. Opteron systems (which have per-CPU memory control) operate as 
> NUMA machines; will the same be true for any of Intel's ia32e chips?
> 

Any difference which is transparent to software is irrelevant.  Hardware 
differences which can be dealt with and hidden by the kernel are things 
Linux can just deal with.  The only thing that really matters here is 
user space.

People don't seem to have problems compiling kernels for specific 
processors, and the kernel can be designed to also do run-time detection 
for the generic case (i.e. boot CD's, etc.).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26 20:17   ` Andi Kleen
@ 2004-02-27 14:50     ` Timothy Miller
  0 siblings, 0 replies; 13+ messages in thread
From: Timothy Miller @ 2004-02-27 14:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Nakajima, Jun, richard.brunner, linux-kernel



Andi Kleen wrote:
> Timothy Miller <miller@techsource.com> writes:
> 
[snip]

> 
>>Why did Intel decide to do that?
> 
> 
> Most likely they didn't plan to, but it happened by accident 
> and is obscure enough to be not worth fixing. I would agree with
> them that it's not worth fixing.

Yes, I have come to understand that this is merely a case of Intel 
documenting that an "undocumented" instruction doesn't behave in a 
useful or consistent way.

My apologies for delving into this pointless discussion.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26 13:39 ` Chris Wedgwood
  2004-02-26 14:35   ` Richard B. Johnson
@ 2004-02-27 18:44   ` H. Peter Anvin
  1 sibling, 0 replies; 13+ messages in thread
From: H. Peter Anvin @ 2004-02-27 18:44 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <20040226133959.GA19254@dingdong.cryptoapps.com>
By author:    Chris Wedgwood <cw@f00f.org>
In newsgroup: linux.dev.kernel
> 
> Not that it really matters that much --- but I'm curious to know why
> Intel made this decision?
> 
> It seems really dumb to make such differences when Intel is already
> sorely lagging behind their competitor here, I would think given the
> circumstances Intel would try to be as compatible as possible on all
> fronts.

This, and a whole bunch of the other "IA-32e differences", I think
really should be classified as "Intel bugs."

They really have more the flavour of errata than anything else...

	-hpa

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Intel vs AMD64
  2004-02-26 19:18 ` Timothy Miller
  2004-02-26 19:45   ` Scott Robert Ladd
@ 2004-03-03 17:34   ` Pavel Machek
  1 sibling, 0 replies; 13+ messages in thread
From: Pavel Machek @ 2004-03-03 17:34 UTC (permalink / raw)
  To: Timothy Miller; +Cc: Nakajima, Jun, richard.brunner, linux-kernel

Hi!

> >Yes, "implementation specific" is one of the differences between 
> >IA-32e
> >and AMD64, i.e. that behavior is architecturally defined on AMD64, 
> >but
> >on IA-32e (as I posted): 
> >  Near branch with 66H prefix:
> >    As documented in PRM the behavior is implementation specific and
> >should 
> >    avoid using 66H prefix on near branches.
> 
> 
> In other words, Intel's implementation deviates from the architecture 
> as defined by AMD.  So it's not 100% compatible.  I just want this 
> point to be clear.
> 
> 
> If these sorts of branches are common enough (and I suspect they 
> are), then this sort of deviation could have a notable code-size (and 

They are not.
-- 
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms         


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2004-03-04 14:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-26  5:32 Intel vs AMD64 Nakajima, Jun
2004-02-26 13:39 ` Chris Wedgwood
2004-02-26 14:35   ` Richard B. Johnson
2004-02-26 19:25     ` Timothy Miller
2004-02-26 19:46       ` Richard B. Johnson
2004-02-27 18:44   ` H. Peter Anvin
2004-02-26 19:18 ` Timothy Miller
2004-02-26 19:45   ` Scott Robert Ladd
2004-02-27 14:43     ` Timothy Miller
2004-03-03 17:34   ` Pavel Machek
     [not found] <7F740D512C7C1046AB53446D37200173EA28A5@scsmsx402.sc.intel.com.suse.lists.linux.kernel>
     [not found] ` <403E4681.20603@techsource.com.suse.lists.linux.kernel>
2004-02-26 20:17   ` Andi Kleen
2004-02-27 14:50     ` Timothy Miller
  -- strict thread matches above, loose matches on Subject: below --
2004-02-26  4:28 richard.brunner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox