public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Immediate values
@ 2009-09-24 12:31 Mathieu Desnoyers
  2009-09-24 12:34 ` Ingo Molnar
  0 siblings, 1 reply; 20+ messages in thread
From: Mathieu Desnoyers @ 2009-09-24 12:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, linux-kernel

Hi Ingo,

Andi asked me this week when we should expect to see the "immediate
values" make it into mainline. I remember you pulled them at one point.
He would like to use them to encode some very hot-path variables into
the instruction stream.

How should I proceed to get that upstream ? Would a repost be
appropriate ?

I have support for powerpc, x86 ans sparc64 currently.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 12:31 Immediate values Mathieu Desnoyers
@ 2009-09-24 12:34 ` Ingo Molnar
  2009-09-24 14:02   ` Jason Baron
  0 siblings, 1 reply; 20+ messages in thread
From: Ingo Molnar @ 2009-09-24 12:34 UTC (permalink / raw)
  To: Mathieu Desnoyers, H. Peter Anvin, Thomas Gleixner,
	Steven Rostedt, Jason Baron
  Cc: Andi Kleen, linux-kernel


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> Hi Ingo,
> 
> Andi asked me this week when we should expect to see the "immediate 
> values" make it into mainline. I remember you pulled them at one 
> point. He would like to use them to encode some very hot-path 
> variables into the instruction stream.
> 
> How should I proceed to get that upstream ? Would a repost be 
> appropriate ?

Would have to see it in full context i guess, with before/after 
measurements, etc.

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 12:34 ` Ingo Molnar
@ 2009-09-24 14:02   ` Jason Baron
  2009-09-24 14:10     ` H. Peter Anvin
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Jason Baron @ 2009-09-24 14:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, H. Peter Anvin, Thomas Gleixner,
	Steven Rostedt, Andi Kleen, linux-kernel

On Thu, Sep 24, 2009 at 02:34:28PM +0200, Ingo Molnar wrote:
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > Hi Ingo,
> > 
> > Andi asked me this week when we should expect to see the "immediate 
> > values" make it into mainline. I remember you pulled them at one 
> > point. He would like to use them to encode some very hot-path 
> > variables into the instruction stream.
> > 
> > How should I proceed to get that upstream ? Would a repost be 
> > appropriate ?
> 
> Would have to see it in full context i guess, with before/after 
> measurements, etc.
> 
> 	Ingo

right we've proposed an alternative to the immediate values, which I've
been calling 'jump label', here:

http://marc.info/?l=linux-kernel&m=125200966226921&w=2

The basic idea is that gcc, 4.5 will have support for an 'asm goto'
construct which can refer to c code labels. Thus, we can replace a nop
in the code stream with a 'jmp' instruction to various branch targets.

In terms of a comparison between the two, IMO, I think that the syntax
for the immediate variables can be more readable, since it just looks
like a conditional expression.

The immediate values do a 'mov', 'test' and then a jump, whereas jump
label can just do a jump. So in this respect, I believe jump label can
be more optimal. Additinally, if we want to mark sections 'cold' so they
don't impact the istruction cache, the jump label already has the labels
for doing so. Obviously, a performance comparison would be interesting
as well.

thanks,

-Jason



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 14:02   ` Jason Baron
@ 2009-09-24 14:10     ` H. Peter Anvin
  2009-09-24 14:16     ` Mathieu Desnoyers
  2009-09-24 14:16     ` H. Peter Anvin
  2 siblings, 0 replies; 20+ messages in thread
From: H. Peter Anvin @ 2009-09-24 14:10 UTC (permalink / raw)
  To: Jason Baron
  Cc: Ingo Molnar, Mathieu Desnoyers, Thomas Gleixner, Steven Rostedt,
	Andi Kleen, linux-kernel

Jason Baron wrote:
> 
> right we've proposed an alternative to the immediate values, which I've
> been calling 'jump label', here:
> 
> http://marc.info/?l=linux-kernel&m=125200966226921&w=2
> 
> The basic idea is that gcc, 4.5 will have support for an 'asm goto'
> construct which can refer to c code labels. Thus, we can replace a nop
> in the code stream with a 'jmp' instruction to various branch targets.
> 
> In terms of a comparison between the two, IMO, I think that the syntax
> for the immediate variables can be more readable, since it just looks
> like a conditional expression.
> 
> The immediate values do a 'mov', 'test' and then a jump, whereas jump
> label can just do a jump. So in this respect, I believe jump label can
> be more optimal. Additinally, if we want to mark sections 'cold' so they
> don't impact the istruction cache, the jump label already has the labels
> for doing so. Obviously, a performance comparison would be interesting
> as well.
> 

Direct jumps should at least theoretically be able to have better 
performance, but it would still be nice to have measurements of both.

	-hpa


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 14:02   ` Jason Baron
  2009-09-24 14:10     ` H. Peter Anvin
@ 2009-09-24 14:16     ` Mathieu Desnoyers
  2009-09-24 19:16       ` Ingo Molnar
  2009-09-24 14:16     ` H. Peter Anvin
  2 siblings, 1 reply; 20+ messages in thread
From: Mathieu Desnoyers @ 2009-09-24 14:16 UTC (permalink / raw)
  To: Jason Baron
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Steven Rostedt,
	Andi Kleen, linux-kernel

* Jason Baron (jbaron@redhat.com) wrote:
> On Thu, Sep 24, 2009 at 02:34:28PM +0200, Ingo Molnar wrote:
> > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> > 
> > > Hi Ingo,
> > > 
> > > Andi asked me this week when we should expect to see the "immediate 
> > > values" make it into mainline. I remember you pulled them at one 
> > > point. He would like to use them to encode some very hot-path 
> > > variables into the instruction stream.
> > > 
> > > How should I proceed to get that upstream ? Would a repost be 
> > > appropriate ?
> > 
> > Would have to see it in full context i guess, with before/after 
> > measurements, etc.
> > 
> > 	Ingo
> 
> right we've proposed an alternative to the immediate values, which I've
> been calling 'jump label', here:
> 
> http://marc.info/?l=linux-kernel&m=125200966226921&w=2
> 
> The basic idea is that gcc, 4.5 will have support for an 'asm goto'
> construct which can refer to c code labels. Thus, we can replace a nop
> in the code stream with a 'jmp' instruction to various branch targets.
> 
> In terms of a comparison between the two, IMO, I think that the syntax
> for the immediate variables can be more readable, since it just looks
> like a conditional expression.
> 
> The immediate values do a 'mov', 'test' and then a jump, whereas jump
> label can just do a jump. So in this respect, I believe jump label can
> be more optimal. Additinally, if we want to mark sections 'cold' so they
> don't impact the istruction cache, the jump label already has the labels
> for doing so. Obviously, a performance comparison would be interesting
> as well.
> 

For branches, I'm convinced that a "static jump" approach will beat
immediate values anytime, because you save the BPB hit completely.

However, there are other use-cases involving a variable read, and in
that case immediate values are useful. Andi has been bugging me for a
while to re-post this patchset, I'm pretty sure he has precise ideas
about how he would like to use it.

Until we get the static jump support mainlined in gcc, immediate values
at least save the d-cache hit. So it would be a step in the right
direction.

Thanks,

Mathieu

> thanks,
> 
> -Jason
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 14:02   ` Jason Baron
  2009-09-24 14:10     ` H. Peter Anvin
  2009-09-24 14:16     ` Mathieu Desnoyers
@ 2009-09-24 14:16     ` H. Peter Anvin
  2009-09-24 15:39       ` Jason Baron
  2 siblings, 1 reply; 20+ messages in thread
From: H. Peter Anvin @ 2009-09-24 14:16 UTC (permalink / raw)
  To: Jason Baron
  Cc: Ingo Molnar, Mathieu Desnoyers, Thomas Gleixner, Steven Rostedt,
	Andi Kleen, linux-kernel

Jason Baron wrote:
> 
> http://marc.info/?l=linux-kernel&m=125200966226921&w=2
> 
> The basic idea is that gcc, 4.5 will have support for an 'asm goto'
> construct which can refer to c code labels. Thus, we can replace a nop
> in the code stream with a 'jmp' instruction to various branch targets.
> 

Looking at the above, I'm a bit unclear for the need for a NOP5.  We 
obviously need a *total* of 5 bytes, but at least I don't seem to 
understand why we need 7 bytes per tracepoint.

	-hpa


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 14:16     ` H. Peter Anvin
@ 2009-09-24 15:39       ` Jason Baron
  2009-09-24 16:52         ` H. Peter Anvin
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Baron @ 2009-09-24 15:39 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, Mathieu Desnoyers, Thomas Gleixner, Steven Rostedt,
	Andi Kleen, linux-kernel

On Thu, Sep 24, 2009 at 07:16:56AM -0700, H. Peter Anvin wrote:
> Jason Baron wrote:
>>
>> http://marc.info/?l=linux-kernel&m=125200966226921&w=2
>>
>> The basic idea is that gcc, 4.5 will have support for an 'asm goto'
>> construct which can refer to c code labels. Thus, we can replace a nop
>> in the code stream with a 'jmp' instruction to various branch targets.
>>
>
> Looking at the above, I'm a bit unclear for the need for a NOP5.  We  
> obviously need a *total* of 5 bytes, but at least I don't seem to  
> understand why we need 7 bytes per tracepoint.
>
> 	-hpa

that's right. The optimal solution doesn't require the the NOP5 at all,
and I've been playing around with an implementation that doesn't have
it. The problem I've been running into is that sometimes the compiler
will put in a short jump - '0xeb', with a 1 byte offset, but the jump
target is further away. Thus, I need to either ensure the target is
close, or somehow force a longer jump '0xe9' into the code so I always
have the space. The other advantage of not including the nop is easier
support for all x86 implementations, since I'm not sure a 5 byte atomic
nop is always available, whereas a jump is always atomic. I'm pretty
sure we can come up with a patch that avoids the nop...I'll keep working
on it.

thanks,

-Jason

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 15:39       ` Jason Baron
@ 2009-09-24 16:52         ` H. Peter Anvin
  0 siblings, 0 replies; 20+ messages in thread
From: H. Peter Anvin @ 2009-09-24 16:52 UTC (permalink / raw)
  To: Jason Baron
  Cc: Ingo Molnar, Mathieu Desnoyers, Thomas Gleixner, Steven Rostedt,
	Andi Kleen, linux-kernel

Jason Baron wrote:
> 
> that's right. The optimal solution doesn't require the the NOP5 at all,
> and I've been playing around with an implementation that doesn't have
> it. The problem I've been running into is that sometimes the compiler
> will put in a short jump - '0xeb', with a 1 byte offset, but the jump
> target is further away. Thus, I need to either ensure the target is
> close, or somehow force a longer jump '0xe9' into the code so I always
> have the space. The other advantage of not including the nop is easier
> support for all x86 implementations, since I'm not sure a 5 byte atomic
> nop is always available, whereas a jump is always atomic. I'm pretty
> sure we can come up with a patch that avoids the nop...I'll keep working
> on it.
> 

Unfortunately gas doesn't have any equivalent of the NASM "strict" 
operand modifier, which would be ideal here.  The following *seems* to 
work on binutils-2.18.50.0.9-8.fc10.x86_64 at least:

	.byte 0xe9
	.long %0-1f
1:

	-hpa


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 14:16     ` Mathieu Desnoyers
@ 2009-09-24 19:16       ` Ingo Molnar
  2009-09-24 19:34         ` Ingo Molnar
  0 siblings, 1 reply; 20+ messages in thread
From: Ingo Molnar @ 2009-09-24 19:16 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Jason Baron, H. Peter Anvin, Thomas Gleixner, Steven Rostedt,
	Andi Kleen, linux-kernel


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> * Jason Baron (jbaron@redhat.com) wrote:
> > 
> > right we've proposed an alternative to the immediate values, which 
> > I've been calling 'jump label', here:
> > 
> > http://marc.info/?l=linux-kernel&m=125200966226921&w=2
> > 
> > The basic idea is that gcc, 4.5 will have support for an 'asm goto' 
> > construct which can refer to c code labels. Thus, we can replace a 
> > nop in the code stream with a 'jmp' instruction to various branch 
> > targets.
> > 
> > In terms of a comparison between the two, IMO, I think that the 
> > syntax for the immediate variables can be more readable, since it 
> > just looks like a conditional expression.
> > 
> > The immediate values do a 'mov', 'test' and then a jump, whereas 
> > jump label can just do a jump. So in this respect, I believe jump 
> > label can be more optimal. Additinally, if we want to mark sections 
> > 'cold' so they don't impact the istruction cache, the jump label 
> > already has the labels for doing so. Obviously, a performance 
> > comparison would be interesting as well.
> 
> For branches, I'm convinced that a "static jump" approach will beat 
> immediate values anytime, because you save the BPB hit completely.
> 
> However, there are other use-cases involving a variable read, and in 
> that case immediate values are useful. Andi has been bugging me for a 
> while to re-post this patchset, I'm pretty sure he has precise ideas 
> about how he would like to use it.

It depends on how significant that usecase is.

Tracepoints used to be the biggest use-case for immediate values, and 
without that the thing becomes rather complex to maintain, for probably 
very little benefit.

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 19:16       ` Ingo Molnar
@ 2009-09-24 19:34         ` Ingo Molnar
  2009-09-25  6:51           ` Arjan van de Ven
  0 siblings, 1 reply; 20+ messages in thread
From: Ingo Molnar @ 2009-09-24 19:34 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Mathieu Desnoyers, Jason Baron, Thomas Gleixner, Steven Rostedt,
	Andi Kleen, linux-kernel


* H. Peter Anvin <hpa@zytor.com> wrote:

> I would like to get an official ACK or NAK for this patching technique 
> from inside Intel, and preferrably from AMD as well.  If it does work 
> as described it would provide a very clean way to do one-shot 
> alternative functions, which probably would be higher value than 
> immediate data values.

Sounds tempting. Things like the CONFIG_SECURITY hookery could use it?

But ... since it's patched under stopmachine, is there any reason why 
this wouldnt work?

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-24 19:34         ` Ingo Molnar
@ 2009-09-25  6:51           ` Arjan van de Ven
  2009-09-25  7:35             ` Mathieu Desnoyers
  0 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2009-09-25  6:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: H. Peter Anvin, Mathieu Desnoyers, Jason Baron, Thomas Gleixner,
	Steven Rostedt, Andi Kleen, linux-kernel

On Thu, 24 Sep 2009 21:34:22 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * H. Peter Anvin <hpa@zytor.com> wrote:
> 
> > I would like to get an official ACK or NAK for this patching
> > technique from inside Intel, and preferrably from AMD as well.  If
> > it does work as described it would provide a very clean way to do
> > one-shot alternative functions, which probably would be higher
> > value than immediate data values.
> 
> Sounds tempting. Things like the CONFIG_SECURITY hookery could use it?
> 
> But ... since it's patched under stopmachine, is there any reason why 
> this wouldnt work?
> 

stopmachine is fine.

more aggressive tricks are rather dicey.

(cross modifying code that's being executed in ring 0 is ... not
something CPU designers had in mind)

-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25  6:51           ` Arjan van de Ven
@ 2009-09-25  7:35             ` Mathieu Desnoyers
  2009-09-25  8:25               ` Arjan van de Ven
  2009-09-25 10:02               ` Alan Cox
  0 siblings, 2 replies; 20+ messages in thread
From: Mathieu Desnoyers @ 2009-09-25  7:35 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, H. Peter Anvin, Jason Baron, Thomas Gleixner,
	Steven Rostedt, Andi Kleen, linux-kernel, Masami Hiramatsu,
	Prasanna S Panchamukhi, Rusty Lynch, Jim Keniston,
	Vamsi Krishna S, Suparna Bhattacharya, Nathan Sidwell,
	Dominique Toupin, Anton Massoud, Richard J Moore

* Arjan van de Ven (arjan@infradead.org) wrote:
> On Thu, 24 Sep 2009 21:34:22 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
> 
[context for people CCed: see
http://lkml.org/lkml/2009/9/24/262]

> > 
> > * H. Peter Anvin <hpa@zytor.com> wrote:
> > 
> > > I would like to get an official ACK or NAK for this patching
> > > technique from inside Intel, and preferrably from AMD as well.  If
> > > it does work as described it would provide a very clean way to do
> > > one-shot alternative functions, which probably would be higher
> > > value than immediate data values.
> > 
> > Sounds tempting. Things like the CONFIG_SECURITY hookery could use it?
> > 
> > But ... since it's patched under stopmachine, is there any reason why 
> > this wouldnt work?
> > 
> 
> stopmachine is fine.
> 
> more aggressive tricks are rather dicey.
> 
> (cross modifying code that's being executed in ring 0 is ... not
> something CPU designers had in mind)
> 

Then, following your advice, kprobes should be re-designed to do a
stop_machine around the int3 breakpoint insertion ? And gdb
should be stopping all threads of a target process before inserting a
breakpoint. Therefore, I do not seem to be the only one confused about
Intel statement on this issue.

Mathieu

> -- 
> Arjan van de Ven 	Intel Open Source Technology Centre
> For development, discussion and tips for power savings, 
> visit http://www.lesswatts.org

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25  7:35             ` Mathieu Desnoyers
@ 2009-09-25  8:25               ` Arjan van de Ven
  2009-09-25 10:02               ` Alan Cox
  1 sibling, 0 replies; 20+ messages in thread
From: Arjan van de Ven @ 2009-09-25  8:25 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, H. Peter Anvin, Jason Baron, Thomas Gleixner,
	Steven Rostedt, Andi Kleen, linux-kernel, Masami Hiramatsu,
	Prasanna S Panchamukhi, Rusty Lynch, Jim Keniston,
	Vamsi Krishna S, Suparna Bhattacharya, Nathan Sidwell,
	Dominique Toupin, Anton Massoud, Richard J Moore

On Fri, 25 Sep 2009 03:35:13 -0400
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> * Arjan van de Ven (arjan@infradead.org) wrote:
> > On Thu, 24 Sep 2009 21:34:22 +0200
> > Ingo Molnar <mingo@elte.hu> wrote:
> > 
> [context for people CCed: see
> http://lkml.org/lkml/2009/9/24/262]
> 
> > > 
> > 
> > stopmachine is fine.
> > 
> > more aggressive tricks are rather dicey.
> > 
> > (cross modifying code that's being executed in ring 0 is ... not
> > something CPU designers had in mind)
> > 
> 
> Then, following your advice, kprobes should be re-designed to do a
> stop_machine around the int3 breakpoint insertion ? And gdb
> should be stopping all threads of a target process before inserting a
> breakpoint. Therefore, I do not seem to be the only one confused about
> Intel statement on this issue.

you are oversimplifying what you are trying to do, and overstating what
a ring 3 app and others do. 

But I'm not the one whom you'd need to convince, I don't design the
CPU. The people who do are extremely frowning on cross modifying code,
and Peter and I need to sit down with people who did many generations
of CPU to figure out if your scheme is actually safe. And on the AMD
side someone will need to do the same.


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25  7:35             ` Mathieu Desnoyers
  2009-09-25  8:25               ` Arjan van de Ven
@ 2009-09-25 10:02               ` Alan Cox
  2009-09-25 10:14                 ` Arjan van de Ven
  2009-09-25 10:18                 ` Richard J Moore
  1 sibling, 2 replies; 20+ messages in thread
From: Alan Cox @ 2009-09-25 10:02 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Arjan van de Ven, Ingo Molnar, H. Peter Anvin, Jason Baron,
	Thomas Gleixner, Steven Rostedt, Andi Kleen, linux-kernel,
	Masami Hiramatsu, Prasanna S Panchamukhi, Rusty Lynch,
	Jim Keniston, Vamsi Krishna S, Suparna Bhattacharya,
	Nathan Sidwell, Dominique Toupin, Anton Massoud, Richard J Moore

> Then, following your advice, kprobes should be re-designed to do a
> stop_machine around the int3 breakpoint insertion ? And gdb
> should be stopping all threads of a target process before inserting a
> breakpoint. Therefore, I do not seem to be the only one confused about
> Intel statement on this issue.

There was considerable discussion abut this when the kprobe stuff went
in. If I remember rightly it was stated by someone @intel.com then that
int3 was ok (even though its not strictly documented as such). The same
is not true for all instructions on all x86 processors unfortunately.

Alan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25 10:02               ` Alan Cox
@ 2009-09-25 10:14                 ` Arjan van de Ven
  2009-09-25 16:19                   ` H. Peter Anvin
  2009-09-25 10:18                 ` Richard J Moore
  1 sibling, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2009-09-25 10:14 UTC (permalink / raw)
  To: Alan Cox
  Cc: Mathieu Desnoyers, Ingo Molnar, H. Peter Anvin, Jason Baron,
	Thomas Gleixner, Steven Rostedt, Andi Kleen, linux-kernel,
	Masami Hiramatsu, Prasanna S Panchamukhi, Rusty Lynch,
	Jim Keniston, Vamsi Krishna S, Suparna Bhattacharya,
	Nathan Sidwell, Dominique Toupin, Anton Massoud, Richard J Moore

On Fri, 25 Sep 2009 11:02:06 +0100
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> > Then, following your advice, kprobes should be re-designed to do a
> > stop_machine around the int3 breakpoint insertion ? And gdb
> > should be stopping all threads of a target process before inserting
> > a breakpoint. Therefore, I do not seem to be the only one confused
> > about Intel statement on this issue.
> 
> There was considerable discussion abut this when the kprobe stuff went
> in. If I remember rightly it was stated by someone @intel.com then
> that int3 was ok (even though its not strictly documented as such).
> The same is not true for all instructions on all x86 processors
> unfortunately.

specifically, using int3 *and then going back to the old value*.



-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25 10:02               ` Alan Cox
  2009-09-25 10:14                 ` Arjan van de Ven
@ 2009-09-25 10:18                 ` Richard J Moore
  2009-09-25 11:12                   ` Masami Hiramatsu
  1 sibling, 1 reply; 20+ messages in thread
From: Richard J Moore @ 2009-09-25 10:18 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andi Kleen, Anton Massoud, Arjan van de Ven, Dominique Toupin,
	H. Peter Anvin, Jason Baron, Jim Keniston, linux-kernel,
	Mathieu Desnoyers, Masami Hiramatsu, Ingo Molnar, Nathan Sidwell,
	Prasanna S Panchamukhi, Steven Rostedt, Rusty Lynch,
	Suparna Bhattacharya, Thomas Gleixner, Vamsi Krishna S



Alan Cox <alan@lxorguk.ukuu.org.uk> wrote on 25/09/2009 11:02:06:


>
> There was considerable discussion abut this when the kprobe stuff went
> in. If I remember rightly it was stated by someone @intel.com then that
> int3 was ok (even though its not strictly documented as such). The same
> is not true for all instructions on all x86 processors unfortunately.
>
> Alan

Alan, I had that discussion with Intel, and yes int3 is a special case
because of the interrupt processing associated with it. The discussion
went along this lines: int3 is practically useless in an MP environment
if it's trouble by the cross-modifying erratum.

I suppose it is possible the more recent microarchitectures have
changed things. And yes, we might need to have that conversation again.

Richard




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25 10:18                 ` Richard J Moore
@ 2009-09-25 11:12                   ` Masami Hiramatsu
  0 siblings, 0 replies; 20+ messages in thread
From: Masami Hiramatsu @ 2009-09-25 11:12 UTC (permalink / raw)
  To: Richard J Moore
  Cc: Alan Cox, Andi Kleen, Anton Massoud, Arjan van de Ven,
	Dominique Toupin, H. Peter Anvin, Jason Baron, Jim Keniston,
	linux-kernel, Mathieu Desnoyers, Ingo Molnar, Nathan Sidwell,
	Prasanna S Panchamukhi, Steven Rostedt, Rusty Lynch,
	Suparna Bhattacharya, Thomas Gleixner, Vamsi Krishna S

Richard J Moore wrote:
> 
> 
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote on 25/09/2009 11:02:06:
> 
> 
>>
>> There was considerable discussion abut this when the kprobe stuff went
>> in. If I remember rightly it was stated by someone @intel.com then that
>> int3 was ok (even though its not strictly documented as such). The same
>> is not true for all instructions on all x86 processors unfortunately.
>>
>> Alan
> 
> Alan, I had that discussion with Intel, and yes int3 is a special case
> because of the interrupt processing associated with it. The discussion
> went along this lines: int3 is practically useless in an MP environment
> if it's trouble by the cross-modifying erratum.
> 
> I suppose it is possible the more recent microarchitectures have
> changed things. And yes, we might need to have that conversation again.

Hi,

I'm also very interested in this topic, since I'd like to replace
kprobe's int3 with jump instruction by using bypass code, which
Mathieu's new imv using.
http://lkml.org/lkml/2009/9/14/549

Actually, it is OK even if I need to use stop_machine(), because
the main goal is reducing overhead of probing, not reducing
the replacing time. :)

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25 10:14                 ` Arjan van de Ven
@ 2009-09-25 16:19                   ` H. Peter Anvin
  2009-09-25 16:45                     ` Arjan van de Ven
  0 siblings, 1 reply; 20+ messages in thread
From: H. Peter Anvin @ 2009-09-25 16:19 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Mathieu Desnoyers, Ingo Molnar, Jason Baron,
	Thomas Gleixner, Steven Rostedt, Andi Kleen, linux-kernel,
	Masami Hiramatsu, Prasanna S Panchamukhi, Rusty Lynch,
	Jim Keniston, Vamsi Krishna S, Suparna Bhattacharya,
	Nathan Sidwell, Dominique Toupin, Anton Massoud, Richard J Moore

Arjan van de Ven wrote:
> On Fri, 25 Sep 2009 11:02:06 +0100
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> 
>>> Then, following your advice, kprobes should be re-designed to do a
>>> stop_machine around the int3 breakpoint insertion ? And gdb
>>> should be stopping all threads of a target process before inserting
>>> a breakpoint. Therefore, I do not seem to be the only one confused
>>> about Intel statement on this issue.
>> There was considerable discussion abut this when the kprobe stuff went
>> in. If I remember rightly it was stated by someone @intel.com then
>> that int3 was ok (even though its not strictly documented as such).
>> The same is not true for all instructions on all x86 processors
>> unfortunately.
> 
> specifically, using int3 *and then going back to the old value*.
> 

As I told Mathieu in person yesterday:

1. We have no information if this is safe or not.  It is most certainly 
not documented as safe, and trying to play language lawyer with the 
errata text is pointless, as it's trying to interpret something that 
isn't there.

2. There are some reasons to believe there might be a safe technique 
somewhere in here (the one he described is a possibility, but not the 
only one.)

3. Being able to patch code without stopping all cores has other uses, 
and so spending some time doing legwork on it is probably worth it.

4. "Someone at Intel" isn't a reference... we need to track down actual 
CPU architects with real names who can give us a thumbs up or down.

	-hpa

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25 16:19                   ` H. Peter Anvin
@ 2009-09-25 16:45                     ` Arjan van de Ven
  2009-09-25 17:05                       ` H. Peter Anvin
  0 siblings, 1 reply; 20+ messages in thread
From: Arjan van de Ven @ 2009-09-25 16:45 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Alan Cox, Mathieu Desnoyers, Ingo Molnar, Jason Baron,
	Thomas Gleixner, Steven Rostedt, Andi Kleen, linux-kernel,
	Masami Hiramatsu, Prasanna S Panchamukhi, Rusty Lynch,
	Jim Keniston, Vamsi Krishna S, Suparna Bhattacharya,
	Nathan Sidwell, Dominique Toupin, Anton Massoud, Richard J Moore

On Fri, 25 Sep 2009 09:19:32 -0700
"H. Peter Anvin" <hpa@zytor.com> wrote:

> Arjan van de Ven wrote:
> > On Fri, 25 Sep 2009 11:02:06 +0100
> > Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> > 
> >>> Then, following your advice, kprobes should be re-designed to do a
> >>> stop_machine around the int3 breakpoint insertion ? And gdb
> >>> should be stopping all threads of a target process before
> >>> inserting a breakpoint. Therefore, I do not seem to be the only
> >>> one confused about Intel statement on this issue.
> >> There was considerable discussion abut this when the kprobe stuff
> >> went in. If I remember rightly it was stated by someone @intel.com
> >> then that int3 was ok (even though its not strictly documented as
> >> such). The same is not true for all instructions on all x86
> >> processors unfortunately.
> > 
> > specifically, using int3 *and then going back to the old value*.
> > 
> 
> As I told Mathieu in person yesterday:
> 
> 1. We have no information if this is safe or not.  It is most
> certainly not documented as safe, and trying to play language lawyer
> with the errata text is pointless, as it's trying to interpret
> something that isn't there.
> 
> 2. There are some reasons to believe there might be a safe technique 
> somewhere in here (the one he described is a possibility, but not the 
> only one.)
> 
> 3. Being able to patch code without stopping all cores has other
> uses, and so spending some time doing legwork on it is probably worth
> it.
> 
> 4. "Someone at Intel" isn't a reference... we need to track down
> actual CPU architects with real names who can give us a thumbs up or
> down.

and we need to talk to about 5 or so generations at least.
We know whom to talk to, it just will take time
(and first indication is frowned faces)

AMD will also do the same, and VIA (I think they have dual core/smp as
well)

-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Immediate values
  2009-09-25 16:45                     ` Arjan van de Ven
@ 2009-09-25 17:05                       ` H. Peter Anvin
  0 siblings, 0 replies; 20+ messages in thread
From: H. Peter Anvin @ 2009-09-25 17:05 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Mathieu Desnoyers, Ingo Molnar, Jason Baron,
	Thomas Gleixner, Steven Rostedt, Andi Kleen, linux-kernel,
	Masami Hiramatsu, Prasanna S Panchamukhi, Rusty Lynch,
	Jim Keniston, Vamsi Krishna S, Suparna Bhattacharya,
	Nathan Sidwell, Dominique Toupin, Anton Massoud, Richard J Moore

Arjan van de Ven wrote:
> 
> and we need to talk to about 5 or so generations at least.
> We know whom to talk to, it just will take time
> (and first indication is frowned faces)
> 
> AMD will also do the same, and VIA (I think they have dual core/smp as
> well)
> 

Of course.

	-hpa

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2009-09-25 17:14 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-24 12:31 Immediate values Mathieu Desnoyers
2009-09-24 12:34 ` Ingo Molnar
2009-09-24 14:02   ` Jason Baron
2009-09-24 14:10     ` H. Peter Anvin
2009-09-24 14:16     ` Mathieu Desnoyers
2009-09-24 19:16       ` Ingo Molnar
2009-09-24 19:34         ` Ingo Molnar
2009-09-25  6:51           ` Arjan van de Ven
2009-09-25  7:35             ` Mathieu Desnoyers
2009-09-25  8:25               ` Arjan van de Ven
2009-09-25 10:02               ` Alan Cox
2009-09-25 10:14                 ` Arjan van de Ven
2009-09-25 16:19                   ` H. Peter Anvin
2009-09-25 16:45                     ` Arjan van de Ven
2009-09-25 17:05                       ` H. Peter Anvin
2009-09-25 10:18                 ` Richard J Moore
2009-09-25 11:12                   ` Masami Hiramatsu
2009-09-24 14:16     ` H. Peter Anvin
2009-09-24 15:39       ` Jason Baron
2009-09-24 16:52         ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox