* Re: Single-stepping
@ 2000-11-16 9:01 John Marvin
2000-11-16 12:00 ` Single-stepping Richard Hirst
2000-11-20 3:03 ` Single-stepping Alan Modra
0 siblings, 2 replies; 12+ messages in thread
From: John Marvin @ 2000-11-16 9:01 UTC (permalink / raw)
To: parisc-linux
> I've been helping Alan Modra out with kernel changes to support
> single stepping for gdb. Paul Bame suggested I bounced our ideas
> off you in case you (or anyone else) had any comments. I havn't
> actually committed my changes yet.
>
I've decided to respond to the whole list, since others are now
participating in the discussion.
> The basic approach is to use the recovery counter to generate
> a trap every instruction. The scheme is complicated because a
> suspended process may or may not return to user space via an RFI.
>
There is no easy way to do single stepping on parisc. So any single
stepping design will be complicated.
> If it was suspended as a result of an interrupt then we can
> simply set PSW bit R in the tasks saved registers and it will
> get loaded by the RFI. On every task switch I set the
> recovery counter to 0, just in case the new process is being
> single-stepped.
>
> If a process is suspended during a syscall, then there is no
> RFI on the return path to userland, and we have to handle things
> differently. I have changed the syscall return path such that
> it loads the recovery counter with 3 before updating the PSW
> with a value from the tasks saved registers. If that PSW has
> the R bit set, then the count of 3 will generate a trap on the
> first instruction following the branch back to user space.
> Note that PSW wasn't previously restored on the syscall return
> path.
>
Just to be clear, it is impossible to restore the entire PSW without
an RFI. So, I assume you are referring to the system mask subset of
the PSW that can be manipulated by the ssm,rsm, and mtsm instructions.
You mention restoring from the task's saved registers, but we currently
do not save the system mask during a syscall (because it should be the
same for all processes). Have you added code to do that also? If not,
you are restoring from whatever the state was at the last interruption.
Which in this case works (since the R bit state will be changed
by another process while the debugged process is suspended, this should
guarantee that the R bit state is up to date), but it seems a little ugly.
In my opinion, you should just be checking a bit in the ptrace flags
in the task structure, and setting the R bit with an ssm instruction
based on that.
> To avoid further complications of interrupts during the three
> instructions when the recovery counter is decrementing, whenever
> we set the R bit, we also clear the I bit to disable interrupts.
Yuck, but I agree that it would be messier to have to deal with this in
the interrupt handlers. Please make sure that a comment is added that
explains what you are doing, and clearly documents the dependency on the
number of remaining instructions before we return to user privilege level.
I assume you restore the I bit in the recovery counter trap handler. I
can think of alternative ways of doing this, but they are probably just as
ugly (e.g. one possibility would be to do an rfi to set the L bit).
>
> Nullified instructions are handled by the controlling process
> manually moving the childs IAOQ over the instruction without
> actually setting it running, because the recovery counter isn't
> decremented for nullified instructions.
Does this code properly handle branches in the delay slot of another
branch? (you need to make sure you are not advancing the queues by just
adding 4 to each element). One concern I have about this method is that
the userland debugger has to cooperate to make this design work, i.e. the
single stepping is not accomplished entirely within the kernel, so we
cannot easily change the design for single stepping at a later date.
I wonder if it is necessary to do this. So what if we don't stop on the
nullified instruction. Since it is nullified, it doesn't actually do
anything, so why does the user have to see it, i.e. just let the recovery
counter trap happen on the next truly executed instruction (i.e. the
debugger performs a "double step" in this case). Am I missing something
here?
>
> I need to do some more testing before committing this, but would
> welcome any comments on the basic approach taken, areas I have
> mis-understood, or problems with it that might not yet have
> occurred to me.
OK, well here are some issues that you didn't mention, so I don't
know whether or not you addressed them:
1) When single stepping over a syscall, when do you actually stop the
single stepping and execute the syscall? Hopefully you are not
allowing single stepping after the gate instruction on the gateway
page (and returning control to a non privileged debugging process).
The recovery counter trap should detect when the user code gets
to the gateway page.
2) Does your solution properly handle single stepping into and out of
a signal handler? Note that the debugger will trap the signal as part
of this process. Since the return is handled through a hidden syscall
you may not have to do anything special here.
Note that HP-UX does not use the recovery counter for single stepping. I
made a few phone calls to various engineers to find out what the design
process was, and why they chose the solution they did, but I could not
find anyone who knew. Looking at the code in HP-UX it looks like someone
implemented that code a long time ago, and some of the engineers who have
worked on it since don't understand it, because some of the comments added
since then clearly show a lack of understanding of what is really going
on.
Others on this list have mentioned that MPE does use the recovery counter
for single stepping. Of course, MPE is not a Unix clone, so just because
it could be done on MPE doesn't mean that the recovery counter can cover
all cases on Unix (e.g. I have no idea how signals and syscalls are
implemented on MPE). But since I have no idea why the recovery counter
was not used for HP-UX, I can't say it is the wrong way to go. I can't
think of anything that will definitely rule it out, I'm just a little
uncomfortable with the fact that HP-UX chose not to use it.
One advantage of the HP-UX method is that it completely encapsulates the
single stepping inside the kernel, so it can be changed if necessary,
without having to modify gdb (and having to worry about old versions of
gdb).
Anyway, for reference, HP-UX does single stepping by using a combination
of the taken branch trap, and loading the instruction queues such that the
front of the queue points to the next instruction to be single stepped and
the back of the queue points to the first of two break instructions on a
"break" page. It does NOT insert break instructions into the code, so it
does not adversely affect execution on a SMP machine. Note that we
already put a bunch of break instructions before the syscall entry point
on the gateway page, so it would be easy to use our gateway page for the
"break page". This way, if the single stepped instruction branches, a
taken branch trap will be taken (which is important in the case where the
branch nullifies its delay slot). Otherwise, the instruction will be
executed and then the break instruction at the known location on the
"break" page will be executed. If the single stepped instruction
nullifies the next instruction, the second break instruction on the
"break" page will be executed.
Note that this is the short explanation. It is not as simple as it sounds.
One major complication is that branches with links don't work properly
with the instruction queue magic, so the link register has to be updated
in the taken branch trap handler. Also branch externals won't update
the space of the space queue tail properly (again, that has to be fixed
in the taken branch handler). I can provide more details if the recovery
counter method doesn't work out.
Sincerely,
John Marvin
jsm@fc.hp.com
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: Single-stepping
2000-11-16 9:01 Single-stepping John Marvin
@ 2000-11-16 12:00 ` Richard Hirst
2000-11-20 3:03 ` Single-stepping Alan Modra
1 sibling, 0 replies; 12+ messages in thread
From: Richard Hirst @ 2000-11-16 12:00 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux
Hi John,
On Thu, Nov 16, 2000 at 02:01:12AM -0700, John Marvin wrote:
> Just to be clear, it is impossible to restore the entire PSW without
> an RFI. So, I assume you are referring to the system mask subset of
> the PSW that can be manipulated by the ssm,rsm, and mtsm instructions.
Yes, mtsm in this case.
> You mention restoring from the task's saved registers, but we currently
> do not save the system mask during a syscall (because it should be the
> same for all processes). Have you added code to do that also? If not,
Yes I have.
> you are restoring from whatever the state was at the last interruption.
> Which in this case works (since the R bit state will be changed
> by another process while the debugged process is suspended, this should
> guarantee that the R bit state is up to date), but it seems a little ugly.
> In my opinion, you should just be checking a bit in the ptrace flags
> in the task structure, and setting the R bit with an ssm instruction
> based on that.
Sounds better, I'll look in to it.
> > Nullified instructions are handled by the controlling process
> > manually moving the childs IAOQ over the instruction without
> > actually setting it running, because the recovery counter isn't
> > decremented for nullified instructions.
Sorry, I worded that very badly. The code that moves the childs
IAOQ on is in the kernel, invoked as a result of the controlling
process calling ptrace(PTRACE_SINGLESTEP...) when the childs N
bit is set.
> Does this code properly handle branches in the delay slot of another
> branch? (you need to make sure you are not advancing the queues by just
> adding 4 to each element). One concern I have about this method is that
Current code does
/* Nullified, just crank over the queue. */
task_regs(child)->iaoq[0] = task_regs(child)->iaoq[1];
task_regs(child)->iasq[0] = task_regs(child)->iasq[1];
task_regs(child)->iaoq[1] = task_regs(child)->iaoq[0] + 4;
Does that look right to you?
> I wonder if it is necessary to do this. So what if we don't stop on the
> nullified instruction. Since it is nullified, it doesn't actually do
> anything, so why does the user have to see it, i.e. just let the recovery
> counter trap happen on the next truly executed instruction (i.e. the
> debugger performs a "double step" in this case). Am I missing something
> here?
I don't see why we really need to stop on a nullified instruction, but
I'll wait for Alan to comment as he wrote this initially.
> 1) When single stepping over a syscall, when do you actually stop the
> single stepping and execute the syscall? Hopefully you are not
> allowing single stepping after the gate instruction on the gateway
> page (and returning control to a non privileged debugging process).
> The recovery counter trap should detect when the user code gets
> to the gateway page.
At the moment my test harness notes IAOQ=0x100 and stops single stepping,
but obviously the kernel needs to enforce that.
> 2) Does your solution properly handle single stepping into and out of
> a signal handler? Note that the debugger will trap the signal as part
> of this process. Since the return is handled through a hidden syscall
> you may not have to do anything special here.
Havn't looked at signal handling yet.
> Note that HP-UX does not use the recovery counter for single stepping. I
Thanks for the description of how HP-UX does it. I'll stick with
the recovery counter for now as it does seem to be basically working.
I'll also try to ensure that it is completely encapsulated within the kernel
so it is less painful to change later, if need be.
Thanks,
Richard
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: Single-stepping
2000-11-16 9:01 Single-stepping John Marvin
2000-11-16 12:00 ` Single-stepping Richard Hirst
@ 2000-11-20 3:03 ` Alan Modra
1 sibling, 0 replies; 12+ messages in thread
From: Alan Modra @ 2000-11-20 3:03 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux
On Thu, 16 Nov 2000, John Marvin wrote:
> Anyway, for reference, HP-UX does single stepping by using a combination
> of the taken branch trap, and loading the instruction queues such that the
> front of the queue points to the next instruction to be single stepped and
> the back of the queue points to the first of two break instructions on a
> "break" page. It does NOT insert break instructions into the code, so it
> does not adversely affect execution on a SMP machine. Note that we
> already put a bunch of break instructions before the syscall entry point
> on the gateway page, so it would be easy to use our gateway page for the
> "break page". This way, if the single stepped instruction branches, a
> taken branch trap will be taken (which is important in the case where the
> branch nullifies its delay slot). Otherwise, the instruction will be
> executed and then the break instruction at the known location on the
> "break" page will be executed. If the single stepped instruction
> nullifies the next instruction, the second break instruction on the
> "break" page will be executed.
This is the path I started out on for hppa-linux, then hit the problem of
a branch that nullifies it's delay slot. At that point, I decided playing
with IAOQ_back wouldn't work as I missed the solution of enabling taken
branch traps. :-( If I'd seen this trick, then I would not have tried
using the recovery counter, and even now, it may be better to go back to
IAOQ fiddling. The recovery counter scheme has the disadvantage that
there's only one of them so you need to save/restore over task swaps or
introduce extra instructions in the syscall path - and be very careful.
> Note that this is the short explanation. It is not as simple as it sounds.
> One major complication is that branches with links don't work properly
> with the instruction queue magic, so the link register has to be updated
> in the taken branch trap handler. Also branch externals won't update
> the space of the space queue tail properly (again, that has to be fixed
> in the taken branch handler). I can provide more details if the recovery
> counter method doesn't work out.
I'm a little intrigued about these "complications". How can the link
register or space _not_ be updated properly? As far as I can see, the
only really tricky instruction to single-step is RFI - which shouldn't
ever occur in userspace, and which we'd just emulate if it was important.
Alan Modra
--
Linuxcare. Support for the Revolution.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
@ 2000-11-16 12:44 John Marvin
2000-11-16 13:20 ` Single-stepping Richard Hirst
2000-11-16 19:00 ` Single-stepping Frank Rowand
0 siblings, 2 replies; 12+ messages in thread
From: John Marvin @ 2000-11-16 12:44 UTC (permalink / raw)
To: parisc-linux
Richard,
>
> Sorry, I worded that very badly. The code that moves the childs
> IAOQ on is in the kernel, invoked as a result of the controlling
> process calling ptrace(PTRACE_SINGLESTEP...) when the childs N
> bit is set.
>
Great.
> > Does this code properly handle branches in the delay slot of another
> > branch? (you need to make sure you are not advancing the queues by just
> > adding 4 to each element). One concern I have about this method is that
>
> Current code does
>
> /* Nullified, just crank over the queue. */
> task_regs(child)->iaoq[0] = task_regs(child)->iaoq[1];
> task_regs(child)->iasq[0] = task_regs(child)->iasq[1];
> task_regs(child)->iaoq[1] = task_regs(child)->iaoq[0] + 4;
>
> Does that look right to you?
>
Yes, that is the correct way to do it (I'll assume the duplicated line
is just a cut/paste error).
> > I wonder if it is necessary to do this. So what if we don't stop on the
> > nullified instruction. Since it is nullified, it doesn't actually do
> > anything, so why does the user have to see it, i.e. just let the recovery
> > counter trap happen on the next truly executed instruction (i.e. the
> > debugger performs a "double step" in this case). Am I missing something
> > here?
>
> I don't see why we really need to stop on a nullified instruction, but
> I'll wait for Alan to comment as he wrote this initially.
>
Given the above, i.e. that this is being handled in the kernel anyway, I
guess I don't really care which way this goes. Probably it is best to
do it whatever way gdb on hp-ux presents it.
> > 1) When single stepping over a syscall, when do you actually stop the
> > single stepping and execute the syscall? Hopefully you are not
> > allowing single stepping after the gate instruction on the gateway
> > page (and returning control to a non privileged debugging process).
> > The recovery counter trap should detect when the user code gets
> > to the gateway page.
>
> At the moment my test harness notes IAOQ=0x100 and stops single stepping,
> but obviously the kernel needs to enforce that.
>
You should also be checking the space. But yes, the kernel needs to enforce
this for security reasons. You should be able to do it in the recovery
counter trap handler (rather than having to test for it in the syscall
path, which affects all processes).
> > 2) Does your solution properly handle single stepping into and out of
> > a signal handler? Note that the debugger will trap the signal as part
> > of this process. Since the return is handled through a hidden syscall
> > you may not have to do anything special here.
>
> Havn't looked at signal handling yet.
>
I'm not sure that there is a real issue here or not. HP-UX has some code
for single stepping with respect to signal handlers, but I believe it may
only be necessary due to the saved state necessary as part of the iaoq
manipulation. Obviously you should test this case.
> > Note that HP-UX does not use the recovery counter for single stepping. I
>
> Thanks for the description of how HP-UX does it. I'll stick with
> the recovery counter for now as it does seem to be basically working.
> I'll also try to ensure that it is completely encapsulated within the kernel
> so it is less painful to change later, if need be.
>
Sounds ok with me. And as long as there are no corner cases, it probably
is the best solution, assuming we don't find another application for
the recovery counter.
John
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
2000-11-16 12:44 Single-stepping John Marvin
@ 2000-11-16 13:20 ` Richard Hirst
2000-11-16 19:00 ` Single-stepping Frank Rowand
1 sibling, 0 replies; 12+ messages in thread
From: Richard Hirst @ 2000-11-16 13:20 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux
On Thu, Nov 16, 2000 at 05:44:55AM -0700, John Marvin wrote:
> > Current code does
> >
> > /* Nullified, just crank over the queue. */
> > task_regs(child)->iaoq[0] = task_regs(child)->iaoq[1];
> > task_regs(child)->iasq[0] = task_regs(child)->iasq[1];
> > task_regs(child)->iaoq[1] = task_regs(child)->iaoq[0] + 4;
> >
> > Does that look right to you?
>
> Yes, that is the correct way to do it (I'll assume the duplicated line
> is just a cut/paste error).
It's not duplicated (iaoq v. iasq).
> > At the moment my test harness notes IAOQ=0x100 and stops single stepping,
> > but obviously the kernel needs to enforce that.
> >
> You should also be checking the space. But yes, the kernel needs to enforce
> this for security reasons. You should be able to do it in the recovery
> counter trap handler (rather than having to test for it in the syscall
> path, which affects all processes).
I might come back to you on that when I've thought some more.
Thanks,
Richard
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
2000-11-16 12:44 Single-stepping John Marvin
2000-11-16 13:20 ` Single-stepping Richard Hirst
@ 2000-11-16 19:00 ` Frank Rowand
2000-11-16 20:28 ` Single-stepping Richard Hirst
1 sibling, 1 reply; 12+ messages in thread
From: Frank Rowand @ 2000-11-16 19:00 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux
John Marvin wrote:
>
> Richard,
>
> >
> > Sorry, I worded that very badly. The code that moves the childs
> > IAOQ on is in the kernel, invoked as a result of the controlling
> > process calling ptrace(PTRACE_SINGLESTEP...) when the childs N
> > bit is set.
> >
>
> Great.
>
> > > Does this code properly handle branches in the delay slot of another
> > > branch? (you need to make sure you are not advancing the queues by just
> > > adding 4 to each element). One concern I have about this method is that
> >
> > Current code does
> >
> > /* Nullified, just crank over the queue. */
> > task_regs(child)->iaoq[0] = task_regs(child)->iaoq[1];
> > task_regs(child)->iasq[0] = task_regs(child)->iasq[1];
> > task_regs(child)->iaoq[1] = task_regs(child)->iaoq[0] + 4;
> >
> > Does that look right to you?
> >
>
> Yes, that is the correct way to do it (I'll assume the duplicated line
> is just a cut/paste error).
If iaoq[0] contains a branch, iaoq[1] is in the delay slot. The instruction
executed after iaoq[1] would then typically _not_ be iaoq[0] + 4 (the next
instruction would be the target of the branch at iaoq[0]).
> Sounds ok with me. And as long as there are no corner cases, it probably
> is the best solution, assuming we don't find another application for
> the recovery counter.
The recovery counter is very useful for performance measurement tools to
understand the cycles per instruction of a code path. (Using the recovery
counter for the debugger doesn't preclude using it for performance tools -
you just can't easily use it for both purposes at the same instant in time.)
> John
-Frank
--
Frank Rowand <frank_rowand@mvista.com>
MontaVista Software, Inc
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
2000-11-16 19:00 ` Single-stepping Frank Rowand
@ 2000-11-16 20:28 ` Richard Hirst
0 siblings, 0 replies; 12+ messages in thread
From: Richard Hirst @ 2000-11-16 20:28 UTC (permalink / raw)
To: frowand; +Cc: John Marvin, parisc-linux
On Thu, Nov 16, 2000 at 11:00:48AM -0800, Frank Rowand wrote:
> John Marvin wrote:
> > > > Does this code properly handle branches in the delay slot of another
> > > > branch? (you need to make sure you are not advancing the queues by just
> > > > adding 4 to each element). One concern I have about this method is that
> > >
> > > Current code does
> > >
> > > /* Nullified, just crank over the queue. */
> > > task_regs(child)->iaoq[0] = task_regs(child)->iaoq[1];
> > > task_regs(child)->iasq[0] = task_regs(child)->iasq[1];
> > > task_regs(child)->iaoq[1] = task_regs(child)->iaoq[0] + 4;
> > >
> > > Does that look right to you?
> > >
> >
> > Yes, that is the correct way to do it (I'll assume the duplicated line
> > is just a cut/paste error).
>
> If iaoq[0] contains a branch, iaoq[1] is in the delay slot. The instruction
> executed after iaoq[1] would then typically _not_ be iaoq[0] + 4 (the next
> instruction would be the target of the branch at iaoq[0]).
But the above code is only executed if the current instruction is
nullified. In your example, the branch in iaoq[0] would be
nullified and therefore never taken.
Richard
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
@ 2000-11-20 5:43 John Marvin
2000-11-20 6:53 ` Single-stepping Alan Modra
0 siblings, 1 reply; 12+ messages in thread
From: John Marvin @ 2000-11-20 5:43 UTC (permalink / raw)
To: parisc-linux
> > Note that this is the short explanation. It is not as simple as it sounds.
> > One major complication is that branches with links don't work properly
> > with the instruction queue magic, so the link register has to be updated
> > in the taken branch trap handler. Also branch externals won't update
> > the space of the space queue tail properly (again, that has to be fixed
> > in the taken branch handler). I can provide more details if the recovery
> > counter method doesn't work out.
>
> I'm a little intrigued about these "complications". How can the link
> register or space _not_ be updated properly? As far as I can see, the
> only really tricky instruction to single-step is RFI - which shouldn't
> ever occur in userspace, and which we'd just emulate if it was important.
The problem is that the link register is set to IAOQ_Back + 4. and in
the case of ble, sr0 is set to IASQ_Back. Since we've played games with
the queues, IAOQ_Back and IASQ_Back are pointing at the break page, not
at the instruction following the branch.
The additional complication is that the taken branch trap traps at the
branch destination, not at the branch, so at the point of the trap you
don't know where you came from in order to fix the problem easily. So,
what HP-UX does is check each instruction before it executes it to see if
it is a branch, and if so, what the link register is (and that is all that
needs to be parsed, since we are not emulating the instruction). It then
stores the branch location, and also sets some branch state flags (e.g.
UBE for a branch external, and UBL for a branch with a link, both flags
being set for a ble instruction). Then in the taken branch handler you
have all the information you need to fix the queue. You also need
to check this saved state if a signal handler is invoked while single
stepping, so that the proper pc queue values can be saved in the signal
context.
John Marvin
jsm@fc.hp.com
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
2000-11-20 5:43 Single-stepping John Marvin
@ 2000-11-20 6:53 ` Alan Modra
2000-11-20 7:24 ` Single-stepping Stan Sieler
0 siblings, 1 reply; 12+ messages in thread
From: Alan Modra @ 2000-11-20 6:53 UTC (permalink / raw)
To: John Marvin; +Cc: parisc-linux, parisc-linux
On Sun, 19 Nov 2000, John Marvin wrote:
> > I'm a little intrigued about these "complications". How can the link
> > register or space _not_ be updated properly? As far as I can see, the
> > only really tricky instruction to single-step is RFI - which shouldn't
> > ever occur in userspace, and which we'd just emulate if it was important.
>
> The problem is that the link register is set to IAOQ_Back + 4. and in
> the case of ble, sr0 is set to IASQ_Back. Since we've played games with
> the queues, IAOQ_Back and IASQ_Back are pointing at the break page, not
> at the instruction following the branch.
Ah. That is a little nasty, especially given the effect on signal
handlers you mention below. Maybe using the recovery counter isn't such a
bad idea after all, especially since the added syscall and task switch
overhead can be quite small if the kernel only supports single-step by
one instruction.
> The additional complication is that the taken branch trap traps at the
> branch destination, not at the branch, so at the point of the trap you
> don't know where you came from in order to fix the problem easily. So,
> what HP-UX does is check each instruction before it executes it to see if
> it is a branch, and if so, what the link register is (and that is all that
> needs to be parsed, since we are not emulating the instruction). It then
> stores the branch location, and also sets some branch state flags (e.g.
> UBE for a branch external, and UBL for a branch with a link, both flags
> being set for a ble instruction). Then in the taken branch handler you
> have all the information you need to fix the queue. You also need
> to check this saved state if a signal handler is invoked while single
> stepping, so that the proper pc queue values can be saved in the signal
> context.
Another question for you and/or the list in general:
Why does struct pt_regs have an ipsw field? Seems like it currently is
unused.
Regards, Alan Modra
--
Linuxcare. Support for the Revolution.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
2000-11-20 6:53 ` Single-stepping Alan Modra
@ 2000-11-20 7:24 ` Stan Sieler
2000-11-20 9:05 ` Single-stepping Alan Modra
0 siblings, 1 reply; 12+ messages in thread
From: Stan Sieler @ 2000-11-20 7:24 UTC (permalink / raw)
To: Alan Modra; +Cc: John Marvin, parisc-linux, parisc-linux
Re:
> handlers you mention below. Maybe using the recovery counter isn't such a
quite true.
> bad idea after all, especially since the added syscall and task switch
> overhead can be quite small if the kernel only supports single-step by
> one instruction.
why the limit? We've used multi-instruction "single step" (oxymoron :)
for about 15 years on PA-RISC...no problems, efficient, and *very*
useful!
--
Stan Sieler sieler@allegro.com
www.allegro.com/sieler/wanted/index.html www.sieler.com
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
2000-11-20 7:24 ` Single-stepping Stan Sieler
@ 2000-11-20 9:05 ` Alan Modra
2000-11-20 18:47 ` Single-stepping Stan Sieler
0 siblings, 1 reply; 12+ messages in thread
From: Alan Modra @ 2000-11-20 9:05 UTC (permalink / raw)
To: Stan Sieler; +Cc: John Marvin, parisc-linux, parisc-linux
On Sun, 19 Nov 2000, Stan Sieler wrote:
> > bad idea after all, especially since the added syscall and task switch
> > overhead can be quite small if the kernel only supports single-step by
> > one instruction.
>
> why the limit? We've used multi-instruction "single step" (oxymoron :)
> for about 15 years on PA-RISC...no problems, efficient, and *very*
> useful!
Because you would then need to save and restore cr0 on task switches (or
only allow one task to be single-stepped at a time). That's four
instructions and two extra memory accesses per task switch. Which might
not seem very much, but at some point somebody will no doubt start caring
about pa-linux performance. For a single-step by one, you can simply set
cr0 to zero on a task switch, and possibly avoid touching cr0 on a task
switch at all with careful attention to various trap handlers.
Here's the idea. The tail of syscall_restore (64-bit stuff pruned) will
look like the following, with the first three instructions being the added
code to support single-step (and also wide/narrow switching for 64-bit)
ldi 3,%r20
mtctl $r20,%cr0 /* recovery counter, ptrace single-step */
LDREG TASK_PT_PSW(%r1),%r20
mtctl %r1,%cr30 /* intrhandler okay. */
mfsp %sr3,%r1 /* Get users space id */
mtsp %r1,%sr4 /* Restore sr4 */
mtsp %r1,%sr5 /* Restore sr5 */
mtsp %r1,%sr6 /* Restore sr6 */
depi 3,31,2,%r31 /* ensure return to user mode. */
mtsm %r20 /* restore irq state */
mfctl %cr27,%r20
be 0(%sr3,%r31) /* return to user space */
mtsp %r1,%sr7 /* Restore sr7 */
ptrace will fiddle with TASK_PT_PSW, setting the R bit and clearing the I
bit to enable the recovery counter - which will start counting down at the
mtsm instruction above, and reach zero on the user-space instruction, so
we'll trap after executing one user-space instruction. The task-switch
nonsense is to handle the case where we page-fault on the instruction and
switch to another task also doing single-stepping. You want to ensure cr0
is zero when we finally get back to the original task.
Now it might turn out that having extra instructions in the syscall path
is worse than extra code and memory accesses on task switch. If that
turns out to be true, then you'll probably get your multi-step ptrace. :-)
Alan Modra
--
Linuxcare. Support for the Revolution.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Single-stepping
2000-11-20 9:05 ` Single-stepping Alan Modra
@ 2000-11-20 18:47 ` Stan Sieler
0 siblings, 0 replies; 12+ messages in thread
From: Stan Sieler @ 2000-11-20 18:47 UTC (permalink / raw)
To: Alan Modra; +Cc: John Marvin, parisc-linux, parisc-linux
Re:
> Because you would then need to save and restore cr0 on task switches (or
> only allow one task to be single-stepped at a time). That's four
> instructions and two extra memory accesses per task switch. Which might
> not seem very much, but at some point somebody will no doubt start caring
> about pa-linux performance.
And it still won't seem like much, then!
Non-memory-access instructions are cheap. An extra memory reference (from
something probably already in cache) and two extra instructions
would probably cost less than an hour per CPU over the next 10 *years*,
assuming 10 years of 1000 task switches per second on a slow 100 MHz CPU.
Of course, at the cost of an extra non-memory-referencing instruction or so,
you could say "at switch-to-task time: if PSW R-bit set, then load the saved
CR0 from memory and move it to CR0", saving one memory reference 99.99999%
of the time, resuling in an average of only one memory reference
per task switch normally.
I haven't look at interrupt handling / system calls closely, but I
hope there aren't other false savings. (E.g., failure to save/restore
the PID check flag ... sure, user processes *now* probably never have
pid checking disabled, but that's a very useful feature to have
available (with proper security controls, of course).)
(Yes, I'm one of the very few who use that feature on MPE/iX ... carefully,
of course :)
--
Stan Sieler sieler@allegro.com
www.allegro.com/sieler/wanted/index.html www.sieler.com
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2000-11-20 18:46 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-11-16 9:01 Single-stepping John Marvin
2000-11-16 12:00 ` Single-stepping Richard Hirst
2000-11-20 3:03 ` Single-stepping Alan Modra
-- strict thread matches above, loose matches on Subject: below --
2000-11-16 12:44 Single-stepping John Marvin
2000-11-16 13:20 ` Single-stepping Richard Hirst
2000-11-16 19:00 ` Single-stepping Frank Rowand
2000-11-16 20:28 ` Single-stepping Richard Hirst
2000-11-20 5:43 Single-stepping John Marvin
2000-11-20 6:53 ` Single-stepping Alan Modra
2000-11-20 7:24 ` Single-stepping Stan Sieler
2000-11-20 9:05 ` Single-stepping Alan Modra
2000-11-20 18:47 ` Single-stepping Stan Sieler
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.