* Re: [parisc] double restarts on multiple signal arrivals
2012-05-18 22:24 ` Al Viro
@ 2012-05-19 1:36 ` Carlos O'Donell
2012-05-19 5:26 ` Al Viro
2012-05-20 8:40 ` Al Viro
2 siblings, 0 replies; 21+ messages in thread
From: Carlos O'Donell @ 2012-05-19 1:36 UTC (permalink / raw)
To: Al Viro
Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
linux-arch
On Fri, May 18, 2012 at 6:24 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Fri, May 18, 2012 at 04:15:42PM -0400, Carlos O'Donell wrote:
>
>> I've hit "Mark as unread" twice on this thread now because I *am*
>> interested in this conversation.
>>
>> However, I must confess that the original email reads like a rant
>> without a clear explanation of exactly what you think is wrong.
>>
>> I'm familiar with the PARISC signal code because I worked on the
>> 32/64-bit compat support for the frame layout, I also helped Randolph
>> work on the syscall restart logic.
>>
>> What I don't understand is what's wrong with it now, and your subject
>> begins to hint at the problem.
>>
>> Can you describe in more detail exactly what problem we need to solve?
>
> Sure. Have e.g. read(2) on a pipe called when you happen to have -512 in
> r28 and have two signals arrive at once while inside that read(). Both
> with user handlers. Both with SA_RESTART.
>
> When we enter the kernel, we have pt_regs created on kernel stack and
> filled with the values we had in registers. Among other things, the
> value of r28 is stored in regs->orig_r28. Note that arguments and the
> syscall number are all passed through other registers, so setting the
> arguments for syscall did not overwrite the value we used to have in
> r28. It's still -512.
>
> We find which C function to call (sys_read) and go there, with return
> address set to syscall_exit.
>
> sys_read() ends up calling pipe_read(), which finds pipe empty and goes
> to sleep. The signals arrive and pipe_read() is woken up. It sees that
> * signal_pending() is true
> * nothing has been read yet
> so it return -ERESTARSYS. sys_read() returns what it got from pipe_read().
>
> Now we are at syscall_exit. We set r1 to pt_regs created on stack when
> we entered the syscall. Return value of sys_read() is taken from r28 and
> stored into regs->gr[28]. We check TIF_NEED_RESCHED and it's not set.
> We'd reached syscall_check_sig and check TIF_SIGPENDING, which is set.
> We go through syscall_do_signal and reach the call of
> do_notify_resume(regs, 1), which calls do_signal(regs, 1).
>
> There we call get_signal_to_deliver() and get one of those two signals
> dequeued and returned. Note that TIF_SIGPENDING is still set - there is
> the second signal left. We got positive signr, so we do not leave the
> loop yet. in_syscall is 1, so we call syscall_restart().
>
> We check regs->gr[28], which is -ERESTARTSYS. SA_RESTART is set in
> sa_flags, so we fall through to the next case, which is
> regs->gr[31] -= 8; /* delayed branching */
> /* Preserve original r28. */
> regs->gr[28] = regs->orig_r28;
> Note that ERESTARTSYS is 512, so the second assignment does nothing - we
> still have regs->gr[28] equal to -ERESTARTSYS. We are done with
> syscall_restart().
>
> Now we call handle_signal(), which calls setup_rt_frame(). sigframe is
> allocated on userland stack and filled accordingly to the contents of *regs.
> No errors occured, so we set regs->gr[31] to entry point of signal handler,
> regs->gr[2] to address of sigreturn trampoline we'd created in the sigframe,
> regs->gr[26..24] to arguments of handler, and regs->gr[30] to new value of
> userland stack pointer.
>
> Back in handle_signal(), we adjust the set of blocked signals and return 1.
> do_signal() sees that RESTORE_SIGMASK is not set and it's done - we return
> to do_notify_resume() and from there back into entry.S.
>
> We reload registers from pt_regs and go to syscall_check_sig again. There
> we reread the thread flags and see that SIGPENDING is still set - we have
> another signal to deal with. We hit do_notify_resume() and do_signal()
> again. And again, in_syscall is 1.
>
> do_signal() gets the remaining pending signal from get_signal_to_deliver().
> Again, we call syscall_restart(). And neither regs->gr[28] nor regs->orig_r28
> has been changed, so we hit the same case. Assignment to regs->gr[28] is
> a no-op again. But look what happened to regs->gr[31] - it got decremented
> by 8. So it points 8 bytes before the entry point of the first handler now.
>
> We proceed to set the second sigframe up. The registers are saved in
> sigcontext, adjusted so that they correspond to the call of the second
> handler with future return to the second trampoline and we return all the
> way out to entry.S. There we go to syscall_check_sig again, but this
> time we have no pending signals left, so we reach syscall_restore and
> proceed to userland.
>
> Now we are in the second handler. It's executed to the end and when it
> returns, we go to the address we had in r2. I.e. the second trampoline.
> There we get r25 set to 1, r20 set to __NR_rt_sigreturn and we go to
> execute the syscall. Which restores the registers from sigcontext and,
> since its in_syscall argument is 1, sets ->gr[31] to sc->sc_iaoq. Which
> had been set by setup_rt_sigcontext() from regs->gr[31] at the moment it
> had been called, i.e. the entry point of the first handler decremented by 8.
> Then we proceed to return to userland. And arrive there are the right
> arguments for the first handler, but instruction pointer is 8 bytes too
> early. Bad things happen, obviously.
>
> The point where the things go wrong is the second call of syscall_restart().
> It should've done nothing. We need to do syscall restarts only when building
> the first sigframe. And usually it does work out that way, simply because
> the value you happened to have in r28 is probably not one of the 4 that
> make syscall_restart() do things. So after the first time around it does
> nothing. Very similar logics on x86 does, indeed, work in all cases.
> There the register used for return values happens to be the same as we use
> to pass syscall number, so you can't arrange a call of syscall_restart()
> analog that would trigger restart *and* leave the return value register
> unchanged - you'd need %eax to be -ERESTART<something> when you enter the
> syscall and that would immediately result in regs->ax set to -ENOSYS. I.e.
> no restarts would be done. However, on parisc that doesn't work - you can
> start with r28 being -ERESTARTSYS and get -ERESTARTSYS from syscall.
>
> And no, I don't have any good suggestions on how to fix it right now. Depends
> on how does gdb on parisc play with pt_regs, among other things...
>
> Sorry if the original posting sounded like a rant - I'd spent the last month
> crawling through the assembler glue around signal hanlding all over arch/*
> and that probably affected the tone. Definitely had been affecting the amount
> of spoken obscenities lately...
Thanks Al. This is description is a work of art, I wish all
reports looked like this. I'll go over this code tomorrow and
get back to you with my thoughts. I really appreciate the
thorough review.
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc] double restarts on multiple signal arrivals
2012-05-18 22:24 ` Al Viro
2012-05-19 1:36 ` Carlos O'Donell
@ 2012-05-19 5:26 ` Al Viro
2012-05-19 13:37 ` Al Viro
2012-05-20 8:40 ` Al Viro
2 siblings, 1 reply; 21+ messages in thread
From: Al Viro @ 2012-05-19 5:26 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
linux-arch
On Fri, May 18, 2012 at 11:24:22PM +0100, Al Viro wrote:
> And no, I don't have any good suggestions on how to fix it right now. Depends
> on how does gdb on parisc play with pt_regs, among other things...
I think we might be able to cheat and pull off the following:
* stop restoring r28 on restarts. Behaviour of syscall should not
depend on the value of r28 at the moment we enter it and it does clobber
r28 in all cases - that's where the return value goes to, after all. And
the instruction we are restarting would better be a syscall.
* with that done, we can reuse ->orig_r28 for any purpose. In
particular, we can use it as "no restarts allowed" flag. Just have it set
to zero instead of r28 in linux_gateway_entry and hpux_gateway_page, then
in syscall_restart() and insert_restart_trampoline() check that
it's still 0, go away if it isn't and set it to 1 otherwise. Set
it to 1 in sigreturn, to make it indifferent to what we might or might
not find in r28 stored in sigcontext (strictly speaking that's not needed,
but it's less brittle than relying on rather subtle properties of what
could end up in sigcontext and trampoline; the whole thing is a nice
example of how easily subtle things break and how long does that stay
unnoticed)
Another potential solution would be a trick a-la MIPS, but you are
already using that slot in regs->gr[] for "is it a syscall" flag (psw for
interrupt, 0 for syscall) and touching that is very likely to end up with
very unhappy gdb...
So how about the following:
diff --git a/arch/parisc/hpux/gate.S b/arch/parisc/hpux/gate.S
index 38a1c1b..0114688 100644
--- a/arch/parisc/hpux/gate.S
+++ b/arch/parisc/hpux/gate.S
@@ -71,7 +71,7 @@ ENTRY(hpux_gateway_page)
STREG %r26, TASK_PT_GR26(%r1) /* 1st argument */
STREG %r27, TASK_PT_GR27(%r1) /* user dp */
STREG %r28, TASK_PT_GR28(%r1) /* return value 0 */
- STREG %r28, TASK_PT_ORIG_R28(%r1) /* return value 0 (saved for signals) */
+ STREG %r0, TASK_PT_ORIG_R28(%r1) /* don't prohibit restarts */
STREG %r29, TASK_PT_GR29(%r1) /* 8th argument */
STREG %r31, TASK_PT_GR31(%r1) /* preserve syscall return ptr */
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..b4dd2c1 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -88,6 +88,7 @@ restore_sigcontext(struct sigcontext __user *sc, struct pt_regs *regs)
err |= __copy_from_user(regs->iaoq, sc->sc_iaoq, sizeof(regs->iaoq));
err |= __copy_from_user(regs->iasq, sc->sc_iasq, sizeof(regs->iasq));
err |= __get_user(regs->sar, &sc->sc_sar);
+ regs->orig_r28 = 1; /* no restarts for sigreturn */
DBG(2,"restore_sigcontext: iaoq is 0x%#lx / 0x%#lx\n",
regs->iaoq[0],regs->iaoq[1]);
DBG(2,"restore_sigcontext: r28 is %ld\n", regs->gr[28]);
@@ -471,6 +472,9 @@ handle_signal(unsigned long sig, siginfo_t *info, struct k_sigaction *ka,
static inline void
syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
{
+ if (regs->orig_r28)
+ return;
+ regs->orig_r28 = 1; /* no more restarts */
/* Check the return code */
switch (regs->gr[28]) {
case -ERESTART_RESTARTBLOCK:
@@ -493,8 +497,6 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
* we have to do is fiddle the return pointer.
*/
regs->gr[31] -= 8; /* delayed branching */
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
break;
}
}
@@ -502,6 +504,9 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
static inline void
insert_restart_trampoline(struct pt_regs *regs)
{
+ if (regs->orig_r28)
+ return;
+ regs->orig_r28 = 1; /* no more restarts */
switch(regs->gr[28]) {
case -ERESTART_RESTARTBLOCK: {
/* Restart the system call - no handlers present */
@@ -536,9 +541,6 @@ insert_restart_trampoline(struct pt_regs *regs)
flush_user_icache_range(regs->gr[30], regs->gr[30] + 4);
regs->gr[31] = regs->gr[30] + 8;
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
-
return;
}
case -ERESTARTNOHAND:
@@ -550,9 +552,6 @@ insert_restart_trampoline(struct pt_regs *regs)
* slot of the branch external instruction.
*/
regs->gr[31] -= 8;
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
-
return;
}
default:
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index 82a52b2..54a9cbf 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -156,7 +156,7 @@ linux_gateway_entry:
STREG %r26, TASK_PT_GR26(%r1) /* 1st argument */
STREG %r27, TASK_PT_GR27(%r1) /* user dp */
STREG %r28, TASK_PT_GR28(%r1) /* return value 0 */
- STREG %r28, TASK_PT_ORIG_R28(%r1) /* return value 0 (saved for signals) */
+ STREG %r0, TASK_PT_ORIG_R28(%r1) /* don't prohibit restarts */
STREG %r29, TASK_PT_GR29(%r1) /* return value 1 */
STREG %r31, TASK_PT_GR31(%r1) /* preserve syscall return ptr */
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [parisc] double restarts on multiple signal arrivals
2012-05-19 5:26 ` Al Viro
@ 2012-05-19 13:37 ` Al Viro
2012-05-19 13:37 ` Al Viro
0 siblings, 1 reply; 21+ messages in thread
From: Al Viro @ 2012-05-19 13:37 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
linux-arch
On Sat, May 19, 2012 at 06:26:23AM +0100, Al Viro wrote:
> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
> index 12c1ed3..b4dd2c1 100644
> --- a/arch/parisc/kernel/signal.c
> +++ b/arch/parisc/kernel/signal.c
> @@ -88,6 +88,7 @@ restore_sigcontext(struct sigcontext __user *sc, struct pt_regs *regs)
> err |= __copy_from_user(regs->iaoq, sc->sc_iaoq, sizeof(regs->iaoq));
> err |= __copy_from_user(regs->iasq, sc->sc_iasq, sizeof(regs->iasq));
> err |= __get_user(regs->sar, &sc->sc_sar);
> + regs->orig_r28 = 1; /* no restarts for sigreturn */
> DBG(2,"restore_sigcontext: iaoq is 0x%#lx / 0x%#lx\n",
> regs->iaoq[0],regs->iaoq[1]);
> DBG(2,"restore_sigcontext: r28 is %ld\n", regs->gr[28]);
... only that part should be done in sys_rt_sigreturn() instead, or we'll miss
it for 32bit-on-64bit case.
diff --git a/arch/parisc/hpux/gate.S b/arch/parisc/hpux/gate.S
index 38a1c1b..0114688 100644
--- a/arch/parisc/hpux/gate.S
+++ b/arch/parisc/hpux/gate.S
@@ -71,7 +71,7 @@ ENTRY(hpux_gateway_page)
STREG %r26, TASK_PT_GR26(%r1) /* 1st argument */
STREG %r27, TASK_PT_GR27(%r1) /* user dp */
STREG %r28, TASK_PT_GR28(%r1) /* return value 0 */
- STREG %r28, TASK_PT_ORIG_R28(%r1) /* return value 0 (saved for signals) */
+ STREG %r0, TASK_PT_ORIG_R28(%r1) /* don't prohibit restarts */
STREG %r29, TASK_PT_GR29(%r1) /* 8th argument */
STREG %r31, TASK_PT_GR31(%r1) /* preserve syscall return ptr */
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..369b1c4 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -115,6 +115,8 @@ sys_rt_sigreturn(struct pt_regs *regs, int in_syscall)
(usp - sigframe_size);
DBG(2,"sys_rt_sigreturn: frame is %p\n", frame);
+ regs->orig_r28 = 1; /* no restarts for sigreturn */
+
#ifdef CONFIG_64BIT
compat_frame = (struct compat_rt_sigframe __user *)frame;
@@ -471,6 +473,9 @@ handle_signal(unsigned long sig, siginfo_t *info, struct k_sigaction *ka,
static inline void
syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
{
+ if (regs->orig_r28)
+ return;
+ regs->orig_r28 = 1; /* no more restarts */
/* Check the return code */
switch (regs->gr[28]) {
case -ERESTART_RESTARTBLOCK:
@@ -493,8 +498,6 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
* we have to do is fiddle the return pointer.
*/
regs->gr[31] -= 8; /* delayed branching */
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
break;
}
}
@@ -502,6 +505,9 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
static inline void
insert_restart_trampoline(struct pt_regs *regs)
{
+ if (regs->orig_r28)
+ return;
+ regs->orig_r28 = 1; /* no more restarts */
switch(regs->gr[28]) {
case -ERESTART_RESTARTBLOCK: {
/* Restart the system call - no handlers present */
@@ -536,9 +542,6 @@ insert_restart_trampoline(struct pt_regs *regs)
flush_user_icache_range(regs->gr[30], regs->gr[30] + 4);
regs->gr[31] = regs->gr[30] + 8;
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
-
return;
}
case -ERESTARTNOHAND:
@@ -550,9 +553,6 @@ insert_restart_trampoline(struct pt_regs *regs)
* slot of the branch external instruction.
*/
regs->gr[31] -= 8;
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [parisc] double restarts on multiple signal arrivals
2012-05-19 13:37 ` Al Viro
@ 2012-05-19 13:37 ` Al Viro
0 siblings, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-19 13:37 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
linux-arch
On Sat, May 19, 2012 at 06:26:23AM +0100, Al Viro wrote:
> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
> index 12c1ed3..b4dd2c1 100644
> --- a/arch/parisc/kernel/signal.c
> +++ b/arch/parisc/kernel/signal.c
> @@ -88,6 +88,7 @@ restore_sigcontext(struct sigcontext __user *sc, struct pt_regs *regs)
> err |= __copy_from_user(regs->iaoq, sc->sc_iaoq, sizeof(regs->iaoq));
> err |= __copy_from_user(regs->iasq, sc->sc_iasq, sizeof(regs->iasq));
> err |= __get_user(regs->sar, &sc->sc_sar);
> + regs->orig_r28 = 1; /* no restarts for sigreturn */
> DBG(2,"restore_sigcontext: iaoq is 0x%#lx / 0x%#lx\n",
> regs->iaoq[0],regs->iaoq[1]);
> DBG(2,"restore_sigcontext: r28 is %ld\n", regs->gr[28]);
... only that part should be done in sys_rt_sigreturn() instead, or we'll miss
it for 32bit-on-64bit case.
diff --git a/arch/parisc/hpux/gate.S b/arch/parisc/hpux/gate.S
index 38a1c1b..0114688 100644
--- a/arch/parisc/hpux/gate.S
+++ b/arch/parisc/hpux/gate.S
@@ -71,7 +71,7 @@ ENTRY(hpux_gateway_page)
STREG %r26, TASK_PT_GR26(%r1) /* 1st argument */
STREG %r27, TASK_PT_GR27(%r1) /* user dp */
STREG %r28, TASK_PT_GR28(%r1) /* return value 0 */
- STREG %r28, TASK_PT_ORIG_R28(%r1) /* return value 0 (saved for signals) */
+ STREG %r0, TASK_PT_ORIG_R28(%r1) /* don't prohibit restarts */
STREG %r29, TASK_PT_GR29(%r1) /* 8th argument */
STREG %r31, TASK_PT_GR31(%r1) /* preserve syscall return ptr */
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..369b1c4 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -115,6 +115,8 @@ sys_rt_sigreturn(struct pt_regs *regs, int in_syscall)
(usp - sigframe_size);
DBG(2,"sys_rt_sigreturn: frame is %p\n", frame);
+ regs->orig_r28 = 1; /* no restarts for sigreturn */
+
#ifdef CONFIG_64BIT
compat_frame = (struct compat_rt_sigframe __user *)frame;
@@ -471,6 +473,9 @@ handle_signal(unsigned long sig, siginfo_t *info, struct k_sigaction *ka,
static inline void
syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
{
+ if (regs->orig_r28)
+ return;
+ regs->orig_r28 = 1; /* no more restarts */
/* Check the return code */
switch (regs->gr[28]) {
case -ERESTART_RESTARTBLOCK:
@@ -493,8 +498,6 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
* we have to do is fiddle the return pointer.
*/
regs->gr[31] -= 8; /* delayed branching */
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
break;
}
}
@@ -502,6 +505,9 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
static inline void
insert_restart_trampoline(struct pt_regs *regs)
{
+ if (regs->orig_r28)
+ return;
+ regs->orig_r28 = 1; /* no more restarts */
switch(regs->gr[28]) {
case -ERESTART_RESTARTBLOCK: {
/* Restart the system call - no handlers present */
@@ -536,9 +542,6 @@ insert_restart_trampoline(struct pt_regs *regs)
flush_user_icache_range(regs->gr[30], regs->gr[30] + 4);
regs->gr[31] = regs->gr[30] + 8;
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
-
return;
}
case -ERESTARTNOHAND:
@@ -550,9 +553,6 @@ insert_restart_trampoline(struct pt_regs *regs)
* slot of the branch external instruction.
*/
regs->gr[31] -= 8;
- /* Preserve original r28. */
- regs->gr[28] = regs->orig_r28;
-
return;
}
default:
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index 82a52b2..54a9cbf 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -156,7 +156,7 @@ linux_gateway_entry:
STREG %r26, TASK_PT_GR26(%r1) /* 1st argument */
STREG %r27, TASK_PT_GR27(%r1) /* user dp */
STREG %r28, TASK_PT_GR28(%r1) /* return value 0 */
- STREG %r28, TASK_PT_ORIG_R28(%r1) /* return value 0 (saved for signals) */
+ STREG %r0, TASK_PT_ORIG_R28(%r1) /* don't prohibit restarts */
STREG %r29, TASK_PT_GR29(%r1) /* return value 1 */
STREG %r31, TASK_PT_GR31(%r1) /* preserve syscall return ptr */
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [parisc] double restarts on multiple signal arrivals
2012-05-18 22:24 ` Al Viro
2012-05-19 1:36 ` Carlos O'Donell
2012-05-19 5:26 ` Al Viro
@ 2012-05-20 8:40 ` Al Viro
2012-05-20 8:40 ` Al Viro
2012-05-20 9:04 ` Al Viro
2 siblings, 2 replies; 21+ messages in thread
From: Al Viro @ 2012-05-20 8:40 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
linux-arch
BTW, after looking through the asm glue there, I've noticed something
rather odd. *All* architectures other than parisc have the following thing
done on syscall entry:
* check if we have TIF_SYSCALL_TRACE in flags; bugger off to slow
path if we do.
* there check if we have PT_PTRACED in current->ptrace and if so
do the standard song and dance with raising SIGTRAP or SIGTRAP | 0x80,
etc.
parisc does that in the opposite order - it goes to slow path if
current->ptrace & PT_PTRACED is true. The same actions with raising
SIGTRAP, etc. are done only if TIF_SYSCALL_TRACE is there as well,
but by that point we'd already plonked a bunch of registers on the
stack and we actually stay on the slow path even in absense of SYSCALL_TRACE.
The thing is, other architectures have a good reason for looking at
SYSCALL_TRACE first and not bothering with all that fun if it's not set.
PT_PTRACED is set when the task is being traced; it does *not* imply
stopping on every syscall entry. E.g. a perfectly normal situation is
when in gdb you've said
break foo.c:69
cont
and let the process run until it hits that breakpoint. It has PT_PTRACED
all along. It will be stopped when it gets a signal, including SIGTRAP
we'll get when it steps on that breakpoint. But by that time it's very
likely to have passed through tons of syscalls without stopping on any of
them. SYSCALL_TRACE is set *only* when we want the process to stop on the
next syscall. That's what strace(1) and friends are using. So it doesn't
make sense to get overhead for all processes that have PT_PTRACED;
SYSCALL_TRACE is less likely, so checking it first is an obvious approach.
Moreover, checking PT_TRACED first is not even cheaper in the common case
(i.e. when branch to .Ltracesys is not taken at all). As it is, parisc does
mfctl %cr30, %r1
LDREG TI_TASK(%r1),%r1
ldw TASK_PTRACE(%r1), %r1
bb,<,n %r1,31,.Ltracesys
and that's actually an extra dereference compared to
mfctl %cr30, %r1
LDREG TI_FLAGS(%r1),%r1
bb,<,n %r1,31 - TIF_SYSCALL_TRACE,.Ltracesys
Note that tracehook_report_syscall_entry/tracehook_report_syscall_exit
are checking PT_PTRACED, so it's not like we needed to change anything
other than that spot - resulting logics will be equivalent to what we
have right now. Looks like a fairly straightforward optimisation...
Am I missing something subtle and parisc-specific here?
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [parisc] double restarts on multiple signal arrivals
2012-05-20 8:40 ` Al Viro
@ 2012-05-20 8:40 ` Al Viro
2012-05-20 9:04 ` Al Viro
1 sibling, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-20 8:40 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
linux-arch
BTW, after looking through the asm glue there, I've noticed something
rather odd. *All* architectures other than parisc have the following thing
done on syscall entry:
* check if we have TIF_SYSCALL_TRACE in flags; bugger off to slow
path if we do.
* there check if we have PT_PTRACED in current->ptrace and if so
do the standard song and dance with raising SIGTRAP or SIGTRAP | 0x80,
etc.
parisc does that in the opposite order - it goes to slow path if
current->ptrace & PT_PTRACED is true. The same actions with raising
SIGTRAP, etc. are done only if TIF_SYSCALL_TRACE is there as well,
but by that point we'd already plonked a bunch of registers on the
stack and we actually stay on the slow path even in absense of SYSCALL_TRACE.
The thing is, other architectures have a good reason for looking at
SYSCALL_TRACE first and not bothering with all that fun if it's not set.
PT_PTRACED is set when the task is being traced; it does *not* imply
stopping on every syscall entry. E.g. a perfectly normal situation is
when in gdb you've said
break foo.c:69
cont
and let the process run until it hits that breakpoint. It has PT_PTRACED
all along. It will be stopped when it gets a signal, including SIGTRAP
we'll get when it steps on that breakpoint. But by that time it's very
likely to have passed through tons of syscalls without stopping on any of
them. SYSCALL_TRACE is set *only* when we want the process to stop on the
next syscall. That's what strace(1) and friends are using. So it doesn't
make sense to get overhead for all processes that have PT_PTRACED;
SYSCALL_TRACE is less likely, so checking it first is an obvious approach.
Moreover, checking PT_TRACED first is not even cheaper in the common case
(i.e. when branch to .Ltracesys is not taken at all). As it is, parisc does
mfctl %cr30, %r1
LDREG TI_TASK(%r1),%r1
ldw TASK_PTRACE(%r1), %r1
bb,<,n %r1,31,.Ltracesys
and that's actually an extra dereference compared to
mfctl %cr30, %r1
LDREG TI_FLAGS(%r1),%r1
bb,<,n %r1,31 - TIF_SYSCALL_TRACE,.Ltracesys
Note that tracehook_report_syscall_entry/tracehook_report_syscall_exit
are checking PT_PTRACED, so it's not like we needed to change anything
other than that spot - resulting logics will be equivalent to what we
have right now. Looks like a fairly straightforward optimisation...
Am I missing something subtle and parisc-specific here?
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [parisc] double restarts on multiple signal arrivals
2012-05-20 8:40 ` Al Viro
2012-05-20 8:40 ` Al Viro
@ 2012-05-20 9:04 ` Al Viro
2012-05-20 9:04 ` Al Viro
2012-05-20 18:46 ` John David Anglin
1 sibling, 2 replies; 21+ messages in thread
From: Al Viro @ 2012-05-20 9:04 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
linux-arch
On Sun, May 20, 2012 at 09:40:42AM +0100, Al Viro wrote:
> Moreover, checking PT_TRACED first is not even cheaper in the common case
> (i.e. when branch to .Ltracesys is not taken at all). As it is, parisc does
> mfctl %cr30, %r1
> LDREG TI_TASK(%r1),%r1
> ldw TASK_PTRACE(%r1), %r1
> bb,<,n %r1,31,.Ltracesys
> and that's actually an extra dereference compared to
> mfctl %cr30, %r1
> LDREG TI_FLAGS(%r1),%r1
> bb,<,n %r1,31 - TIF_SYSCALL_TRACE,.Ltracesys
> Note that tracehook_report_syscall_entry/tracehook_report_syscall_exit
> are checking PT_PTRACED, so it's not like we needed to change anything
> other than that spot - resulting logics will be equivalent to what we
> have right now. Looks like a fairly straightforward optimisation...
> Am I missing something subtle and parisc-specific here?
Actually, looks like I am missing something, but it's not particulary subtle.
SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes do_syscall_trace_leave()
do things. So checking one bit in flags is not enough - any of those 3 is
a reason for taking the slow path. The point still stands, though -
mfctl %cr30, %r1
LDREG TI_FLAGS(%r1),%r1
ldi (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP), %r19
and,COND(=) %r19, %r1, %r0
b,n .Ltracesys
would still be no worse on the fast path and would not hit the slow path in
a lot of cases when the current code does it for no apparent reason.
Comments?
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [parisc] double restarts on multiple signal arrivals
2012-05-20 9:04 ` Al Viro
@ 2012-05-20 9:04 ` Al Viro
2012-05-20 18:46 ` John David Anglin
1 sibling, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-20 9:04 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
linux-arch
On Sun, May 20, 2012 at 09:40:42AM +0100, Al Viro wrote:
> Moreover, checking PT_TRACED first is not even cheaper in the common case
> (i.e. when branch to .Ltracesys is not taken at all). As it is, parisc does
> mfctl %cr30, %r1
> LDREG TI_TASK(%r1),%r1
> ldw TASK_PTRACE(%r1), %r1
> bb,<,n %r1,31,.Ltracesys
> and that's actually an extra dereference compared to
> mfctl %cr30, %r1
> LDREG TI_FLAGS(%r1),%r1
> bb,<,n %r1,31 - TIF_SYSCALL_TRACE,.Ltracesys
> Note that tracehook_report_syscall_entry/tracehook_report_syscall_exit
> are checking PT_PTRACED, so it's not like we needed to change anything
> other than that spot - resulting logics will be equivalent to what we
> have right now. Looks like a fairly straightforward optimisation...
> Am I missing something subtle and parisc-specific here?
Actually, looks like I am missing something, but it's not particulary subtle.
SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes do_syscall_trace_leave()
do things. So checking one bit in flags is not enough - any of those 3 is
a reason for taking the slow path. The point still stands, though -
mfctl %cr30, %r1
LDREG TI_FLAGS(%r1),%r1
ldi (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP), %r19
and,COND(=) %r19, %r1, %r0
b,n .Ltracesys
would still be no worse on the fast path and would not hit the slow path in
a lot of cases when the current code does it for no apparent reason.
Comments?
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [parisc] double restarts on multiple signal arrivals
2012-05-20 9:04 ` Al Viro
2012-05-20 9:04 ` Al Viro
@ 2012-05-20 18:46 ` John David Anglin
2012-05-20 18:46 ` John David Anglin
2012-05-20 20:38 ` Carlos O'Donell
1 sibling, 2 replies; 21+ messages in thread
From: John David Anglin @ 2012-05-20 18:46 UTC (permalink / raw)
To: Al Viro
Cc: Carlos O'Donell, Grant Grundler, linux-parisc, Linus Torvalds,
linux-arch
On 20-May-12, at 5:04 AM, Al Viro wrote:
> Actually, looks like I am missing something, but it's not
> particulary subtle.
> SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
> any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes
> do_syscall_trace_leave()
> do things. So checking one bit in flags is not enough - any of
> those 3 is
> a reason for taking the slow path. The point still stands, though -
> mfctl %cr30, %r1
> LDREG TI_FLAGS(%r1),%r1
> ldi (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP), %r19
> and,COND(=) %r19, %r1, %r0
> b,n .Ltracesys
> would still be no worse on the fast path and would not hit the slow
> path in
> a lot of cases when the current code does it for no apparent reason.
>
> Comments?
This looks good to me. There is similar code in entry.S.
I added the above change to my patch set for stable 3.3.6. System still
boots. I singled stepped /bin/ls through a directory listing and it
seemed
to work.
Dave
--
John David Anglin dave.anglin@bell.net
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc] double restarts on multiple signal arrivals
2012-05-20 18:46 ` John David Anglin
@ 2012-05-20 18:46 ` John David Anglin
2012-05-20 20:38 ` Carlos O'Donell
1 sibling, 0 replies; 21+ messages in thread
From: John David Anglin @ 2012-05-20 18:46 UTC (permalink / raw)
To: Al Viro
Cc: Carlos O'Donell, Grant Grundler, linux-parisc, Linus Torvalds,
linux-arch
On 20-May-12, at 5:04 AM, Al Viro wrote:
> Actually, looks like I am missing something, but it's not
> particulary subtle.
> SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
> any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes
> do_syscall_trace_leave()
> do things. So checking one bit in flags is not enough - any of
> those 3 is
> a reason for taking the slow path. The point still stands, though -
> mfctl %cr30, %r1
> LDREG TI_FLAGS(%r1),%r1
> ldi (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP), %r19
> and,COND(=) %r19, %r1, %r0
> b,n .Ltracesys
> would still be no worse on the fast path and would not hit the slow
> path in
> a lot of cases when the current code does it for no apparent reason.
>
> Comments?
This looks good to me. There is similar code in entry.S.
I added the above change to my patch set for stable 3.3.6. System still
boots. I singled stepped /bin/ls through a directory listing and it
seemed
to work.
Dave
--
John David Anglin dave.anglin@bell.net
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc] double restarts on multiple signal arrivals
2012-05-20 18:46 ` John David Anglin
2012-05-20 18:46 ` John David Anglin
@ 2012-05-20 20:38 ` Carlos O'Donell
1 sibling, 0 replies; 21+ messages in thread
From: Carlos O'Donell @ 2012-05-20 20:38 UTC (permalink / raw)
To: John David Anglin
Cc: Al Viro, Grant Grundler, linux-parisc, Linus Torvalds, linux-arch
On Sun, May 20, 2012 at 2:46 PM, John David Anglin <dave.anglin@bell.net> wrote:
> On 20-May-12, at 5:04 AM, Al Viro wrote:
>
>> Actually, looks like I am missing something, but it's not particulary
>> subtle.
>> SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
>> any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes
>> do_syscall_trace_leave()
>> do things. So checking one bit in flags is not enough - any of those 3 is
>> a reason for taking the slow path. The point still stands, though -
>> mfctl %cr30, %r1
>> LDREG TI_FLAGS(%r1),%r1
>> ldi (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP),
>> %r19
>> and,COND(=) %r19, %r1, %r0
>> b,n .Ltracesys
>> would still be no worse on the fast path and would not hit the slow path
>> in
>> a lot of cases when the current code does it for no apparent reason.
>>
>> Comments?
>
>
> This looks good to me. There is similar code in entry.S.
>
> I added the above change to my patch set for stable 3.3.6. System still
> boots. I singled stepped /bin/ls through a directory listing and it seemed
> to work.
I'm running this through the ringer on my local system.
I'll send out an update when I'm done testing.
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 21+ messages in thread