linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [parisc] double restarts on multiple signal arrivals
@ 2012-05-18 17:58 Al Viro
  2012-05-18 17:58 ` Al Viro
  2012-05-18 18:05 ` Grant Grundler
  0 siblings, 2 replies; 21+ messages in thread
From: Al Viro @ 2012-05-18 17:58 UTC (permalink / raw)
  To: linux-parisc; +Cc: Linus Torvalds, linux-arch

	Since "[PATCH] PA-RISC assembler cleanups and fixes" back in 2005,
(commit b6181a0a1999c7d4dd937d7f5234fb62eca68e89 in historical tree),
we do loop until all pending signals are dealt with.  Which is fine,
except that now we need to deal with syscall restart logics in all of
them.

	What happens if we have r28 containing e.g. -ERESTARTNOINTR
when we do a syscall?  Note that x86 analog of the code in parisc
syscall_restart() differs in one respect - the same register is used
for return value and syscall number.  So once we'd copied ->orig_ax
to ->ax, that's it - it *can't* be a restart-worthy value anymore,
since then we would've been in a syscall with negative syscall number
and ->ax before the assignment would've been -ENOSYS.  I.e. not a
restart-worthy value, so we wouldn't have hit that regs->ax = regs->orig_ax
in the first place.

	On parisc the counterpart of the above doesn't work; AFAICS,
we might have whatever we bloody please in r28 when we make a syscall.
Syscall number goes in r20, arguments are apparently in r26..r21.
Return value goes into r28.  Incidentally, why do we bother restoring
r28 on syscall restart at all?  Except for that mess with multiple
pending signals, the value we have in r28 on syscall entry doesn't
seem to affect the syscall behaviour...  Some HPUX compat fun?  But
AFAICS there calling conventions are also such that r28 on entry
doesn't affect anything...[1]

	Another fun question: what prevents syscall restarts on
rt_sigreturn()?  If original signal had been taken in syscall handler
we go through syscall_exit and that'll pass 1 as in_syscall to do_signal().
Are we relying on r28 in saved sigcontext never being restart-worthy,
seeing that we'd done syscall_restart()?  But that again relies on
->orig_r28 never containing such values; if it *did* contain one,
we are really screwed since on the path from rt_sigreturn() we'll have
->orig_r28 equal to r28 at the end of signal handler (not to mention
having moved back by twice the required amount - one time when we took
the original signal, another - now).

[1] FWIW, on other architectures the situation with restoring registers on
restarts looks so:
alpha		r0 (syscall number), r19 (arg4)
arm		r0 (arg1)
avr32		r12 (arg1)
bfin		p0 (syscall number), r0 (arg1)
c6x		a4 (arg1)
cris		r10 (arg1)
frv		gr8 (arg1)
h8300		er0 (syscall number)
hexagon		r06 (syscall number), r00 (arg1)
ia64		nothing
m32r		r0 (arg1)
m68k		d0 (syscall number)
microblaze	nothing
mips		r2 (syscall number), r7 (arg1)
mn10300		d0 (syscall number)
openrisc	r11 (syscall number)
parisc		r28
ppc		r3 (arg1)
s390		r2 (arg1)
score		r4 (syscall number), r7 (arg1)
sh		r0 or r9, depending on whether it's sh32 or sh4
sparc		i0 (arg1)
tile		r0 (arg1)
unicore32	r00 (arg1)
x86		[er]ax (syscall number)
xtensa		a2 (syscall number)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [parisc] double restarts on multiple signal arrivals
  2012-05-18 17:58 [parisc] double restarts on multiple signal arrivals Al Viro
@ 2012-05-18 17:58 ` Al Viro
  2012-05-18 18:05 ` Grant Grundler
  1 sibling, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-18 17:58 UTC (permalink / raw)
  To: linux-parisc; +Cc: Linus Torvalds, linux-arch

	Since "[PATCH] PA-RISC assembler cleanups and fixes" back in 2005,
(commit b6181a0a1999c7d4dd937d7f5234fb62eca68e89 in historical tree),
we do loop until all pending signals are dealt with.  Which is fine,
except that now we need to deal with syscall restart logics in all of
them.

	What happens if we have r28 containing e.g. -ERESTARTNOINTR
when we do a syscall?  Note that x86 analog of the code in parisc
syscall_restart() differs in one respect - the same register is used
for return value and syscall number.  So once we'd copied ->orig_ax
to ->ax, that's it - it *can't* be a restart-worthy value anymore,
since then we would've been in a syscall with negative syscall number
and ->ax before the assignment would've been -ENOSYS.  I.e. not a
restart-worthy value, so we wouldn't have hit that regs->ax = regs->orig_ax
in the first place.

	On parisc the counterpart of the above doesn't work; AFAICS,
we might have whatever we bloody please in r28 when we make a syscall.
Syscall number goes in r20, arguments are apparently in r26..r21.
Return value goes into r28.  Incidentally, why do we bother restoring
r28 on syscall restart at all?  Except for that mess with multiple
pending signals, the value we have in r28 on syscall entry doesn't
seem to affect the syscall behaviour...  Some HPUX compat fun?  But
AFAICS there calling conventions are also such that r28 on entry
doesn't affect anything...[1]

	Another fun question: what prevents syscall restarts on
rt_sigreturn()?  If original signal had been taken in syscall handler
we go through syscall_exit and that'll pass 1 as in_syscall to do_signal().
Are we relying on r28 in saved sigcontext never being restart-worthy,
seeing that we'd done syscall_restart()?  But that again relies on
->orig_r28 never containing such values; if it *did* contain one,
we are really screwed since on the path from rt_sigreturn() we'll have
->orig_r28 equal to r28 at the end of signal handler (not to mention
having moved back by twice the required amount - one time when we took
the original signal, another - now).

[1] FWIW, on other architectures the situation with restoring registers on
restarts looks so:
alpha		r0 (syscall number), r19 (arg4)
arm		r0 (arg1)
avr32		r12 (arg1)
bfin		p0 (syscall number), r0 (arg1)
c6x		a4 (arg1)
cris		r10 (arg1)
frv		gr8 (arg1)
h8300		er0 (syscall number)
hexagon		r06 (syscall number), r00 (arg1)
ia64		nothing
m32r		r0 (arg1)
m68k		d0 (syscall number)
microblaze	nothing
mips		r2 (syscall number), r7 (arg1)
mn10300		d0 (syscall number)
openrisc	r11 (syscall number)
parisc		r28
ppc		r3 (arg1)
s390		r2 (arg1)
score		r4 (syscall number), r7 (arg1)
sh		r0 or r9, depending on whether it's sh32 or sh4
sparc		i0 (arg1)
tile		r0 (arg1)
unicore32	r00 (arg1)
x86		[er]ax (syscall number)
xtensa		a2 (syscall number)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 17:58 [parisc] double restarts on multiple signal arrivals Al Viro
  2012-05-18 17:58 ` Al Viro
@ 2012-05-18 18:05 ` Grant Grundler
  2012-05-18 18:57   ` Al Viro
  2012-05-18 19:56   ` Al Viro
  1 sibling, 2 replies; 21+ messages in thread
From: Grant Grundler @ 2012-05-18 18:05 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-parisc, Linus Torvalds, linux-arch

On Fri, May 18, 2012 at 10:58 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
...
> Except for that mess with multiple
> pending signals, the value we have in r28 on syscall entry doesn't
> seem to affect the syscall behaviour...  Some HPUX compat fun?

We stopped trying to support HPUX compat support probably 8 or so
years ago. Since HP didn't care, no one else did either.  So no need
to consider it now.

thanks,
grant

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 18:05 ` Grant Grundler
@ 2012-05-18 18:57   ` Al Viro
  2012-05-18 18:57     ` Al Viro
  2012-05-18 19:56   ` Al Viro
  1 sibling, 1 reply; 21+ messages in thread
From: Al Viro @ 2012-05-18 18:57 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc, Linus Torvalds, linux-arch

On Fri, May 18, 2012 at 11:05:46AM -0700, Grant Grundler wrote:
> On Fri, May 18, 2012 at 10:58 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> ...
> >?Except for that mess with multiple
> > pending signals, the value we have in r28 on syscall entry doesn't
> > seem to affect the syscall behaviour... ?Some HPUX compat fun?
> 
> We stopped trying to support HPUX compat support probably 8 or so
> years ago. Since HP didn't care, no one else did either.  So no need
> to consider it now.

In any case, it doesn't look like something that might be HPUX-related -
there r28 is not used for arguments or syscall number either, as far
as I can tell...

That's a side story, in any case; whatever the reason for restoring
r28, it only masks the bug with double restarts.  If you enter syscall
with r28 equal to e.g. -ERESTARTNOINTR, get the same value from sys_whatever()
and have a couple of pending signals, you will have syscall_restart()
called twice, each time seeing regs->gr[28] == -ERESTARTNOINTR and leaving
it unchanged.  regs->gr[31] will be decremented by 8 on each of those
calls, first time back to your syscall (correctly), then to the entry point
of the first handler minus 8.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 18:57   ` Al Viro
@ 2012-05-18 18:57     ` Al Viro
  0 siblings, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-18 18:57 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc, Linus Torvalds, linux-arch

On Fri, May 18, 2012 at 11:05:46AM -0700, Grant Grundler wrote:
> On Fri, May 18, 2012 at 10:58 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> ...
> >?Except for that mess with multiple
> > pending signals, the value we have in r28 on syscall entry doesn't
> > seem to affect the syscall behaviour... ?Some HPUX compat fun?
> 
> We stopped trying to support HPUX compat support probably 8 or so
> years ago. Since HP didn't care, no one else did either.  So no need
> to consider it now.

In any case, it doesn't look like something that might be HPUX-related -
there r28 is not used for arguments or syscall number either, as far
as I can tell...

That's a side story, in any case; whatever the reason for restoring
r28, it only masks the bug with double restarts.  If you enter syscall
with r28 equal to e.g. -ERESTARTNOINTR, get the same value from sys_whatever()
and have a couple of pending signals, you will have syscall_restart()
called twice, each time seeing regs->gr[28] == -ERESTARTNOINTR and leaving
it unchanged.  regs->gr[31] will be decremented by 8 on each of those
calls, first time back to your syscall (correctly), then to the entry point
of the first handler minus 8.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 18:05 ` Grant Grundler
  2012-05-18 18:57   ` Al Viro
@ 2012-05-18 19:56   ` Al Viro
  2012-05-18 19:56     ` Al Viro
  2012-05-18 20:12     ` Grant Grundler
  1 sibling, 2 replies; 21+ messages in thread
From: Al Viro @ 2012-05-18 19:56 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc, Linus Torvalds, linux-arch

On Fri, May 18, 2012 at 11:05:46AM -0700, Grant Grundler wrote:
> On Fri, May 18, 2012 at 10:58 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> ...
> >?Except for that mess with multiple
> > pending signals, the value we have in r28 on syscall entry doesn't
> > seem to affect the syscall behaviour... ?Some HPUX compat fun?
> 
> We stopped trying to support HPUX compat support probably 8 or so
> years ago. Since HP didn't care, no one else did either.  So no need
> to consider it now.

BTW, what should we put into the trampolines of subsequent sigframes
when we are building more that one?  I.e. suppose we are in a syscall
and see two signals.  We build a sigframe for the first one, with
r25 = 1, call rt_sigreturn() in it.  Return address of original syscall
is stored into sigcontext of that frame, our return address set to the
entry point of handler1.  Now we see the second signal and build another
sigframe.  That one will have the entry point of the first handler stored
in sigframe and we'll set the things up so that return to userland will
land us in the entry of handler2.  Fine, but... what should we have in r25
for rt_sigreturn() in the second trampoline?

We return to userland and find ourselves in the beginning of handler2.
It's executed and eventually we return from it.  We are at the beginning
of the second trampoline now.  rt_sigreturn() is called, restores the
registers from the second sigcontext and returns to userland.  We are at
the beginning of handler1.  It is executed and returns.  We are at the
beginning of the first trampoline.  rt_sigreturn() _there_ restores the
original registers from the first sigcontext and we finally return to the
insn right after the original syscall (or to the syscall itself if we
took a syscall restart).

That final rt_sigreturn() will go through syscall_exit; OK, no problem
with that.  But should the earlier one (done after return from handler2()
and landing us at the entry of handler1) go through syscall_exit or
through syscall_exit_rfi?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 19:56   ` Al Viro
@ 2012-05-18 19:56     ` Al Viro
  2012-05-18 20:12     ` Grant Grundler
  1 sibling, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-18 19:56 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc, Linus Torvalds, linux-arch

On Fri, May 18, 2012 at 11:05:46AM -0700, Grant Grundler wrote:
> On Fri, May 18, 2012 at 10:58 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> ...
> >?Except for that mess with multiple
> > pending signals, the value we have in r28 on syscall entry doesn't
> > seem to affect the syscall behaviour... ?Some HPUX compat fun?
> 
> We stopped trying to support HPUX compat support probably 8 or so
> years ago. Since HP didn't care, no one else did either.  So no need
> to consider it now.

BTW, what should we put into the trampolines of subsequent sigframes
when we are building more that one?  I.e. suppose we are in a syscall
and see two signals.  We build a sigframe for the first one, with
r25 = 1, call rt_sigreturn() in it.  Return address of original syscall
is stored into sigcontext of that frame, our return address set to the
entry point of handler1.  Now we see the second signal and build another
sigframe.  That one will have the entry point of the first handler stored
in sigframe and we'll set the things up so that return to userland will
land us in the entry of handler2.  Fine, but... what should we have in r25
for rt_sigreturn() in the second trampoline?

We return to userland and find ourselves in the beginning of handler2.
It's executed and eventually we return from it.  We are at the beginning
of the second trampoline now.  rt_sigreturn() is called, restores the
registers from the second sigcontext and returns to userland.  We are at
the beginning of handler1.  It is executed and returns.  We are at the
beginning of the first trampoline.  rt_sigreturn() _there_ restores the
original registers from the first sigcontext and we finally return to the
insn right after the original syscall (or to the syscall itself if we
took a syscall restart).

That final rt_sigreturn() will go through syscall_exit; OK, no problem
with that.  But should the earlier one (done after return from handler2()
and landing us at the entry of handler1) go through syscall_exit or
through syscall_exit_rfi?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 19:56   ` Al Viro
  2012-05-18 19:56     ` Al Viro
@ 2012-05-18 20:12     ` Grant Grundler
  2012-05-18 20:15       ` Carlos O'Donell
  1 sibling, 1 reply; 21+ messages in thread
From: Grant Grundler @ 2012-05-18 20:12 UTC (permalink / raw)
  To: Al Viro, Carlos O'Donell, John David Anglin
  Cc: linux-parisc, Linus Torvalds, linux-arch

+patofiero,anglin

On Fri, May 18, 2012 at 12:56 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Fri, May 18, 2012 at 11:05:46AM -0700, Grant Grundler wrote:
>> On Fri, May 18, 2012 at 10:58 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> ...
>> >?Except for that mess with multiple
>> > pending signals, the value we have in r28 on syscall entry doesn't
>> > seem to affect the syscall behaviour... ?Some HPUX compat fun?
>>
>> We stopped trying to support HPUX compat support probably 8 or so
>> years ago. Since HP didn't care, no one else did either.  So no need
>> to consider it now.
>
> BTW, what should we put into the trampolines of subsequent sigframes
> when we are building more that one?

Sorry - I only knew about the state of the HPUX support...Carlos and
Dave (and a few others) understand the syscall interface. I've include
them directly and hopefully they can respond soon.

cheers,
grant

>  I.e. suppose we are in a syscall
> and see two signals.  We build a sigframe for the first one, with
> r25 = 1, call rt_sigreturn() in it.  Return address of original syscall
> is stored into sigcontext of that frame, our return address set to the
> entry point of handler1.  Now we see the second signal and build another
> sigframe.  That one will have the entry point of the first handler stored
> in sigframe and we'll set the things up so that return to userland will
> land us in the entry of handler2.  Fine, but... what should we have in r25
> for rt_sigreturn() in the second trampoline?
>
> We return to userland and find ourselves in the beginning of handler2.
> It's executed and eventually we return from it.  We are at the beginning
> of the second trampoline now.  rt_sigreturn() is called, restores the
> registers from the second sigcontext and returns to userland.  We are at
> the beginning of handler1.  It is executed and returns.  We are at the
> beginning of the first trampoline.  rt_sigreturn() _there_ restores the
> original registers from the first sigcontext and we finally return to the
> insn right after the original syscall (or to the syscall itself if we
> took a syscall restart).
>
> That final rt_sigreturn() will go through syscall_exit; OK, no problem
> with that.  But should the earlier one (done after return from handler2()
> and landing us at the entry of handler1) go through syscall_exit or
> through syscall_exit_rfi?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 20:12     ` Grant Grundler
@ 2012-05-18 20:15       ` Carlos O'Donell
  2012-05-18 22:24         ` Al Viro
  0 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2012-05-18 20:15 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Al Viro, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

On Fri, May 18, 2012 at 4:12 PM, Grant Grundler <grantgrundler@gmail.com> wrote:
> +patofiero,anglin
>
> On Fri, May 18, 2012 at 12:56 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> On Fri, May 18, 2012 at 11:05:46AM -0700, Grant Grundler wrote:
>>> On Fri, May 18, 2012 at 10:58 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>> ...
>>> >?Except for that mess with multiple
>>> > pending signals, the value we have in r28 on syscall entry doesn't
>>> > seem to affect the syscall behaviour... ?Some HPUX compat fun?
>>>
>>> We stopped trying to support HPUX compat support probably 8 or so
>>> years ago. Since HP didn't care, no one else did either.  So no need
>>> to consider it now.
>>
>> BTW, what should we put into the trampolines of subsequent sigframes
>> when we are building more that one?
>
> Sorry - I only knew about the state of the HPUX support...Carlos and
> Dave (and a few others) understand the syscall interface. I've include
> them directly and hopefully they can respond soon.

I've hit "Mark as unread" twice on this thread now because I *am*
interested in this conversation.

However, I must confess that the original email reads like a rant
without a clear explanation of exactly what you think is wrong.

I'm familiar with the PARISC signal code because I worked on the
32/64-bit compat support for the frame layout, I also helped Randolph
work on the syscall restart logic.

What I don't understand is what's wrong with it now, and your subject
begins to hint at the problem.

Can you describe in more detail exactly what problem we need to solve?

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 20:15       ` Carlos O'Donell
@ 2012-05-18 22:24         ` Al Viro
  2012-05-19  1:36           ` Carlos O'Donell
                             ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Al Viro @ 2012-05-18 22:24 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

On Fri, May 18, 2012 at 04:15:42PM -0400, Carlos O'Donell wrote:

> I've hit "Mark as unread" twice on this thread now because I *am*
> interested in this conversation.
> 
> However, I must confess that the original email reads like a rant
> without a clear explanation of exactly what you think is wrong.
> 
> I'm familiar with the PARISC signal code because I worked on the
> 32/64-bit compat support for the frame layout, I also helped Randolph
> work on the syscall restart logic.
> 
> What I don't understand is what's wrong with it now, and your subject
> begins to hint at the problem.
> 
> Can you describe in more detail exactly what problem we need to solve?

Sure.  Have e.g. read(2) on a pipe called when you happen to have -512 in
r28 and have two signals arrive at once while inside that read().  Both
with user handlers.  Both with SA_RESTART.

When we enter the kernel, we have pt_regs created on kernel stack and
filled with the values we had in registers.  Among other things, the
value of r28 is stored in regs->orig_r28.  Note that arguments and the
syscall number are all passed through other registers, so setting the
arguments for syscall did not overwrite the value we used to have in
r28.  It's still -512.

We find which C function to call (sys_read) and go there, with return
address set to syscall_exit.

sys_read() ends up calling pipe_read(), which finds pipe empty and goes
to sleep.  The signals arrive and pipe_read() is woken up.  It sees that
	* signal_pending() is true
	* nothing has been read yet
so it return -ERESTARSYS.  sys_read() returns what it got from pipe_read().

Now we are at syscall_exit.  We set r1 to pt_regs created on stack when
we entered the syscall.  Return value of sys_read() is taken from r28 and
stored into regs->gr[28].  We check TIF_NEED_RESCHED and it's not set.
We'd reached syscall_check_sig and check TIF_SIGPENDING, which is set.
We go through syscall_do_signal and reach the call of
do_notify_resume(regs, 1), which calls do_signal(regs, 1).

There we call get_signal_to_deliver() and get one of those two signals
dequeued and returned.  Note that TIF_SIGPENDING is still set - there is
the second signal left.  We got positive signr, so we do not leave the
loop yet.  in_syscall is 1, so we call syscall_restart().

We check regs->gr[28], which is -ERESTARTSYS.  SA_RESTART is set in
sa_flags, so we fall through to the next case, which is
                regs->gr[31] -= 8; /* delayed branching */
                /* Preserve original r28. */
                regs->gr[28] = regs->orig_r28;
Note that ERESTARTSYS is 512, so the second assignment does nothing - we
still have regs->gr[28] equal to -ERESTARTSYS.  We are done with
syscall_restart().

Now we call handle_signal(), which calls setup_rt_frame().  sigframe is
allocated on userland stack and filled accordingly to the contents of *regs.
No errors occured, so we set regs->gr[31] to entry point of signal handler,
regs->gr[2] to address of sigreturn trampoline we'd created in the sigframe,
regs->gr[26..24] to arguments of handler, and regs->gr[30] to new value of
userland stack pointer.

Back in handle_signal(), we adjust the set of blocked signals and return 1.
do_signal() sees that RESTORE_SIGMASK is not set and it's done - we return
to do_notify_resume() and from there back into entry.S.

We reload registers from pt_regs and go to syscall_check_sig again.  There
we reread the thread flags and see that SIGPENDING is still set - we have
another signal to deal with.  We hit do_notify_resume() and do_signal()
again.  And again, in_syscall is 1.

do_signal() gets the remaining pending signal from get_signal_to_deliver().
Again, we call syscall_restart().  And neither regs->gr[28] nor regs->orig_r28
has been changed, so we hit the same case.  Assignment to regs->gr[28] is
a no-op again.  But look what happened to regs->gr[31] - it got decremented
by 8.  So it points 8 bytes before the entry point of the first handler now.

We proceed to set the second sigframe up.  The registers are saved in
sigcontext, adjusted so that they correspond to the call of the second
handler with future return to the second trampoline and we return all the
way out to entry.S.  There we go to syscall_check_sig again, but this
time we have no pending signals left, so we reach syscall_restore and
proceed to userland.

Now we are in the second handler.  It's executed to the end and when it
returns, we go to the address we had in r2.  I.e. the second trampoline.
There we get r25 set to 1, r20 set to __NR_rt_sigreturn and we go to
execute the syscall.  Which restores the registers from sigcontext and,
since its in_syscall argument is 1, sets ->gr[31] to sc->sc_iaoq.  Which
had been set by setup_rt_sigcontext() from regs->gr[31] at the moment it
had been called, i.e. the entry point of the first handler decremented by 8.  
Then we proceed to return to userland.  And arrive there are the right
arguments for the first handler, but instruction pointer is 8 bytes too
early.  Bad things happen, obviously.

The point where the things go wrong is the second call of syscall_restart().
It should've done nothing.  We need to do syscall restarts only when building
the first sigframe.  And usually it does work out that way, simply because
the value you happened to have in r28 is probably not one of the 4 that
make syscall_restart() do things.  So after the first time around it does
nothing.  Very similar logics on x86 does, indeed, work in all cases.
There the register used for return values happens to be the same as we use
to pass syscall number, so you can't arrange a call of syscall_restart()
analog that would trigger restart *and* leave the return value register
unchanged - you'd need %eax to be -ERESTART<something> when you enter the
syscall and that would immediately result in regs->ax set to -ENOSYS.  I.e.
no restarts would be done.  However, on parisc that doesn't work - you can
start with r28 being -ERESTARTSYS and get -ERESTARTSYS from syscall.

And no, I don't have any good suggestions on how to fix it right now.  Depends
on how does gdb on parisc play with pt_regs, among other things...

Sorry if the original posting sounded like a rant - I'd spent the last month
crawling through the assembler glue around signal hanlding all over arch/*
and that probably affected the tone.  Definitely had been affecting the amount
of spoken obscenities lately...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 22:24         ` Al Viro
@ 2012-05-19  1:36           ` Carlos O'Donell
  2012-05-19  5:26           ` Al Viro
  2012-05-20  8:40           ` Al Viro
  2 siblings, 0 replies; 21+ messages in thread
From: Carlos O'Donell @ 2012-05-19  1:36 UTC (permalink / raw)
  To: Al Viro
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

On Fri, May 18, 2012 at 6:24 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Fri, May 18, 2012 at 04:15:42PM -0400, Carlos O'Donell wrote:
>
>> I've hit "Mark as unread" twice on this thread now because I *am*
>> interested in this conversation.
>>
>> However, I must confess that the original email reads like a rant
>> without a clear explanation of exactly what you think is wrong.
>>
>> I'm familiar with the PARISC signal code because I worked on the
>> 32/64-bit compat support for the frame layout, I also helped Randolph
>> work on the syscall restart logic.
>>
>> What I don't understand is what's wrong with it now, and your subject
>> begins to hint at the problem.
>>
>> Can you describe in more detail exactly what problem we need to solve?
>
> Sure.  Have e.g. read(2) on a pipe called when you happen to have -512 in
> r28 and have two signals arrive at once while inside that read().  Both
> with user handlers.  Both with SA_RESTART.
>
> When we enter the kernel, we have pt_regs created on kernel stack and
> filled with the values we had in registers.  Among other things, the
> value of r28 is stored in regs->orig_r28.  Note that arguments and the
> syscall number are all passed through other registers, so setting the
> arguments for syscall did not overwrite the value we used to have in
> r28.  It's still -512.
>
> We find which C function to call (sys_read) and go there, with return
> address set to syscall_exit.
>
> sys_read() ends up calling pipe_read(), which finds pipe empty and goes
> to sleep.  The signals arrive and pipe_read() is woken up.  It sees that
>        * signal_pending() is true
>        * nothing has been read yet
> so it return -ERESTARSYS.  sys_read() returns what it got from pipe_read().
>
> Now we are at syscall_exit.  We set r1 to pt_regs created on stack when
> we entered the syscall.  Return value of sys_read() is taken from r28 and
> stored into regs->gr[28].  We check TIF_NEED_RESCHED and it's not set.
> We'd reached syscall_check_sig and check TIF_SIGPENDING, which is set.
> We go through syscall_do_signal and reach the call of
> do_notify_resume(regs, 1), which calls do_signal(regs, 1).
>
> There we call get_signal_to_deliver() and get one of those two signals
> dequeued and returned.  Note that TIF_SIGPENDING is still set - there is
> the second signal left.  We got positive signr, so we do not leave the
> loop yet.  in_syscall is 1, so we call syscall_restart().
>
> We check regs->gr[28], which is -ERESTARTSYS.  SA_RESTART is set in
> sa_flags, so we fall through to the next case, which is
>                regs->gr[31] -= 8; /* delayed branching */
>                /* Preserve original r28. */
>                regs->gr[28] = regs->orig_r28;
> Note that ERESTARTSYS is 512, so the second assignment does nothing - we
> still have regs->gr[28] equal to -ERESTARTSYS.  We are done with
> syscall_restart().
>
> Now we call handle_signal(), which calls setup_rt_frame().  sigframe is
> allocated on userland stack and filled accordingly to the contents of *regs.
> No errors occured, so we set regs->gr[31] to entry point of signal handler,
> regs->gr[2] to address of sigreturn trampoline we'd created in the sigframe,
> regs->gr[26..24] to arguments of handler, and regs->gr[30] to new value of
> userland stack pointer.
>
> Back in handle_signal(), we adjust the set of blocked signals and return 1.
> do_signal() sees that RESTORE_SIGMASK is not set and it's done - we return
> to do_notify_resume() and from there back into entry.S.
>
> We reload registers from pt_regs and go to syscall_check_sig again.  There
> we reread the thread flags and see that SIGPENDING is still set - we have
> another signal to deal with.  We hit do_notify_resume() and do_signal()
> again.  And again, in_syscall is 1.
>
> do_signal() gets the remaining pending signal from get_signal_to_deliver().
> Again, we call syscall_restart().  And neither regs->gr[28] nor regs->orig_r28
> has been changed, so we hit the same case.  Assignment to regs->gr[28] is
> a no-op again.  But look what happened to regs->gr[31] - it got decremented
> by 8.  So it points 8 bytes before the entry point of the first handler now.
>
> We proceed to set the second sigframe up.  The registers are saved in
> sigcontext, adjusted so that they correspond to the call of the second
> handler with future return to the second trampoline and we return all the
> way out to entry.S.  There we go to syscall_check_sig again, but this
> time we have no pending signals left, so we reach syscall_restore and
> proceed to userland.
>
> Now we are in the second handler.  It's executed to the end and when it
> returns, we go to the address we had in r2.  I.e. the second trampoline.
> There we get r25 set to 1, r20 set to __NR_rt_sigreturn and we go to
> execute the syscall.  Which restores the registers from sigcontext and,
> since its in_syscall argument is 1, sets ->gr[31] to sc->sc_iaoq.  Which
> had been set by setup_rt_sigcontext() from regs->gr[31] at the moment it
> had been called, i.e. the entry point of the first handler decremented by 8.
> Then we proceed to return to userland.  And arrive there are the right
> arguments for the first handler, but instruction pointer is 8 bytes too
> early.  Bad things happen, obviously.
>
> The point where the things go wrong is the second call of syscall_restart().
> It should've done nothing.  We need to do syscall restarts only when building
> the first sigframe.  And usually it does work out that way, simply because
> the value you happened to have in r28 is probably not one of the 4 that
> make syscall_restart() do things.  So after the first time around it does
> nothing.  Very similar logics on x86 does, indeed, work in all cases.
> There the register used for return values happens to be the same as we use
> to pass syscall number, so you can't arrange a call of syscall_restart()
> analog that would trigger restart *and* leave the return value register
> unchanged - you'd need %eax to be -ERESTART<something> when you enter the
> syscall and that would immediately result in regs->ax set to -ENOSYS.  I.e.
> no restarts would be done.  However, on parisc that doesn't work - you can
> start with r28 being -ERESTARTSYS and get -ERESTARTSYS from syscall.
>
> And no, I don't have any good suggestions on how to fix it right now.  Depends
> on how does gdb on parisc play with pt_regs, among other things...
>
> Sorry if the original posting sounded like a rant - I'd spent the last month
> crawling through the assembler glue around signal hanlding all over arch/*
> and that probably affected the tone.  Definitely had been affecting the amount
> of spoken obscenities lately...

Thanks Al. This is description is a work of art, I wish all
reports looked like this. I'll go over this code tomorrow and
get back to you with my thoughts. I really appreciate the
thorough review.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 22:24         ` Al Viro
  2012-05-19  1:36           ` Carlos O'Donell
@ 2012-05-19  5:26           ` Al Viro
  2012-05-19 13:37             ` Al Viro
  2012-05-20  8:40           ` Al Viro
  2 siblings, 1 reply; 21+ messages in thread
From: Al Viro @ 2012-05-19  5:26 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

On Fri, May 18, 2012 at 11:24:22PM +0100, Al Viro wrote:

> And no, I don't have any good suggestions on how to fix it right now.  Depends
> on how does gdb on parisc play with pt_regs, among other things...

I think we might be able to cheat and pull off the following:
	* stop restoring r28 on restarts.  Behaviour of syscall should not
depend on the value of r28 at the moment we enter it and it does clobber
r28 in all cases - that's where the return value goes to, after all.  And
the instruction we are restarting would better be a syscall.
	* with that done, we can reuse ->orig_r28 for any purpose.  In
particular, we can use it as "no restarts allowed" flag.  Just have it set
to zero instead of r28 in linux_gateway_entry and hpux_gateway_page, then
in syscall_restart() and insert_restart_trampoline() check that
it's still 0, go away if it isn't and set it to 1 otherwise.  Set
it to 1 in sigreturn, to make it indifferent to what we might or might
not find in r28 stored in sigcontext (strictly speaking that's not needed,
but it's less brittle than relying on rather subtle properties of what
could end up in sigcontext and trampoline; the whole thing is a nice
example of how easily subtle things break and how long does that stay
unnoticed)

	Another potential solution would be a trick a-la MIPS, but you are
already using that slot in regs->gr[] for "is it a syscall" flag (psw for
interrupt, 0 for syscall) and touching that is very likely to end up with
very unhappy gdb...

So how about the following:

diff --git a/arch/parisc/hpux/gate.S b/arch/parisc/hpux/gate.S
index 38a1c1b..0114688 100644
--- a/arch/parisc/hpux/gate.S
+++ b/arch/parisc/hpux/gate.S
@@ -71,7 +71,7 @@ ENTRY(hpux_gateway_page)
 	STREG	%r26, TASK_PT_GR26(%r1)	 	/* 1st argument */
 	STREG	%r27, TASK_PT_GR27(%r1)		/* user dp */
 	STREG   %r28, TASK_PT_GR28(%r1)         /* return value 0 */
-	STREG   %r28, TASK_PT_ORIG_R28(%r1)     /* return value 0 (saved for signals) */
+	STREG   %r0, TASK_PT_ORIG_R28(%r1)     /* don't prohibit restarts */
 	STREG   %r29, TASK_PT_GR29(%r1)         /* 8th argument */
 	STREG	%r31, TASK_PT_GR31(%r1)		/* preserve syscall return ptr */
 	
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..b4dd2c1 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -88,6 +88,7 @@ restore_sigcontext(struct sigcontext __user *sc, struct pt_regs *regs)
 	err |= __copy_from_user(regs->iaoq, sc->sc_iaoq, sizeof(regs->iaoq));
 	err |= __copy_from_user(regs->iasq, sc->sc_iasq, sizeof(regs->iasq));
 	err |= __get_user(regs->sar, &sc->sc_sar);
+	regs->orig_r28 = 1; /* no restarts for sigreturn */
 	DBG(2,"restore_sigcontext: iaoq is 0x%#lx / 0x%#lx\n", 
 			regs->iaoq[0],regs->iaoq[1]);
 	DBG(2,"restore_sigcontext: r28 is %ld\n", regs->gr[28]);
@@ -471,6 +472,9 @@ handle_signal(unsigned long sig, siginfo_t *info, struct k_sigaction *ka,
 static inline void
 syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 {
+	if (regs->orig_r28)
+		return;
+	regs->orig_r28 = 1; /* no more restarts */
 	/* Check the return code */
 	switch (regs->gr[28]) {
 	case -ERESTART_RESTARTBLOCK:
@@ -493,8 +497,6 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 		 * we have to do is fiddle the return pointer.
 		 */
 		regs->gr[31] -= 8; /* delayed branching */
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;
 		break;
 	}
 }
@@ -502,6 +504,9 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 static inline void
 insert_restart_trampoline(struct pt_regs *regs)
 {
+	if (regs->orig_r28)
+		return;
+	regs->orig_r28 = 1; /* no more restarts */
 	switch(regs->gr[28]) {
 	case -ERESTART_RESTARTBLOCK: {
 		/* Restart the system call - no handlers present */
@@ -536,9 +541,6 @@ insert_restart_trampoline(struct pt_regs *regs)
 		flush_user_icache_range(regs->gr[30], regs->gr[30] + 4);
 
 		regs->gr[31] = regs->gr[30] + 8;
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;
-
 		return;
 	}
 	case -ERESTARTNOHAND:
@@ -550,9 +552,6 @@ insert_restart_trampoline(struct pt_regs *regs)
 		 * slot of the branch external instruction.
 		 */
 		regs->gr[31] -= 8;
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;
-
 		return;
 	}
 	default:
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index 82a52b2..54a9cbf 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -156,7 +156,7 @@ linux_gateway_entry:
 	STREG	%r26, TASK_PT_GR26(%r1)	 	/* 1st argument */
 	STREG	%r27, TASK_PT_GR27(%r1)		/* user dp */
 	STREG   %r28, TASK_PT_GR28(%r1)         /* return value 0 */
-	STREG   %r28, TASK_PT_ORIG_R28(%r1)     /* return value 0 (saved for signals) */
+	STREG   %r0, TASK_PT_ORIG_R28(%r1)      /* don't prohibit restarts */
 	STREG	%r29, TASK_PT_GR29(%r1)		/* return value 1 */
 	STREG	%r31, TASK_PT_GR31(%r1)		/* preserve syscall return ptr */
 	

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-19  5:26           ` Al Viro
@ 2012-05-19 13:37             ` Al Viro
  2012-05-19 13:37               ` Al Viro
  0 siblings, 1 reply; 21+ messages in thread
From: Al Viro @ 2012-05-19 13:37 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

On Sat, May 19, 2012 at 06:26:23AM +0100, Al Viro wrote:
> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
> index 12c1ed3..b4dd2c1 100644
> --- a/arch/parisc/kernel/signal.c
> +++ b/arch/parisc/kernel/signal.c
> @@ -88,6 +88,7 @@ restore_sigcontext(struct sigcontext __user *sc, struct pt_regs *regs)
>  	err |= __copy_from_user(regs->iaoq, sc->sc_iaoq, sizeof(regs->iaoq));
>  	err |= __copy_from_user(regs->iasq, sc->sc_iasq, sizeof(regs->iasq));
>  	err |= __get_user(regs->sar, &sc->sc_sar);
> +	regs->orig_r28 = 1; /* no restarts for sigreturn */
>  	DBG(2,"restore_sigcontext: iaoq is 0x%#lx / 0x%#lx\n", 
>  			regs->iaoq[0],regs->iaoq[1]);
>  	DBG(2,"restore_sigcontext: r28 is %ld\n", regs->gr[28]);

... only that part should be done in sys_rt_sigreturn() instead, or we'll miss
it for 32bit-on-64bit case.

diff --git a/arch/parisc/hpux/gate.S b/arch/parisc/hpux/gate.S
index 38a1c1b..0114688 100644
--- a/arch/parisc/hpux/gate.S
+++ b/arch/parisc/hpux/gate.S
@@ -71,7 +71,7 @@ ENTRY(hpux_gateway_page)
 	STREG	%r26, TASK_PT_GR26(%r1)	 	/* 1st argument */
 	STREG	%r27, TASK_PT_GR27(%r1)		/* user dp */
 	STREG   %r28, TASK_PT_GR28(%r1)         /* return value 0 */
-	STREG   %r28, TASK_PT_ORIG_R28(%r1)     /* return value 0 (saved for signals) */
+	STREG   %r0, TASK_PT_ORIG_R28(%r1)     /* don't prohibit restarts */
 	STREG   %r29, TASK_PT_GR29(%r1)         /* 8th argument */
 	STREG	%r31, TASK_PT_GR31(%r1)		/* preserve syscall return ptr */
 	
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..369b1c4 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -115,6 +115,8 @@ sys_rt_sigreturn(struct pt_regs *regs, int in_syscall)
 		(usp - sigframe_size);
 	DBG(2,"sys_rt_sigreturn: frame is %p\n", frame);
 
+	regs->orig_r28 = 1; /* no restarts for sigreturn */
+
 #ifdef CONFIG_64BIT
 	compat_frame = (struct compat_rt_sigframe __user *)frame;
 	
@@ -471,6 +473,9 @@ handle_signal(unsigned long sig, siginfo_t *info, struct k_sigaction *ka,
 static inline void
 syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 {
+	if (regs->orig_r28)
+		return;
+	regs->orig_r28 = 1; /* no more restarts */
 	/* Check the return code */
 	switch (regs->gr[28]) {
 	case -ERESTART_RESTARTBLOCK:
@@ -493,8 +498,6 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 		 * we have to do is fiddle the return pointer.
 		 */
 		regs->gr[31] -= 8; /* delayed branching */
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;
 		break;
 	}
 }
@@ -502,6 +505,9 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 static inline void
 insert_restart_trampoline(struct pt_regs *regs)
 {
+	if (regs->orig_r28)
+		return;
+	regs->orig_r28 = 1; /* no more restarts */
 	switch(regs->gr[28]) {
 	case -ERESTART_RESTARTBLOCK: {
 		/* Restart the system call - no handlers present */
@@ -536,9 +542,6 @@ insert_restart_trampoline(struct pt_regs *regs)
 		flush_user_icache_range(regs->gr[30], regs->gr[30] + 4);
 
 		regs->gr[31] = regs->gr[30] + 8;
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;
-
 		return;
 	}
 	case -ERESTARTNOHAND:
@@ -550,9 +553,6 @@ insert_restart_trampoline(struct pt_regs *regs)
 		 * slot of the branch external instruction.
 		 */
 		regs->gr[31] -= 8;
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-19 13:37             ` Al Viro
@ 2012-05-19 13:37               ` Al Viro
  0 siblings, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-19 13:37 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

On Sat, May 19, 2012 at 06:26:23AM +0100, Al Viro wrote:
> diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
> index 12c1ed3..b4dd2c1 100644
> --- a/arch/parisc/kernel/signal.c
> +++ b/arch/parisc/kernel/signal.c
> @@ -88,6 +88,7 @@ restore_sigcontext(struct sigcontext __user *sc, struct pt_regs *regs)
>  	err |= __copy_from_user(regs->iaoq, sc->sc_iaoq, sizeof(regs->iaoq));
>  	err |= __copy_from_user(regs->iasq, sc->sc_iasq, sizeof(regs->iasq));
>  	err |= __get_user(regs->sar, &sc->sc_sar);
> +	regs->orig_r28 = 1; /* no restarts for sigreturn */
>  	DBG(2,"restore_sigcontext: iaoq is 0x%#lx / 0x%#lx\n", 
>  			regs->iaoq[0],regs->iaoq[1]);
>  	DBG(2,"restore_sigcontext: r28 is %ld\n", regs->gr[28]);

... only that part should be done in sys_rt_sigreturn() instead, or we'll miss
it for 32bit-on-64bit case.

diff --git a/arch/parisc/hpux/gate.S b/arch/parisc/hpux/gate.S
index 38a1c1b..0114688 100644
--- a/arch/parisc/hpux/gate.S
+++ b/arch/parisc/hpux/gate.S
@@ -71,7 +71,7 @@ ENTRY(hpux_gateway_page)
 	STREG	%r26, TASK_PT_GR26(%r1)	 	/* 1st argument */
 	STREG	%r27, TASK_PT_GR27(%r1)		/* user dp */
 	STREG   %r28, TASK_PT_GR28(%r1)         /* return value 0 */
-	STREG   %r28, TASK_PT_ORIG_R28(%r1)     /* return value 0 (saved for signals) */
+	STREG   %r0, TASK_PT_ORIG_R28(%r1)     /* don't prohibit restarts */
 	STREG   %r29, TASK_PT_GR29(%r1)         /* 8th argument */
 	STREG	%r31, TASK_PT_GR31(%r1)		/* preserve syscall return ptr */
 	
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..369b1c4 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -115,6 +115,8 @@ sys_rt_sigreturn(struct pt_regs *regs, int in_syscall)
 		(usp - sigframe_size);
 	DBG(2,"sys_rt_sigreturn: frame is %p\n", frame);
 
+	regs->orig_r28 = 1; /* no restarts for sigreturn */
+
 #ifdef CONFIG_64BIT
 	compat_frame = (struct compat_rt_sigframe __user *)frame;
 	
@@ -471,6 +473,9 @@ handle_signal(unsigned long sig, siginfo_t *info, struct k_sigaction *ka,
 static inline void
 syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 {
+	if (regs->orig_r28)
+		return;
+	regs->orig_r28 = 1; /* no more restarts */
 	/* Check the return code */
 	switch (regs->gr[28]) {
 	case -ERESTART_RESTARTBLOCK:
@@ -493,8 +498,6 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 		 * we have to do is fiddle the return pointer.
 		 */
 		regs->gr[31] -= 8; /* delayed branching */
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;
 		break;
 	}
 }
@@ -502,6 +505,9 @@ syscall_restart(struct pt_regs *regs, struct k_sigaction *ka)
 static inline void
 insert_restart_trampoline(struct pt_regs *regs)
 {
+	if (regs->orig_r28)
+		return;
+	regs->orig_r28 = 1; /* no more restarts */
 	switch(regs->gr[28]) {
 	case -ERESTART_RESTARTBLOCK: {
 		/* Restart the system call - no handlers present */
@@ -536,9 +542,6 @@ insert_restart_trampoline(struct pt_regs *regs)
 		flush_user_icache_range(regs->gr[30], regs->gr[30] + 4);
 
 		regs->gr[31] = regs->gr[30] + 8;
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;
-
 		return;
 	}
 	case -ERESTARTNOHAND:
@@ -550,9 +553,6 @@ insert_restart_trampoline(struct pt_regs *regs)
 		 * slot of the branch external instruction.
 		 */
 		regs->gr[31] -= 8;
-		/* Preserve original r28. */
-		regs->gr[28] = regs->orig_r28;
-
 		return;
 	}
 	default:
diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
index 82a52b2..54a9cbf 100644
--- a/arch/parisc/kernel/syscall.S
+++ b/arch/parisc/kernel/syscall.S
@@ -156,7 +156,7 @@ linux_gateway_entry:
 	STREG	%r26, TASK_PT_GR26(%r1)	 	/* 1st argument */
 	STREG	%r27, TASK_PT_GR27(%r1)		/* user dp */
 	STREG   %r28, TASK_PT_GR28(%r1)         /* return value 0 */
-	STREG   %r28, TASK_PT_ORIG_R28(%r1)     /* return value 0 (saved for signals) */
+	STREG   %r0, TASK_PT_ORIG_R28(%r1)      /* don't prohibit restarts */
 	STREG	%r29, TASK_PT_GR29(%r1)		/* return value 1 */
 	STREG	%r31, TASK_PT_GR31(%r1)		/* preserve syscall return ptr */
 	

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-18 22:24         ` Al Viro
  2012-05-19  1:36           ` Carlos O'Donell
  2012-05-19  5:26           ` Al Viro
@ 2012-05-20  8:40           ` Al Viro
  2012-05-20  8:40             ` Al Viro
  2012-05-20  9:04             ` Al Viro
  2 siblings, 2 replies; 21+ messages in thread
From: Al Viro @ 2012-05-20  8:40 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

	BTW, after looking through the asm glue there, I've noticed something
rather odd.  *All* architectures other than parisc have the following thing
done on syscall entry:
	* check if we have TIF_SYSCALL_TRACE in flags; bugger off to slow
path if we do.
	* there check if we have PT_PTRACED in current->ptrace and if so
do the standard song and dance with raising SIGTRAP or SIGTRAP | 0x80,
etc.

parisc does that in the opposite order - it goes to slow path if
current->ptrace & PT_PTRACED is true.  The same actions with raising
SIGTRAP, etc. are done only if TIF_SYSCALL_TRACE is there as well,
but by that point we'd already plonked a bunch of registers on the
stack and we actually stay on the slow path even in absense of SYSCALL_TRACE.

The thing is, other architectures have a good reason for looking at
SYSCALL_TRACE first and not bothering with all that fun if it's not set.
PT_PTRACED is set when the task is being traced; it does *not* imply
stopping on every syscall entry.  E.g. a perfectly normal situation is
when in gdb you've said
break foo.c:69
cont
and let the process run until it hits that breakpoint.  It has PT_PTRACED
all along.  It will be stopped when it gets a signal, including SIGTRAP
we'll get when it steps on that breakpoint.  But by that time it's very
likely to have passed through tons of syscalls without stopping on any of
them.  SYSCALL_TRACE is set *only* when we want the process to stop on the
next syscall.  That's what strace(1) and friends are using.  So it doesn't
make sense to get overhead for all processes that have PT_PTRACED;
SYSCALL_TRACE is less likely, so checking it first is an obvious approach.

Moreover, checking PT_TRACED first is not even cheaper in the common case
(i.e. when branch to .Ltracesys is not taken at all).  As it is, parisc does
        mfctl   %cr30, %r1
        LDREG   TI_TASK(%r1),%r1
        ldw     TASK_PTRACE(%r1), %r1
        bb,<,n  %r1,31,.Ltracesys
and that's actually an extra dereference compared to
        mfctl   %cr30, %r1
        LDREG   TI_FLAGS(%r1),%r1
        bb,<,n  %r1,31 - TIF_SYSCALL_TRACE,.Ltracesys
Note that tracehook_report_syscall_entry/tracehook_report_syscall_exit
are checking PT_PTRACED, so it's not like we needed to change anything
other than that spot - resulting logics will be equivalent to what we
have right now.  Looks like a fairly straightforward optimisation...
Am I missing something subtle and parisc-specific here?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-20  8:40           ` Al Viro
@ 2012-05-20  8:40             ` Al Viro
  2012-05-20  9:04             ` Al Viro
  1 sibling, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-20  8:40 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

	BTW, after looking through the asm glue there, I've noticed something
rather odd.  *All* architectures other than parisc have the following thing
done on syscall entry:
	* check if we have TIF_SYSCALL_TRACE in flags; bugger off to slow
path if we do.
	* there check if we have PT_PTRACED in current->ptrace and if so
do the standard song and dance with raising SIGTRAP or SIGTRAP | 0x80,
etc.

parisc does that in the opposite order - it goes to slow path if
current->ptrace & PT_PTRACED is true.  The same actions with raising
SIGTRAP, etc. are done only if TIF_SYSCALL_TRACE is there as well,
but by that point we'd already plonked a bunch of registers on the
stack and we actually stay on the slow path even in absense of SYSCALL_TRACE.

The thing is, other architectures have a good reason for looking at
SYSCALL_TRACE first and not bothering with all that fun if it's not set.
PT_PTRACED is set when the task is being traced; it does *not* imply
stopping on every syscall entry.  E.g. a perfectly normal situation is
when in gdb you've said
break foo.c:69
cont
and let the process run until it hits that breakpoint.  It has PT_PTRACED
all along.  It will be stopped when it gets a signal, including SIGTRAP
we'll get when it steps on that breakpoint.  But by that time it's very
likely to have passed through tons of syscalls without stopping on any of
them.  SYSCALL_TRACE is set *only* when we want the process to stop on the
next syscall.  That's what strace(1) and friends are using.  So it doesn't
make sense to get overhead for all processes that have PT_PTRACED;
SYSCALL_TRACE is less likely, so checking it first is an obvious approach.

Moreover, checking PT_TRACED first is not even cheaper in the common case
(i.e. when branch to .Ltracesys is not taken at all).  As it is, parisc does
        mfctl   %cr30, %r1
        LDREG   TI_TASK(%r1),%r1
        ldw     TASK_PTRACE(%r1), %r1
        bb,<,n  %r1,31,.Ltracesys
and that's actually an extra dereference compared to
        mfctl   %cr30, %r1
        LDREG   TI_FLAGS(%r1),%r1
        bb,<,n  %r1,31 - TIF_SYSCALL_TRACE,.Ltracesys
Note that tracehook_report_syscall_entry/tracehook_report_syscall_exit
are checking PT_PTRACED, so it's not like we needed to change anything
other than that spot - resulting logics will be equivalent to what we
have right now.  Looks like a fairly straightforward optimisation...
Am I missing something subtle and parisc-specific here?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-20  8:40           ` Al Viro
  2012-05-20  8:40             ` Al Viro
@ 2012-05-20  9:04             ` Al Viro
  2012-05-20  9:04               ` Al Viro
  2012-05-20 18:46               ` John David Anglin
  1 sibling, 2 replies; 21+ messages in thread
From: Al Viro @ 2012-05-20  9:04 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

On Sun, May 20, 2012 at 09:40:42AM +0100, Al Viro wrote:
> Moreover, checking PT_TRACED first is not even cheaper in the common case
> (i.e. when branch to .Ltracesys is not taken at all).  As it is, parisc does
>         mfctl   %cr30, %r1
>         LDREG   TI_TASK(%r1),%r1
>         ldw     TASK_PTRACE(%r1), %r1
>         bb,<,n  %r1,31,.Ltracesys
> and that's actually an extra dereference compared to
>         mfctl   %cr30, %r1
>         LDREG   TI_FLAGS(%r1),%r1
>         bb,<,n  %r1,31 - TIF_SYSCALL_TRACE,.Ltracesys
> Note that tracehook_report_syscall_entry/tracehook_report_syscall_exit
> are checking PT_PTRACED, so it's not like we needed to change anything
> other than that spot - resulting logics will be equivalent to what we
> have right now.  Looks like a fairly straightforward optimisation...
> Am I missing something subtle and parisc-specific here?

Actually, looks like I am missing something, but it's not particulary subtle.
SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes do_syscall_trace_leave()
do things.  So checking one bit in flags is not enough - any of those 3 is
a reason for taking the slow path.  The point still stands, though -
	mfctl   %cr30, %r1
	LDREG   TI_FLAGS(%r1),%r1
	ldi	(_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP), %r19
        and,COND(=) %r19, %r1, %r0
	b,n	.Ltracesys
would still be no worse on the fast path and would not hit the slow path in
a lot of cases when the current code does it for no apparent reason.

Comments?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-20  9:04             ` Al Viro
@ 2012-05-20  9:04               ` Al Viro
  2012-05-20 18:46               ` John David Anglin
  1 sibling, 0 replies; 21+ messages in thread
From: Al Viro @ 2012-05-20  9:04 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Grant Grundler, John David Anglin, linux-parisc, Linus Torvalds,
	linux-arch

On Sun, May 20, 2012 at 09:40:42AM +0100, Al Viro wrote:
> Moreover, checking PT_TRACED first is not even cheaper in the common case
> (i.e. when branch to .Ltracesys is not taken at all).  As it is, parisc does
>         mfctl   %cr30, %r1
>         LDREG   TI_TASK(%r1),%r1
>         ldw     TASK_PTRACE(%r1), %r1
>         bb,<,n  %r1,31,.Ltracesys
> and that's actually an extra dereference compared to
>         mfctl   %cr30, %r1
>         LDREG   TI_FLAGS(%r1),%r1
>         bb,<,n  %r1,31 - TIF_SYSCALL_TRACE,.Ltracesys
> Note that tracehook_report_syscall_entry/tracehook_report_syscall_exit
> are checking PT_PTRACED, so it's not like we needed to change anything
> other than that spot - resulting logics will be equivalent to what we
> have right now.  Looks like a fairly straightforward optimisation...
> Am I missing something subtle and parisc-specific here?

Actually, looks like I am missing something, but it's not particulary subtle.
SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes do_syscall_trace_leave()
do things.  So checking one bit in flags is not enough - any of those 3 is
a reason for taking the slow path.  The point still stands, though -
	mfctl   %cr30, %r1
	LDREG   TI_FLAGS(%r1),%r1
	ldi	(_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP), %r19
        and,COND(=) %r19, %r1, %r0
	b,n	.Ltracesys
would still be no worse on the fast path and would not hit the slow path in
a lot of cases when the current code does it for no apparent reason.

Comments?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-20  9:04             ` Al Viro
  2012-05-20  9:04               ` Al Viro
@ 2012-05-20 18:46               ` John David Anglin
  2012-05-20 18:46                 ` John David Anglin
  2012-05-20 20:38                 ` Carlos O'Donell
  1 sibling, 2 replies; 21+ messages in thread
From: John David Anglin @ 2012-05-20 18:46 UTC (permalink / raw)
  To: Al Viro
  Cc: Carlos O'Donell, Grant Grundler, linux-parisc, Linus Torvalds,
	linux-arch

On 20-May-12, at 5:04 AM, Al Viro wrote:

> Actually, looks like I am missing something, but it's not  
> particulary subtle.
> SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
> any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes  
> do_syscall_trace_leave()
> do things.  So checking one bit in flags is not enough - any of  
> those 3 is
> a reason for taking the slow path.  The point still stands, though -
> 	mfctl   %cr30, %r1
> 	LDREG   TI_FLAGS(%r1),%r1
> 	ldi	(_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP), %r19
>        and,COND(=) %r19, %r1, %r0
> 	b,n	.Ltracesys
> would still be no worse on the fast path and would not hit the slow  
> path in
> a lot of cases when the current code does it for no apparent reason.
>
> Comments?

This looks good to me.  There is similar code in entry.S.

I added the above change to my patch set for stable 3.3.6.  System still
boots.  I singled stepped /bin/ls through a directory listing and it  
seemed
to work.

Dave
--
John David Anglin	dave.anglin@bell.net




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-20 18:46               ` John David Anglin
@ 2012-05-20 18:46                 ` John David Anglin
  2012-05-20 20:38                 ` Carlos O'Donell
  1 sibling, 0 replies; 21+ messages in thread
From: John David Anglin @ 2012-05-20 18:46 UTC (permalink / raw)
  To: Al Viro
  Cc: Carlos O'Donell, Grant Grundler, linux-parisc, Linus Torvalds,
	linux-arch

On 20-May-12, at 5:04 AM, Al Viro wrote:

> Actually, looks like I am missing something, but it's not  
> particulary subtle.
> SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
> any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes  
> do_syscall_trace_leave()
> do things.  So checking one bit in flags is not enough - any of  
> those 3 is
> a reason for taking the slow path.  The point still stands, though -
> 	mfctl   %cr30, %r1
> 	LDREG   TI_FLAGS(%r1),%r1
> 	ldi	(_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP), %r19
>        and,COND(=) %r19, %r1, %r0
> 	b,n	.Ltracesys
> would still be no worse on the fast path and would not hit the slow  
> path in
> a lot of cases when the current code does it for no apparent reason.
>
> Comments?

This looks good to me.  There is similar code in entry.S.

I added the above change to my patch set for stable 3.3.6.  System still
boots.  I singled stepped /bin/ls through a directory listing and it  
seemed
to work.

Dave
--
John David Anglin	dave.anglin@bell.net




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [parisc] double restarts on multiple signal arrivals
  2012-05-20 18:46               ` John David Anglin
  2012-05-20 18:46                 ` John David Anglin
@ 2012-05-20 20:38                 ` Carlos O'Donell
  1 sibling, 0 replies; 21+ messages in thread
From: Carlos O'Donell @ 2012-05-20 20:38 UTC (permalink / raw)
  To: John David Anglin
  Cc: Al Viro, Grant Grundler, linux-parisc, Linus Torvalds, linux-arch

On Sun, May 20, 2012 at 2:46 PM, John David Anglin <dave.anglin@bell.net> wrote:
> On 20-May-12, at 5:04 AM, Al Viro wrote:
>
>> Actually, looks like I am missing something, but it's not particulary
>> subtle.
>> SYSCALL_TRACE is needed for do_syscall_trace_enter() to do anything;
>> any of SYSCALL_TRACE/SINGLESTEP/BLOCKSTEP is makes
>> do_syscall_trace_leave()
>> do things.  So checking one bit in flags is not enough - any of those 3 is
>> a reason for taking the slow path.  The point still stands, though -
>>        mfctl   %cr30, %r1
>>        LDREG   TI_FLAGS(%r1),%r1
>>        ldi     (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_BLOCKSTEP),
>> %r19
>>       and,COND(=) %r19, %r1, %r0
>>        b,n     .Ltracesys
>> would still be no worse on the fast path and would not hit the slow path
>> in
>> a lot of cases when the current code does it for no apparent reason.
>>
>> Comments?
>
>
> This looks good to me.  There is similar code in entry.S.
>
> I added the above change to my patch set for stable 3.3.6.  System still
> boots.  I singled stepped /bin/ls through a directory listing and it seemed
> to work.

I'm running this through the ringer on my local system.

I'll send out an update when I'm done testing.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-05-20 20:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-18 17:58 [parisc] double restarts on multiple signal arrivals Al Viro
2012-05-18 17:58 ` Al Viro
2012-05-18 18:05 ` Grant Grundler
2012-05-18 18:57   ` Al Viro
2012-05-18 18:57     ` Al Viro
2012-05-18 19:56   ` Al Viro
2012-05-18 19:56     ` Al Viro
2012-05-18 20:12     ` Grant Grundler
2012-05-18 20:15       ` Carlos O'Donell
2012-05-18 22:24         ` Al Viro
2012-05-19  1:36           ` Carlos O'Donell
2012-05-19  5:26           ` Al Viro
2012-05-19 13:37             ` Al Viro
2012-05-19 13:37               ` Al Viro
2012-05-20  8:40           ` Al Viro
2012-05-20  8:40             ` Al Viro
2012-05-20  9:04             ` Al Viro
2012-05-20  9:04               ` Al Viro
2012-05-20 18:46               ` John David Anglin
2012-05-20 18:46                 ` John David Anglin
2012-05-20 20:38                 ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).