public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Denys Vlasenko <vda.linux@googlemail.com>
To: James Hogan <james.hogan@imgtec.com>
Cc: linux-kernel@vger.kernel.org, Ralf Baechle <ralf@linux-mips.org>,
	David Daney <david.daney@cavium.com>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-mips@linux-mips.org, Al Viro <viro@zeniv.linux.org.uk>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	David Howells <dhowells@redhat.com>,
	Dave Jones <davej@redhat.com>
Subject: Re: [PATCH v2] MIPS: Reduce _NSIG from 128 to 127 to avoid BUG_ON
Date: Fri, 28 Jun 2013 21:28:27 +0200	[thread overview]
Message-ID: <201306282128.27266.vda.linux@googlemail.com> (raw)
In-Reply-To: <51BEE6A8.2050307@imgtec.com>

On Monday 17 June 2013 12:36, James Hogan wrote:
> On 14/06/13 17:03, James Hogan wrote:
> > MIPS has 128 signals, the highest of which has the number 128 (they
> > start from 1). The following command causes get_signal_to_deliver() to
> > pass this signal number straight through to do_group_exit() as the exit
> > code:
> > 
> >   strace sleep 10 & sleep 1 && kill -128 `pidof sleep`
> > 
> > However do_group_exit() checks for the core dump bit (0x80) in the exit
> > code which matches in this particular case and the kernel panics:
> > 
> >   BUG_ON(exit_code & 0x80); /* core dumps don't get here */
> > 
> > Lets avoid this by changing the ABI by reducing the number of signals to
> > 127 (so that the maximum signal number is 127). Glibc incorrectly sets
> > [__]SIGRTMAX to 127 already. uClibc sets it to 128 so it's conceivable
> > that programs built against uClibc which intentionally uses RT signals
> > from the top (SIGRTMAX-n, n>=0) would need an updated uClibc (and a
> > rebuild if it's crazy enough to use __SIGRTMAX).
> 
> Hmm, although this works around the BUG_ON, this doesn't actually seem
> to be sufficient to behave correctly.
> 
> So it appears the exit status is constructed like this:
> bits	purpose
> 0x007f	signal number (0-127)
> 0x0080	core dump
> 0xff00	exit status
> 
> but the macros in waitstatus.h and wait.h in libc
> (see also "man 2 wait"):
> WIFEXITED:   status & 0x7f == 0
> WIFSIGNALED: status & 0x7f in [1..126] (i.e. not 0 or 127)
> WIFSTOPPED:  status & 0xff == 127
> 
> So termination due to SIG127 looks like it's been stopped instead of
> terminated via a signal, unless a core dump occurs in which case none of
> the above match.
> 
> (And termination due to SIG128 hits BUG_ON, otherwise would appear to
> have exited normally with core dump).
> 
> 
> Reducing number of signals to 126 to avoid this will change the glibc
> ABI too, in which case we may as well reduce to 64 to match other
> arches, which is more likely to break something (I'm not really
> comfortable making that change).
> 
> Reducing to 127 (this patch) still leaves incorrect exit status codes
> for SIG127 ...
> 
> Any further thoughts/opinions?

Strictly speaking, exit status of 0x007f isn't ambiguous.

Currently userspace uses the following rules
(assuming that status is 16-bit (IOW, dropping PTRACE_EVENT bits)):

WIFEXITED(status)    = (status & 0x7f) == 0
WIFSIGNALED(status)  = (status & 0x7f) != 0 && (status & 0x7f) < 0x7f
WIFSTOPPED(status)   = (status & 0xff) == 0x7f
WIFCONTINUED(status) = (status == 0xffff)

WEXITSTATUS(status)  = status >> 8
WSTOPSIG(status)     = status >> 8
WCOREDUMP(status)    = status & 0x80
WTERMSIG(status)     = status & 0x7f

When process dies from signal 127, status is 0x007f and it is not a valid
"stopped by signal" indicator, since WSTOPSIG == 0 is an impossibility.

Status 0x007f get misinterpreted by the rules above, namely,
WIFSTOPPED is true, WIFSIGNALED is false.

But an alternative definition exists which works correctly with
all previous status codes, treats 0x007f as "killed by signal 127"
and isn't more convoluted.
In fact, while WIFSTOPPED needs one additional check,
WIFSIGNALED gets simpler (loses one AND'ing operation):

WIFSTOPPED(status)   = (status & 0xff) == 0x7f && (status >> 8) != 0
WIFSIGNALED(status)  = status != 0 && status <= 0xff

All other rules need no change.

I think it's feasible to ask {g,uc}libc to change their defines
(on MIPS as a minimum), and live with 127 signals.

-- 
vda

  reply	other threads:[~2013-06-28 19:28 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-14 16:03 [PATCH v2] MIPS: Reduce _NSIG from 128 to 127 to avoid BUG_ON James Hogan
2013-06-14 16:22 ` Oleg Nesterov
2013-06-14 16:28 ` David Daney
2013-06-17 10:36 ` James Hogan
2013-06-28 19:28   ` Denys Vlasenko [this message]
2013-06-28 22:03     ` James Hogan
2013-09-04  4:41       ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201306282128.27266.vda.linux@googlemail.com \
    --to=vda.linux@googlemail.com \
    --cc=davej@redhat.com \
    --cc=david.daney@cavium.com \
    --cc=dhowells@redhat.com \
    --cc=james.hogan@imgtec.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=ralf@linux-mips.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox