public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michael Kerrisk <michael.kerrisk@gmx.net>
To: linux-kernel@vger.kernel.org
Cc: tytso@mit.edu, torvalds@osdl.com, drepper@redhat.com,
	Eric Paire <paire@ri.silicomp.fr>,
	Paul Eggert <eggert@cs.ucla.edu>,
	Manfred Spraul <manfred@colorfullife.com>,
	roland@redhat.com, Robert Love <rlove@rlove.org>,
	Michael Kerrisk <mtk-manpages@gmx.net>,
	mtk-lkml@gmx.net
Subject: Strange Linux behaviour with blocking syscalls and stop signals+SIGCONT
Date: Mon, 03 Jul 2006 16:46:32 +0200	[thread overview]
Message-ID: <44A92DC8.9000401@gmx.net> (raw)

Gidday,

[Various parties involved in past discussions on this topic CCed.]

First off, it's worth mentioning that the topic that I'll go into below 
has been visited a few times before, and in particular during one of the 
more recent visits, Linus's position was:

     http://marc.theaimsgroup.com/?l=linux-kernel&m=104502401330898&w=2
     List:       linux-kernel
     Subject:    Re: another subtle signals issue
     From:       Linus Torvalds
     Date:       2003-02-12 4:21:02

     ...

     And I have multiple times said that as far as Linux is
     concerned, ^Z is, has always been, and certainly for the 2.6.x
     timeframe _will_ be a signal that the kernel considers "caught".

The problem is that the "2.6.x timeframe" means something different now 
than it did then, and the question is when the following Linux-specific 
(mis)behaviour will ever be fixed.

==

Linux exhibits a unique behaviour among Unix implementations with 
respect to signals.  When a program that is blocked in the middle of 
certain system calls is suspended by a signal (SIGSTSTP (^Z), SIGSTOP, 
SIGGTIIN, SIGTTOU) and then resumed by a SIGCONT signal, the system call 
fails with the error EINTR.  ***This behaviour occurs even when no 
signal handler is installed for the stop or SIGCONT signals.***  (An 
example program showing the behaviour for one system call is appended to 
this message.)

This can cause applications that work on other Unix systems to behave 
unexpectedly on Linux.

On Linux, the current (2.6.17) system calls and functions that exhibit 
this behaviour are: futex(FUTEX_WAIT), epoll_wait(), poll(), read() from 
an inotify file descriptor (but not read()s from any other type of file 
descriptor, AFAIK), sem_wait(3) (because of futex(2)), semop(), 
semtimedop(), sigtimedwait(), and sigwaitinfo().

I have never seen this behaviour on any other Unix system, and at 
various times I've explicitly tested various blocking system calls on at 
least the following: Solaris 8, FreeBSD 4.8, HP-UX 11, Tru64 5.1B, Irix 
6.5, and Darwin 7.2.

My reading of POSIX is that POSIX only permits a system call to fail 
with EINTR if a signal handler is involved (i.e., a signal is only 
considered "caught" if a handler is involved, a different definition of 
Linus's "caught" quoted at the start of this message).  Some inquiries 
on the Austin group list quite a while back:

https://www.opengroup.org/sophocles/show_archive.tpl?source=L&listname=austin-group-l&first=1&pagesize=80&searchstring=signals+and+interruption+of+system+calls&zone=G
Or: http://tinyurl.com/l5at8

     Date: Fri, 13 Feb 2004 11:44:50 +0100 (MET)
     From: "Michael T Kerrisk"
     To: The Austin Group
     Subject: Stop signals and interruption of system calls on Linux

got answers that agreed with my interpretation, including:

https://www.opengroup.org/sophocles/show_mail.tpl?CALLER=show_archive.tpl&source=L&listname=austin-group-l&id=6668
Or: http://tinyurl.com/rdd7d

Quoting Paul Eggert:

   This topic came up in a POSIX.1 standards meeting on April 22, 1993,
   and the consensus of that meeting also agreed with you.

   > (And of course I'm still curious if any other implementation behaves
   > like Linux.)

   Long-ago AIX hosts had that bug as well, but they were fixed after
   that POSIX.1 discussion of a decade ago.  For more details about this,
   please see:

   David A. Willcox (Motorola MCG - Urbana)
   Job Control and POSIX
   <http://groups.google.com/groups?selm=1r77ojINN85n%40ftp.UU.NET>
   1993-04-22

==

Given the current development model, a fix now, in kernel 2.6.x seems in 
order.  Probably all of the system calls listed above should be fixed so 
that in this circumstance the system call is automatically restarted, 
just as currently occurs for many other similar blocking system calls 
(e.g., select(), pselect(), mq_send(), mq_receive(), accep(), connect(), 
recv(), send(), wait(), flock(), fcntl(F_SETLKW), etc.).

In some past threads on this topic I have seen arguments about ABI 
compatability advanced, but I don't believe these hold, for the 
following reasons:

a) We are talking about signals involved in *interactive* job control; 
therefore any ABI issues are likely to be minimal (and see "c)" below).

b) There is no consistency about which system calls show this behaviour 
and which do not.  For example poll() shows it, but select() does not. 
Notably, ppoll() also does not have the behaviour, since it does signal 
processing differently from poll()!  Another notable example is read(): 
on an inotify file descriptor it demonstrates this behaviour, but on 
other file descriptors it does not.

c) The Linux baehviour has been arbitrary across kernel versions and 
system calls.  In particular, the following system calls showed this 
behaviour in earlier kernel versions, but then the behaviour was changed 
without forewarning and (AFAIK) without subsequent complaint:

       * nanosleep() in kernel 2.4 and earlier

       * msgsnd() and msgrcv() in kernels before 2.6.9.

==

As far as I can see, there seems no compelling argument not to make 
Linux consistent with other current and historical Unix implementations 
with respect to the treatment of blocking syscalls and stop 
signals+SIGCONT.  Is there any reason not to fix things now?

Cheers,

Michael


PS Just for completeness, I note that this topic has been visited before:

http://marc.theaimsgroup.com/?l=linux-kernel&m=94464821126712&w=2
List:       linux-kernel
Subject:    SIGCONT misbehaviour in Linux
From:       Eric PAIRE <eric.paire () ri ! silicomp ! fr>
Date:       1999-12-08 10:14:13

and

http://marc.theaimsgroup.com/?l=linux-kernel&m=104501574824496&w=2
List:       linux-kernel
Subject:    another subtle signals issue
From:       Roland McGrath <roland () redhat ! com>
Date:       2003-02-12 2:06:54


==

Below is an example program demonstrating the behaviour for semop(). 
This is a sample run:

$ ./a.out x
<type ^Z>
[1]+  Stopped                 ./r x
$ fg
./a.out x
semop: Interrupted system call

The last line of output shows that the system call failed with EINTR.

$ cat restart_semop.c
/* restart_semop.c */

#define _GNU_SOURCE
#include <string.h>
#include <signal.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>

#define errMsg(msg)     do { perror(msg); } while (0)

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                         } while (0)

static void
catcher(int sig)
{
     char msg[100];

     sprintf(msg, "Caught signal %d %s\n", sig, strsignal(sig));
     fprintf(stderr, "%s", msg);
} /* catcher */

int
main(int argc, char *argv[])
{
     struct sigaction sa;
     int semId;
     struct sembuf sops[10];

     sa.sa_handler = catcher;
     sigemptyset(&sa.sa_mask);

     /* Make system calls restartable if argc > 1 */

     sa.sa_flags = (argc > 1) ? SA_RESTART : 0;

     if (sigaction(SIGINT, &sa, NULL) == -1) errMsg("sigaction");

     semId = semget(IPC_PRIVATE, 1, IPC_CREAT | S_IRUSR | S_IWUSR);

     if (semId == -1) errExit("semget");

     if (semctl(semId, 0, SETVAL, 1) == -1) errExit("semctl");

     sops[0].sem_num = 0;
     sops[0].sem_op = 0;
     sops[0].sem_flg = 0;

     for (;;)
         if (semop(semId, sops, 1) == -1) errMsg("semop");
}

             reply	other threads:[~2006-07-03 14:46 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-03 14:46 Michael Kerrisk [this message]
2006-07-04 19:02 ` Strange Linux behaviour with blocking syscalls and stop signals+SIGCONT Manfred Spraul
2006-07-06  9:23   ` Michael Kerrisk
2006-07-06 18:42     ` Manfred Spraul
2006-07-06 18:55       ` Ulrich Drepper
2006-07-06 19:02         ` Manfred Spraul
2006-07-06 19:10           ` Ulrich Drepper
2006-07-06 19:18           ` Linus Torvalds
2006-07-06 19:28             ` Manfred Spraul
2006-07-06 19:29               ` Manfred Spraul
2006-07-07  4:57             ` Michael Kerrisk
2006-07-07  5:10               ` Linus Torvalds
2006-07-07  5:12                 ` Linus Torvalds
2009-11-07 19:43             ` angelo.borsotti
2006-07-07  4:32         ` Michael Kerrisk
2006-07-07  4:57           ` Ulrich Drepper
2006-07-07  5:07             ` Michael Kerrisk
2006-07-07  6:20               ` Ulrich Drepper
2006-07-07  7:03                 ` Michael Kerrisk
2006-07-07  7:20                   ` Arjan van de Ven
2006-07-07  8:02                     ` Michael Kerrisk
2006-07-07  9:26                       ` Jakub Jelinek
2006-07-07 13:36                         ` Ulrich Drepper
2006-07-07  4:28       ` Michael Kerrisk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44A92DC8.9000401@gmx.net \
    --to=michael.kerrisk@gmx.net \
    --cc=drepper@redhat.com \
    --cc=eggert@cs.ucla.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=mtk-lkml@gmx.net \
    --cc=mtk-manpages@gmx.net \
    --cc=paire@ri.silicomp.fr \
    --cc=rlove@rlove.org \
    --cc=roland@redhat.com \
    --cc=torvalds@osdl.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox