* sys_paccept: disable paccept() until API design is resolved
@ 2008-09-16 12:05 Michael Kerrisk
2008-09-16 13:04 ` Oleg Nesterov
2008-09-16 23:17 ` Ulrich Drepper
0 siblings, 2 replies; 8+ messages in thread
From: Michael Kerrisk @ 2008-09-16 12:05 UTC (permalink / raw)
To: Andrew Morton
Cc: David Miller, Davide Libenzi, Alan Cox, Ulrich Drepper,
Jakub Jelinek, lkml, Linus Torvalds, netdev, Roland McGrath,
Oleg Nesterov, Christoph Hellwig
Andrew,
The patch below disables the new sys_paccept() for now. Please
apply for 2.6.27-rc, so that we do not release this API into
the wild before a conclusion has been reached about its design.
The reasons for disabling paccept() are as follows:
* The API is more complex than needed. There is AFAICS no demonstrated
use case that the sigset argument of this syscall serves that
couldn't equally be served by the use of pselect/ppoll/epoll_pwait +
traditional accept(). Roland seems to concur with this opinion
(http://thread.gmane.org/gmane.linux.kernel/723953/focus=732255).
I have (more than once) asked Ulrich to explain otherwise
(http://thread.gmane.org/gmane.linux.kernel/723952/focus=731018),
but he does not respond, so one is left to assume that he doesn't
know of such a case.
* The use of a sigset argument is not consistent with other I/O APIs
that can block on a single file descriptor (e.g., read(), recv(),
connect()).
* The behavior of paccept() when interrupted by a signal is IMO
strange: the kernel restarts the system call if SA_RESTART was set
for the handler. I think that it should not do this -- that it
should behave consistently with paccept()/ppoll()/epoll_pwait(),
which never restart, regardless of SA_RESTART. The reasoning here
is that the very purpose of paccept() is to wait for a connection
or a signal, and that restarting in the latter case is probably
never useful. (Note: Roland disagrees on this point, believing
that rather paccept() should be consistent with accept() in its
behavior wrt EINTR
(http://thread.gmane.org/gmane.linux.kernel/723953/focus=732255).)
I believe that instead, a simpler API, consistent with Ulrich's
other recent additions, is preferable:
accept4(int fd, struct sockaddr *sa, socklen_t *salen, ind flags);
(This simpler API was originally proposed by Ulrich:
http://thread.gmane.org/gmane.linux.network/92072)
If this simpler API is added, then if we later decide that the sigset
argument really is required, then a suitable bit in 'flags' could
be added to indicate the presence of the sigset argument.
At this point, I am hoping we either will get a counter-argument
from Ulrich about why we really do need paccept()'s sigset argument,
or that he will resubmit the original accept4() patch.
Cheers,
Michael
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
--- linux-2.6.27-rc6/net/socket.c.orig 2008-09-16 12:38:15.000000000 +0200
+++ linux-2.6.27-rc6/net/socket.c 2008-09-16 13:07:51.000000000 +0200
@@ -1511,6 +1511,7 @@
goto out_put;
}
+#if 0
#ifdef HAVE_SET_RESTORE_SIGMASK
asmlinkage long sys_paccept(int fd, struct sockaddr __user *upeer_sockaddr,
int __user *upeer_addrlen,
@@ -1564,6 +1565,7 @@
return do_accept(fd, upeer_sockaddr, upeer_addrlen, flags);
}
#endif
+#endif
asmlinkage long sys_accept(int fd, struct sockaddr __user *upeer_sockaddr,
int __user *upeer_addrlen)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sys_paccept: disable paccept() until API design is resolved
2008-09-16 12:05 sys_paccept: disable paccept() until API design is resolved Michael Kerrisk
@ 2008-09-16 13:04 ` Oleg Nesterov
2008-09-16 23:17 ` Ulrich Drepper
1 sibling, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2008-09-16 13:04 UTC (permalink / raw)
To: Michael Kerrisk
Cc: Andrew Morton, David Miller, Davide Libenzi, Alan Cox,
Ulrich Drepper, Jakub Jelinek, lkml, Linus Torvalds, netdev,
Roland McGrath, Christoph Hellwig
On 09/16, Michael Kerrisk wrote:
>
> * The behavior of paccept() when interrupted by a signal is IMO
> strange: the kernel restarts the system call if SA_RESTART was set
> for the handler. I think that it should not do this -- that it
> should behave consistently with paccept()/ppoll()/epoll_pwait(),
> which never restart, regardless of SA_RESTART. The reasoning here
> is that the very purpose of paccept() is to wait for a connection
> or a signal, and that restarting in the latter case is probably
> never useful. (Note: Roland disagrees on this point, believing
> that rather paccept() should be consistent with accept() in its
> behavior wrt EINTR
> (http://thread.gmane.org/gmane.linux.kernel/723953/focus=732255).)
Also, the implementation of sys_paccept() is not "perfect", imho.
sys_paccept:
ret = do_accept(...);
if (ret < 0 && signal_pending()) {
set_restore_sigmask();
return ret;
}
It doesn't check that ret == ERESTARTSYS/EINTR. I can't say this
is bug, but let's suppose that do_accept() returns (say) -EINVAL,
and then the task is interrupted by the signal.
Now, if the signal comes after sys_paccept() checks signal_pending(),
we return -EINVAL, and the signal handler runs with the original
current->blocked mask, as expected.
However, if the signal happens in the window before signal_pending(),
we still return -EINVAL, but the signal handler runs with
->blocked == sigmask. A bit odd, but probably harmless.
Note also that unless I misread the code, do_paccept() returns
ERESTARTSYS or EINTR depending on ->sk_rcvtimeo. Yes, it is very
clear why sock_intr_errno() does this, but this doesn't make the
behaviour of paccept() more understandable.
Oleg.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: sys_paccept: disable paccept() until API design is resolved
2008-09-16 12:05 sys_paccept: disable paccept() until API design is resolved Michael Kerrisk
2008-09-16 13:04 ` Oleg Nesterov
@ 2008-09-16 23:17 ` Ulrich Drepper
2008-09-17 0:24 ` Michael Kerrisk
` (3 more replies)
1 sibling, 4 replies; 8+ messages in thread
From: Ulrich Drepper @ 2008-09-16 23:17 UTC (permalink / raw)
To: Michael Kerrisk
Cc: Andrew Morton, David Miller, Davide Libenzi, Alan Cox,
Jakub Jelinek, lkml, Linus Torvalds, netdev, Roland McGrath,
Oleg Nesterov, Christoph Hellwig
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Michael Kerrisk wrote:
> The patch below disables the new sys_paccept() for now. Please
> apply for 2.6.27-rc, so that we do not release this API into
> the wild before a conclusion has been reached about its design.
There is no reason for that.
> The reasons for disabling paccept() are as follows:
>
> * The API is more complex than needed. There is AFAICS no demonstrated
> use case that the sigset argument of this syscall serves that
> couldn't equally be served by the use of pselect/ppoll/epoll_pwait +
It would unnecessarily require programs to be changed. I've explained
that programs cannot efficiently use accept() and poll() when multiple
threads are involved. This means in these situations you'll find a
single thread handling only the accept() calls.
> * The use of a sigset argument is not consistent with other I/O APIs
> that can block on a single file descriptor (e.g., read(), recv(),
> connect()).
This is because none of the other interfaces had (so far) be revised.
With this flawed argumentation you'd prevent any program ever to be made.
> * The behavior of paccept() when interrupted by a signal is IMO
> strange:
You use your own opinion as the deciding factor? The behavior differs
from other uses but is consistent with the accept() behavior.
> I believe that instead, a simpler API, consistent with Ulrich's
> other recent additions, is preferable:
>
> accept4(int fd, struct sockaddr *sa, socklen_t *salen, ind flags);
The signal set wasn't actually my idea. See:
http://marc.info/?l=linux-kernel&m=120909788728078&w=2
> At this point, I am hoping we either will get a counter-argument
> from Ulrich about why we really do need paccept()'s sigset argument,
> or that he will resubmit the original accept4() patch.
I have explained the need already. you just chose to ignore it.
- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iEYEARECAAYFAkjQPoQACgkQ2ijCOnn/RHTNZwCfaXdw5Yhy/chAUMqR2kZE8Rsm
wzUAnA7PtvODGyAMeahl44+mqasqGS1U
=Gh2E
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: sys_paccept: disable paccept() until API design is resolved
2008-09-16 23:17 ` Ulrich Drepper
@ 2008-09-17 0:24 ` Michael Kerrisk
2008-09-17 16:46 ` Evgeniy Polyakov
2008-09-17 1:22 ` Nick Piggin
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Michael Kerrisk @ 2008-09-17 0:24 UTC (permalink / raw)
To: Ulrich Drepper
Cc: Andrew Morton, David Miller, Davide Libenzi, Alan Cox,
Jakub Jelinek, lkml, Linus Torvalds, netdev, Roland McGrath,
Oleg Nesterov, Christoph Hellwig, Evgeniy Polyakov
On Wed, Sep 17, 2008 at 1:17 AM, Ulrich Drepper <drepper@redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Michael Kerrisk wrote:
>> The patch below disables the new sys_paccept() for now. Please
>> apply for 2.6.27-rc, so that we do not release this API into
>> the wild before a conclusion has been reached about its design.
>
> There is no reason for that.
>
>
>> The reasons for disabling paccept() are as follows:
>>
>> * The API is more complex than needed. There is AFAICS no demonstrated
>> use case that the sigset argument of this syscall serves that
>> couldn't equally be served by the use of pselect/ppoll/epoll_pwait +
>
> It would unnecessarily require programs to be changed.
What is "it"? What "would unnecessarily require programs to be changed"?
> I've explained
> that programs cannot efficiently use accept() and poll() when multiple
> threads are involved. This means in these situations you'll find a
> single thread handling only the accept() calls.
I'm assuming that you mean this text from another thread:
[[
> accept, like select/poll, is used often as a function to dealy
> operation. Unlike read, recv, etc, which are handled using O_NONBLOCK
> and select/poll. pselect/ppoll do not really have a sigset parameter
> to handle signals in general. You use it to enable special handling
> in case of blocking. Example: if you want to implement userlevel
> context switching, you dedicate a signal to wake up any blocked
> thread. Since accept falls more into the same category than poll,
> this means the sigset parameter is justified. In theory we could add
> it to all functions but there is no reason to do this without any
> other reason to change the interface.
]]
I find this very difficult to understand -- what makes it difficult is
the wording of the explanation. Could you please take some time to
(much) more clearly explain the above, and especially to very clearly
explain why paccept() serves needs that can't be dealt with by
pseelect/ppoll/epoll_pwait + accept().
>> * The use of a sigset argument is not consistent with other I/O APIs
>> that can block on a single file descriptor (e.g., read(), recv(),
>> connect()).
>
> This is because none of the other interfaces had (so far) be revised.
> With this flawed argumentation you'd prevent any program ever to be made.
>
>
>> * The behavior of paccept() when interrupted by a signal is IMO
>> strange:
>
> You use your own opinion as the deciding factor?
I already clearly stated it was my opinion, and pointed out that
Roland had a diffferent opinion. Really, what is your point?
> The behavior differs
> from other uses but is consistent with the accept() behavior.
>
>> I believe that instead, a simpler API, consistent with Ulrich's
>> other recent additions, is preferable:
>>
>> accept4(int fd, struct sockaddr *sa, socklen_t *salen, ind flags);
>
> The signal set wasn't actually my idea. See:
>
> http://marc.info/?l=linux-kernel&m=120909788728078&w=2
[CC=+ Evgeniy Polyakov <johnpol@2ka.mipt.ru>]
But, what are you trying to say in pointing out that this wasn't your idea?
>> At this point, I am hoping we either will get a counter-argument
>> from Ulrich about why we really do need paccept()'s sigset argument,
>> or that he will resubmit the original accept4() patch.
>
> I have explained the need already. you just chose to ignore it.
No. You still have given no clear explanation. Please give one.
Cheers,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: sys_paccept: disable paccept() until API design is resolved
2008-09-17 0:24 ` Michael Kerrisk
@ 2008-09-17 16:46 ` Evgeniy Polyakov
0 siblings, 0 replies; 8+ messages in thread
From: Evgeniy Polyakov @ 2008-09-17 16:46 UTC (permalink / raw)
To: mtk.manpages
Cc: Ulrich Drepper, Andrew Morton, David Miller, Davide Libenzi,
Alan Cox, Jakub Jelinek, lkml, Linus Torvalds, netdev,
Roland McGrath, Oleg Nesterov, Christoph Hellwig
Hi Michael.
On Wed, Sep 17, 2008 at 02:24:32AM +0200, Michael Kerrisk (mtk.manpages@googlemail.com) wrote:
> >> accept4(int fd, struct sockaddr *sa, socklen_t *salen, ind flags);
> >
> > The signal set wasn't actually my idea. See:
> >
> > http://marc.info/?l=linux-kernel&m=120909788728078&w=2
>
> [CC=+ Evgeniy Polyakov <johnpol@2ka.mipt.ru>]
>
> But, what are you trying to say in pointing out that this wasn't your idea?
I asked to put signal mask there to be able to siplify life of those who
use accept() and signals. It is not 1005 _required_, but just like it is
not required to be in ppoll(). It is an optimization which should allow
to block signals during code execution without lots of additional steps
(like disabling/enabling them around the call via additional syscalls).
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sys_paccept: disable paccept() until API design is resolved
2008-09-16 23:17 ` Ulrich Drepper
2008-09-17 0:24 ` Michael Kerrisk
@ 2008-09-17 1:22 ` Nick Piggin
2008-09-17 6:50 ` Rémi Denis-Courmont
2008-09-17 14:30 ` Oleg Nesterov
3 siblings, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2008-09-17 1:22 UTC (permalink / raw)
To: Ulrich Drepper
Cc: Michael Kerrisk, Andrew Morton, David Miller, Davide Libenzi,
Alan Cox, Jakub Jelinek, lkml, Linus Torvalds, netdev,
Roland McGrath, Oleg Nesterov, Christoph Hellwig
On Wednesday 17 September 2008 09:17, Ulrich Drepper wrote:
> Michael Kerrisk wrote:
> > The patch below disables the new sys_paccept() for now. Please
> > apply for 2.6.27-rc, so that we do not release this API into
> > the wild before a conclusion has been reached about its design.
>
> There is no reason for that.
There is a good reason, and that is that if there is any questioning
of a patch that adds a userspace API, then it is much better to be
safe than sorry.
We don't get enough people reviewing these things as is, so
ignoring the review we do get is not the right thing to do. There
is much more harm in releasing the kernel with a poor API than
just holding off for another release until issues are sorted out.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sys_paccept: disable paccept() until API design is resolved
2008-09-16 23:17 ` Ulrich Drepper
2008-09-17 0:24 ` Michael Kerrisk
2008-09-17 1:22 ` Nick Piggin
@ 2008-09-17 6:50 ` Rémi Denis-Courmont
2008-09-17 14:30 ` Oleg Nesterov
3 siblings, 0 replies; 8+ messages in thread
From: Rémi Denis-Courmont @ 2008-09-17 6:50 UTC (permalink / raw)
To: ext Ulrich Drepper
Cc: Michael Kerrisk, Andrew Morton, David Miller, Davide Libenzi,
Alan Cox, Jakub Jelinek, lkml, Linus Torvalds, netdev,
Roland McGrath, Oleg Nesterov, Christoph Hellwig
On Wednesday 17 September 2008 02:17:25 ext Ulrich Drepper, you wrote:
> It would unnecessarily require programs to be changed. I've explained
> that programs cannot efficiently use accept() and poll() when multiple
> threads are involved. This means in these situations you'll find a
> single thread handling only the accept() calls.
Hmm. In a multithreaded program, it makes a lot of sense to use blocking I/O
in general - not just blocking accept(). Of course, this assumes that there
is only one file descriptor to wait on at a time. This is not necessarily
true for I/O, nor is it for accept(). For instance, modern TCP servers should
have an IPv4 and an IPv6 socket to accept() from...
If the code path is such that only one file descriptor is being waited on at a
time, then using blocking I/O halves the number of syscalls, and thirds the
number of context switch between hardware packet reception, and userland data
deliver (3 to 1).
> > * The use of a sigset argument is not consistent with other I/O APIs
> > that can block on a single file descriptor (e.g., read(), recv(),
> > connect()).
>
> This is because none of the other interfaces had (so far) be revised.
> With this flawed argumentation you'd prevent any program ever to be made.
Right.
Then again... Why not recommend threaded programs to use sigwait() in a
dedicated task and give up on the asynchronous signal handlers completely?
--
Rémi Denis-Courmont
Maemo Software, Nokia Devices R&D
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sys_paccept: disable paccept() until API design is resolved
2008-09-16 23:17 ` Ulrich Drepper
` (2 preceding siblings ...)
2008-09-17 6:50 ` Rémi Denis-Courmont
@ 2008-09-17 14:30 ` Oleg Nesterov
3 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2008-09-17 14:30 UTC (permalink / raw)
To: Ulrich Drepper
Cc: Michael Kerrisk, Andrew Morton, David Miller, Davide Libenzi,
Alan Cox, Jakub Jelinek, lkml, Linus Torvalds, netdev,
Roland McGrath, Christoph Hellwig
On 09/16, Ulrich Drepper wrote:
>
> Michael Kerrisk wrote:
>
> > * The behavior of paccept() when interrupted by a signal is IMO
> > strange:
>
> You use your own opinion as the deciding factor?
It would be very strange if Michael used the somebody else's opinion ;)
> The behavior differs
> from other uses but is consistent with the accept() behavior.
And Michael asks why this behaviour (and paccept() itself) is useful.
I must admit I don't understand this too.
It is very possible that we both just need the help from expert (you).
(Ulrich, there is no irony, seriously).
Oleg.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-09-17 16:51 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-16 12:05 sys_paccept: disable paccept() until API design is resolved Michael Kerrisk
2008-09-16 13:04 ` Oleg Nesterov
2008-09-16 23:17 ` Ulrich Drepper
2008-09-17 0:24 ` Michael Kerrisk
2008-09-17 16:46 ` Evgeniy Polyakov
2008-09-17 1:22 ` Nick Piggin
2008-09-17 6:50 ` Rémi Denis-Courmont
2008-09-17 14:30 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).