netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: select implementation not POSIX compliant?
       [not found] <37062.66.93.180.209.1092243659.squirrel@66.93.180.209>
@ 2004-08-11 19:40 ` Alex Riesen
  2004-08-11 20:33   ` khandelw
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Alex Riesen @ 2004-08-11 19:40 UTC (permalink / raw)
  To: linux-kernel; +Cc: Nick Palmer, netdev

On linux-kernel, Nick Palmer wrote:
> I am working on porting some software from Solaris to Linux 2.6.7. I
> have run into a problem with the interaction of select and/or
> recvmsg and close in our multi-threaded application. The application
> expects that a close call on a socket that another thread is
> blocking in select and/or recvmsg on will cause select and/or
> recvmsg to return with an error. Linux does not seem to do this. (I
> also verified that the same issue exists in Linux 2.4.25, just to be
> sure it wasn't introduced in 2.6 in case you were wondering.)

It works always for stream sockets and does not at all (even with
shutdown, even using poll(2) or read(2) instead of select) for dgram
sockets.

What domain (inet, local) are your sockets in?
What type (stream, dgram)?

There will probably be a problem anyway with changing the behaviour:
there surely is lots of code, which start complaining about select and
poll finishing "unexpectedly".

I used this to check:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <netinet/in.h>
#include <fcntl.h>

int main(int argc, char* argv[])
{
    int status;
    int fds[2];
    fd_set set;
#if 0
    puts("stream");
    if (  socketpair(PF_LOCAL, SOCK_STREAM, 0, fds) < 0 )
#else
    puts("dgram");
    if (  socketpair(PF_LOCAL, SOCK_DGRAM, 0, fds) < 0 )
#endif
    {
	perror("socketpair");
	exit(1);
    }
    fcntl(fds[0], F_SETFL, fcntl(fds[0], F_GETFL) | O_NONBLOCK);
    fcntl(fds[1], F_SETFL, fcntl(fds[1], F_GETFL) | O_NONBLOCK);
    switch ( fork() )
    {
    case 0:
	sleep(1);
	close(fds[0]);
	shutdown(fds[1], SHUT_RD);
	close(fds[1]);
	exit(0);
	break;
    case -1:
	perror("fork");
	exit(1);
    }
    close(fds[1]);
    FD_ZERO(&set);
    FD_SET(fds[0], &set);
    select(fds[0] + 1, &set, NULL, NULL, 0);
    wait(&status);
    return 0;
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: select implementation not POSIX compliant?
  2004-08-11 19:40 ` select implementation not POSIX compliant? Alex Riesen
@ 2004-08-11 20:33   ` khandelw
  2004-08-11 21:23     ` Alex Riesen
  2004-08-13 20:13     ` Nick Palmer
  2004-08-11 21:57   ` Steven Dake
  2004-08-13 20:12   ` Nick Palmer
  2 siblings, 2 replies; 6+ messages in thread
From: khandelw @ 2004-08-11 20:33 UTC (permalink / raw)
  To: Alex Riesen; +Cc: linux-kernel, Nick Palmer, netdev

select should work for any type of socket. Its based on the type of file
descriptor not whether it is stream/dgram.

man recvmsg -

recvmsg() may be used to receive data on a socket whether it
     is in a connected state or not. s is a socket  created  with
     socket(3SOCKET).

so why should recvmsg return error???? upon closing the socket in other thread?
wouldn't the socket linger around for some time...

If no messages are available at the socket, the receive call
     waits  for  a  message  to arrive, unless the socket is non-
     blocking (see fcntl(2)) in which case -1  is  returned  with
     the external variable errno set to EWOULDBLOCK.


Quoting Alex Riesen <fork0@users.sourceforge.net>:

> On linux-kernel, Nick Palmer wrote:
> > I am working on porting some software from Solaris to Linux 2.6.7. I
> > have run into a problem with the interaction of select and/or
> > recvmsg and close in our multi-threaded application. The application
> > expects that a close call on a socket that another thread is
> > blocking in select and/or recvmsg on will cause select and/or
> > recvmsg to return with an error. Linux does not seem to do this. (I
> > also verified that the same issue exists in Linux 2.4.25, just to be
> > sure it wasn't introduced in 2.6 in case you were wondering.)
>
> It works always for stream sockets and does not at all (even with
> shutdown, even using poll(2) or read(2) instead of select) for dgram
> sockets.
>
> What domain (inet, local) are your sockets in?
> What type (stream, dgram)?
>
> There will probably be a problem anyway with changing the behaviour:
> there surely is lots of code, which start complaining about select and
> poll finishing "unexpectedly".
>
> I used this to check:
>
> #include <unistd.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/socket.h>
> #include <sys/wait.h>
> #include <netinet/in.h>
> #include <fcntl.h>
>
> int main(int argc, char* argv[])
> {
>     int status;
>     int fds[2];
>     fd_set set;
> #if 0
>     puts("stream");
>     if (  socketpair(PF_LOCAL, SOCK_STREAM, 0, fds) < 0 )
> #else
>     puts("dgram");
>     if (  socketpair(PF_LOCAL, SOCK_DGRAM, 0, fds) < 0 )
> #endif
>     {
> 	perror("socketpair");
> 	exit(1);
>     }
>     fcntl(fds[0], F_SETFL, fcntl(fds[0], F_GETFL) | O_NONBLOCK);
>     fcntl(fds[1], F_SETFL, fcntl(fds[1], F_GETFL) | O_NONBLOCK);
>     switch ( fork() )
>     {
>     case 0:
> 	sleep(1);
> 	close(fds[0]);
> 	shutdown(fds[1], SHUT_RD);
> 	close(fds[1]);
> 	exit(0);
> 	break;
>     case -1:
> 	perror("fork");
> 	exit(1);
>     }
>     close(fds[1]);
>     FD_ZERO(&set);
>     FD_SET(fds[0], &set);
>     select(fds[0] + 1, &set, NULL, NULL, 0);
>     wait(&status);
>     return 0;
> }
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: select implementation not POSIX compliant?
  2004-08-11 20:33   ` khandelw
@ 2004-08-11 21:23     ` Alex Riesen
  2004-08-13 20:13     ` Nick Palmer
  1 sibling, 0 replies; 6+ messages in thread
From: Alex Riesen @ 2004-08-11 21:23 UTC (permalink / raw)
  To: khandelw; +Cc: linux-kernel, Nick Palmer, netdev

I missed the point: threads! _Not_ duplicated handles.
Ignore me.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: select implementation not POSIX compliant?
  2004-08-11 19:40 ` select implementation not POSIX compliant? Alex Riesen
  2004-08-11 20:33   ` khandelw
@ 2004-08-11 21:57   ` Steven Dake
  2004-08-13 20:12   ` Nick Palmer
  2 siblings, 0 replies; 6+ messages in thread
From: Steven Dake @ 2004-08-11 21:57 UTC (permalink / raw)
  To: Alex Riesen; +Cc: linux-kernel, Nick Palmer, netdev

You will find poll works as you desire but select does not.  I recommend
porting to poll anyway; select sucks bad.  You might even try out epoll
in 2.6.

Thanks
Good luck

On Wed, 2004-08-11 at 12:40, Alex Riesen wrote:
> On linux-kernel, Nick Palmer wrote:
> > I am working on porting some software from Solaris to Linux 2.6.7. I
> > have run into a problem with the interaction of select and/or
> > recvmsg and close in our multi-threaded application. The application
> > expects that a close call on a socket that another thread is
> > blocking in select and/or recvmsg on will cause select and/or
> > recvmsg to return with an error. Linux does not seem to do this. (I
> > also verified that the same issue exists in Linux 2.4.25, just to be
> > sure it wasn't introduced in 2.6 in case you were wondering.)
> 
> It works always for stream sockets and does not at all (even with
> shutdown, even using poll(2) or read(2) instead of select) for dgram
> sockets.
> 
> What domain (inet, local) are your sockets in?
> What type (stream, dgram)?
> 
> There will probably be a problem anyway with changing the behaviour:
> there surely is lots of code, which start complaining about select and
> poll finishing "unexpectedly".
> 
> I used this to check:
> 
> #include <unistd.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/socket.h>
> #include <sys/wait.h>
> #include <netinet/in.h>
> #include <fcntl.h>
> 
> int main(int argc, char* argv[])
> {
>     int status;
>     int fds[2];
>     fd_set set;
> #if 0
>     puts("stream");
>     if (  socketpair(PF_LOCAL, SOCK_STREAM, 0, fds) < 0 )
> #else
>     puts("dgram");
>     if (  socketpair(PF_LOCAL, SOCK_DGRAM, 0, fds) < 0 )
> #endif
>     {
> 	perror("socketpair");
> 	exit(1);
>     }
>     fcntl(fds[0], F_SETFL, fcntl(fds[0], F_GETFL) | O_NONBLOCK);
>     fcntl(fds[1], F_SETFL, fcntl(fds[1], F_GETFL) | O_NONBLOCK);
>     switch ( fork() )
>     {
>     case 0:
> 	sleep(1);
> 	close(fds[0]);
> 	shutdown(fds[1], SHUT_RD);
> 	close(fds[1]);
> 	exit(0);
> 	break;
>     case -1:
> 	perror("fork");
> 	exit(1);
>     }
>     close(fds[1]);
>     FD_ZERO(&set);
>     FD_SET(fds[0], &set);
>     select(fds[0] + 1, &set, NULL, NULL, 0);
>     wait(&status);
>     return 0;
> }
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: select implementation not POSIX compliant?
  2004-08-11 19:40 ` select implementation not POSIX compliant? Alex Riesen
  2004-08-11 20:33   ` khandelw
  2004-08-11 21:57   ` Steven Dake
@ 2004-08-13 20:12   ` Nick Palmer
  2 siblings, 0 replies; 6+ messages in thread
From: Nick Palmer @ 2004-08-13 20:12 UTC (permalink / raw)
  To: Alex Riesen; +Cc: linux-kernel, netdev

Alex Riesen wrote:
> On linux-kernel, Nick Palmer wrote:
>>The application
>>expects that a close call on a socket that another thread is
>>blocking in select and/or recvmsg on will cause select and/or
>>recvmsg to return with an error. Linux does not seem to do this.
> 
> 
> It works always for stream sockets and does not at all (even with
> shutdown, even using poll(2) or read(2) instead of select) for dgram
> sockets.
> 
> What domain (inet, local) are your sockets in?

inet.

> What type (stream, dgram)?

We use both, though the breakage I was trying to fix was with a dgram 
socket.

You are correct that it does not work for dgram sockets at all! I had 
not noticed the difference between the two in the test case I wrote, 
since I hadn't tested streams. Thanks for pointing that out. Note that 
shutdown will cause a dgram socket to exit from a recv* call though, as 
this is the workaround I am using right now. On Solaris close will do 
the job. However when the recv from ends it returns 0, but does not set 
errno, which indicates that there may be more data that can be retrieved 
with another call to recv. On Solaris both shutdown and close cause 
errno to be set.

There is no way then to cause a select on a dgram socket to break out at 
all short of kludging some dgram packet transmission to cause it to happen.

Yech!

Thanks for looking into the issue more,
-Nick

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: select implementation not POSIX compliant?
  2004-08-11 20:33   ` khandelw
  2004-08-11 21:23     ` Alex Riesen
@ 2004-08-13 20:13     ` Nick Palmer
  1 sibling, 0 replies; 6+ messages in thread
From: Nick Palmer @ 2004-08-13 20:13 UTC (permalink / raw)
  To: khandelw; +Cc: Alex Riesen, linux-kernel, netdev

khandelw@cs.fsu.edu wrote:
 > select should work for any type of socket. Its based on the type of file
 > descriptor not whether it is stream/dgram.

Agreed, but as Alex Riesen has shown with his test case, the behavior
differs based on the type of socket. This doesn't seem quite right, but
was not my original point.

 > so why should recvmsg return error???? upon closing the socket in 
other thread?
 > wouldn't the socket linger around for some time...

Only if SO_LINGER is on, and then only for the linger time. I would
expect recvmsg to set errno to EINTR or EINVAL indicating that the recv
message was interrupted or is no longer valid since the socket has
closed. This is not the case. Instead it returns 0, and doesn't set errno.

-Nick

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-08-13 20:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <37062.66.93.180.209.1092243659.squirrel@66.93.180.209>
2004-08-11 19:40 ` select implementation not POSIX compliant? Alex Riesen
2004-08-11 20:33   ` khandelw
2004-08-11 21:23     ` Alex Riesen
2004-08-13 20:13     ` Nick Palmer
2004-08-11 21:57   ` Steven Dake
2004-08-13 20:12   ` Nick Palmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).