All of lore.kernel.org
 help / color / mirror / Atom feed
From: linux@horizon.com
To: aeb@cwi.nl, linux-kernel@vger.kernel.org
Cc: cfriesen@nortelnetworks.com
Subject: [PATCH] Re: UDP recvmsg blocks after select(), 2.6 bug?
Date: 7 Oct 2004 12:49:09 -0000	[thread overview]
Message-ID: <20041007124909.12995.qmail@science.horizon.com> (raw)

How about the following?  Should I make a similar addition to
poll(2)?

Legalese:
These changes are works of original authorship.
These changes are hereby released into the public domain; copyright abandoned.

--- man2/select.2.old	2004-10-07 07:58:46.000000000 -0400
+++ man2/select.2	2004-10-07 08:38:24.000000000 -0400
@@ -170,7 +170,7 @@
 .IR sigmask ,
 avoiding the race.)
 Since Linux today does not have a
-.IR pselect ()
+.BR pselect ()
 system call, the current glibc2 routine still contains this race.
 .SS "The timeout"
 The time structures involved are defined in
@@ -291,6 +291,18 @@
     return 0;
 }
 .fi
+.SH BUGS
+.B pselect
+is currently emulated with a user-space wrapper that has a race condition.
+For reliable (and more portable) signal trapping, use the self-pipe trick.
+(Where a signal handler writes to a pipe whose other end is read by the
+main loop.)
+
+.B select
+and
+.B pselect
+permit blocking file descritprs in the fd_sets, even though
+there is no valid reason for a program to do this.
 .SH "CONFORMING TO"
 4.4BSD (the
 .B select
@@ -315,6 +327,39 @@
 .I fd
 to be a valid file descriptor.
 
+When
+.B select
+indicates that a file descriptor is ready, this is only a strong hint,
+not a guarantee, that a read or write is possible without blocking.
+For this reason, the associated file descriptors must always be in
+non-blocking mode (see
+.BR fcntl (2))
+in a correct program.  Reasons why the I/O could block include:
+.TP
+(i)
+Another process may have performed I/O on the
+.I fd
+in the meantime.
+.TP
+(ii)
+Some needed kernel buffer space may have been consumed for reasons
+totally unrelated to this I/O, or
+.TP
+(iii)
+Since 2.4.x, Linux has overlapped UDP checksum verification with
+copying to user-space.  If a UDP packet arrives,
+.B select
+will indicate that data is ready, but during the read, if the checksum is
+bad, the packet will disappear and (if no subsequent packet with a
+valid checksum is waiting) the read will indicate that no data is available.
+.PP
+In general, it is legal for
+.B select
+to make some optimistic assumptions, subject to later verification by the
+subsequent I/O, as long as this does not result in a busy-loop where
+.B select
+is stuck thinking data is ready when it is not.
+
 Concerning the types involved, the classical situation is that
 the two fields of a struct timeval are longs (as shown above),
 and the struct is defined in

             reply	other threads:[~2004-10-07 13:00 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-07 12:49 linux [this message]
2004-10-08 11:45 ` [PATCH] Re: UDP recvmsg blocks after select(), 2.6 bug? Andries Brouwer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041007124909.12995.qmail@science.horizon.com \
    --to=linux@horizon.com \
    --cc=aeb@cwi.nl \
    --cc=cfriesen@nortelnetworks.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.