public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Denys Vlasenko <vda.linux@googlemail.com>
To: linux-kernel@vger.kernel.org
Subject: O_NONBLOCK is broken
Date: Tue, 14 Aug 2007 12:41:43 +0100	[thread overview]
Message-ID: <200708141241.44021.vda.linux@googlemail.com> (raw)

Hi folks,

I apologize for a provocative subject. O_NONBLOCK in Linux is not broken.
It is working as designed. But the design itself is suffering from a flaw.

Suppose I want to write block of data to stdout, without blocking.
I will do the classic thing:

int fl = fcntl(1, F_GETFL, 0);
fcntl(1, F_SETFL, fl | O_NONBLOCK);
r = write(1, buf, size);
fcntl(1, F_SETFL, fl); /* restore ASAP! */

The problem is, O_NONBLOCK flag is not attached to file *descriptor*,
but to a "file description" mentioned in fcntl manpage:

"Each open file description has certain associated status flags, initialized 
by open(2) and possibly modified by fcntl(2).  Duplicated file descriptors  
(made with dup(), fcntl(F_DUPFD), fork(), etc.) refer to the same open file 
description, and thus share the same file status flags."

We don't know whether our stdout descriptor #1 is shared with anyone or not,
and if we were started from shell, it typically is. That's why we try to
restore flags ASAP.

But "ASAP" isn't soon enough. Between setting and clearing O_NONBLOCK,
other process which share fd #1 with us may well be affected
by file suddenly becoming O_NONBLOCK under its feet.

Worse, other process can do the same
    fcntl(1, F_SETFL, fl | O_NONBLOCK);
    ...
    fcntl(1, F_SETFL, fl);
sequence, and first fcntl can return flags with O_NONBLOCK set (because of 
us), and then second fcntl will set O_NONBLOCK permanently, which is not
what was intended!

Other failure mode is that process can be killed by a signal
between fcntl's, leaving file in O_NONBLOCK mode.

This isn't theoretical problem, it actually happens not-so-rarely, for 
example, with pagers.

Possible solutions:

a) Introduce *per-fd* flag, so that one can use
   fcntl(1, F_SETFD, fdflag | O_NONBLOCK) instead of
   fcntl(1, F_SETFL, flflag | O_NONBLOCK) instead of
   Currently there is only one per-fd flag - O_CLOEXEC with value of 1.
   O_NONBLOCK is 0x4000.

b) Make recv(fd, buf, size, flags) and send(fd, buf, size, flags);
   work with non-socket fds too, for flags==0 or flags==MSG_DONTWAIT.
   (it's ok to fail with "socket op on non-socket fd" for other values
   of flags)

Both things are non-standard, but portable programs can test for errors
and fall back to "standard" UNIX way of doing it.


P.S. Hmm, it seems fcntl GETFL/SETFL interface seems to be racy:

    int fl = fcntl(fd, F_GETFL, 0);
    /* other process can muck with file flags here */
    fcntl(fd, F_SETFL, fl | SOME_BITS);

How can I *atomically* add or remove bits from file flags?
--
vda

             reply	other threads:[~2007-08-14 11:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-14 11:41 Denys Vlasenko [this message]
2007-08-14 12:33 ` O_NONBLOCK is broken Alan Cox
2007-08-14 17:28   ` Jan Engelhardt
2007-08-19 12:50   ` [PATCH] allow send/recv(MSG_DONTWAIT) on non-sockets (was Re: O_NONBLOCK is broken) Denys Vlasenko
2007-08-14 21:59 ` O_NONBLOCK is broken David Schwartz
2007-08-19 12:50   ` Denys Vlasenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200708141241.44021.vda.linux@googlemail.com \
    --to=vda.linux@googlemail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox