All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rainer Weikusat <rweikusat@mobileactivedefense.com>
To: David Laight <David.Laight@ACULAB.COM>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH] net: unix: non blocking recvmsg() should not return -EINTR
Date: Thu, 27 Mar 2014 12:40:49 +0000	[thread overview]
Message-ID: <87zjkb7r1q.fsf@sable.mobileactivedefense.com> (raw)
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D0F6EA4FF@AcuExch.aculab.com> (David Laight's message of "Thu, 27 Mar 2014 09:36:30 +0000")

David Laight <David.Laight@ACULAB.COM> writes:
> From: Rainer Weikusat 
>> Rainer Weikusat <rw@sable> writes:
>> 
>> [...]
>> 
>> > The underlying problem would seem to be that a O_NONBLOCK call might
>> > actually block forever in case a blocking receiver sits on the lock and
>> > no data is ever received.
>> 
>> ... except that this probably cannot happen because O_NONBLOCK is a file
>> status flag and not a file descriptor flag.
>> 
>> NB: I've neither tested nor checked this.
>
> While dup() gives a second fd referring to the same kernel 'file'
> doing open("/dev/fd/4", ...) traditionally gives you an additional
> 'file' referring to the same vnode.
> For real files the file offset is in the 'file' structure, and
> I think the O_NONBLOCK flag is in the same place.
> Which means that is possible (but maybe not that usual or sensible)
> for a process to try a non-blocking read on a socket while another
> process is blocked in the read code.
>
> The same would be true for writes, and for writes to a datagram
> socket it might even make sense.
>
> In any case I expect EAGAIN to mean 'there is no data to read'
> not 'something happened and I didn't bother to look for data'.

The problem is really that the non-blocking thread shouldn't be
interruptible and hence, should never return an EINTR error because of
this. As shown elsewhere, this is not only a theoretical concern but
actually real bug, as a read-call made while the socket is non-blocking
may actually stop executing forever if a prior, blocking read call is
already blocked. When assuming that the non-blocking call should execute
in favour of a blocking call which is actually blocked, this could be
regarded as a 'priority inversion' problem. OTOH, there is not 'perfect'
solution, IOW, one which doesn't involve the non-blocking read to give
up without trying 'immediately before' it had succeeded had it tried.

The 'return EAGAIN in an EINTR situation' is really a lame attempt at
hiding the real problem. A slightly better idea would be that the
non-blocking call should use trylock and return EAGAIN if this didn't
succeed. This would at least prevent it from becoming blocked for an
indefinite time.

A possible improvement would be to record if the thread currently
holding the lock made a blocking or a non-blocking call and use a
non-interruptible wait for the latter case since the lock ought to
become free soon. Problem with this: Another blocking reader could
appear while the non-blocking one is waiting an grab the lock instead.
This could presumably be preventend, but I doubt it'll be worth the
effort for something which seems to be a corner case.

Lastly, the non-blocking read could wait for a bounded time and give up
afterwards. Which turns this into a 'tuning' problem because there's no
good way to determine the 'right' bounded time.

Considering all of this, the trylock-approach seems best to me. OTOH,
I'm find with any behaviour which does not restore the original 'lost
wakeup' bug and considering that "it is a standard procedure to harras
people who are so careless to contribute bug fixes to Linux until the
cow comes home and they'd better be quiet about that!", as Mr Eric
"/dev/null" Dumasomething explained, I certainly don't plan to turn this
conviction of mine into a proper patch.

  reply	other threads:[~2014-03-27 12:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-26  1:42 [PATCH] net: unix: non blocking recvmsg() should not return -EINTR Eric Dumazet
2014-03-26 13:17 ` Rainer Weikusat
2014-03-26 13:57   ` Rainer Weikusat
2014-03-26 14:09   ` Eric Dumazet
2014-03-26 14:25     ` Rainer Weikusat
2014-03-26 15:00 ` David Laight
2014-03-26 15:13   ` Rainer Weikusat
2014-03-26 15:25     ` Eric Dumazet
2014-03-26 15:33       ` Eric Dumazet
2014-03-26 19:46       ` Rainer Weikusat
2014-03-26 21:04         ` Rainer Weikusat
2014-03-27  9:36           ` David Laight
2014-03-27 12:40             ` Rainer Weikusat [this message]
2014-03-27 12:49               ` Jamal Hadi Salim
2014-03-27 13:02                 ` Rainer Weikusat
2014-03-27 12:53               ` David Laight
2014-03-27 13:29                 ` Rainer Weikusat
2014-03-26 21:05         ` David Miller
2014-03-26 21:21           ` Rainer Weikusat
2014-03-26 21:44             ` Eric Dumazet
2014-03-26 22:06               ` Rainer Weikusat
2014-03-26 22:35                 ` Eric Dumazet
2014-03-26 22:51                   ` Rainer Weikusat
2014-03-26 22:58                     ` Eric Dumazet
2014-03-28 20:35             ` Rainer Weikusat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zjkb7r1q.fsf@sable.mobileactivedefense.com \
    --to=rweikusat@mobileactivedefense.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.