netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] sock: do not set sk_err in sock_dequeue_err_skb
@ 2016-11-03 22:24 Soheil Hassas Yeganeh
  2016-11-03 23:10 ` Hannes Frederic Sowa
  2016-11-08  1:29 ` David Miller
  0 siblings, 2 replies; 5+ messages in thread
From: Soheil Hassas Yeganeh @ 2016-11-03 22:24 UTC (permalink / raw)
  To: davem, netdev; +Cc: edumazet, willemb, ncardwell, Soheil Hassas Yeganeh

From: Soheil Hassas Yeganeh <soheil@google.com>

Do not set sk_err when dequeuing errors from the error queue.
Doing so results in:
a) Bugs: By overwriting existing sk_err values, it possibly
   hides legitimate errors. It is also incorrect when local
   errors are queued with ip_local_error. That happens in the
   context of a system call, which already returns the error
   code.
b) Inconsistent behavior: When there are pending errors on
   the error queue, sk_err is sometimes 0 (e.g., for
   the first timestamp on the error queue) and sometimes
   set to an error code (after dequeuing the first
   timestamp).
c) Suboptimality: Setting sk_err to ENOMSG on simple
   TX timestamps can abort parallel reads and writes.

Removing this line doesn't break userspace. This is because
userspace code cannot rely on sk_err for detecting whether
there is something on the error queue. Except for ICMP messages
received for UDP and RAW, sk_err is not set at enqueue time,
and as a result sk_err can be 0 while there are plenty of
errors on the error queue.

For ICMP packets in UDP and RAW, sk_err is set when they are
enqueued on the error queue, but that does not result in aborting
reads and writes. For such cases, sk_err is only readable via
getsockopt(SO_ERROR) which will reset the value of sk_err on
its own. More importantly, prior to this patch,
recvmsg(MSG_ERRQUEUE) has a race on setting sk_err (i.e.,
sk_err is set by sock_dequeue_err_skb without atomic ops or
locks) which can store 0 in sk_err even when we have ICMP
messages pending. Removing this line from sock_dequeue_err_skb
eliminates that race.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
---
 net/core/skbuff.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 1e3e008..0b2a6e9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3725,7 +3725,6 @@ struct sk_buff *sock_dequeue_err_skb(struct sock *sk)
 		err = SKB_EXT_ERR(skb_next)->ee.ee_errno;
 	spin_unlock_irqrestore(&q->lock, flags);
 
-	sk->sk_err = err;
 	if (err)
 		sk->sk_error_report(sk);
 
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-11-08  1:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-03 22:24 [PATCH net-next] sock: do not set sk_err in sock_dequeue_err_skb Soheil Hassas Yeganeh
2016-11-03 23:10 ` Hannes Frederic Sowa
2016-11-04 19:26   ` Willem de Bruijn
2016-11-07  5:18     ` Andy Lutomirski
2016-11-08  1:29 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).