From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from shards.monkeyblade.net ([184.105.139.130]:33186 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751938AbeD3Pig (ORCPT ); Mon, 30 Apr 2018 11:38:36 -0400 Date: Mon, 30 Apr 2018 11:38:34 -0400 (EDT) Message-Id: <20180430.113834.1760530542793231849.davem@davemloft.net> To: soheil.kdev@gmail.com Cc: netdev@vger.kernel.org, ycheng@google.com, ncardwell@google.com, edumazet@google.com, willemb@google.com, soheil@google.com Subject: Re: [PATCH V2 net-next 1/2] tcp: send in-queue bytes in cmsg upon read From: David Miller In-Reply-To: <20180427185733.36855-1-soheil.kdev@gmail.com> References: <20180427185733.36855-1-soheil.kdev@gmail.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org List-ID: From: Soheil Hassas Yeganeh Date: Fri, 27 Apr 2018 14:57:32 -0400 > Since the socket lock is not held when calculating the size of > receive queue, TCP_INQ is a hint. For example, it can overestimate > the queue size by one byte, if FIN is received. I think it is even worse than that. If another application comes in and does a recvmsg() in parallel with these calculations, you could even report a negative value. These READ_ONCE() make it look like some of these issues are being addressed but they are not. You could freeze the values just by taking sk->sk_lock.slock, but I don't know if that cost is considered acceptable or not. Another idea is to sample both values in a loop, similar to a sequence lock sequence: again: tmp1 = A; tmp2 = B; barrier(); tmp3 = A; if (tmp1 != tmp3) goto again; But the current state of affairs is not going to work well.