BPF List
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: John Fastabend <john.fastabend@gmail.com>,
	Lingpeng Chen <forrest0579@gmail.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH] bpf/sockmap: read psock ingress_msg before sk_receive_queue
Date: Wed, 8 Jan 2020 19:17:49 +0100	[thread overview]
Message-ID: <e40286e9-107c-4af9-e596-4af426408eca@iogearbox.net> (raw)
In-Reply-To: <5e161913342f2_67ea2afd262665bc1c@john-XPS-13-9370.notmuch>

On 1/8/20 7:01 PM, John Fastabend wrote:
> Daniel Borkmann wrote:
>> On Wed, Jan 08, 2020 at 12:57:08PM +0800, Lingpeng Chen wrote:
>>> Right now in tcp_bpf_recvmsg, sock read data first from sk_receive_queue
>>> if not empty than psock->ingress_msg otherwise. If a FIN packet arrives
>>> and there's also some data in psock->ingress_msg, the data in
>>> psock->ingress_msg will be purged. It is always happen when request to a
>>> HTTP1.0 server like python SimpleHTTPServer since the server send FIN
>>> packet after data is sent out.
>>>
>>> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
>>> Reported-by: Arika Chen <eaglesora@gmail.com>
>>> Suggested-by: Arika Chen <eaglesora@gmail.com>
>>> Signed-off-by: Lingpeng Chen <forrest0579@gmail.com>
>>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>>> ---
>>>   net/ipv4/tcp_bpf.c | 7 ++++---
>>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
>>> index e38705165ac9..f7e902868fce 100644
>>> --- a/net/ipv4/tcp_bpf.c
>>> +++ b/net/ipv4/tcp_bpf.c
>>> @@ -123,12 +123,13 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
>>>   
>>>   	if (unlikely(flags & MSG_ERRQUEUE))
>>>   		return inet_recv_error(sk, msg, len, addr_len);
>>
>> Shouldn't we also move the error queue handling below the psock test as
>> well and let tcp_recvmsg() natively do it in case of !psock?
>>
> 
> You mean the MSG_ERRQUEUE flag handling? If the user sets MSG_ERRQUEUE
> they expect to receive any queued errors it would be wrong to return
> psock data in this case if psock is attached and has data on queue and
> user passes MSG_ERRQUEUE flag.
> 
>   MSG_ERRQUEUE (since Linux 2.2)
>    This flag specifies that queued errors should be received from the socket
>    error queue.  The error is passed in an ancillary message with a type
>    dependent on the protocol (for IPv4 IP_RECVERR).  The user should supply
>    a buffer of sufficient size. See cmsg(3) and ip(7) for more information.
>    The payload of the original packet that caused the error is passed as
>    normal data via msg_iovec. The original destination address of the
>    datagram that caused the error is supplied via msg_name.
> 
> I believe it needs to be where it is.

I meant that it should have looked as follows (aka moving both below the
psock test) ...

         psock = sk_psock_get(sk);
         if (unlikely(!psock))
             return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
         if (unlikely(flags & MSG_ERRQUEUE))
             return inet_recv_error(sk, msg, len, addr_len);
	if (!skb_queue_empty(&sk->sk_receive_queue) && [...]

... since when detached it's handled already via tcp_recvmsg() internals.

>>> -	if (!skb_queue_empty(&sk->sk_receive_queue))
>>> -		return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
>>>   
>>>   	psock = sk_psock_get(sk);
>>>   	if (unlikely(!psock))
>>>   		return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
>>> +	if (!skb_queue_empty(&sk->sk_receive_queue) &&
>>> +	    sk_psock_queue_empty(psock))
>>> +		return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
>>>   	lock_sock(sk);
>>>   msg_bytes_ready:
>>>   	copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags);
>>> @@ -139,7 +140,7 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
>>>   		timeo = sock_rcvtimeo(sk, nonblock);
>>>   		data = tcp_bpf_wait_data(sk, psock, flags, timeo, &err);
>>>   		if (data) {
>>> -			if (skb_queue_empty(&sk->sk_receive_queue))
>>> +			if (!sk_psock_queue_empty(psock))
>>>   				goto msg_bytes_ready;
>>>   			release_sock(sk);
>>>   			sk_psock_put(sk, psock);
>>> -- 
>>> 2.17.1
>>>
> 
> 


  reply	other threads:[~2020-01-08 18:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200107042247.16614-1-forrest0579@gmail.com>
     [not found] ` <5e14a5fe53ac8_67962afd051fc5c0ea@john-XPS-13-9370.notmuch>
     [not found]   ` <CAH+Qyb+37gaWZzEvvXeX9ghsCYw1JyH_23S+1HW0ML-MZkcYfg@mail.gmail.com>
2020-01-08  3:54     ` [PATCH] bpf/sockmap: read psock ingress_msg before sk_receive_queue John Fastabend
2020-01-08  4:57       ` Lingpeng Chen
2020-01-08 16:50         ` Song Liu
2020-01-08 17:02         ` Daniel Borkmann
2020-01-08 18:01           ` John Fastabend
2020-01-08 18:17             ` Daniel Borkmann [this message]
2020-01-08 18:34               ` John Fastabend
2020-01-09  1:48                 ` [PATCH v2] " Lingpeng Chen
2020-01-09 22:16                   ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e40286e9-107c-4af9-e596-4af426408eca@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=bpf@vger.kernel.org \
    --cc=forrest0579@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox