netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [bug report] one possible out-of-order issue in sockmap
@ 2022-09-24  7:59 liujian (CE)
  2022-09-25 18:25 ` Cong Wang
  0 siblings, 1 reply; 7+ messages in thread
From: liujian (CE) @ 2022-09-24  7:59 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
	yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, bpf@vger.kernel.org

Hello,

I had a scp failure problem here. I analyze the code, and the reasons may be as follows:

From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg before
 sk_receive_queue", if we use sockops (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's sockmap
function, and don't enable strparse and verdict function, the out-of-order
problem may occur in the following process.

client SK                                   server SK
--------------------------------------------------------------------------
tcp_rcv_synsent_state_process
  tcp_finish_connect
    tcp_init_transfer
      tcp_set_state(sk, TCP_ESTABLISHED);
      // insert SK to sockmap
    wake up waitter
    tcp_send_ack

tcp_bpf_sendmsg(msgA)
// msgA will go tcp stack
                                            tcp_rcv_state_process
                                              tcp_init_transfer
                                                //insert SK to sockmap
                                              tcp_set_state(sk,
                                                     TCP_ESTABLISHED)
                                              wake up waitter
tcp_bpf_sendmsg(msgB)
// msgB go sockmap
                                              tcp_bpf_recvmsg
                                                //msgB, out-of-order
                                              tcp_bpf_recvmsg
                                                //msgA, out-of-order


Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg receives msg from the psock queue first.
The worst case is that msgA waits for serverSK to change to TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK receive queue later than msgB.
If msgA befor than msgB, 

If the ACK packets of the three-way TCP handshake are dropped for a period of time, the OOO problem is easily reproduced.

iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
...
iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP

Best Wishes
Liu Jian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bug report] one possible out-of-order issue in sockmap
  2022-09-24  7:59 [bug report] one possible out-of-order issue in sockmap liujian (CE)
@ 2022-09-25 18:25 ` Cong Wang
  2022-09-26  1:34   ` liujian (CE)
  0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2022-09-25 18:25 UTC (permalink / raw)
  To: liujian (CE)
  Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
	yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
	Paolo Abeni, netdev, bpf@vger.kernel.org

On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> Hello,
> 
> I had a scp failure problem here. I analyze the code, and the reasons may be as follows:
> 
> From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg before
>  sk_receive_queue", if we use sockops (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's sockmap
> function, and don't enable strparse and verdict function, the out-of-order
> problem may occur in the following process.
> 
> client SK                                   server SK
> --------------------------------------------------------------------------
> tcp_rcv_synsent_state_process
>   tcp_finish_connect
>     tcp_init_transfer
>       tcp_set_state(sk, TCP_ESTABLISHED);
>       // insert SK to sockmap
>     wake up waitter
>     tcp_send_ack
> 
> tcp_bpf_sendmsg(msgA)
> // msgA will go tcp stack
>                                             tcp_rcv_state_process
>                                               tcp_init_transfer
>                                                 //insert SK to sockmap
>                                               tcp_set_state(sk,
>                                                      TCP_ESTABLISHED)
>                                               wake up waitter

Here after the socket is inserted to a sockmap, its ->sk_data_ready() is
already replaced with sk_psock_verdict_data_ready(), so msgA should go
to sockmap, not TCP stack?

> tcp_bpf_sendmsg(msgB)
> // msgB go sockmap
>                                               tcp_bpf_recvmsg
>                                                 //msgB, out-of-order
>                                               tcp_bpf_recvmsg
>                                                 //msgA, out-of-order
> 
> 
> Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg receives msg from the psock queue first.
> The worst case is that msgA waits for serverSK to change to TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK receive queue later than msgB.
> If msgA befor than msgB, 
> 
> If the ACK packets of the three-way TCP handshake are dropped for a period of time, the OOO problem is easily reproduced.
> 
> iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
> ...
> iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
> 
> Best Wishes
> Liu Jian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [bug report] one possible out-of-order issue in sockmap
  2022-09-25 18:25 ` Cong Wang
@ 2022-09-26  1:34   ` liujian (CE)
  2022-09-26 21:16     ` John Fastabend
  0 siblings, 1 reply; 7+ messages in thread
From: liujian (CE) @ 2022-09-26  1:34 UTC (permalink / raw)
  To: Cong Wang
  Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
	yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
	Paolo Abeni, netdev, bpf@vger.kernel.org



> -----Original Message-----
> From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> Sent: Monday, September 26, 2022 2:26 AM
> To: liujian (CE) <liujian56@huawei.com>
> Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem
> <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> Subject: Re: [bug report] one possible out-of-order issue in sockmap
> 
> On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > Hello,
> >
> > I had a scp failure problem here. I analyze the code, and the reasons may
> be as follows:
> >
> > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> before
> > sk_receive_queue", if we use sockops
> > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> sockmap
> > function, and don't enable strparse and verdict function, the
> > out-of-order problem may occur in the following process.
> >
> > client SK                                   server SK
> > ----------------------------------------------------------------------
> > ----
> > tcp_rcv_synsent_state_process
> >   tcp_finish_connect
> >     tcp_init_transfer
> >       tcp_set_state(sk, TCP_ESTABLISHED);
> >       // insert SK to sockmap
> >     wake up waitter
> >     tcp_send_ack
> >
> > tcp_bpf_sendmsg(msgA)
> > // msgA will go tcp stack
> >                                             tcp_rcv_state_process
> >                                               tcp_init_transfer
> >                                                 //insert SK to sockmap
> >                                               tcp_set_state(sk,
> >                                                      TCP_ESTABLISHED)
> >                                               wake up waitter
> 
> Here after the socket is inserted to a sockmap, its ->sk_data_ready() is
> already replaced with sk_psock_verdict_data_ready(), so msgA should go to
> sockmap, not TCP stack?
> 
It is TCP stack.  Here I only enable BPF_SK_MSG_VERDICT type.
bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name sock_ops_map pinned /sys/fs/bpf/sock_ops_map
bpftool prog attach pinned /sys/fs/bpf/bpf_redir msg_verdict pinned /sys/fs/bpf/sock_ops_map

The call trace like this:
Tcp_bpf_sendmsg
--tcp_bpf_send_verdict
---- sk_psock_msg_verdict // did not find serverSK, return __SK_PASS
---- tcp_bpf_push
------ do_tcp_sendpages // go to TCP stack

After this, serverSk is inserted to a sockmap, but msgA is already running the TCP stack.

> > tcp_bpf_sendmsg(msgB)
> > // msgB go sockmap
> >                                               tcp_bpf_recvmsg
> >                                                 //msgB, out-of-order
> >                                               tcp_bpf_recvmsg
> >                                                 //msgA, out-of-order
> >
> >
> > Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg
> receives msg from the psock queue first.
> > The worst case is that msgA waits for serverSK to change to
> TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK
> receive queue later than msgB.
> > If msgA befor than msgB,
> >
> > If the ACK packets of the three-way TCP handshake are dropped for a
> period of time, the OOO problem is easily reproduced.
> >
> > iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags
> > SYN,RST,ACK,FIN ACK -j DROP ...
> > iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags
> > SYN,RST,ACK,FIN ACK -j DROP
> >
> > Best Wishes
> > Liu Jian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [bug report] one possible out-of-order issue in sockmap
  2022-09-26  1:34   ` liujian (CE)
@ 2022-09-26 21:16     ` John Fastabend
  2022-09-27  2:15       ` liujian (CE)
  0 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2022-09-26 21:16 UTC (permalink / raw)
  To: liujian (CE), Cong Wang
  Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
	yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
	Paolo Abeni, netdev, bpf@vger.kernel.org

liujian (CE) wrote:
> 
> 
> > -----Original Message-----
> > From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> > Sent: Monday, September 26, 2022 2:26 AM
> > To: liujian (CE) <liujian56@huawei.com>
> > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem
> > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > Subject: Re: [bug report] one possible out-of-order issue in sockmap
> > 
> > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > > Hello,
> > >
> > > I had a scp failure problem here. I analyze the code, and the reasons may
> > be as follows:
> > >
> > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> > before
> > > sk_receive_queue", if we use sockops
> > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> > sockmap
> > > function, and don't enable strparse and verdict function, the
> > > out-of-order problem may occur in the following process.
> > >
> > > client SK                                   server SK
> > > ----------------------------------------------------------------------
> > > ----
> > > tcp_rcv_synsent_state_process
> > >   tcp_finish_connect
> > >     tcp_init_transfer
> > >       tcp_set_state(sk, TCP_ESTABLISHED);
> > >       // insert SK to sockmap
> > >     wake up waitter
> > >     tcp_send_ack
> > >
> > > tcp_bpf_sendmsg(msgA)
> > > // msgA will go tcp stack
> > >                                             tcp_rcv_state_process
> > >                                               tcp_init_transfer
> > >                                                 //insert SK to sockmap
> > >                                               tcp_set_state(sk,
> > >                                                      TCP_ESTABLISHED)
> > >                                               wake up waitter
> > 
> > Here after the socket is inserted to a sockmap, its ->sk_data_ready() is
> > already replaced with sk_psock_verdict_data_ready(), so msgA should go to
> > sockmap, not TCP stack?
> > 
> It is TCP stack.  Here I only enable BPF_SK_MSG_VERDICT type.
> bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name sock_ops_map pinned /sys/fs/bpf/sock_ops_map
> bpftool prog attach pinned /sys/fs/bpf/bpf_redir msg_verdict pinned /sys/fs/bpf/sock_ops_map

Is the sender using FAST_OPEN by any chance? We know this bug exists
in this case. Fix tbd.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [bug report] one possible out-of-order issue in sockmap
  2022-09-26 21:16     ` John Fastabend
@ 2022-09-27  2:15       ` liujian (CE)
  2022-09-28 18:31         ` John Fastabend
  0 siblings, 1 reply; 7+ messages in thread
From: liujian (CE) @ 2022-09-27  2:15 UTC (permalink / raw)
  To: John Fastabend, Cong Wang
  Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org,
	dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev,
	bpf@vger.kernel.org



> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Tuesday, September 27, 2022 5:16 AM
> To: liujian (CE) <liujian56@huawei.com>; Cong Wang
> <xiyou.wangcong@gmail.com>
> Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem
> <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> Subject: RE: [bug report] one possible out-of-order issue in sockmap
> 
> liujian (CE) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> > > Sent: Monday, September 26, 2022 2:26 AM
> > > To: liujian (CE) <liujian56@huawei.com>
> > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>;
> davem
> > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > > Subject: Re: [bug report] one possible out-of-order issue in sockmap
> > >
> > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > > > Hello,
> > > >
> > > > I had a scp failure problem here. I analyze the code, and the
> > > > reasons may
> > > be as follows:
> > > >
> > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> > > before
> > > > sk_receive_queue", if we use sockops
> > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> > > sockmap
> > > > function, and don't enable strparse and verdict function, the
> > > > out-of-order problem may occur in the following process.
> > > >
> > > > client SK                                   server SK
> > > > ------------------------------------------------------------------
> > > > ----
> > > > ----
> > > > tcp_rcv_synsent_state_process
> > > >   tcp_finish_connect
> > > >     tcp_init_transfer
> > > >       tcp_set_state(sk, TCP_ESTABLISHED);
> > > >       // insert SK to sockmap
> > > >     wake up waitter
> > > >     tcp_send_ack
> > > >
> > > > tcp_bpf_sendmsg(msgA)
> > > > // msgA will go tcp stack
> > > >                                             tcp_rcv_state_process
> > > >                                               tcp_init_transfer
> > > >                                                 //insert SK to sockmap
> > > >                                               tcp_set_state(sk,
> > > >                                                      TCP_ESTABLISHED)
> > > >                                               wake up waitter
> > >
> > > Here after the socket is inserted to a sockmap, its
> > > ->sk_data_ready() is already replaced with
> > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not TCP
> stack?
> > >
> > It is TCP stack.  Here I only enable BPF_SK_MSG_VERDICT type.
> > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name
> > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach
> > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned
> > /sys/fs/bpf/sock_ops_map
> 
> Is the sender using FAST_OPEN by any chance? We know this bug exists in
> this case. Fix tbd.

FAST_OPEN is not used.
The following test cases can be used to reproduce the OOO problem. 
But the worst-case scenario described in the problem (the arrival of msgA is later than the arrival of msgB), I have not been able to construct an inevitable case.

tcp_server.c

int server_port = 5006;
int main(int argc, char *argv[])
{
	int serverSocket;
	struct sockaddr_in server_addr;
	struct sockaddr_in clientAddr;
	int addr_len = sizeof(clientAddr);
	int client;
	char buffer[200];
	int iDataNum;
	int optbuf, ret;

	if (argc != 2) {
		return -1;
	}

	server_port =  atoi(argv[1]);
	if( server_port<1025 || server_port>65535 )
	{
		return -1;
	}

	if((serverSocket = socket(AF_INET, SOCK_STREAM, 0)) < 0)
	{
		perror("socket");
		return 1;
	}
	optbuf = 1;
	ret = setsockopt(serverSocket, SOL_SOCKET, SO_REUSEADDR, &optbuf, sizeof(int));
	if (ret != 0)
		perror("reuseaddr failed");
	bzero(&server_addr, sizeof(server_addr));
	server_addr.sin_family = AF_INET;
	server_addr.sin_port = htons(server_port);
	server_addr.sin_addr.s_addr = htonl(INADDR_ANY);
	if(bind(serverSocket, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0)
	{
		perror("connect");
		return 1;
	}
	if(listen(serverSocket, 5) < 0)
	{
		perror("listen");
		return 1;
	}
	while(1)
	{
		client = accept(serverSocket, (struct sockaddr*)&clientAddr, (socklen_t*)&addr_len);
		if(client < 0)
		{
			perror("accept");
			continue;
		}
		printf("wait until the two msgs of client are sent...\n");
		sleep(5);
		while(1)
		{
			printf("recvmsg:");
			buffer[0] = '\0';
			iDataNum = recv(client, buffer, 1024, 0);
			if(iDataNum < 0)
			{
				perror("recv null");
				continue;
			}
			buffer[iDataNum] = '\0';
			printf("%s\n", buffer);
			sleep(2);
		}
	}
	close(serverSocket);
	return 0;
}



tcp_client.c

int server_port = 5006;
int main(int argc, char *argv[])
{
	int clientSocket;
	struct sockaddr_in serverAddr;
	struct sockaddr_in clientAddr;
	char sendbuf[4096];
	char recvbuf[4096];
	int iDataNum;
	int ret;
	int client_port;

	if (argc != 3) {
		printf("client [sport] [dport]\n");
		return -1;
	}

	client_port =  atoi(argv[1]);
	if(client_port<1025 || client_port>65535 )
	{
		return -1;
	}

	server_port =  atoi(argv[2]);
	if( server_port<1025 || server_port>65535 )
	{
		return -1;
	}

	if((clientSocket = socket(AF_INET, SOCK_STREAM, 0)) < 0)
	{
		perror("socket");
		return 1;
	}
	bzero(&clientAddr, sizeof(clientAddr));
	clientAddr.sin_family = AF_INET;
	clientAddr.sin_port = htons(client_port);
	clientAddr.sin_addr.s_addr = htonl(INADDR_ANY);
	if(bind(clientSocket, (struct sockaddr *)&clientAddr, sizeof(clientAddr)) < 0)
	{
		perror("bind");
		return 1;
	}
	bzero(&serverAddr, sizeof(serverAddr));
	serverAddr.sin_family = AF_INET;
	serverAddr.sin_port = htons(server_port);
	serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
	system("iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP");
	if(connect(clientSocket, (struct sockaddr *)&serverAddr, sizeof(serverAddr)) < 0)
	{
		perror("connect");
		return 1;
	}

	memset(sendbuf, 0, sizeof(sendbuf));
	memcpy(sendbuf, "AAAAAAAAAAA", 10); 
	ret = send(clientSocket, sendbuf, strlen(sendbuf), 0);
	if (ret <= 0) {
		perror("send fail\n");
		return -1;
	}
	printf("finish send A\n");
	system("iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP");
	sleep(2); // wait serversk insert to sockmap
	printf("start send b\n");
	memcpy(sendbuf, "bbbbbbbbbbbbb", 10); 
	ret = send(clientSocket, sendbuf, strlen(sendbuf), 0);
	if (ret <= 0) {
		perror("send fail\n");
		return -1;
	}

	sleep(10);
	close(clientSocket);
	return 0;
}

[root@localhost sockmap_test]# ./server 5006
wait until the two msgs of client are sent...
recvmsg:bbbbbbbbbb
recvmsg:AAAAAAAAAA
^C

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [bug report] one possible out-of-order issue in sockmap
  2022-09-27  2:15       ` liujian (CE)
@ 2022-09-28 18:31         ` John Fastabend
  2022-11-26  7:12           ` liujian (CE)
  0 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2022-09-28 18:31 UTC (permalink / raw)
  To: liujian (CE), John Fastabend, Cong Wang
  Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org,
	dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev,
	bpf@vger.kernel.org

liujian (CE) wrote:
> 
> 
> > -----Original Message-----
> > From: John Fastabend [mailto:john.fastabend@gmail.com]
> > Sent: Tuesday, September 27, 2022 5:16 AM
> > To: liujian (CE) <liujian56@huawei.com>; Cong Wang
> > <xiyou.wangcong@gmail.com>
> > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem
> > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > Subject: RE: [bug report] one possible out-of-order issue in sockmap
> > 
> > liujian (CE) wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> > > > Sent: Monday, September 26, 2022 2:26 AM
> > > > To: liujian (CE) <liujian56@huawei.com>
> > > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>;
> > davem
> > > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > > > Subject: Re: [bug report] one possible out-of-order issue in sockmap
> > > >
> > > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > > > > Hello,
> > > > >
> > > > > I had a scp failure problem here. I analyze the code, and the
> > > > > reasons may
> > > > be as follows:
> > > > >
> > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> > > > before
> > > > > sk_receive_queue", if we use sockops
> > > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> > > > sockmap
> > > > > function, and don't enable strparse and verdict function, the
> > > > > out-of-order problem may occur in the following process.
> > > > >
> > > > > client SK                                   server SK
> > > > > ------------------------------------------------------------------
> > > > > ----
> > > > > ----
> > > > > tcp_rcv_synsent_state_process
> > > > >   tcp_finish_connect
> > > > >     tcp_init_transfer
> > > > >       tcp_set_state(sk, TCP_ESTABLISHED);
> > > > >       // insert SK to sockmap
> > > > >     wake up waitter
> > > > >     tcp_send_ack
> > > > >
> > > > > tcp_bpf_sendmsg(msgA)
> > > > > // msgA will go tcp stack
> > > > >                                             tcp_rcv_state_process
> > > > >                                               tcp_init_transfer
> > > > >                                                 //insert SK to sockmap
> > > > >                                               tcp_set_state(sk,
> > > > >                                                      TCP_ESTABLISHED)
> > > > >                                               wake up waitter
> > > >
> > > > Here after the socket is inserted to a sockmap, its
> > > > ->sk_data_ready() is already replaced with
> > > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not TCP
> > stack?
> > > >
> > > It is TCP stack.  Here I only enable BPF_SK_MSG_VERDICT type.
> > > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name
> > > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach
> > > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned
> > > /sys/fs/bpf/sock_ops_map
> > 
> > Is the sender using FAST_OPEN by any chance? We know this bug exists in
> > this case. Fix tbd.
> 
> FAST_OPEN is not used.

OK thanks for the reproducer I'll take a look this afternoon.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [bug report] one possible out-of-order issue in sockmap
  2022-09-28 18:31         ` John Fastabend
@ 2022-11-26  7:12           ` liujian (CE)
  0 siblings, 0 replies; 7+ messages in thread
From: liujian (CE) @ 2022-11-26  7:12 UTC (permalink / raw)
  To: John Fastabend, Cong Wang
  Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org,
	dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev,
	bpf@vger.kernel.org



> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Thursday, September 29, 2022 2:31 AM
> To: liujian (CE) <liujian56@huawei.com>; John Fastabend
> <john.fastabend@gmail.com>; Cong Wang <xiyou.wangcong@gmail.com>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>; Eric Dumazet
> <edumazet@google.com>; davem <davem@davemloft.net>;
> yoshfuji@linux-ipv6.org; dsahern@kernel.org; Jakub Kicinski
> <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; netdev
> <netdev@vger.kernel.org>; bpf@vger.kernel.org
> Subject: RE: [bug report] one possible out-of-order issue in sockmap
> 
> liujian (CE) wrote:
> >
> >
> > > -----Original Message-----
> > > From: John Fastabend [mailto:john.fastabend@gmail.com]
> > > Sent: Tuesday, September 27, 2022 5:16 AM
> > > To: liujian (CE) <liujian56@huawei.com>; Cong Wang
> > > <xiyou.wangcong@gmail.com>
> > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>;
> davem
> > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > > Subject: RE: [bug report] one possible out-of-order issue in sockmap
> > >
> > > liujian (CE) wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> > > > > Sent: Monday, September 26, 2022 2:26 AM
> > > > > To: liujian (CE) <liujian56@huawei.com>
> > > > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > > > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>;
> > > davem
> > > > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org;
> > > > > dsahern@kernel.org; Jakub Kicinski <kuba@kernel.org>; Paolo
> > > > > Abeni <pabeni@redhat.com>; netdev <netdev@vger.kernel.org>;
> > > > > bpf@vger.kernel.org
> > > > > Subject: Re: [bug report] one possible out-of-order issue in
> > > > > sockmap
> > > > >
> > > > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > > > > > Hello,
> > > > > >
> > > > > > I had a scp failure problem here. I analyze the code, and the
> > > > > > reasons may
> > > > > be as follows:
> > > > > >
> > > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> > > > > before
> > > > > > sk_receive_queue", if we use sockops
> > > > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > > > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> > > > > sockmap
> > > > > > function, and don't enable strparse and verdict function, the
> > > > > > out-of-order problem may occur in the following process.
> > > > > >
> > > > > > client SK                                   server SK
> > > > > > --------------------------------------------------------------
> > > > > > ----
> > > > > > ----
> > > > > > ----
> > > > > > tcp_rcv_synsent_state_process
> > > > > >   tcp_finish_connect
> > > > > >     tcp_init_transfer
> > > > > >       tcp_set_state(sk, TCP_ESTABLISHED);
> > > > > >       // insert SK to sockmap
> > > > > >     wake up waitter
> > > > > >     tcp_send_ack
> > > > > >
> > > > > > tcp_bpf_sendmsg(msgA)
> > > > > > // msgA will go tcp stack
> > > > > >                                             tcp_rcv_state_process
> > > > > >                                               tcp_init_transfer
> > > > > >                                                 //insert SK to sockmap
> > > > > >                                               tcp_set_state(sk,
> > > > > >                                                      TCP_ESTABLISHED)
> > > > > >                                               wake up waitter
> > > > >
> > > > > Here after the socket is inserted to a sockmap, its
> > > > > ->sk_data_ready() is already replaced with
> > > > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not
> > > > > TCP
> > > stack?
> > > > >
> > > > It is TCP stack.  Here I only enable BPF_SK_MSG_VERDICT type.
> > > > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name
> > > > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach
> > > > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned
> > > > /sys/fs/bpf/sock_ops_map
> > >
> > > Is the sender using FAST_OPEN by any chance? We know this bug exists
> > > in this case. Fix tbd.
> >
> > FAST_OPEN is not used.
> 
> OK thanks for the reproducer I'll take a look this afternoon.
Hey, John and everyone, could you take a look at this one again?
If there's anything need me to test, please let me know.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-11-26  7:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-24  7:59 [bug report] one possible out-of-order issue in sockmap liujian (CE)
2022-09-25 18:25 ` Cong Wang
2022-09-26  1:34   ` liujian (CE)
2022-09-26 21:16     ` John Fastabend
2022-09-27  2:15       ` liujian (CE)
2022-09-28 18:31         ` John Fastabend
2022-11-26  7:12           ` liujian (CE)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).