* [bug report] one possible out-of-order issue in sockmap
@ 2022-09-24 7:59 liujian (CE)
2022-09-25 18:25 ` Cong Wang
0 siblings, 1 reply; 7+ messages in thread
From: liujian (CE) @ 2022-09-24 7:59 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
Paolo Abeni
Cc: netdev, bpf@vger.kernel.org
Hello,
I had a scp failure problem here. I analyze the code, and the reasons may be as follows:
From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg before
sk_receive_queue", if we use sockops (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's sockmap
function, and don't enable strparse and verdict function, the out-of-order
problem may occur in the following process.
client SK server SK
--------------------------------------------------------------------------
tcp_rcv_synsent_state_process
tcp_finish_connect
tcp_init_transfer
tcp_set_state(sk, TCP_ESTABLISHED);
// insert SK to sockmap
wake up waitter
tcp_send_ack
tcp_bpf_sendmsg(msgA)
// msgA will go tcp stack
tcp_rcv_state_process
tcp_init_transfer
//insert SK to sockmap
tcp_set_state(sk,
TCP_ESTABLISHED)
wake up waitter
tcp_bpf_sendmsg(msgB)
// msgB go sockmap
tcp_bpf_recvmsg
//msgB, out-of-order
tcp_bpf_recvmsg
//msgA, out-of-order
Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg receives msg from the psock queue first.
The worst case is that msgA waits for serverSK to change to TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK receive queue later than msgB.
If msgA befor than msgB,
If the ACK packets of the three-way TCP handshake are dropped for a period of time, the OOO problem is easily reproduced.
iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
...
iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
Best Wishes
Liu Jian
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [bug report] one possible out-of-order issue in sockmap 2022-09-24 7:59 [bug report] one possible out-of-order issue in sockmap liujian (CE) @ 2022-09-25 18:25 ` Cong Wang 2022-09-26 1:34 ` liujian (CE) 0 siblings, 1 reply; 7+ messages in thread From: Cong Wang @ 2022-09-25 18:25 UTC (permalink / raw) To: liujian (CE) Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev, bpf@vger.kernel.org On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote: > Hello, > > I had a scp failure problem here. I analyze the code, and the reasons may be as follows: > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg before > sk_receive_queue", if we use sockops (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's sockmap > function, and don't enable strparse and verdict function, the out-of-order > problem may occur in the following process. > > client SK server SK > -------------------------------------------------------------------------- > tcp_rcv_synsent_state_process > tcp_finish_connect > tcp_init_transfer > tcp_set_state(sk, TCP_ESTABLISHED); > // insert SK to sockmap > wake up waitter > tcp_send_ack > > tcp_bpf_sendmsg(msgA) > // msgA will go tcp stack > tcp_rcv_state_process > tcp_init_transfer > //insert SK to sockmap > tcp_set_state(sk, > TCP_ESTABLISHED) > wake up waitter Here after the socket is inserted to a sockmap, its ->sk_data_ready() is already replaced with sk_psock_verdict_data_ready(), so msgA should go to sockmap, not TCP stack? > tcp_bpf_sendmsg(msgB) > // msgB go sockmap > tcp_bpf_recvmsg > //msgB, out-of-order > tcp_bpf_recvmsg > //msgA, out-of-order > > > Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg receives msg from the psock queue first. > The worst case is that msgA waits for serverSK to change to TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK receive queue later than msgB. > If msgA befor than msgB, > > If the ACK packets of the three-way TCP handshake are dropped for a period of time, the OOO problem is easily reproduced. > > iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP > ... > iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP > > Best Wishes > Liu Jian ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap 2022-09-25 18:25 ` Cong Wang @ 2022-09-26 1:34 ` liujian (CE) 2022-09-26 21:16 ` John Fastabend 0 siblings, 1 reply; 7+ messages in thread From: liujian (CE) @ 2022-09-26 1:34 UTC (permalink / raw) To: Cong Wang Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev, bpf@vger.kernel.org > -----Original Message----- > From: Cong Wang [mailto:xiyou.wangcong@gmail.com] > Sent: Monday, September 26, 2022 2:26 AM > To: liujian (CE) <liujian56@huawei.com> > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org; > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org > Subject: Re: [bug report] one possible out-of-order issue in sockmap > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote: > > Hello, > > > > I had a scp failure problem here. I analyze the code, and the reasons may > be as follows: > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg > before > > sk_receive_queue", if we use sockops > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's > sockmap > > function, and don't enable strparse and verdict function, the > > out-of-order problem may occur in the following process. > > > > client SK server SK > > ---------------------------------------------------------------------- > > ---- > > tcp_rcv_synsent_state_process > > tcp_finish_connect > > tcp_init_transfer > > tcp_set_state(sk, TCP_ESTABLISHED); > > // insert SK to sockmap > > wake up waitter > > tcp_send_ack > > > > tcp_bpf_sendmsg(msgA) > > // msgA will go tcp stack > > tcp_rcv_state_process > > tcp_init_transfer > > //insert SK to sockmap > > tcp_set_state(sk, > > TCP_ESTABLISHED) > > wake up waitter > > Here after the socket is inserted to a sockmap, its ->sk_data_ready() is > already replaced with sk_psock_verdict_data_ready(), so msgA should go to > sockmap, not TCP stack? > It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type. bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach pinned /sys/fs/bpf/bpf_redir msg_verdict pinned /sys/fs/bpf/sock_ops_map The call trace like this: Tcp_bpf_sendmsg --tcp_bpf_send_verdict ---- sk_psock_msg_verdict // did not find serverSK, return __SK_PASS ---- tcp_bpf_push ------ do_tcp_sendpages // go to TCP stack After this, serverSk is inserted to a sockmap, but msgA is already running the TCP stack. > > tcp_bpf_sendmsg(msgB) > > // msgB go sockmap > > tcp_bpf_recvmsg > > //msgB, out-of-order > > tcp_bpf_recvmsg > > //msgA, out-of-order > > > > > > Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg > receives msg from the psock queue first. > > The worst case is that msgA waits for serverSK to change to > TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK > receive queue later than msgB. > > If msgA befor than msgB, > > > > If the ACK packets of the three-way TCP handshake are dropped for a > period of time, the OOO problem is easily reproduced. > > > > iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags > > SYN,RST,ACK,FIN ACK -j DROP ... > > iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags > > SYN,RST,ACK,FIN ACK -j DROP > > > > Best Wishes > > Liu Jian ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap 2022-09-26 1:34 ` liujian (CE) @ 2022-09-26 21:16 ` John Fastabend 2022-09-27 2:15 ` liujian (CE) 0 siblings, 1 reply; 7+ messages in thread From: John Fastabend @ 2022-09-26 21:16 UTC (permalink / raw) To: liujian (CE), Cong Wang Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev, bpf@vger.kernel.org liujian (CE) wrote: > > > > -----Original Message----- > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com] > > Sent: Monday, September 26, 2022 2:26 AM > > To: liujian (CE) <liujian56@huawei.com> > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org; > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org > > Subject: Re: [bug report] one possible out-of-order issue in sockmap > > > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote: > > > Hello, > > > > > > I had a scp failure problem here. I analyze the code, and the reasons may > > be as follows: > > > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg > > before > > > sk_receive_queue", if we use sockops > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's > > sockmap > > > function, and don't enable strparse and verdict function, the > > > out-of-order problem may occur in the following process. > > > > > > client SK server SK > > > ---------------------------------------------------------------------- > > > ---- > > > tcp_rcv_synsent_state_process > > > tcp_finish_connect > > > tcp_init_transfer > > > tcp_set_state(sk, TCP_ESTABLISHED); > > > // insert SK to sockmap > > > wake up waitter > > > tcp_send_ack > > > > > > tcp_bpf_sendmsg(msgA) > > > // msgA will go tcp stack > > > tcp_rcv_state_process > > > tcp_init_transfer > > > //insert SK to sockmap > > > tcp_set_state(sk, > > > TCP_ESTABLISHED) > > > wake up waitter > > > > Here after the socket is inserted to a sockmap, its ->sk_data_ready() is > > already replaced with sk_psock_verdict_data_ready(), so msgA should go to > > sockmap, not TCP stack? > > > It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type. > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name sock_ops_map pinned /sys/fs/bpf/sock_ops_map > bpftool prog attach pinned /sys/fs/bpf/bpf_redir msg_verdict pinned /sys/fs/bpf/sock_ops_map Is the sender using FAST_OPEN by any chance? We know this bug exists in this case. Fix tbd. ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap 2022-09-26 21:16 ` John Fastabend @ 2022-09-27 2:15 ` liujian (CE) 2022-09-28 18:31 ` John Fastabend 0 siblings, 1 reply; 7+ messages in thread From: liujian (CE) @ 2022-09-27 2:15 UTC (permalink / raw) To: John Fastabend, Cong Wang Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev, bpf@vger.kernel.org > -----Original Message----- > From: John Fastabend [mailto:john.fastabend@gmail.com] > Sent: Tuesday, September 27, 2022 5:16 AM > To: liujian (CE) <liujian56@huawei.com>; Cong Wang > <xiyou.wangcong@gmail.com> > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org; > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org > Subject: RE: [bug report] one possible out-of-order issue in sockmap > > liujian (CE) wrote: > > > > > > > -----Original Message----- > > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com] > > > Sent: Monday, September 26, 2022 2:26 AM > > > To: liujian (CE) <liujian56@huawei.com> > > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki > > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; > davem > > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org; > > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; > > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org > > > Subject: Re: [bug report] one possible out-of-order issue in sockmap > > > > > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote: > > > > Hello, > > > > > > > > I had a scp failure problem here. I analyze the code, and the > > > > reasons may > > > be as follows: > > > > > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg > > > before > > > > sk_receive_queue", if we use sockops > > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB > > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's > > > sockmap > > > > function, and don't enable strparse and verdict function, the > > > > out-of-order problem may occur in the following process. > > > > > > > > client SK server SK > > > > ------------------------------------------------------------------ > > > > ---- > > > > ---- > > > > tcp_rcv_synsent_state_process > > > > tcp_finish_connect > > > > tcp_init_transfer > > > > tcp_set_state(sk, TCP_ESTABLISHED); > > > > // insert SK to sockmap > > > > wake up waitter > > > > tcp_send_ack > > > > > > > > tcp_bpf_sendmsg(msgA) > > > > // msgA will go tcp stack > > > > tcp_rcv_state_process > > > > tcp_init_transfer > > > > //insert SK to sockmap > > > > tcp_set_state(sk, > > > > TCP_ESTABLISHED) > > > > wake up waitter > > > > > > Here after the socket is inserted to a sockmap, its > > > ->sk_data_ready() is already replaced with > > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not TCP > stack? > > > > > It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type. > > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name > > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach > > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned > > /sys/fs/bpf/sock_ops_map > > Is the sender using FAST_OPEN by any chance? We know this bug exists in > this case. Fix tbd. FAST_OPEN is not used. The following test cases can be used to reproduce the OOO problem. But the worst-case scenario described in the problem (the arrival of msgA is later than the arrival of msgB), I have not been able to construct an inevitable case. tcp_server.c int server_port = 5006; int main(int argc, char *argv[]) { int serverSocket; struct sockaddr_in server_addr; struct sockaddr_in clientAddr; int addr_len = sizeof(clientAddr); int client; char buffer[200]; int iDataNum; int optbuf, ret; if (argc != 2) { return -1; } server_port = atoi(argv[1]); if( server_port<1025 || server_port>65535 ) { return -1; } if((serverSocket = socket(AF_INET, SOCK_STREAM, 0)) < 0) { perror("socket"); return 1; } optbuf = 1; ret = setsockopt(serverSocket, SOL_SOCKET, SO_REUSEADDR, &optbuf, sizeof(int)); if (ret != 0) perror("reuseaddr failed"); bzero(&server_addr, sizeof(server_addr)); server_addr.sin_family = AF_INET; server_addr.sin_port = htons(server_port); server_addr.sin_addr.s_addr = htonl(INADDR_ANY); if(bind(serverSocket, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) { perror("connect"); return 1; } if(listen(serverSocket, 5) < 0) { perror("listen"); return 1; } while(1) { client = accept(serverSocket, (struct sockaddr*)&clientAddr, (socklen_t*)&addr_len); if(client < 0) { perror("accept"); continue; } printf("wait until the two msgs of client are sent...\n"); sleep(5); while(1) { printf("recvmsg:"); buffer[0] = '\0'; iDataNum = recv(client, buffer, 1024, 0); if(iDataNum < 0) { perror("recv null"); continue; } buffer[iDataNum] = '\0'; printf("%s\n", buffer); sleep(2); } } close(serverSocket); return 0; } tcp_client.c int server_port = 5006; int main(int argc, char *argv[]) { int clientSocket; struct sockaddr_in serverAddr; struct sockaddr_in clientAddr; char sendbuf[4096]; char recvbuf[4096]; int iDataNum; int ret; int client_port; if (argc != 3) { printf("client [sport] [dport]\n"); return -1; } client_port = atoi(argv[1]); if(client_port<1025 || client_port>65535 ) { return -1; } server_port = atoi(argv[2]); if( server_port<1025 || server_port>65535 ) { return -1; } if((clientSocket = socket(AF_INET, SOCK_STREAM, 0)) < 0) { perror("socket"); return 1; } bzero(&clientAddr, sizeof(clientAddr)); clientAddr.sin_family = AF_INET; clientAddr.sin_port = htons(client_port); clientAddr.sin_addr.s_addr = htonl(INADDR_ANY); if(bind(clientSocket, (struct sockaddr *)&clientAddr, sizeof(clientAddr)) < 0) { perror("bind"); return 1; } bzero(&serverAddr, sizeof(serverAddr)); serverAddr.sin_family = AF_INET; serverAddr.sin_port = htons(server_port); serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1"); system("iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP"); if(connect(clientSocket, (struct sockaddr *)&serverAddr, sizeof(serverAddr)) < 0) { perror("connect"); return 1; } memset(sendbuf, 0, sizeof(sendbuf)); memcpy(sendbuf, "AAAAAAAAAAA", 10); ret = send(clientSocket, sendbuf, strlen(sendbuf), 0); if (ret <= 0) { perror("send fail\n"); return -1; } printf("finish send A\n"); system("iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP"); sleep(2); // wait serversk insert to sockmap printf("start send b\n"); memcpy(sendbuf, "bbbbbbbbbbbbb", 10); ret = send(clientSocket, sendbuf, strlen(sendbuf), 0); if (ret <= 0) { perror("send fail\n"); return -1; } sleep(10); close(clientSocket); return 0; } [root@localhost sockmap_test]# ./server 5006 wait until the two msgs of client are sent... recvmsg:bbbbbbbbbb recvmsg:AAAAAAAAAA ^C ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap 2022-09-27 2:15 ` liujian (CE) @ 2022-09-28 18:31 ` John Fastabend 2022-11-26 7:12 ` liujian (CE) 0 siblings, 1 reply; 7+ messages in thread From: John Fastabend @ 2022-09-28 18:31 UTC (permalink / raw) To: liujian (CE), John Fastabend, Cong Wang Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev, bpf@vger.kernel.org liujian (CE) wrote: > > > > -----Original Message----- > > From: John Fastabend [mailto:john.fastabend@gmail.com] > > Sent: Tuesday, September 27, 2022 5:16 AM > > To: liujian (CE) <liujian56@huawei.com>; Cong Wang > > <xiyou.wangcong@gmail.com> > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org; > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org > > Subject: RE: [bug report] one possible out-of-order issue in sockmap > > > > liujian (CE) wrote: > > > > > > > > > > -----Original Message----- > > > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com] > > > > Sent: Monday, September 26, 2022 2:26 AM > > > > To: liujian (CE) <liujian56@huawei.com> > > > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki > > > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; > > davem > > > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org; > > > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; > > > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org > > > > Subject: Re: [bug report] one possible out-of-order issue in sockmap > > > > > > > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote: > > > > > Hello, > > > > > > > > > > I had a scp failure problem here. I analyze the code, and the > > > > > reasons may > > > > be as follows: > > > > > > > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg > > > > before > > > > > sk_receive_queue", if we use sockops > > > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB > > > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's > > > > sockmap > > > > > function, and don't enable strparse and verdict function, the > > > > > out-of-order problem may occur in the following process. > > > > > > > > > > client SK server SK > > > > > ------------------------------------------------------------------ > > > > > ---- > > > > > ---- > > > > > tcp_rcv_synsent_state_process > > > > > tcp_finish_connect > > > > > tcp_init_transfer > > > > > tcp_set_state(sk, TCP_ESTABLISHED); > > > > > // insert SK to sockmap > > > > > wake up waitter > > > > > tcp_send_ack > > > > > > > > > > tcp_bpf_sendmsg(msgA) > > > > > // msgA will go tcp stack > > > > > tcp_rcv_state_process > > > > > tcp_init_transfer > > > > > //insert SK to sockmap > > > > > tcp_set_state(sk, > > > > > TCP_ESTABLISHED) > > > > > wake up waitter > > > > > > > > Here after the socket is inserted to a sockmap, its > > > > ->sk_data_ready() is already replaced with > > > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not TCP > > stack? > > > > > > > It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type. > > > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name > > > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach > > > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned > > > /sys/fs/bpf/sock_ops_map > > > > Is the sender using FAST_OPEN by any chance? We know this bug exists in > > this case. Fix tbd. > > FAST_OPEN is not used. OK thanks for the reproducer I'll take a look this afternoon. ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap 2022-09-28 18:31 ` John Fastabend @ 2022-11-26 7:12 ` liujian (CE) 0 siblings, 0 replies; 7+ messages in thread From: liujian (CE) @ 2022-11-26 7:12 UTC (permalink / raw) To: John Fastabend, Cong Wang Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev, bpf@vger.kernel.org > -----Original Message----- > From: John Fastabend [mailto:john.fastabend@gmail.com] > Sent: Thursday, September 29, 2022 2:31 AM > To: liujian (CE) <liujian56@huawei.com>; John Fastabend > <john.fastabend@gmail.com>; Cong Wang <xiyou.wangcong@gmail.com> > Cc: Jakub Sitnicki <jakub@cloudflare.com>; Eric Dumazet > <edumazet@google.com>; davem <davem@davemloft.net>; > yoshfuji@linux-ipv6.org; dsahern@kernel.org; Jakub Kicinski > <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; netdev > <netdev@vger.kernel.org>; bpf@vger.kernel.org > Subject: RE: [bug report] one possible out-of-order issue in sockmap > > liujian (CE) wrote: > > > > > > > -----Original Message----- > > > From: John Fastabend [mailto:john.fastabend@gmail.com] > > > Sent: Tuesday, September 27, 2022 5:16 AM > > > To: liujian (CE) <liujian56@huawei.com>; Cong Wang > > > <xiyou.wangcong@gmail.com> > > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki > > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; > davem > > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org; > > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; > > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org > > > Subject: RE: [bug report] one possible out-of-order issue in sockmap > > > > > > liujian (CE) wrote: > > > > > > > > > > > > > -----Original Message----- > > > > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com] > > > > > Sent: Monday, September 26, 2022 2:26 AM > > > > > To: liujian (CE) <liujian56@huawei.com> > > > > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki > > > > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; > > > davem > > > > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; > > > > > dsahern@kernel.org; Jakub Kicinski <kuba@kernel.org>; Paolo > > > > > Abeni <pabeni@redhat.com>; netdev <netdev@vger.kernel.org>; > > > > > bpf@vger.kernel.org > > > > > Subject: Re: [bug report] one possible out-of-order issue in > > > > > sockmap > > > > > > > > > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote: > > > > > > Hello, > > > > > > > > > > > > I had a scp failure problem here. I analyze the code, and the > > > > > > reasons may > > > > > be as follows: > > > > > > > > > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg > > > > > before > > > > > > sk_receive_queue", if we use sockops > > > > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB > > > > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's > > > > > sockmap > > > > > > function, and don't enable strparse and verdict function, the > > > > > > out-of-order problem may occur in the following process. > > > > > > > > > > > > client SK server SK > > > > > > -------------------------------------------------------------- > > > > > > ---- > > > > > > ---- > > > > > > ---- > > > > > > tcp_rcv_synsent_state_process > > > > > > tcp_finish_connect > > > > > > tcp_init_transfer > > > > > > tcp_set_state(sk, TCP_ESTABLISHED); > > > > > > // insert SK to sockmap > > > > > > wake up waitter > > > > > > tcp_send_ack > > > > > > > > > > > > tcp_bpf_sendmsg(msgA) > > > > > > // msgA will go tcp stack > > > > > > tcp_rcv_state_process > > > > > > tcp_init_transfer > > > > > > //insert SK to sockmap > > > > > > tcp_set_state(sk, > > > > > > TCP_ESTABLISHED) > > > > > > wake up waitter > > > > > > > > > > Here after the socket is inserted to a sockmap, its > > > > > ->sk_data_ready() is already replaced with > > > > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not > > > > > TCP > > > stack? > > > > > > > > > It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type. > > > > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name > > > > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach > > > > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned > > > > /sys/fs/bpf/sock_ops_map > > > > > > Is the sender using FAST_OPEN by any chance? We know this bug exists > > > in this case. Fix tbd. > > > > FAST_OPEN is not used. > > OK thanks for the reproducer I'll take a look this afternoon. Hey, John and everyone, could you take a look at this one again? If there's anything need me to test, please let me know. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-11-26 7:12 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-09-24 7:59 [bug report] one possible out-of-order issue in sockmap liujian (CE) 2022-09-25 18:25 ` Cong Wang 2022-09-26 1:34 ` liujian (CE) 2022-09-26 21:16 ` John Fastabend 2022-09-27 2:15 ` liujian (CE) 2022-09-28 18:31 ` John Fastabend 2022-11-26 7:12 ` liujian (CE)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).