* [bug report] one possible out-of-order issue in sockmap
@ 2022-09-24 7:59 liujian (CE)
2022-09-25 18:25 ` Cong Wang
0 siblings, 1 reply; 7+ messages in thread
From: liujian (CE) @ 2022-09-24 7:59 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
Paolo Abeni
Cc: netdev, bpf@vger.kernel.org
Hello,
I had a scp failure problem here. I analyze the code, and the reasons may be as follows:
From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg before
sk_receive_queue", if we use sockops (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's sockmap
function, and don't enable strparse and verdict function, the out-of-order
problem may occur in the following process.
client SK server SK
--------------------------------------------------------------------------
tcp_rcv_synsent_state_process
tcp_finish_connect
tcp_init_transfer
tcp_set_state(sk, TCP_ESTABLISHED);
// insert SK to sockmap
wake up waitter
tcp_send_ack
tcp_bpf_sendmsg(msgA)
// msgA will go tcp stack
tcp_rcv_state_process
tcp_init_transfer
//insert SK to sockmap
tcp_set_state(sk,
TCP_ESTABLISHED)
wake up waitter
tcp_bpf_sendmsg(msgB)
// msgB go sockmap
tcp_bpf_recvmsg
//msgB, out-of-order
tcp_bpf_recvmsg
//msgA, out-of-order
Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg receives msg from the psock queue first.
The worst case is that msgA waits for serverSK to change to TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK receive queue later than msgB.
If msgA befor than msgB,
If the ACK packets of the three-way TCP handshake are dropped for a period of time, the OOO problem is easily reproduced.
iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
...
iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
Best Wishes
Liu Jian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] one possible out-of-order issue in sockmap
2022-09-24 7:59 [bug report] one possible out-of-order issue in sockmap liujian (CE)
@ 2022-09-25 18:25 ` Cong Wang
2022-09-26 1:34 ` liujian (CE)
0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2022-09-25 18:25 UTC (permalink / raw)
To: liujian (CE)
Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
Paolo Abeni, netdev, bpf@vger.kernel.org
On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> Hello,
>
> I had a scp failure problem here. I analyze the code, and the reasons may be as follows:
>
> From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg before
> sk_receive_queue", if we use sockops (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's sockmap
> function, and don't enable strparse and verdict function, the out-of-order
> problem may occur in the following process.
>
> client SK server SK
> --------------------------------------------------------------------------
> tcp_rcv_synsent_state_process
> tcp_finish_connect
> tcp_init_transfer
> tcp_set_state(sk, TCP_ESTABLISHED);
> // insert SK to sockmap
> wake up waitter
> tcp_send_ack
>
> tcp_bpf_sendmsg(msgA)
> // msgA will go tcp stack
> tcp_rcv_state_process
> tcp_init_transfer
> //insert SK to sockmap
> tcp_set_state(sk,
> TCP_ESTABLISHED)
> wake up waitter
Here after the socket is inserted to a sockmap, its ->sk_data_ready() is
already replaced with sk_psock_verdict_data_ready(), so msgA should go
to sockmap, not TCP stack?
> tcp_bpf_sendmsg(msgB)
> // msgB go sockmap
> tcp_bpf_recvmsg
> //msgB, out-of-order
> tcp_bpf_recvmsg
> //msgA, out-of-order
>
>
> Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg receives msg from the psock queue first.
> The worst case is that msgA waits for serverSK to change to TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK receive queue later than msgB.
> If msgA befor than msgB,
>
> If the ACK packets of the three-way TCP handshake are dropped for a period of time, the OOO problem is easily reproduced.
>
> iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
> ...
> iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP
>
> Best Wishes
> Liu Jian
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap
2022-09-25 18:25 ` Cong Wang
@ 2022-09-26 1:34 ` liujian (CE)
2022-09-26 21:16 ` John Fastabend
0 siblings, 1 reply; 7+ messages in thread
From: liujian (CE) @ 2022-09-26 1:34 UTC (permalink / raw)
To: Cong Wang
Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
Paolo Abeni, netdev, bpf@vger.kernel.org
> -----Original Message-----
> From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> Sent: Monday, September 26, 2022 2:26 AM
> To: liujian (CE) <liujian56@huawei.com>
> Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem
> <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> Subject: Re: [bug report] one possible out-of-order issue in sockmap
>
> On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > Hello,
> >
> > I had a scp failure problem here. I analyze the code, and the reasons may
> be as follows:
> >
> > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> before
> > sk_receive_queue", if we use sockops
> > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> sockmap
> > function, and don't enable strparse and verdict function, the
> > out-of-order problem may occur in the following process.
> >
> > client SK server SK
> > ----------------------------------------------------------------------
> > ----
> > tcp_rcv_synsent_state_process
> > tcp_finish_connect
> > tcp_init_transfer
> > tcp_set_state(sk, TCP_ESTABLISHED);
> > // insert SK to sockmap
> > wake up waitter
> > tcp_send_ack
> >
> > tcp_bpf_sendmsg(msgA)
> > // msgA will go tcp stack
> > tcp_rcv_state_process
> > tcp_init_transfer
> > //insert SK to sockmap
> > tcp_set_state(sk,
> > TCP_ESTABLISHED)
> > wake up waitter
>
> Here after the socket is inserted to a sockmap, its ->sk_data_ready() is
> already replaced with sk_psock_verdict_data_ready(), so msgA should go to
> sockmap, not TCP stack?
>
It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type.
bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name sock_ops_map pinned /sys/fs/bpf/sock_ops_map
bpftool prog attach pinned /sys/fs/bpf/bpf_redir msg_verdict pinned /sys/fs/bpf/sock_ops_map
The call trace like this:
Tcp_bpf_sendmsg
--tcp_bpf_send_verdict
---- sk_psock_msg_verdict // did not find serverSK, return __SK_PASS
---- tcp_bpf_push
------ do_tcp_sendpages // go to TCP stack
After this, serverSk is inserted to a sockmap, but msgA is already running the TCP stack.
> > tcp_bpf_sendmsg(msgB)
> > // msgB go sockmap
> > tcp_bpf_recvmsg
> > //msgB, out-of-order
> > tcp_bpf_recvmsg
> > //msgA, out-of-order
> >
> >
> > Even if msgA arrives earlier than msgB (in most cases), tcp_bpf_recvmsg
> receives msg from the psock queue first.
> > The worst case is that msgA waits for serverSK to change to
> TCP_ESTABLISHED in the protocol stack. msgA may arrive at the serverSK
> receive queue later than msgB.
> > If msgA befor than msgB,
> >
> > If the ACK packets of the three-way TCP handshake are dropped for a
> period of time, the OOO problem is easily reproduced.
> >
> > iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags
> > SYN,RST,ACK,FIN ACK -j DROP ...
> > iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags
> > SYN,RST,ACK,FIN ACK -j DROP
> >
> > Best Wishes
> > Liu Jian
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap
2022-09-26 1:34 ` liujian (CE)
@ 2022-09-26 21:16 ` John Fastabend
2022-09-27 2:15 ` liujian (CE)
0 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2022-09-26 21:16 UTC (permalink / raw)
To: liujian (CE), Cong Wang
Cc: John Fastabend, Jakub Sitnicki, Eric Dumazet, davem,
yoshfuji@linux-ipv6.org, dsahern@kernel.org, Jakub Kicinski,
Paolo Abeni, netdev, bpf@vger.kernel.org
liujian (CE) wrote:
>
>
> > -----Original Message-----
> > From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> > Sent: Monday, September 26, 2022 2:26 AM
> > To: liujian (CE) <liujian56@huawei.com>
> > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem
> > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > Subject: Re: [bug report] one possible out-of-order issue in sockmap
> >
> > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > > Hello,
> > >
> > > I had a scp failure problem here. I analyze the code, and the reasons may
> > be as follows:
> > >
> > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> > before
> > > sk_receive_queue", if we use sockops
> > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> > sockmap
> > > function, and don't enable strparse and verdict function, the
> > > out-of-order problem may occur in the following process.
> > >
> > > client SK server SK
> > > ----------------------------------------------------------------------
> > > ----
> > > tcp_rcv_synsent_state_process
> > > tcp_finish_connect
> > > tcp_init_transfer
> > > tcp_set_state(sk, TCP_ESTABLISHED);
> > > // insert SK to sockmap
> > > wake up waitter
> > > tcp_send_ack
> > >
> > > tcp_bpf_sendmsg(msgA)
> > > // msgA will go tcp stack
> > > tcp_rcv_state_process
> > > tcp_init_transfer
> > > //insert SK to sockmap
> > > tcp_set_state(sk,
> > > TCP_ESTABLISHED)
> > > wake up waitter
> >
> > Here after the socket is inserted to a sockmap, its ->sk_data_ready() is
> > already replaced with sk_psock_verdict_data_ready(), so msgA should go to
> > sockmap, not TCP stack?
> >
> It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type.
> bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name sock_ops_map pinned /sys/fs/bpf/sock_ops_map
> bpftool prog attach pinned /sys/fs/bpf/bpf_redir msg_verdict pinned /sys/fs/bpf/sock_ops_map
Is the sender using FAST_OPEN by any chance? We know this bug exists
in this case. Fix tbd.
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap
2022-09-26 21:16 ` John Fastabend
@ 2022-09-27 2:15 ` liujian (CE)
2022-09-28 18:31 ` John Fastabend
0 siblings, 1 reply; 7+ messages in thread
From: liujian (CE) @ 2022-09-27 2:15 UTC (permalink / raw)
To: John Fastabend, Cong Wang
Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org,
dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev,
bpf@vger.kernel.org
> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Tuesday, September 27, 2022 5:16 AM
> To: liujian (CE) <liujian56@huawei.com>; Cong Wang
> <xiyou.wangcong@gmail.com>
> Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem
> <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> Subject: RE: [bug report] one possible out-of-order issue in sockmap
>
> liujian (CE) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> > > Sent: Monday, September 26, 2022 2:26 AM
> > > To: liujian (CE) <liujian56@huawei.com>
> > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>;
> davem
> > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > > Subject: Re: [bug report] one possible out-of-order issue in sockmap
> > >
> > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > > > Hello,
> > > >
> > > > I had a scp failure problem here. I analyze the code, and the
> > > > reasons may
> > > be as follows:
> > > >
> > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> > > before
> > > > sk_receive_queue", if we use sockops
> > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> > > sockmap
> > > > function, and don't enable strparse and verdict function, the
> > > > out-of-order problem may occur in the following process.
> > > >
> > > > client SK server SK
> > > > ------------------------------------------------------------------
> > > > ----
> > > > ----
> > > > tcp_rcv_synsent_state_process
> > > > tcp_finish_connect
> > > > tcp_init_transfer
> > > > tcp_set_state(sk, TCP_ESTABLISHED);
> > > > // insert SK to sockmap
> > > > wake up waitter
> > > > tcp_send_ack
> > > >
> > > > tcp_bpf_sendmsg(msgA)
> > > > // msgA will go tcp stack
> > > > tcp_rcv_state_process
> > > > tcp_init_transfer
> > > > //insert SK to sockmap
> > > > tcp_set_state(sk,
> > > > TCP_ESTABLISHED)
> > > > wake up waitter
> > >
> > > Here after the socket is inserted to a sockmap, its
> > > ->sk_data_ready() is already replaced with
> > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not TCP
> stack?
> > >
> > It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type.
> > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name
> > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach
> > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned
> > /sys/fs/bpf/sock_ops_map
>
> Is the sender using FAST_OPEN by any chance? We know this bug exists in
> this case. Fix tbd.
FAST_OPEN is not used.
The following test cases can be used to reproduce the OOO problem.
But the worst-case scenario described in the problem (the arrival of msgA is later than the arrival of msgB), I have not been able to construct an inevitable case.
tcp_server.c
int server_port = 5006;
int main(int argc, char *argv[])
{
int serverSocket;
struct sockaddr_in server_addr;
struct sockaddr_in clientAddr;
int addr_len = sizeof(clientAddr);
int client;
char buffer[200];
int iDataNum;
int optbuf, ret;
if (argc != 2) {
return -1;
}
server_port = atoi(argv[1]);
if( server_port<1025 || server_port>65535 )
{
return -1;
}
if((serverSocket = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
perror("socket");
return 1;
}
optbuf = 1;
ret = setsockopt(serverSocket, SOL_SOCKET, SO_REUSEADDR, &optbuf, sizeof(int));
if (ret != 0)
perror("reuseaddr failed");
bzero(&server_addr, sizeof(server_addr));
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(server_port);
server_addr.sin_addr.s_addr = htonl(INADDR_ANY);
if(bind(serverSocket, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0)
{
perror("connect");
return 1;
}
if(listen(serverSocket, 5) < 0)
{
perror("listen");
return 1;
}
while(1)
{
client = accept(serverSocket, (struct sockaddr*)&clientAddr, (socklen_t*)&addr_len);
if(client < 0)
{
perror("accept");
continue;
}
printf("wait until the two msgs of client are sent...\n");
sleep(5);
while(1)
{
printf("recvmsg:");
buffer[0] = '\0';
iDataNum = recv(client, buffer, 1024, 0);
if(iDataNum < 0)
{
perror("recv null");
continue;
}
buffer[iDataNum] = '\0';
printf("%s\n", buffer);
sleep(2);
}
}
close(serverSocket);
return 0;
}
tcp_client.c
int server_port = 5006;
int main(int argc, char *argv[])
{
int clientSocket;
struct sockaddr_in serverAddr;
struct sockaddr_in clientAddr;
char sendbuf[4096];
char recvbuf[4096];
int iDataNum;
int ret;
int client_port;
if (argc != 3) {
printf("client [sport] [dport]\n");
return -1;
}
client_port = atoi(argv[1]);
if(client_port<1025 || client_port>65535 )
{
return -1;
}
server_port = atoi(argv[2]);
if( server_port<1025 || server_port>65535 )
{
return -1;
}
if((clientSocket = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
perror("socket");
return 1;
}
bzero(&clientAddr, sizeof(clientAddr));
clientAddr.sin_family = AF_INET;
clientAddr.sin_port = htons(client_port);
clientAddr.sin_addr.s_addr = htonl(INADDR_ANY);
if(bind(clientSocket, (struct sockaddr *)&clientAddr, sizeof(clientAddr)) < 0)
{
perror("bind");
return 1;
}
bzero(&serverAddr, sizeof(serverAddr));
serverAddr.sin_family = AF_INET;
serverAddr.sin_port = htons(server_port);
serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
system("iptables -A INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP");
if(connect(clientSocket, (struct sockaddr *)&serverAddr, sizeof(serverAddr)) < 0)
{
perror("connect");
return 1;
}
memset(sendbuf, 0, sizeof(sendbuf));
memcpy(sendbuf, "AAAAAAAAAAA", 10);
ret = send(clientSocket, sendbuf, strlen(sendbuf), 0);
if (ret <= 0) {
perror("send fail\n");
return -1;
}
printf("finish send A\n");
system("iptables -D INPUT -p tcp -m tcp --dport 5006 --tcp-flags SYN,RST,ACK,FIN ACK -j DROP");
sleep(2); // wait serversk insert to sockmap
printf("start send b\n");
memcpy(sendbuf, "bbbbbbbbbbbbb", 10);
ret = send(clientSocket, sendbuf, strlen(sendbuf), 0);
if (ret <= 0) {
perror("send fail\n");
return -1;
}
sleep(10);
close(clientSocket);
return 0;
}
[root@localhost sockmap_test]# ./server 5006
wait until the two msgs of client are sent...
recvmsg:bbbbbbbbbb
recvmsg:AAAAAAAAAA
^C
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap
2022-09-27 2:15 ` liujian (CE)
@ 2022-09-28 18:31 ` John Fastabend
2022-11-26 7:12 ` liujian (CE)
0 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2022-09-28 18:31 UTC (permalink / raw)
To: liujian (CE), John Fastabend, Cong Wang
Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org,
dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev,
bpf@vger.kernel.org
liujian (CE) wrote:
>
>
> > -----Original Message-----
> > From: John Fastabend [mailto:john.fastabend@gmail.com]
> > Sent: Tuesday, September 27, 2022 5:16 AM
> > To: liujian (CE) <liujian56@huawei.com>; Cong Wang
> > <xiyou.wangcong@gmail.com>
> > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>; davem
> > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > Subject: RE: [bug report] one possible out-of-order issue in sockmap
> >
> > liujian (CE) wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> > > > Sent: Monday, September 26, 2022 2:26 AM
> > > > To: liujian (CE) <liujian56@huawei.com>
> > > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>;
> > davem
> > > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > > > Subject: Re: [bug report] one possible out-of-order issue in sockmap
> > > >
> > > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > > > > Hello,
> > > > >
> > > > > I had a scp failure problem here. I analyze the code, and the
> > > > > reasons may
> > > > be as follows:
> > > > >
> > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> > > > before
> > > > > sk_receive_queue", if we use sockops
> > > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> > > > sockmap
> > > > > function, and don't enable strparse and verdict function, the
> > > > > out-of-order problem may occur in the following process.
> > > > >
> > > > > client SK server SK
> > > > > ------------------------------------------------------------------
> > > > > ----
> > > > > ----
> > > > > tcp_rcv_synsent_state_process
> > > > > tcp_finish_connect
> > > > > tcp_init_transfer
> > > > > tcp_set_state(sk, TCP_ESTABLISHED);
> > > > > // insert SK to sockmap
> > > > > wake up waitter
> > > > > tcp_send_ack
> > > > >
> > > > > tcp_bpf_sendmsg(msgA)
> > > > > // msgA will go tcp stack
> > > > > tcp_rcv_state_process
> > > > > tcp_init_transfer
> > > > > //insert SK to sockmap
> > > > > tcp_set_state(sk,
> > > > > TCP_ESTABLISHED)
> > > > > wake up waitter
> > > >
> > > > Here after the socket is inserted to a sockmap, its
> > > > ->sk_data_ready() is already replaced with
> > > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not TCP
> > stack?
> > > >
> > > It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type.
> > > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name
> > > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach
> > > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned
> > > /sys/fs/bpf/sock_ops_map
> >
> > Is the sender using FAST_OPEN by any chance? We know this bug exists in
> > this case. Fix tbd.
>
> FAST_OPEN is not used.
OK thanks for the reproducer I'll take a look this afternoon.
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [bug report] one possible out-of-order issue in sockmap
2022-09-28 18:31 ` John Fastabend
@ 2022-11-26 7:12 ` liujian (CE)
0 siblings, 0 replies; 7+ messages in thread
From: liujian (CE) @ 2022-11-26 7:12 UTC (permalink / raw)
To: John Fastabend, Cong Wang
Cc: Jakub Sitnicki, Eric Dumazet, davem, yoshfuji@linux-ipv6.org,
dsahern@kernel.org, Jakub Kicinski, Paolo Abeni, netdev,
bpf@vger.kernel.org
> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Thursday, September 29, 2022 2:31 AM
> To: liujian (CE) <liujian56@huawei.com>; John Fastabend
> <john.fastabend@gmail.com>; Cong Wang <xiyou.wangcong@gmail.com>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>; Eric Dumazet
> <edumazet@google.com>; davem <davem@davemloft.net>;
> yoshfuji@linux-ipv6.org; dsahern@kernel.org; Jakub Kicinski
> <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; netdev
> <netdev@vger.kernel.org>; bpf@vger.kernel.org
> Subject: RE: [bug report] one possible out-of-order issue in sockmap
>
> liujian (CE) wrote:
> >
> >
> > > -----Original Message-----
> > > From: John Fastabend [mailto:john.fastabend@gmail.com]
> > > Sent: Tuesday, September 27, 2022 5:16 AM
> > > To: liujian (CE) <liujian56@huawei.com>; Cong Wang
> > > <xiyou.wangcong@gmail.com>
> > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>;
> davem
> > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org; dsahern@kernel.org;
> > > Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> > > netdev <netdev@vger.kernel.org>; bpf@vger.kernel.org
> > > Subject: RE: [bug report] one possible out-of-order issue in sockmap
> > >
> > > liujian (CE) wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Cong Wang [mailto:xiyou.wangcong@gmail.com]
> > > > > Sent: Monday, September 26, 2022 2:26 AM
> > > > > To: liujian (CE) <liujian56@huawei.com>
> > > > > Cc: John Fastabend <john.fastabend@gmail.com>; Jakub Sitnicki
> > > > > <jakub@cloudflare.com>; Eric Dumazet <edumazet@google.com>;
> > > davem
> > > > > <davem@davemloft.net>; yoshfuji@linux-ipv6.org;
> > > > > dsahern@kernel.org; Jakub Kicinski <kuba@kernel.org>; Paolo
> > > > > Abeni <pabeni@redhat.com>; netdev <netdev@vger.kernel.org>;
> > > > > bpf@vger.kernel.org
> > > > > Subject: Re: [bug report] one possible out-of-order issue in
> > > > > sockmap
> > > > >
> > > > > On Sat, Sep 24, 2022 at 07:59:15AM +0000, liujian (CE) wrote:
> > > > > > Hello,
> > > > > >
> > > > > > I had a scp failure problem here. I analyze the code, and the
> > > > > > reasons may
> > > > > be as follows:
> > > > > >
> > > > > > From commit e7a5f1f1cd00 ("bpf/sockmap: Read psock ingress_msg
> > > > > before
> > > > > > sk_receive_queue", if we use sockops
> > > > > > (BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB
> > > > > > and BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB) to enable socket's
> > > > > sockmap
> > > > > > function, and don't enable strparse and verdict function, the
> > > > > > out-of-order problem may occur in the following process.
> > > > > >
> > > > > > client SK server SK
> > > > > > --------------------------------------------------------------
> > > > > > ----
> > > > > > ----
> > > > > > ----
> > > > > > tcp_rcv_synsent_state_process
> > > > > > tcp_finish_connect
> > > > > > tcp_init_transfer
> > > > > > tcp_set_state(sk, TCP_ESTABLISHED);
> > > > > > // insert SK to sockmap
> > > > > > wake up waitter
> > > > > > tcp_send_ack
> > > > > >
> > > > > > tcp_bpf_sendmsg(msgA)
> > > > > > // msgA will go tcp stack
> > > > > > tcp_rcv_state_process
> > > > > > tcp_init_transfer
> > > > > > //insert SK to sockmap
> > > > > > tcp_set_state(sk,
> > > > > > TCP_ESTABLISHED)
> > > > > > wake up waitter
> > > > >
> > > > > Here after the socket is inserted to a sockmap, its
> > > > > ->sk_data_ready() is already replaced with
> > > > > sk_psock_verdict_data_ready(), so msgA should go to sockmap, not
> > > > > TCP
> > > stack?
> > > > >
> > > > It is TCP stack. Here I only enable BPF_SK_MSG_VERDICT type.
> > > > bpftool prog load bpf_redir.o /sys/fs/bpf/bpf_redir map name
> > > > sock_ops_map pinned /sys/fs/bpf/sock_ops_map bpftool prog attach
> > > > pinned /sys/fs/bpf/bpf_redir msg_verdict pinned
> > > > /sys/fs/bpf/sock_ops_map
> > >
> > > Is the sender using FAST_OPEN by any chance? We know this bug exists
> > > in this case. Fix tbd.
> >
> > FAST_OPEN is not used.
>
> OK thanks for the reproducer I'll take a look this afternoon.
Hey, John and everyone, could you take a look at this one again?
If there's anything need me to test, please let me know.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-11-26 7:12 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-24 7:59 [bug report] one possible out-of-order issue in sockmap liujian (CE)
2022-09-25 18:25 ` Cong Wang
2022-09-26 1:34 ` liujian (CE)
2022-09-26 21:16 ` John Fastabend
2022-09-27 2:15 ` liujian (CE)
2022-09-28 18:31 ` John Fastabend
2022-11-26 7:12 ` liujian (CE)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).