netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ISSUE] suspicious sock leak
@ 2022-11-12  9:01 mingkun bian
  2022-11-13 10:22 ` mingkun bian
  0 siblings, 1 reply; 5+ messages in thread
From: mingkun bian @ 2022-11-12  9:01 UTC (permalink / raw)
  To: netdev

Hi,
    I found a problem that a sock whose state is ESTABLISHED is not
freed to slab cache by __sock_free.
    The test scenario is as follows:

    1. A HTTP Server,I insert a node to ebpf
map(BPF_MAP_TYPE_LRU_HASH) by BPF_MAP_UPDATE_ELEM when receiving a
"HTTP GET" request in user application.
    ebpf map is:
    key: cookie(getsockopt(fd, SOL_SOCKET, SO_COOKIE, &cookie, &optlen))
    value: saddr sport daddr dport cookie...

    2. I delete the corresponding ebpf map node by "kprobe __sk_free"
in ebpf as following, bpf_map_delete_elem keeps returning 0.

    SEC("kprobe/__sk_free")
    int bpf_prog_destroy_sock(struct pt_regs *ctx)
    {
        struct sock *sk;
        __u64 cookie;
       struct  tcp_infos *value;

       sk = (struct sock *) PT_REGS_PARM1(ctx);
       bpf_probe_read(&cookie, sizeof(sk->__sk_common.skc_cookie),
&sk->__sk_common.skc_cookie);
       value = bpf_map_lookup_elem(&bpfmap, &cookie);
       if (value) {
           if (bpf_map_delete_elem(&bpfmap, &cookie) != 0) {
               debugmsg("delete failed\n");
           }
       }
    }

   3. Sending pressure "HTTP GET" requests to HTTP Server for a while,
 then stop to send and close the HTTP Server, then wait a long time,
we can not see any tcpinfo by "netstat -anp", then error occurs:
    We can see some node which is not deleted int ebpf map by "bpftool
map dump id **", it seems like "sock leak", but the sockstat's
inuse(cat /proc/net/sockstat) does not increase quickly.

4. I did some more experiments by ebpf kprobe, I find that a
sock(state is ESTABLISHED, HTTP server recv a "HTTP GET" requset) does
not come in __sock_free, but the same sock will be reused by another
tcp connection(the most frequent is "127.0.0.1") after a while.
   What I doubt is that why a new tcp connection can resue a old sock
while the old sock does not come in __sk_free.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ISSUE] suspicious sock leak
  2022-11-12  9:01 [ISSUE] suspicious sock leak mingkun bian
@ 2022-11-13 10:22 ` mingkun bian
  2022-11-14  1:25   ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: mingkun bian @ 2022-11-13 10:22 UTC (permalink / raw)
  To: netdev

Hi,

bpf map1:
key: cookie
value: addr daddr sport dport cookie sock*

bpf map2:
key: sock*
value: addr daddr sport dport cookie sock*

1. Recv a "HTTP GET" request in user applicatoin
map1.insert(cookie, value)
map2.insert(sock*, value)

1. kprobe inet_csk_destroy_sock:
sk->sk_wmem_queued is 0
sk->sk_wmem_alloc is 4201
sk->sk_refcnt is 2
sk->__sk_common.skc_cookie is 173585924
saddr daddr sport dport is 192.168.10.x 80

2. kprobe __sk_free
can not find the "saddr daddr sport dport 192.168.10.x 80" in kprobe __sk_free

3. kprobe __sk_free
after a while, "kprobe __sk_free" find the "saddr daddr sport dport
127.0.0.1 xx"' info
value = map2.find(sock*)
value1 = map1.find(sock->cookie)
if (value) {
    map2.delete(sock) //print value info, find "saddr daddr sport
dport" is "192.168.10.x 80“, and value->cookie is 173585924, which is
the same as "192.168.10.x 80" 's cookie.
}

if (value1) {
    map1.delete(sock->cookie)
}

Here is my test flow, commented lines represents that  sock of ”saddr
daddr sport dport 192.168.10.x 80“ does not come in  __sk_free, but it
is reused by ” saddr daddr sport dport 127.0.0.1 xx"


mingkun bian <bianmingkun@gmail.com> 于2022年11月12日周六 17:01写道:
>
> Hi,
>     I found a problem that a sock whose state is ESTABLISHED is not
> freed to slab cache by __sock_free.
>     The test scenario is as follows:
>
>     1. A HTTP Server,I insert a node to ebpf
> map(BPF_MAP_TYPE_LRU_HASH) by BPF_MAP_UPDATE_ELEM when receiving a
> "HTTP GET" request in user application.
>     ebpf map is:
>     key: cookie(getsockopt(fd, SOL_SOCKET, SO_COOKIE, &cookie, &optlen))
>     value: saddr sport daddr dport cookie...
>
>     2. I delete the corresponding ebpf map node by "kprobe __sk_free"
> in ebpf as following, bpf_map_delete_elem keeps returning 0.
>
>     SEC("kprobe/__sk_free")
>     int bpf_prog_destroy_sock(struct pt_regs *ctx)
>     {
>         struct sock *sk;
>         __u64 cookie;
>        struct  tcp_infos *value;
>
>        sk = (struct sock *) PT_REGS_PARM1(ctx);
>        bpf_probe_read(&cookie, sizeof(sk->__sk_common.skc_cookie),
> &sk->__sk_common.skc_cookie);
>        value = bpf_map_lookup_elem(&bpfmap, &cookie);
>        if (value) {
>            if (bpf_map_delete_elem(&bpfmap, &cookie) != 0) {
>                debugmsg("delete failed\n");
>            }
>        }
>     }
>
>    3. Sending pressure "HTTP GET" requests to HTTP Server for a while,
>  then stop to send and close the HTTP Server, then wait a long time,
> we can not see any tcpinfo by "netstat -anp", then error occurs:
>     We can see some node which is not deleted int ebpf map by "bpftool
> map dump id **", it seems like "sock leak", but the sockstat's
> inuse(cat /proc/net/sockstat) does not increase quickly.
>
> 4. I did some more experiments by ebpf kprobe, I find that a
> sock(state is ESTABLISHED, HTTP server recv a "HTTP GET" requset) does
> not come in __sock_free, but the same sock will be reused by another
> tcp connection(the most frequent is "127.0.0.1") after a while.
>    What I doubt is that why a new tcp connection can resue a old sock
> while the old sock does not come in __sk_free.
>
> Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ISSUE] suspicious sock leak
  2022-11-13 10:22 ` mingkun bian
@ 2022-11-14  1:25   ` Cong Wang
  2022-11-14  5:39     ` mingkun bian
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2022-11-14  1:25 UTC (permalink / raw)
  To: mingkun bian; +Cc: netdev

On Sun, Nov 13, 2022 at 06:22:22PM +0800, mingkun bian wrote:
> Hi,
> 
> bpf map1:
> key: cookie
> value: addr daddr sport dport cookie sock*
> 
> bpf map2:
> key: sock*
> value: addr daddr sport dport cookie sock*

So none of them is sockmap? Why not use sockmap which takes care
of sock refcnt for you?

> 
> 1. Recv a "HTTP GET" request in user applicatoin
> map1.insert(cookie, value)
> map2.insert(sock*, value)
> 
> 1. kprobe inet_csk_destroy_sock:
> sk->sk_wmem_queued is 0
> sk->sk_wmem_alloc is 4201
> sk->sk_refcnt is 2
> sk->__sk_common.skc_cookie is 173585924
> saddr daddr sport dport is 192.168.10.x 80
> 
> 2. kprobe __sk_free
> can not find the "saddr daddr sport dport 192.168.10.x 80" in kprobe __sk_free
> 
> 3. kprobe __sk_free
> after a while, "kprobe __sk_free" find the "saddr daddr sport dport
> 127.0.0.1 xx"' info
> value = map2.find(sock*)
> value1 = map1.find(sock->cookie)
> if (value) {
>     map2.delete(sock) //print value info, find "saddr daddr sport
> dport" is "192.168.10.x 80“, and value->cookie is 173585924, which is
> the same as "192.168.10.x 80" 's cookie.
> }
> 
> if (value1) {
>     map1.delete(sock->cookie)
> }
> 
> Here is my test flow, commented lines represents that  sock of ”saddr
> daddr sport dport 192.168.10.x 80“ does not come in  __sk_free, but it
> is reused by ” saddr daddr sport dport 127.0.0.1 xx"

I don't see this is a problem yet, the struct sock may be still referenced
by the kernel even after you close its corresponding struct socket from
user-space. And TCP sockets have timewait too, so...

I suggest you try sockmap to store sockets instead.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ISSUE] suspicious sock leak
  2022-11-14  1:25   ` Cong Wang
@ 2022-11-14  5:39     ` mingkun bian
  2023-07-17  2:35       ` mingkun bian
  0 siblings, 1 reply; 5+ messages in thread
From: mingkun bian @ 2022-11-14  5:39 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev

Cong Wang <xiyou.wangcong@gmail.com> 于2022年11月14日周一 09:25写道:
>
> On Sun, Nov 13, 2022 at 06:22:22PM +0800, mingkun bian wrote:
> > Hi,
> >
> > bpf map1:
> > key: cookie
> > value: addr daddr sport dport cookie sock*
> >
> > bpf map2:
> > key: sock*
> > value: addr daddr sport dport cookie sock*
>
> So none of them is sockmap? Why not use sockmap which takes care
> of sock refcnt for you?
>
> >
> > 1. Recv a "HTTP GET" request in user applicatoin
> > map1.insert(cookie, value)
> > map2.insert(sock*, value)
> >
> > 1. kprobe inet_csk_destroy_sock:
> > sk->sk_wmem_queued is 0
> > sk->sk_wmem_alloc is 4201
> > sk->sk_refcnt is 2
> > sk->__sk_common.skc_cookie is 173585924
> > saddr daddr sport dport is 192.168.10.x 80
> >
> > 2. kprobe __sk_free
> > can not find the "saddr daddr sport dport 192.168.10.x 80" in kprobe __sk_free
> >
> > 3. kprobe __sk_free
> > after a while, "kprobe __sk_free" find the "saddr daddr sport dport
> > 127.0.0.1 xx"' info
> > value = map2.find(sock*)
> > value1 = map1.find(sock->cookie)
> > if (value) {
> >     map2.delete(sock) //print value info, find "saddr daddr sport
> > dport" is "192.168.10.x 80“, and value->cookie is 173585924, which is
> > the same as "192.168.10.x 80" 's cookie.
> > }
> >
> > if (value1) {
> >     map1.delete(sock->cookie)
> > }
> >
> > Here is my test flow, commented lines represents that  sock of ”saddr
> > daddr sport dport 192.168.10.x 80“ does not come in  __sk_free, but it
> > is reused by ” saddr daddr sport dport 127.0.0.1 xx"
>
> I don't see this is a problem yet, the struct sock may be still referenced
> by the kernel even after you close its corresponding struct socket from
> user-space. And TCP sockets have timewait too, so...
>
> I suggest you try sockmap to store sockets instead.
>
> Thanks.

Hi,

I do not use sockmap in this scenario.

Traffic model is about 20Gbps external traffic and 80Gbps lo traffic,
only external traffic can insert bpf map.
The old sock will be reused only if the old sock exec "__sock_free"
whether  referenced or not by the kernel, but my test result is not
so.
And TIME_WAIT state still release the sock immediately, then create
tcp_timewait_sock instead of sock in function 'tcp_time_wait'.

My kernel is 4.18.0.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [ISSUE] suspicious sock leak
  2022-11-14  5:39     ` mingkun bian
@ 2023-07-17  2:35       ` mingkun bian
  0 siblings, 0 replies; 5+ messages in thread
From: mingkun bian @ 2023-07-17  2:35 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev

Hi,
    I have use cookie as key instead of sock*, then use lru map, it works ok.

    But now I basically confirmed the reason, There is a very small
probability that kprobe will fail come into kprobe handler in another
experiment about tcp timer.

    ftrace:
    trace_timer_expire_entry、trace_timer_expire_exit of call_timer_fn.
    trace_timer_start of debug_activate
    trace_timer_cancel of debug_deactivate

    Then I kprobe "call_timer_fn", there will be a small probability
that the timer has been executed, but the kprobe has not been
executed.
    There is a high probability that the stack will appear when the
kprobe hander(kprobe_ftrace_handler)  is interrupted by a hard
interrupt(irq_exit).


nginx-10064 [000] d.s. 154816.275735: timer_cancel: timer=00000000fb259de9
nginx-10064 [000] d.s. 154816.275743: <stack trace>
=> run_timer_softirq
=> __do_softirq
=> irq_exit
=> smp_apic_timer_interrupt
=> apic_timer_interrupt
=> trace_call_bpf
=> kprobe_perf_func
=> kprobe_ftrace_handler
=> ftrace_ops_assist_func
=> 0xffffffffc078a0bf
=> tcp_send_fin
=> tcp_close
=> inet_release
=> __sock_release
=> sock_close
=> __fput
=> task_work_run
=> exit_to_usermode_loop
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
nginx-10064 [000] ..s. 154816.275744: timer_expire_entry:
timer=00000000fb259de9 function=tcp_write_timer now=4449486051
nginx-10064 [000] ..s. 154816.275747: <stack trace>
=> call_timer_fn
=> run_timer_softirq
=> __do_softirq
=> irq_exit
=> smp_apic_timer_interrupt
=> apic_timer_interrupt
=> trace_call_bpf
=> kprobe_perf_func
=> kprobe_ftrace_handler
=> ftrace_ops_assist_func
=> 0xffffffffc078a0bf
=> tcp_send_fin
=> tcp_close
=> inet_release
=> __sock_release
=> sock_close
=> __fput
=> task_work_run
=> exit_to_usermode_loop
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
nginx-10064 [000] ..s. 154816.275748: timer_expire_exit: timer=00000000fb259de9
nginx-10064 [000] ..s. 154816.275750: <stack trace>
=> call_timer_fn
=> run_timer_softirq
=> __do_softirq
=> irq_exit
=> smp_apic_timer_interrupt
=> apic_timer_interrupt
=> trace_call_bpf
=> kprobe_perf_func
=> kprobe_ftrace_handler
=> ftrace_ops_assist_func
=> 0xffffffffc078a0bf
=> tcp_send_fin
=> tcp_close
=> inet_release
=> __sock_release
=> sock_close
=> __fput
=> task_work_run
=> exit_to_usermode_loop
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe

    Thanks.




On Mon, 14 Nov 2022 at 13:39, mingkun bian <bianmingkun@gmail.com> wrote:
>
> Cong Wang <xiyou.wangcong@gmail.com> 于2022年11月14日周一 09:25写道:
> >
> > On Sun, Nov 13, 2022 at 06:22:22PM +0800, mingkun bian wrote:
> > > Hi,
> > >
> > > bpf map1:
> > > key: cookie
> > > value: addr daddr sport dport cookie sock*
> > >
> > > bpf map2:
> > > key: sock*
> > > value: addr daddr sport dport cookie sock*
> >
> > So none of them is sockmap? Why not use sockmap which takes care
> > of sock refcnt for you?
> >
> > >
> > > 1. Recv a "HTTP GET" request in user applicatoin
> > > map1.insert(cookie, value)
> > > map2.insert(sock*, value)
> > >
> > > 1. kprobe inet_csk_destroy_sock:
> > > sk->sk_wmem_queued is 0
> > > sk->sk_wmem_alloc is 4201
> > > sk->sk_refcnt is 2
> > > sk->__sk_common.skc_cookie is 173585924
> > > saddr daddr sport dport is 192.168.10.x 80
> > >
> > > 2. kprobe __sk_free
> > > can not find the "saddr daddr sport dport 192.168.10.x 80" in kprobe __sk_free
> > >
> > > 3. kprobe __sk_free
> > > after a while, "kprobe __sk_free" find the "saddr daddr sport dport
> > > 127.0.0.1 xx"' info
> > > value = map2.find(sock*)
> > > value1 = map1.find(sock->cookie)
> > > if (value) {
> > >     map2.delete(sock) //print value info, find "saddr daddr sport
> > > dport" is "192.168.10.x 80“, and value->cookie is 173585924, which is
> > > the same as "192.168.10.x 80" 's cookie.
> > > }
> > >
> > > if (value1) {
> > >     map1.delete(sock->cookie)
> > > }
> > >
> > > Here is my test flow, commented lines represents that  sock of ”saddr
> > > daddr sport dport 192.168.10.x 80“ does not come in  __sk_free, but it
> > > is reused by ” saddr daddr sport dport 127.0.0.1 xx"
> >
> > I don't see this is a problem yet, the struct sock may be still referenced
> > by the kernel even after you close its corresponding struct socket from
> > user-space. And TCP sockets have timewait too, so...
> >
> > I suggest you try sockmap to store sockets instead.
> >
> > Thanks.
>
> Hi,
>
> I do not use sockmap in this scenario.
>
> Traffic model is about 20Gbps external traffic and 80Gbps lo traffic,
> only external traffic can insert bpf map.
> The old sock will be reused only if the old sock exec "__sock_free"
> whether  referenced or not by the kernel, but my test result is not
> so.
> And TIME_WAIT state still release the sock immediately, then create
> tcp_timewait_sock instead of sock in function 'tcp_time_wait'.
>
> My kernel is 4.18.0.
>
> Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-07-17  2:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-12  9:01 [ISSUE] suspicious sock leak mingkun bian
2022-11-13 10:22 ` mingkun bian
2022-11-14  1:25   ` Cong Wang
2022-11-14  5:39     ` mingkun bian
2023-07-17  2:35       ` mingkun bian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).