netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xu Kuohai <xukuohai@huaweicloud.com>
To: John Fastabend <john.fastabend@gmail.com>,
	Xu Kuohai <xukuohai@huaweicloud.com>,
	Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Jakub Sitnicki <jakub@cloudflare.com>,
	"David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Daniel Borkmann <daniel@iogearbox.net>
Subject: Re: [PATCH bpf] bpf, sockmap: Fix map type error in sock_map_del_link
Date: Thu, 3 Aug 2023 10:47:32 +0800	[thread overview]
Message-ID: <0cc7eb75-f339-3aeb-016f-dc4094bdf600@huaweicloud.com> (raw)
In-Reply-To: <64c8806537c3a_a427920846@john.notmuch>

On 8/1/2023 11:47 AM, John Fastabend wrote:
> Xu Kuohai wrote:
>> On 8/1/2023 9:19 AM, Martin KaFai Lau wrote:
>>> On 7/28/23 3:56 AM, Xu Kuohai wrote:
>>>> sock_map_del_link() operates on both SOCKMAP and SOCKHASH, although
>>>> both types have member named "progs", the offset of "progs" member in
>>>> these two types is different, so "progs" should be accessed with the
>>>> real map type.
>>>
>>> The patch makes sense to me. Can a test be written to trigger it?
>>>
>>
>> Thank you for the review. I have a messy prog that triggers memleak
>> caused by this issue. I'll try to simplify it to a test.
>>
>>> John, please review.
>>>
>>>
>>> .
>>
>>
> 
> Thanks good catch. One thing I don't see any tests for is deleting a
> socket from a sockmap and then trying to use it? My guess is almost
> no one deletes sockets from a map except on sock close. Maybe to
> reproduce,
> 
>   1. connect a bunch of sockets to sockhash with verdict prog
>   2. remove the sockets
>   3. remove the sockhash
>   4. that should leak the bpf ref cnt so we could check for the
>      prog still existing?
> 

I tried this and found no bpf prog leaks. The debugging shows that
the stream_parser and stream_verdict progs are released as follows:

sock_map_unref

   sock_map_del_link

     struct bpf_stab *stab = container_of(map, struct bpf_stab, map);

     if (psock->saved_data_ready && stab->progs.stream_parser)
       strp_stop = true; // (1) not executed, since stab->progs.stream_parser
                         //     is actually shtab->progs.msg_parser, which is
                         //     NULL, so the if condition is false.

     if (psock->saved_data_ready && stab->progs.stream_verdict)
       verdict_stop = true;  // (2) executed, so verdict_stop is set to true

     if (strp_stop) // (3) condition is false since strp_stop is false
       sk_psock_stop_strp(sk, psock)

     if (verdict_stop) // (4) condition pass, so stream_verdict prog refcnt
                       //     is released by sk_psock_stop_verdict
       sk_psock_stop_verdict(sk, psock)
         psock_set_prog(&pock->progs.stream_verdict, NULL)
           bpf_prog_put // (5) release refcnt of stream_verdict prog


   sk_psock_put
       sk_psock_drop(sk, psock)
         sk_psock_stop_strp(sk, psock)
           sk_psock_stop_strp(&psock->progs.stream_parser, NULL)
             bpf_prog_put // (6) release refcnt of stream_parser prog



However, this issue triggers a WARNING in strp_done:

sock_map_unref
   sock_map_del_link

     struct bpf_stab *stab = container_of(map, struct bpf_stab, map);

     if (psock->saved_data_ready && stab->progs.stream_verdict)
       verdict_stop = true;  // (1) verdict_stop is set to true


     if (verdict_stop) // (2) condition pass
       sk_psock_stop_verdict(sk, psock)
         psock_set_prog(&pock->progs.stream_verdict, NULL)
           bpf_prog_put
         psock->saved_data_ready = NULL;  // (3) psock->saved_data_ready is
                                          //     set to NULL

   sk_psock_put
       sk_psock_drop(sk, psock)

         sk_psock_stop_strp(sk, psock)

           if (!psock->saved_data_ready) return; // (4) sk_psock_stop_strp returns

           strp_stop(&psock->strp) // (5) so strp_stop can not be called
             strp->stopped = 1; // (6) so strp->stopped is **NOT** set to 1

         sk_psock_destroy
           sk_psock_done_strp
             strp_done
               WARN_ON(!strp->stopped); // (7) WARNING triggered


Now I'm convinced the memleak I observed was caused by strp_done not
being called, I'll send a test for it.


> Reviewed-by: John Fastabend <john.fastabend@gmail.com>
> 
> 
> .


  reply	other threads:[~2023-08-03  2:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-28 10:56 [PATCH bpf] bpf, sockmap: Fix map type error in sock_map_del_link Xu Kuohai
2023-08-01  1:19 ` Martin KaFai Lau
2023-08-01  2:24   ` Xu Kuohai
2023-08-01  3:47     ` John Fastabend
2023-08-03  2:47       ` Xu Kuohai [this message]
2023-08-05 11:41 ` Jakub Sitnicki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0cc7eb75-f339-3aeb-016f-dc4094bdf600@huaweicloud.com \
    --to=xukuohai@huaweicloud.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jakub@cloudflare.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).