netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: John Fastabend <john.fastabend@gmail.com>,
	 Kuniyuki Iwashima <kuniyu@amazon.com>,
	 xrivendell7@gmail.com
Cc: alexander@mihalicyn.com,  bpf@vger.kernel.org,
	 daan.j.demeyer@gmail.com,  davem@davemloft.net,
	 dhowells@redhat.com,  edumazet@google.com,
	 john.fastabend@gmail.com,  kuba@kernel.org,  kuniyu@amazon.com,
	 linux-kernel@vger.kernel.org,  netdev@vger.kernel.org,
	 pabeni@redhat.com
Subject: Re: memory leak in unix_create1/copy_process/security_prepare_creds
Date: Tue, 19 Dec 2023 12:29:43 -0800	[thread overview]
Message-ID: <6581fd3754b79_95e63208f@john.notmuch> (raw)
In-Reply-To: <6581f509a56ea_90e25208c7@john.notmuch>

John Fastabend wrote:
> Kuniyuki Iwashima wrote:
> > From: xingwei lee <xrivendell7@gmail.com>
> > Date: Tue, 19 Dec 2023 17:12:25 +0800
> > > Hello I found a bug in net/af_unix in the lastest upstream linux
> > > 6.7.rc5 and comfired in lastest net/net-next/bpf/bpf-next tree.
> > > Titled "TITLE: memory leak in unix_create1” and I also upload the
> > > repro.c and repro.txt.
> > > 
> > > If you fix this issue, please add the following tag to the commit:
> > > Reported-by: xingwei Lee <xrivendell7@gmail.com>
> > 
> > Thanks for reporting!
> > 
> > It seems 8866730aed510 forgot to add sock_put().
> > I've confirmed that the diff below silenced kmemleak but will check
> > more before posting a patch.
> 
> Did it really silence the memleak?

Yes reverting the patch fixed the issue for me.

> 
> > 
> > ---8<---
> > diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> > index 7ea7c3a0d0d0..32daba9e7f8b 100644
> > --- a/net/unix/unix_bpf.c
> > +++ b/net/unix/unix_bpf.c
> > @@ -164,6 +164,7 @@ int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool r
> >  	if (restore) {
> >  		sk->sk_write_space = psock->saved_write_space;
> >  		sock_replace_proto(sk, psock->sk_proto);
> > +		sock_put(psock->sk_pair);
> >  		return 0;
> 
> The reason the sock_put is not in this routine but in the sk_psock_destory
> is because we need to wait a RCU grace period for any pending queued
> BPF sends to also be flushed.

So we need a different fix here. I'll look into it.

> 
> >  	}
> >  
> > ---8<---
> > 
> > Thanks!
> > 
> > 
> 
> I'm also trying to understand how this adds up to
> unix_stream_bpf_update_proto() issue. The reproduce has a map_create
> followed by two map_delete() calls. I can't see how the unix socket
> ever got added to the BPF map and the deletes should be empty?
> 
> > > 
> > > lastest net tree: 979e90173af8d2f52f671d988189aab98c6d1be6
> > > Kernel config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=8c4e4700f1727d30
> > > 
> > > in the lastest net tree, the crash like:
> > > Linux syzkaller 6.7.0-rc5-00172-g979e90173af8 #4 SMP PREEMPT_DYNAMIC
> > > Tue Dec 19 11:03:58 HKT 2023 x86_4
> > > 
> > > TITLE: memory leak in security_prepare_creds
> > >    [<ffffffff8129291a>] copy_process+0x6aa/0x25c0 kernel/fork.c:2366
> > >    [<ffffffff812949db>] kernel_clone+0x11b/0x690 kernel/fork.c:2907
> > >    [<ffffffff81294fcc>] __do_sys_clone+0x7c/0xb0 kernel/fork.c:3050
> > >    [<ffffffff84b70dcf>] do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > >    [<ffffffff84b70dcf>] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:83
> > >    [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0x6b
> 
> ...
> 
> > > uint64_t r[1] = {0xffffffffffffffff};
> > > 
> > > void execute_one(void) {
> > >  intptr_t res = 0;
> > >  syscall(__NR_socketpair, /*domain=*/1ul, /*type=*/1ul, /*proto=*/0,
> > >          /*fds=*/0x20000000ul);
> > >  *(uint32_t*)0x200000c0 = 0x12;
> > >  *(uint32_t*)0x200000c4 = 2;
> > >  *(uint32_t*)0x200000c8 = 4;
> > >  *(uint32_t*)0x200000cc = 1;
> > >  *(uint32_t*)0x200000d0 = 0;
> > >  *(uint32_t*)0x200000d4 = -1;
> > >  *(uint32_t*)0x200000d8 = 0;
> 
> 
> > >  memset((void*)0x200000dc, 0, 16);
> > >  *(uint32_t*)0x200000ec = 0;
> > >  *(uint32_t*)0x200000f0 = -1;
> > >  *(uint32_t*)0x200000f4 = 0;
> > >  *(uint32_t*)0x200000f8 = 0;
> > >  *(uint32_t*)0x200000fc = 0;
> > >  *(uint64_t*)0x20000100 = 0;
> > >  res = syscall(__NR_bpf, /*cmd=*/0ul, /*arg=*/0x200000c0ul, /*size=*/0x48ul);
> 
> mapfd = map_create( bpf_attr { SOCKHASH, 1 entry, 0 flags, ...} )
> 
> > >  if (res != -1) r[0] = res;
> > >  *(uint32_t*)0x200003c0 = r[0];
> > >  *(uint64_t*)0x200003c8 = 0x20000040;
> > >  *(uint64_t*)0x200003d0 = 0x20000000;
> > >  *(uint64_t*)0x200003d8 = 0;
> > >  syscall(__NR_bpf, /*cmd=*/2ul, /*arg=*/0x200003c0ul, /*size=*/0x20ul);
> 
> map_delete(mapfd, key=0x20000040, value=0x20000000, flags = 0)
> 
> > >  *(uint32_t*)0x200003c0 = r[0];
> > >  *(uint64_t*)0x200003c8 = 0x20000040;
> > >  *(uint64_t*)0x200003d0 = 0x20000000;
> > >  *(uint64_t*)0x200003d8 = 0;
> > >  syscall(__NR_bpf, /*cmd=*/2ul, /*arg=*/0x200003c0ul, /*size=*/0x20ul);
> 
> map_delete(mapfd, key=0x20000040, value=0x20000000, flags = 0)

cmd=2 is update so this splat makes more sense.

> 
> so same as repro.txt below makes sense. But, if the sockets are
> never added to the sockhash then we never touched the proto from
> BPF side. And both of these deletes should return errors.
> 
> > > }
> > > int main(void) {
> > >  syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul,
> > >          /*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> > >  syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=*/7ul,
> > >          /*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> > >  syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul,
> > >          /*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> > >  setup_leak();
> > >  loop();
> > >  return 0;
> > > }
> > > 
> > > 
> > > 
> > > =* repro.txt =*
> > > socketpair(0x1, 0x1, 0x0, &(0x7f0000000000))
> > > r0 = bpf$MAP_CREATE(0x0, &(0x7f00000000c0)=@base={0x12, 0x2, 0x4, 0x1}, 0x48)
> > > bpf$MAP_DELETE_ELEM(0x2, &(0x7f00000003c0)={r0, &(0x7f0000000040),
> > > 0x20000000}, 0x20)
> > > bpf$MAP_DELETE_ELEM(0x2, &(0x7f00000003c0)={r0, &(0x7f0000000040),
> > > 0x20000000}, 0x20)
> 
> So not making sense to me how we got to blaming the proto delete
> logic here. It doesn't look like we ever added the psock and
> configured the proto?

Not sure why these is delete cmd=0x2.

#define ___BPF_FUNC_MAPPER(FN, ctx...)                  \
        FN(unspec, 0, ##ctx)                            \
        FN(map_lookup_elem, 1, ##ctx)                   \            
        FN(map_update_elem, 2, ##ctx)                   \


> 
> Did a bisect really blame the mentioned patch? I can likely try here
> as well.

Seems so.

> 
> Thanks,
> John



  reply	other threads:[~2023-12-19 20:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-19  9:12 memory leak in unix_create1/copy_process/security_prepare_creds xingwei lee
2023-12-19 15:50 ` Kuniyuki Iwashima
2023-12-19 19:54   ` John Fastabend
2023-12-19 20:29     ` John Fastabend [this message]
2023-12-19 22:06       ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6581fd3754b79_95e63208f@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=alexander@mihalicyn.com \
    --cc=bpf@vger.kernel.org \
    --cc=daan.j.demeyer@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=xrivendell7@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).