netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Kuniyuki Iwashima <kuniyu@amazon.com>,  xrivendell7@gmail.com
Cc: alexander@mihalicyn.com,  bpf@vger.kernel.org,
	 daan.j.demeyer@gmail.com,  davem@davemloft.net,
	 dhowells@redhat.com,  edumazet@google.com,
	 john.fastabend@gmail.com,  kuba@kernel.org,  kuniyu@amazon.com,
	 linux-kernel@vger.kernel.org,  netdev@vger.kernel.org,
	 pabeni@redhat.com
Subject: Re: memory leak in unix_create1/copy_process/security_prepare_creds
Date: Tue, 19 Dec 2023 11:54:49 -0800	[thread overview]
Message-ID: <6581f509a56ea_90e25208c7@john.notmuch> (raw)
In-Reply-To: <20231219155057.12716-1-kuniyu@amazon.com>

Kuniyuki Iwashima wrote:
> From: xingwei lee <xrivendell7@gmail.com>
> Date: Tue, 19 Dec 2023 17:12:25 +0800
> > Hello I found a bug in net/af_unix in the lastest upstream linux
> > 6.7.rc5 and comfired in lastest net/net-next/bpf/bpf-next tree.
> > Titled "TITLE: memory leak in unix_create1” and I also upload the
> > repro.c and repro.txt.
> > 
> > If you fix this issue, please add the following tag to the commit:
> > Reported-by: xingwei Lee <xrivendell7@gmail.com>
> 
> Thanks for reporting!
> 
> It seems 8866730aed510 forgot to add sock_put().
> I've confirmed that the diff below silenced kmemleak but will check
> more before posting a patch.

Did it really silence the memleak?

> 
> ---8<---
> diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> index 7ea7c3a0d0d0..32daba9e7f8b 100644
> --- a/net/unix/unix_bpf.c
> +++ b/net/unix/unix_bpf.c
> @@ -164,6 +164,7 @@ int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool r
>  	if (restore) {
>  		sk->sk_write_space = psock->saved_write_space;
>  		sock_replace_proto(sk, psock->sk_proto);
> +		sock_put(psock->sk_pair);
>  		return 0;

The reason the sock_put is not in this routine but in the sk_psock_destory
is because we need to wait a RCU grace period for any pending queued
BPF sends to also be flushed.

>  	}
>  
> ---8<---
> 
> Thanks!
> 
> 

I'm also trying to understand how this adds up to
unix_stream_bpf_update_proto() issue. The reproduce has a map_create
followed by two map_delete() calls. I can't see how the unix socket
ever got added to the BPF map and the deletes should be empty?

> > 
> > lastest net tree: 979e90173af8d2f52f671d988189aab98c6d1be6
> > Kernel config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=8c4e4700f1727d30
> > 
> > in the lastest net tree, the crash like:
> > Linux syzkaller 6.7.0-rc5-00172-g979e90173af8 #4 SMP PREEMPT_DYNAMIC
> > Tue Dec 19 11:03:58 HKT 2023 x86_4
> > 
> > TITLE: memory leak in security_prepare_creds
> >    [<ffffffff8129291a>] copy_process+0x6aa/0x25c0 kernel/fork.c:2366
> >    [<ffffffff812949db>] kernel_clone+0x11b/0x690 kernel/fork.c:2907
> >    [<ffffffff81294fcc>] __do_sys_clone+0x7c/0xb0 kernel/fork.c:3050
> >    [<ffffffff84b70dcf>] do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >    [<ffffffff84b70dcf>] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:83
> >    [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0x6b

...

> > uint64_t r[1] = {0xffffffffffffffff};
> > 
> > void execute_one(void) {
> >  intptr_t res = 0;
> >  syscall(__NR_socketpair, /*domain=*/1ul, /*type=*/1ul, /*proto=*/0,
> >          /*fds=*/0x20000000ul);
> >  *(uint32_t*)0x200000c0 = 0x12;
> >  *(uint32_t*)0x200000c4 = 2;
> >  *(uint32_t*)0x200000c8 = 4;
> >  *(uint32_t*)0x200000cc = 1;
> >  *(uint32_t*)0x200000d0 = 0;
> >  *(uint32_t*)0x200000d4 = -1;
> >  *(uint32_t*)0x200000d8 = 0;


> >  memset((void*)0x200000dc, 0, 16);
> >  *(uint32_t*)0x200000ec = 0;
> >  *(uint32_t*)0x200000f0 = -1;
> >  *(uint32_t*)0x200000f4 = 0;
> >  *(uint32_t*)0x200000f8 = 0;
> >  *(uint32_t*)0x200000fc = 0;
> >  *(uint64_t*)0x20000100 = 0;
> >  res = syscall(__NR_bpf, /*cmd=*/0ul, /*arg=*/0x200000c0ul, /*size=*/0x48ul);

mapfd = map_create( bpf_attr { SOCKHASH, 1 entry, 0 flags, ...} )

> >  if (res != -1) r[0] = res;
> >  *(uint32_t*)0x200003c0 = r[0];
> >  *(uint64_t*)0x200003c8 = 0x20000040;
> >  *(uint64_t*)0x200003d0 = 0x20000000;
> >  *(uint64_t*)0x200003d8 = 0;
> >  syscall(__NR_bpf, /*cmd=*/2ul, /*arg=*/0x200003c0ul, /*size=*/0x20ul);

map_delete(mapfd, key=0x20000040, value=0x20000000, flags = 0)

> >  *(uint32_t*)0x200003c0 = r[0];
> >  *(uint64_t*)0x200003c8 = 0x20000040;
> >  *(uint64_t*)0x200003d0 = 0x20000000;
> >  *(uint64_t*)0x200003d8 = 0;
> >  syscall(__NR_bpf, /*cmd=*/2ul, /*arg=*/0x200003c0ul, /*size=*/0x20ul);

map_delete(mapfd, key=0x20000040, value=0x20000000, flags = 0)

so same as repro.txt below makes sense. But, if the sockets are
never added to the sockhash then we never touched the proto from
BPF side. And both of these deletes should return errors.

> > }
> > int main(void) {
> >  syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul,
> >          /*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> >  syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=*/7ul,
> >          /*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> >  syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul,
> >          /*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> >  setup_leak();
> >  loop();
> >  return 0;
> > }
> > 
> > 
> > 
> > =* repro.txt =*
> > socketpair(0x1, 0x1, 0x0, &(0x7f0000000000))
> > r0 = bpf$MAP_CREATE(0x0, &(0x7f00000000c0)=@base={0x12, 0x2, 0x4, 0x1}, 0x48)
> > bpf$MAP_DELETE_ELEM(0x2, &(0x7f00000003c0)={r0, &(0x7f0000000040),
> > 0x20000000}, 0x20)
> > bpf$MAP_DELETE_ELEM(0x2, &(0x7f00000003c0)={r0, &(0x7f0000000040),
> > 0x20000000}, 0x20)

So not making sense to me how we got to blaming the proto delete
logic here. It doesn't look like we ever added the psock and
configured the proto?

Did a bisect really blame the mentioned patch? I can likely try here
as well.

Thanks,
John

  reply	other threads:[~2023-12-19 19:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-19  9:12 memory leak in unix_create1/copy_process/security_prepare_creds xingwei lee
2023-12-19 15:50 ` Kuniyuki Iwashima
2023-12-19 19:54   ` John Fastabend [this message]
2023-12-19 20:29     ` John Fastabend
2023-12-19 22:06       ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6581f509a56ea_90e25208c7@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=alexander@mihalicyn.com \
    --cc=bpf@vger.kernel.org \
    --cc=daan.j.demeyer@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=xrivendell7@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).