From: Kees Cook <keescook@chromium.org>
To: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>,
Will Drewry <wad@chromium.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
netdev <netdev@vger.kernel.org>,
bpf@vger.kernel.org, alaaemadhossney.ae@gmail.com,
syzkaller@googlegroups.com, Jann Horn <jannh@google.com>,
Tycho Andersen <tycho@tycho.pizza>,
Sargun Dhillon <sargun@sargun.me>,
Christian Brauner <christian.brauner@ubuntu.com>
Subject: Re: memory leak in do_seccomp
Date: Sat, 31 Jul 2021 20:25:58 -0700 [thread overview]
Message-ID: <202107311901.8CDF235F65@keescook> (raw)
In-Reply-To: <CADVatmPShADZ0F133eS3KjeKj1ZjTNAQfy_QOoJVBan02wuR+Q@mail.gmail.com>
On Sat, Jul 31, 2021 at 08:20:29PM +0100, Sudip Mukherjee wrote:
> Hi All,
>
> We had been running syzkaller on v5.10.y and a "memory leak in
> do_seccomp" was being reported on it. I got some time to check that
> today and have managed to get a syzkaller
> reproducer. I dont have a C reproducer which I can share but I can use
> the syz-reproducer to reproduce this with next-20210730.
> The old report on v5.10.y is at
> https://elisa-builder-00.iol.unh.edu/syzkaller/report?id=f6ddd3b592f00e95f9cbd2e74f70a5b04b015c6f
Thanks for the details!
Is this the same as what syzbot saw here (with a C reproducer)?
https://syzkaller.appspot.com/bug?id=2809bb0ac77ad9aa3f4afe42d6a610aba594a987
I can't figure out what happened with the "Patch testing request" that
was made; there's no link?
>
> BUG: memory leak
> unreferenced object 0xffff888019282c00 (size 512):
> comm "syz-executor.1", pid 7389, jiffies 4294761829 (age 17.841s)
> hex dump (first 32 bytes):
> 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<00000000762c0963>] do_seccomp+0x2d5/0x27d0
Can you run "./scripts/faddr2line do_seccomp+0x2d5/0x27d0" for this? I
expect it'll be:
sfilter = kzalloc(sizeof(*sfilter), GFP_KERNEL | __GFP_NOWARN);
> [<0000000006e512d1>] do_syscall_64+0x3b/0x90
> [<0000000094ae9ff8>] entry_SYSCALL_64_after_hwframe+0x44/0xae
The "size 512" in your v5.10.y report is from seccomp_prepare_filter()
(noted above). seccomp_prepare_filter() cleans up its error paths.
>
> BUG: memory leak
> unreferenced object 0xffffc900006b5000 (size 4096):
> comm "syz-executor.1", pid 7389, jiffies 4294761829 (age 17.841s)
> hex dump (first 32 bytes):
> 01 00 00 00 00 00 00 00 00 00 00 00 05 00 00 00 ................
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<00000000854901e5>] __vmalloc_node_range+0x550/0x9a0
> [<000000002686628f>] __vmalloc_node+0xb5/0x100
> [<0000000004cbd298>] bpf_prog_alloc_no_stats+0x38/0x350
> [<0000000009149728>] bpf_prog_alloc+0x24/0x170
> [<000000000fe7f1e7>] bpf_prog_create_from_user+0xad/0x2e0
> [<000000000c70eb02>] do_seccomp+0x325/0x27d0
> [<0000000006e512d1>] do_syscall_64+0x3b/0x90
> [<0000000094ae9ff8>] entry_SYSCALL_64_after_hwframe+0x44/0xae
Again, I'm curious about where do_seccomp+0x325/0x27d0 is for this, but
the matching one in v5.10 shows:
ret = bpf_prog_create_from_user(&sfilter->prog, fprog,
seccomp_check_filter, save_orig);
This and everything remaining below else has bpf_prog_create_from_user()
in the allocation path.
>
> BUG: memory leak
> unreferenced object 0xffff888026eb1000 (size 2048):
> comm "syz-executor.1", pid 7389, jiffies 4294761829 (age 17.842s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<0000000072de7240>] bpf_prog_alloc_no_stats+0xeb/0x350
> [<0000000009149728>] bpf_prog_alloc+0x24/0x170
> [<000000000fe7f1e7>] bpf_prog_create_from_user+0xad/0x2e0
> [<000000000c70eb02>] do_seccomp+0x325/0x27d0
> [<0000000006e512d1>] do_syscall_64+0x3b/0x90
> [<0000000094ae9ff8>] entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> BUG: memory leak
> unreferenced object 0xffff888014dddac0 (size 16):
> comm "syz-executor.1", pid 7389, jiffies 4294761829 (age 17.842s)
> hex dump (first 16 bytes):
> 01 00 ca 08 80 88 ff ff c8 ef df 14 80 88 ff ff ................
These are two kernel pointers:
0xffff888008ca0001 (unaligned by 1 byte?!)
0xffff888014dfefc8
Ah, no, this is from:
struct sock_fprog_kern {
u16 len;
struct sock_filter *filter;
};
The "ca 08 80 88 ff ff" bytes are uninitialized padding. ;) "len" has
a value of 1 (which matches the syzkaller reproducer args below of a
single BPF instruction).
fp->orig_prog = kmalloc(sizeof(*fkprog), GFP_KERNEL);
if (!fp->orig_prog)
return -ENOMEM;
fkprog = fp->orig_prog;
fkprog->len = fprog->len;
...
fkprog->filter = kmemdup(fp->insns, fsize,
GFP_KERNEL | __GFP_NOWARN);
> backtrace:
> [<00000000c5d4ed93>] bpf_prog_store_orig_filter+0x7b/0x1e0
> [<000000007cb21c2a>] bpf_prog_create_from_user+0x1c6/0x2e0
> [<000000000c70eb02>] do_seccomp+0x325/0x27d0
> [<0000000006e512d1>] do_syscall_64+0x3b/0x90
> [<0000000094ae9ff8>] entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> BUG: memory leak
> unreferenced object 0xffff888014dfefc8 (size 8):
> comm "syz-executor.1", pid 7389, jiffies 4294761829 (age 17.842s)
> hex dump (first 8 bytes):
> 06 00 00 00 ff ff ff 7f ........
This contains a userspace (likely stack) pointer, and is referenced
by the second pointer above. (i.e. kmemdup() above, but how have the
contents become a user stack pointer?)
> backtrace:
> [<00000000ee5550f8>] kmemdup+0x23/0x50
> [<00000000f1acd067>] bpf_prog_store_orig_filter+0x103/0x1e0
> [<000000007cb21c2a>] bpf_prog_create_from_user+0x1c6/0x2e0
> [<000000000c70eb02>] do_seccomp+0x325/0x27d0
> [<0000000006e512d1>] do_syscall_64+0x3b/0x90
> [<0000000094ae9ff8>] entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Not sure if this has been already reported or not, but I will be happy
> to test if you have a fix for this.
I was suspecting a missing error path free near bpf_prepare_filter()
as called by bpf_prog_create_from_user() here:
/* bpf_prepare_filter() already takes care of freeing
* memory in case something goes wrong.
*/
fp = bpf_prepare_filter(fp, trans);
if (IS_ERR(fp))
return PTR_ERR(fp);
Since only seccomp and af_packet use bpf_prog_create_from_user(),
and af_packet sets neither a "trans" callback nor save_orig. But if
"trans" fails (due to some BPF instructions seccomp doesn't support),
I'd expect this leak to be detected more often.
bpf_prepare_filter() is documented as cleaning up allocations on failure,
though I notice its cleanup differs from bpf_prog_create_from_user()'s,
which uses __bpf_prog_free() instead of __bfp_prog_release(). But
that should only make a difference for orig_prog getting freed,
and bpf_prog_store_orig_filter() should already be freeing that on
failures too.
Similarly, bpf_migrate_filter() cleanups up on failure too, so this
doesn't seem to be it:
if (!fp->jited)
fp = bpf_migrate_filter(fp);
return fp;
So, I'm going to assume the missing free is somehow related to
process management, since I see the Syzkaller reproducer mentions
SECCOMP_SET_MODE_FILTER_LISTENER, fork(), and ptrace(). :)
Quoting from the v5.10.y report:
> # {Threaded:true Collide:true Repeat:true RepeatTimes:0 Procs:8 Slowdown:1 Sandbox:none Fault:false FaultCall:-1 FaultNth:0 Leak:true NetInjection:true NetDevices:true NetReset:true Cgroups:true BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false USB:true VhciInjection:false Wifi:false IEEE802154:false Sysctl:true UseTmpDir:true HandleSegv:true Repro:false Trace:false}
> seccomp$SECCOMP_SET_MODE_FILTER_LISTENER(0x1, 0x0, &(0x7f0000000000)={0x1, &(0x7f0000000040)=[{0x6, 0x0, 0x0, 0x7fffffff}]})
0x1 is SECCOMP_SET_MODE_FILTER
0x0 is empty flags
{0x6, 0x0, 0x0, 0x7fffffff} is
BPF_STMT(BPF_RET, SECCOMP_RET_ALLOW | 0xffff)
For "SECCOMP_SET_MODE_FILTER_LISTENER", defined here:
https://github.com/google/syzkaller/blob/master/sys/linux/seccomp.txt#L15
I was expecting flags to include SECCOMP_FILTER_FLAG_NEW_LISTENER:
seccomp$SECCOMP_SET_MODE_FILTER_LISTENER(
op const[SECCOMP_SET_MODE_FILTER],
flags flags[seccomp_flags_listener],
arg ptr[in, sock_fprog]) fd_seccomp (breaks_returns)
For the flags:
seccomp_flags_listener = SECCOMP_FILTER_FLAG_NEW_LISTENER,
SECCOMP_FILTER_FLAG_LOG_LISTENER,
SECCOMP_FILTER_FLAG_SPEC_ALLOW_LISTENER
which is:
SECCOMP_FILTER_FLAG_LOG_LISTENER = 10
SECCOMP_FILTER_FLAG_NEW_LISTENER = 8
SECCOMP_FILTER_FLAG_SPEC_ALLOW = 4
SECCOMP_FILTER_FLAG_SPEC_ALLOW_LISTENER = 12
How is flags 0 above? (Maybe I don't understand the syzkaller reproducer
meaning fully?)
> r0 = fork()
> ptrace(0x10, r0)
0x10 is PTRACE_ATTACH
My best guess is there is some LISTENER refcount state we can get into
where all the processes die, but a reference is left alive.
-Kees
--
Kees Cook
next prev parent reply other threads:[~2021-08-01 3:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-31 19:20 memory leak in do_seccomp Sudip Mukherjee
2021-08-01 3:25 ` Kees Cook [this message]
2021-08-01 21:10 ` Sudip Mukherjee
-- strict thread matches above, loose matches on Subject: below --
2020-08-11 17:06 syzbot
2020-08-31 3:50 ` syzbot
2020-08-31 23:25 ` Kees Cook
2020-09-01 0:09 ` Tycho Andersen
2020-09-01 1:14 ` Tycho Andersen
2020-09-01 10:07 ` Christian Brauner
2020-09-01 15:08 ` Kees Cook
2020-09-01 15:26 ` Tycho Andersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202107311901.8CDF235F65@keescook \
--to=keescook@chromium.org \
--cc=alaaemadhossney.ae@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=christian.brauner@ubuntu.com \
--cc=daniel@iogearbox.net \
--cc=jannh@google.com \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=netdev@vger.kernel.org \
--cc=sargun@sargun.me \
--cc=songliubraving@fb.com \
--cc=sudipm.mukherjee@gmail.com \
--cc=syzkaller@googlegroups.com \
--cc=tycho@tycho.pizza \
--cc=wad@chromium.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.