All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	syzkaller-bugs@googlegroups.com
Subject: Re: proc_flush_task oops
Date: Tue, 19 Dec 2017 19:54:24 -0600	[thread overview]
Message-ID: <87mv2e17vz.fsf@xmission.com> (raw)
In-Reply-To: <878tdy5r5t.fsf@xmission.com> (Eric W. Biederman's message of "Tue, 19 Dec 2017 15:44:30 -0600")

ebiederm@xmission.com (Eric W. Biederman) writes:

> Dave Jones <davej@codemonkey.org.uk> writes:
>
>> On Tue, Dec 19, 2017 at 12:27:30PM -0600, Eric W. Biederman wrote:
>>  > Dave Jones <davej@codemonkey.org.uk> writes:
>>  > 
>>  > > On Mon, Dec 18, 2017 at 03:50:52PM -0800, Linus Torvalds wrote:
>>  > >
>>  > >  > But I don't see what would have changed in this area recently.
>>  > >  > 
>>  > >  > Do you end up saving the seeds that cause crashes? Is this
>>  > >  > reproducible? (Other than seeing it twoce, of course)
>>  > >
>>  > > Only clue so far, is every time I'm able to trigger it, the last thing
>>  > > the child process that triggers it did, was an execveat.
>>  > 
>>  > Is there any chance the excveat might be called from a child thread?
>>
>> If trinity choose one of the exec syscalls, it forks off an extra child
>> to do it in, on the off-chance that it succeeds, and we never return.
>> https://github.com/kernelslacker/trinity/blob/master/syscall.c#L139
>
> 	extrapid = fork();
> 	if (extrapid == 0) {
> 		/* grand-child */
> 		char childname[]="trinity-subchild";
> 		prctl(PR_SET_NAME, (unsigned long) &childname);
>
> 		__do_syscall(rec, GOING_AWAY);
> 		/* if this was for eg. an successful execve, we should never get here.
> 		 * if it failed though... */
> 		_exit(EXIT_SUCCESS);
> 	}
>
> That is interesting.
>
>
> So the system call sequence is a fork which just succeeded and than an
> exec.  That reduces the possibilities quite a lot.
>
> With pids there was a recent change that just replaced the pid hash
> table and the pid bitmap with and idr.  It changes the locking somewhat
> and probably changes the timing so that might be the culprit.
>
> I am trying to figure out if there is an interface that would let
> ns_last_pid for a pid namespace be accessed before the first pid is
> allocated and I am not seeing it.  It does not appear to be possible
> to mount a proc for a pid namespace you are not currently in.
>
> *Scratches my head*  I am not seeing anything obvious.

Can you try this patch as you reproduce this issue?

diff --git a/kernel/pid.c b/kernel/pid.c
index b13b624e2c49..df9e5d4d8f83 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -210,6 +210,7 @@ struct pid *alloc_pid(struct pid_namespace *ns)
                goto out_unlock;
        for ( ; upid >= pid->numbers; --upid) {
                /* Make the PID visible to find_pid_ns. */
+               WARN_ON(!upid->ns->proc_mnt);
                idr_replace(&upid->ns->idr, pid, upid->nr);
                upid->ns->pid_allocated++;
        }


If the warning triggers it means the bug is in alloc_pid and somehow
something has gotten past the is_child_reaper check.

If the warning does not trigger it means something is stomping proc_mnt.

In the entire kernel there are exactly two assignments to proc_mnt.
- kmem_cache_zalloc in create_pid_namespace.
- In pid_ns_prepare_proc where proc_mnt is set to a non-zero value.

On the 29th of Nov syzkaller also hit this and gave me this reproducer
that I can't figure out heads or tails of.

#{Threaded:true Collide:true Repeat:true Procs:8 Sandbox:namespace Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false Repro:false}
mmap(&(0x7f0000000000/0xfff000)=nil, 0xfff000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
perf_event_open(&(0x7f000025c000)={0x2, 0x78, 0x3e3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xf72, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0xffffffffffffffff, 0xffffffffffffffff, 0x0)
socket$inet6_dccp(0xa, 0x6, 0x0)
unshare(0x20000400)
sendmsg$unix(0xffffffffffffffff, &(0x7f0000001000-0x38)={&(0x7f0000239000-0x8)=@abs={0x0, 0x0, 0x0}, 0x8, &(0x7f0000008000)=[], 0x0, &(0x7f0000001000-0x10)=[@rights={0x200, 0x1, 0x1, [0xffffffffffffffff]}], 0x1, 0x0}, 0x0)
process_vm_writev(0x0, &(0x7f0000699000-0x70)=[{&(0x7f00006a5000)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x4c}, {&(0x7f00007b9000-0x54)="", 0x0}, {&(0x7f00004f3000)="", 0x0}, {&(0x7f00002e3000-0xd6)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0xd6}, {&(0x7f0000f2e000)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x52}, {&(0x7f00008e5000-0x10)="00000000000000000000000000000000", 0x10}, {&(0x7f0000a3a000)="", 0x0}], 0x7, &(0x7f0000d05000)=[{&(0x7f0000d64000)="", 0x0}, {&(0x7f0000062000-0x93)="", 0x0}, {&(0x7f0000a16000-0x7e)="000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x7e}, {&(0x7f00003dc000-0x9a)="", 0x0}, {&(0x7f0000fe3000-0xc7)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0xc7}], 0x5, 0x0)
pselect6(0x40, &(0x7f0000cc9000-0x40)={0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x20000, 0x0}, &(0x7f0000cc9000-0x40)={0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, &(0x7f00000de000-0x40)={0xffffffffffffffe1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1}, &(0x7f00008e6000-0x10)={0x0, 0x989680}, &(0x7f0000205000-0x10)={&(0x7f00006e4000-0x8)={0x0}, 0x8})
clone(0x20900, &(0x7f0000a94000-0x1)="6f", &(0x7f00002b8000-0x4)=0x0, &(0x7f000029e000)=0x0, &(0x7f00006fe000)="")
ioctl$KVM_ENABLE_CAP_CPU(0xffffffffffffffff, 0x4068aea3, &(0x7f0000e48000)={0x7b, 0x0, [0x1, 0x1, 0x800, 0x1], [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0]})
epoll_ctl$EPOLL_CTL_DEL(0xffffffffffffffff, 0x2, 0xffffffffffffffff)

#{Threaded:true Collide:true Repeat:true Procs:8 Sandbox:namespace Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false Repro:false}
mmap(&(0x7f0000000000/0xfff000)=nil, 0xfff000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
perf_event_open(&(0x7f000025c000)={0x2, 0x78, 0x3e3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xf72, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0xffffffffffffffff, 0xffffffffffffffff, 0x0)
socket$inet6_dccp(0xa, 0x6, 0x0)
unshare(0x20000400)
sendmsg$unix(0xffffffffffffffff, &(0x7f0000001000-0x38)={&(0x7f0000239000-0x8)=@abs={0x0, 0x0, 0x0}, 0x8, &(0x7f0000008000)=[], 0x0, &(0x7f0000001000-0x10)=[@rights={0x200, 0x1, 0x1, [0xffffffffffffffff]}], 0x1, 0x0}, 0x0)
process_vm_writev(0x0, &(0x7f0000699000-0x70)=[{&(0x7f00006a5000)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x4c}, {&(0x7f00007b9000-0x54)="", 0x0}, {&(0x7f00004f3000)="", 0x0}, {&(0x7f00002e3000-0xd6)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0xd6}, {&(0x7f0000f2e000)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x52}, {&(0x7f00008e5000-0x10)="00000000000000000000000000000000", 0x10}, {&(0x7f0000a3a000)="", 0x0}], 0x7, &(0x7f0000d05000)=[{&(0x7f0000d64000)="", 0x0}, {&(0x7f0000062000-0x93)="", 0x0}, {&(0x7f0000a16000-0x7e)="000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0x7e}, {&(0x7f00003dc000-0x9a)="", 0x0}, {&(0x7f0000fe3000-0xc7)="00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000", 0xc7}], 0x5, 0x0)
pselect6(0x40, &(0x7f0000cc9000-0x40)={0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x20000, 0x0}, &(0x7f0000cc9000-0x40)={0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, &(0x7f00000de000-0x40)={0xffffffffffffffe1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1}, &(0x7f00008e6000-0x10)={0x0, 0x989680}, &(0x7f0000205000-0x10)={&(0x7f00006e4000-0x8)={0x0}, 0x8})
clone(0x20900, &(0x7f0000a94000-0x1)="6f", &(0x7f00002b8000-0x4)=0x0, &(0x7f000029e000)=0x0, &(0x7f00006fe000)="")
ioctl$KVM_ENABLE_CAP_CPU(0xffffffffffffffff, 0x4068aea3, &(0x7f0000e48000)={0x7b, 0x0, [0x1, 0x1, 0x800, 0x1], [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0]})
epoll_ctl$EPOLL_CTL_DEL(0xffffffffffffffff, 0x2, 0xffffffffffffffff)


Eric

  reply	other threads:[~2017-12-20  1:54 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-18 21:44 proc_flush_task oops Dave Jones
2017-12-18 22:15 ` Al Viro
2017-12-18 23:10   ` Dave Jones
2017-12-18 23:50     ` Linus Torvalds
2017-12-19  1:22       ` Dave Jones
2017-12-19  3:39       ` Dave Jones
2017-12-19 10:49         ` Tetsuo Handa
2017-12-19 18:25           ` Eric W. Biederman
2017-12-19 18:27         ` Eric W. Biederman
2017-12-19 19:30           ` Dave Jones
2017-12-19 21:44             ` Eric W. Biederman
2017-12-20  1:54               ` Eric W. Biederman [this message]
2017-12-20  5:28                 ` Dave Jones
2017-12-20 18:25                   ` Eric W. Biederman
2017-12-21  3:16                     ` Dave Jones
2017-12-21  8:26                       ` Eric W. Biederman
2017-12-21 10:38                         ` Alexey Dobriyan
2017-12-21 14:25                           ` Dave Jones
2017-12-21 16:41                             ` Eric W. Biederman
2017-12-21 22:00                           ` Dave Jones
2017-12-22  1:31                             ` Eric W. Biederman
2017-12-22  3:35                               ` Dave Jones
2017-12-22  7:58                                 ` Eric W. Biederman
2017-12-22 10:13                                   ` Alexey Dobriyan
2017-12-22 14:41                                     ` Eric W. Biederman
2017-12-22 16:11                                       ` [TEST PATCH] pid: fix allocating pid 2 for init (was Re: proc_flush_task oops) Alexey Dobriyan
2017-12-24  3:12                                         ` Eric W. Biederman
2017-12-24  3:16                                           ` [PATCH] pid: Handle failure to allocate the first pid in a pid namespace Eric W. Biederman
2017-12-20  8:00                 ` proc_flush_task oops Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mv2e17vz.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=davej@codemonkey.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.