* v4.14-rc{4,7} null pointer dereference in event_sched_out()
@ 2017-10-30 16:23 Mark Rutland
2017-11-15 18:00 ` Will Deacon
0 siblings, 1 reply; 4+ messages in thread
From: Mark Rutland @ 2017-10-30 16:23 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, will.deacon, peterz, mingo
Hi,
As a heads-up, while fuzzing arm64 v4.14-rc{4,7} with Syzkaller, I hit a
KASAN splat in event_sched_out():
[ 133.225742] ==================================================================
[ 133.229374] BUG: KASAN: null-ptr-deref in event_sched_out.isra.47+0x428/0x580
[ 133.230843] Read of size 4 at addr 0000000000000178 by task syz-executor0/6905
[ 133.233151]
[ 133.233664] CPU: 0 PID: 6905 Comm: syz-executor0 Not tainted 4.14.0-rc7-dirty #4
[ 133.235750] Hardware name: linux,dummy-virt (DT)
[ 133.236598] Call trace:
[ 133.237081] [<ffff20000808fef8>] dump_backtrace+0x0/0x658
[ 133.238073] [<ffff200008090570>] show_stack+0x20/0x30
[ 133.239002] [<ffff2000091c22ec>] dump_stack+0xd0/0x124
[ 133.239947] [<ffff200008349d1c>] kasan_report+0x104/0x310
[ 133.240940] [<ffff2000083483f8>] __asan_load4+0x58/0xb0
[ 133.242262] [<ffff200008271138>] event_sched_out.isra.47+0x428/0x580
[ 133.243686] [<ffff2000082712c8>] __perf_remove_from_context+0x38/0xe0
[ 133.244948] [<ffff200008265cf8>] event_function_call+0x1c8/0x258
[ 133.246197] [<ffff20000826ad04>] perf_remove_from_context+0x54/0xf0
[ 133.247514] [<ffff20000827f188>] SyS_perf_event_open+0x1528/0x18e0
[ 133.248831] Exception stack(0xffff800038c5fec0 to 0xffff800038c60000)
[ 133.250199] fec0: 0000000020b12f88 0000000000001af8 00000000ffffffff 0000000000000008
[ 133.251843] fee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 133.253503] ff00: 00000000000000f1 0000000000000000 0000000000405850 00000000003d0f00
[ 133.255132] ff20: 0000ffff94514f60 00000000004ae890 0000000000000027 0000000000000001
[ 133.256756] ff40: 0000000000000000 0000000000826000 0000000000000000 00000000004c0158
[ 133.258392] ff60: 00000000ffffffff 0000000020b12f88 0000000000001af8 000000000046d290
[ 133.260006] ff80: 00000000004aaba8 0000000000473af8 0000ffffe5360da0 0000000000000000
[ 133.261629] ffa0: 0000ffff94514f60 0000ffff94514640 00000000004020fc 0000ffff94514640
[ 133.263253] ffc0: 000000000042d034 00000000a0000000 0000000020b12f88 00000000000000f1
[ 133.264886] ffe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 133.266535] [<ffff200008084170>] el0_svc_naked+0x24/0x28
[ 133.267648] ==================================================================
... which is triggered by the Syzkaller repro program at the end of this
email. I haven't yet come up with a C reproducer; sorry.
The PC seems to be the load of cpuctx->active_oncpu at the end of the
function, so it looks like cpuctx is NULL.
The system has (homogeneous) armv8_pmuv3, breakpoint, and software PMUs.
I initially hit this on v4.14-rc4, and can reproduce the issue on
v4.14-rc7. I haven't tried any other kernels yet.
I'll continue digging, unless someone else has already solved this.
Thanks,
Mark.
Syzkaller reproducer
---->8----
# {Threaded:true Collide:true Repeat:true Procs:1 Sandbox:none Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false Repro:false}
mmap(&(0x7f0000000000/0xd3f000)=nil, 0xd3f000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
r0 = perf_event_open(&(0x7f0000d15000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x9, 0x34, 0x0, 0x8001, 0x0, 0x0, 0x0, 0x0, 0x0, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x8000000, 0x0}, 0x0, 0xffffffff, 0xffffffffffffffff, 0x0)
mmap(&(0x7f0000d3f000/0x1000)=nil, 0x1000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
r1 = perf_event_open(&(0x7f0000d15000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3bd4, 0x0, 0x0, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xffffffff80000001, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0xffffffff, r0, 0x0)
add_key(&(0x7f00004fe000)="6465616400", &(0x7f0000d41000)={0x73, 0x79, 0x7a, 0x0, 0x0}, &(0x7f0000509000-0x19)="", 0x0, 0xfffffffffffffffd)
syz_emit_ethernet(0x32, &(0x7f0000b86000-0x36)={@remote={[0xbb, 0xbb, 0xbb, 0xbb, 0xbb], 0x0}, @remote={[0xbb, 0xbb, 0xbb, 0xbb, 0xbb], 0x0}, [], {{0x200000000080a, @arp=@generic={0x322, 0x8edf, 0x6, 0x0, 0xfffffffffffffffe, @empty=[0x0, 0x0, 0x0, 0x0, 0x0, 0x0], "", @random="dc1ce39913fb", "34f3b689bb48f1e976f2bf1b7cf2243b"}}}})
dup3(r1, r1, 0x80000)
r2 = gettid()
ioctl$sock_FIOSETOWN(0xffffffffffffffff, 0x8901, &(0x7f0000d40000)=0x0)
socket$inet6_udp(0xa, 0x2, 0x0)
perf_event_open(&(0x7f0000b13000-0x78)={0x0, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x30, 0x0, 0x2, 0x0, 0x0, 0x94, 0x0, 0x0, 0x0, 0x0, 0x0, 0x10001, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, r0, 0x0)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: v4.14-rc{4,7} null pointer dereference in event_sched_out()
2017-10-30 16:23 v4.14-rc{4,7} null pointer dereference in event_sched_out() Mark Rutland
@ 2017-11-15 18:00 ` Will Deacon
2017-11-24 18:10 ` Mark Rutland
0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2017-11-15 18:00 UTC (permalink / raw)
To: Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, peterz, mingo
On Mon, Oct 30, 2017 at 04:23:15PM +0000, Mark Rutland wrote:
> As a heads-up, while fuzzing arm64 v4.14-rc{4,7} with Syzkaller, I hit a
> KASAN splat in event_sched_out():
>
> [ 133.225742] ==================================================================
> [ 133.229374] BUG: KASAN: null-ptr-deref in event_sched_out.isra.47+0x428/0x580
> [ 133.230843] Read of size 4 at addr 0000000000000178 by task syz-executor0/6905
> [ 133.233151]
> [ 133.233664] CPU: 0 PID: 6905 Comm: syz-executor0 Not tainted 4.14.0-rc7-dirty #4
> [ 133.235750] Hardware name: linux,dummy-virt (DT)
> [ 133.236598] Call trace:
> [ 133.237081] [<ffff20000808fef8>] dump_backtrace+0x0/0x658
> [ 133.238073] [<ffff200008090570>] show_stack+0x20/0x30
> [ 133.239002] [<ffff2000091c22ec>] dump_stack+0xd0/0x124
> [ 133.239947] [<ffff200008349d1c>] kasan_report+0x104/0x310
> [ 133.240940] [<ffff2000083483f8>] __asan_load4+0x58/0xb0
> [ 133.242262] [<ffff200008271138>] event_sched_out.isra.47+0x428/0x580
> [ 133.243686] [<ffff2000082712c8>] __perf_remove_from_context+0x38/0xe0
> [ 133.244948] [<ffff200008265cf8>] event_function_call+0x1c8/0x258
> [ 133.246197] [<ffff20000826ad04>] perf_remove_from_context+0x54/0xf0
> [ 133.247514] [<ffff20000827f188>] SyS_perf_event_open+0x1528/0x18e0
> [ 133.248831] Exception stack(0xffff800038c5fec0 to 0xffff800038c60000)
> [ 133.250199] fec0: 0000000020b12f88 0000000000001af8 00000000ffffffff 0000000000000008
> [ 133.251843] fee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 133.253503] ff00: 00000000000000f1 0000000000000000 0000000000405850 00000000003d0f00
> [ 133.255132] ff20: 0000ffff94514f60 00000000004ae890 0000000000000027 0000000000000001
> [ 133.256756] ff40: 0000000000000000 0000000000826000 0000000000000000 00000000004c0158
> [ 133.258392] ff60: 00000000ffffffff 0000000020b12f88 0000000000001af8 000000000046d290
> [ 133.260006] ff80: 00000000004aaba8 0000000000473af8 0000ffffe5360da0 0000000000000000
> [ 133.261629] ffa0: 0000ffff94514f60 0000ffff94514640 00000000004020fc 0000ffff94514640
> [ 133.263253] ffc0: 000000000042d034 00000000a0000000 0000000020b12f88 00000000000000f1
> [ 133.264886] ffe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 133.266535] [<ffff200008084170>] el0_svc_naked+0x24/0x28
> [ 133.267648] ==================================================================
>
> ... which is triggered by the Syzkaller repro program at the end of this
> email. I haven't yet come up with a C reproducer; sorry.
>
> The PC seems to be the load of cpuctx->active_oncpu at the end of the
> function, so it looks like cpuctx is NULL.
>
> The system has (homogeneous) armv8_pmuv3, breakpoint, and software PMUs.
>
> I initially hit this on v4.14-rc4, and can reproduce the issue on
> v4.14-rc7. I haven't tried any other kernels yet.
>
> I'll continue digging, unless someone else has already solved this.
Did you get anywhere with this?
Will
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: v4.14-rc{4,7} null pointer dereference in event_sched_out()
2017-11-15 18:00 ` Will Deacon
@ 2017-11-24 18:10 ` Mark Rutland
2017-11-24 18:16 ` Mark Rutland
0 siblings, 1 reply; 4+ messages in thread
From: Mark Rutland @ 2017-11-24 18:10 UTC (permalink / raw)
To: Will Deacon; +Cc: linux-kernel, linux-arm-kernel, peterz, mingo
On Wed, Nov 15, 2017 at 06:00:20PM +0000, Will Deacon wrote:
> On Mon, Oct 30, 2017 at 04:23:15PM +0000, Mark Rutland wrote:
> > As a heads-up, while fuzzing arm64 v4.14-rc{4,7} with Syzkaller, I hit a
> > KASAN splat in event_sched_out():
> >
> > [ 133.225742] ==================================================================
> > [ 133.229374] BUG: KASAN: null-ptr-deref in event_sched_out.isra.47+0x428/0x580
> > [ 133.230843] Read of size 4 at addr 0000000000000178 by task syz-executor0/6905
> > [ 133.233151]
> > [ 133.233664] CPU: 0 PID: 6905 Comm: syz-executor0 Not tainted 4.14.0-rc7-dirty #4
> > [ 133.235750] Hardware name: linux,dummy-virt (DT)
> > [ 133.236598] Call trace:
> > [ 133.237081] [<ffff20000808fef8>] dump_backtrace+0x0/0x658
> > [ 133.238073] [<ffff200008090570>] show_stack+0x20/0x30
> > [ 133.239002] [<ffff2000091c22ec>] dump_stack+0xd0/0x124
> > [ 133.239947] [<ffff200008349d1c>] kasan_report+0x104/0x310
> > [ 133.240940] [<ffff2000083483f8>] __asan_load4+0x58/0xb0
> > [ 133.242262] [<ffff200008271138>] event_sched_out.isra.47+0x428/0x580
> > [ 133.243686] [<ffff2000082712c8>] __perf_remove_from_context+0x38/0xe0
> > [ 133.244948] [<ffff200008265cf8>] event_function_call+0x1c8/0x258
> > [ 133.246197] [<ffff20000826ad04>] perf_remove_from_context+0x54/0xf0
> > [ 133.247514] [<ffff20000827f188>] SyS_perf_event_open+0x1528/0x18e0
> > [ 133.248831] Exception stack(0xffff800038c5fec0 to 0xffff800038c60000)
> > [ 133.250199] fec0: 0000000020b12f88 0000000000001af8 00000000ffffffff 0000000000000008
> > [ 133.251843] fee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 133.253503] ff00: 00000000000000f1 0000000000000000 0000000000405850 00000000003d0f00
> > [ 133.255132] ff20: 0000ffff94514f60 00000000004ae890 0000000000000027 0000000000000001
> > [ 133.256756] ff40: 0000000000000000 0000000000826000 0000000000000000 00000000004c0158
> > [ 133.258392] ff60: 00000000ffffffff 0000000020b12f88 0000000000001af8 000000000046d290
> > [ 133.260006] ff80: 00000000004aaba8 0000000000473af8 0000ffffe5360da0 0000000000000000
> > [ 133.261629] ffa0: 0000ffff94514f60 0000ffff94514640 00000000004020fc 0000ffff94514640
> > [ 133.263253] ffc0: 000000000042d034 00000000a0000000 0000000020b12f88 00000000000000f1
> > [ 133.264886] ffe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 133.266535] [<ffff200008084170>] el0_svc_naked+0x24/0x28
> > [ 133.267648] ==================================================================
> >
> > ... which is triggered by the Syzkaller repro program at the end of this
> > email. I haven't yet come up with a C reproducer; sorry.
> >
> > The PC seems to be the load of cpuctx->active_oncpu at the end of the
> > function, so it looks like cpuctx is NULL.
> >
> > The system has (homogeneous) armv8_pmuv3, breakpoint, and software PMUs.
> >
> > I initially hit this on v4.14-rc4, and can reproduce the issue on
> > v4.14-rc7. I haven't tried any other kernels yet.
> >
> > I'll continue digging, unless someone else has already solved this.
>
> Did you get anywhere with this?
I got a *bit* further, but I haven't figured out the underlying issue
yet.
I minimized the reproducer down to the following:
----
# {Threaded:true Collide:true Repeat:true Procs:1 Sandbox:none Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false Repro:false}
r2 = gettid()
mmap(&(0x7f0000000000/0xd3f000)=nil, 0xd3f000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
r0 = perf_event_open(&(0x7f0000d15000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x9, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, 0xffffffffffffffff, 0x0)
mmap(&(0x7f0000d3f000/0x1000)=nil, 0x1000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
r1 = perf_event_open(&(0x7f0000d15000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, r0, 0x0)
dup3(0, 0, 0)
perf_event_open(&(0x7f0000b13000-0x78)={0x0, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, r0, 0x0)
----
Note: the dup3() is an expensive NOP (since oldfd == newfd), but I think
it's triggering an interesting scheduling pattern, since thus far I
haven't managed to trigger the bug without it.
That creates a perf_cpu_clock event, adds another to that group, and
adds a HW event to that same group. In parallel.
Sometimes at the point the HW event is added, the leading SW event is in
PERF_EVENT_STATE_INACTIVE, but the follower SW event is in
PERF_EVENT_STATE_ACTIVE. The context both are held in is inactive, so
the follower event's state makes no sense.
I added a dump to event_sched_out() that catches this:
[ 35.995144] Uh-oh:
[ 35.995144] event ffff800039a1f880
[ 35.995144] event->state 1
[ 35.995144] event->cpu -1
[ 35.995144] pmu ffff20000a3b2600 (perf_cpu_clock, AKA (null))
[ 35.995144] leader ffff800039a1a480
[ 35.995144] leader->state -1
[ 35.995144] pmu ffff20000a3b2600 (perf_cpu_clock, AKA (null))
[ 35.995144] ctx ffff80003932e180, pmu ffff20000a3b2600 (perf_cpu_clock AKA (null))
I'll try to dig into this a bit more next week.
I can't reproduce this with Syzkaller running in a single thread, nor
with some multi-threaded tests I wrote in C, so I guess there's a subtle
race I'm not managing to hit.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: v4.14-rc{4,7} null pointer dereference in event_sched_out()
2017-11-24 18:10 ` Mark Rutland
@ 2017-11-24 18:16 ` Mark Rutland
0 siblings, 0 replies; 4+ messages in thread
From: Mark Rutland @ 2017-11-24 18:16 UTC (permalink / raw)
To: Will Deacon; +Cc: peterz, mingo, linux-kernel, linux-arm-kernel
On Fri, Nov 24, 2017 at 06:10:56PM +0000, Mark Rutland wrote:
> On Wed, Nov 15, 2017 at 06:00:20PM +0000, Will Deacon wrote:
> > On Mon, Oct 30, 2017 at 04:23:15PM +0000, Mark Rutland wrote:
> > > As a heads-up, while fuzzing arm64 v4.14-rc{4,7} with Syzkaller, I hit a
> > > KASAN splat in event_sched_out():
> > Did you get anywhere with this?
>
> I got a *bit* further, but I haven't figured out the underlying issue
> yet.
Forgot to mention, the above all applies to a vanilla v4.14 arm64
kernel; defconfig + KASAN_INLINE.
Thanks,
Mark.
>
> I minimized the reproducer down to the following:
>
> ----
> # {Threaded:true Collide:true Repeat:true Procs:1 Sandbox:none Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true HandleSegv:true WaitRepeat:true Debug:false Repro:false}
>
> r2 = gettid()
> mmap(&(0x7f0000000000/0xd3f000)=nil, 0xd3f000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
> r0 = perf_event_open(&(0x7f0000d15000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x9, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, 0xffffffffffffffff, 0x0)
> mmap(&(0x7f0000d3f000/0x1000)=nil, 0x1000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
> r1 = perf_event_open(&(0x7f0000d15000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, r0, 0x0)
> dup3(0, 0, 0)
> perf_event_open(&(0x7f0000b13000-0x78)={0x0, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, r2, 0xffffffff, r0, 0x0)
> ----
>
> Note: the dup3() is an expensive NOP (since oldfd == newfd), but I think
> it's triggering an interesting scheduling pattern, since thus far I
> haven't managed to trigger the bug without it.
>
> That creates a perf_cpu_clock event, adds another to that group, and
> adds a HW event to that same group. In parallel.
>
> Sometimes at the point the HW event is added, the leading SW event is in
> PERF_EVENT_STATE_INACTIVE, but the follower SW event is in
> PERF_EVENT_STATE_ACTIVE. The context both are held in is inactive, so
> the follower event's state makes no sense.
>
> I added a dump to event_sched_out() that catches this:
>
> [ 35.995144] Uh-oh:
> [ 35.995144] event ffff800039a1f880
> [ 35.995144] event->state 1
> [ 35.995144] event->cpu -1
> [ 35.995144] pmu ffff20000a3b2600 (perf_cpu_clock, AKA (null))
> [ 35.995144] leader ffff800039a1a480
> [ 35.995144] leader->state -1
> [ 35.995144] pmu ffff20000a3b2600 (perf_cpu_clock, AKA (null))
> [ 35.995144] ctx ffff80003932e180, pmu ffff20000a3b2600 (perf_cpu_clock AKA (null))
>
> I'll try to dig into this a bit more next week.
>
> I can't reproduce this with Syzkaller running in a single thread, nor
> with some multi-threaded tests I wrote in C, so I guess there's a subtle
> race I'm not managing to hit.
>
> Thanks,
> Mark.
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-11-24 18:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-30 16:23 v4.14-rc{4,7} null pointer dereference in event_sched_out() Mark Rutland
2017-11-15 18:00 ` Will Deacon
2017-11-24 18:10 ` Mark Rutland
2017-11-24 18:16 ` Mark Rutland
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox