From: mathieu.desnoyers@efficios.com (Mathieu Desnoyers)
To: linux-arm-kernel@lists.infradead.org
Subject: arm64/v4.16-rc1: KASAN: use-after-free Read in finish_task_switch
Date: Wed, 14 Feb 2018 18:53:44 +0000 (UTC) [thread overview]
Message-ID: <254787533.21950.1518634424009.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20180214165131.o25r3hhrtrjk3ejq@lakrids.cambridge.arm.com>
----- On Feb 14, 2018, at 11:51 AM, Mark Rutland mark.rutland at arm.com wrote:
> On Wed, Feb 14, 2018 at 03:07:41PM +0000, Will Deacon wrote:
>> Hi Mark,
>
> Hi Will,
>
>> Cheers for the report. These things tend to be a pain to debug, but I've had
>> a go.
>
> Thanks for taking a look!
>
>> On Wed, Feb 14, 2018 at 12:02:54PM +0000, Mark Rutland wrote:
>> The interesting thing here is on the exit path:
>>
>> > Freed by task 10882:
>> > save_stack mm/kasan/kasan.c:447 [inline]
>> > set_track mm/kasan/kasan.c:459 [inline]
>> > __kasan_slab_free+0x114/0x220 mm/kasan/kasan.c:520
>> > kasan_slab_free+0x10/0x18 mm/kasan/kasan.c:527
>> > slab_free_hook mm/slub.c:1393 [inline]
>> > slab_free_freelist_hook mm/slub.c:1414 [inline]
>> > slab_free mm/slub.c:2968 [inline]
>> > kmem_cache_free+0x88/0x270 mm/slub.c:2990
>> > __mmdrop+0x164/0x248 kernel/fork.c:604
>>
>> ^^ This should never run, because there's an mmgrab() about 8 lines above
>> the mmput() in exit_mm.
>>
>> > mmdrop+0x50/0x60 kernel/fork.c:615
>> > __mmput kernel/fork.c:981 [inline]
>> > mmput+0x270/0x338 kernel/fork.c:992
>> > exit_mm kernel/exit.c:544 [inline]
>>
>> Looking at exit_mm:
>>
>> mmgrab(mm);
>> BUG_ON(mm != current->active_mm);
>> /* more a memory barrier than a real lock */
>> task_lock(current);
>> current->mm = NULL;
>> up_read(&mm->mmap_sem);
>> enter_lazy_tlb(mm, current);
>> task_unlock(current);
>> mm_update_next_owner(mm);
>> mmput(mm);
>>
>> Then the comment already rings some alarm bells: our spin_lock (as used
>> by task_lock) has ACQUIRE semantics, so the mmgrab (which is unordered
>> due to being an atomic_inc) can be reordered with respect to the assignment
>> of NULL to current->mm.
>>
>> If the exit()ing task had recently migrated from another CPU, then that
>> CPU could concurrently run context_switch() and take this path:
>>
>> if (!prev->mm) {
>> prev->active_mm = NULL;
>> rq->prev_mm = oldmm;
>> }
>
> IIUC, on the prior context_switch, next->mm == NULL, so we set
> next->active_mm to prev->mm.
>
> Then, in this context_switch we set oldmm = prev->active_mm (where prev
> is next from the prior context switch).
>
> ... right?
>
>> which then means finish_task_switch will call mmdrop():
>>
>> struct mm_struct *mm = rq->prev_mm;
>> [...]
>> if (mm) {
>> membarrier_mm_sync_core_before_usermode(mm);
>> mmdrop(mm);
>> }
>
> ... then here we use what was prev->active_mm in the most recent context
> switch.
>
> So AFAICT, we're never concurrently accessing a task_struct::mm field
> here, only prev::{mm,active_mm} while prev is current...
>
> [...]
>
>> diff --git a/kernel/exit.c b/kernel/exit.c
>> index 995453d9fb55..f91e8d56b03f 100644
>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -534,8 +534,9 @@ static void exit_mm(void)
>> }
>> mmgrab(mm);
>> BUG_ON(mm != current->active_mm);
>> - /* more a memory barrier than a real lock */
>> task_lock(current);
>> + /* Ensure we've grabbed the mm before setting current->mm to NULL */
>> + smp_mb__after_spin_lock();
>> current->mm = NULL;
>
> ... and thus I don't follow why we would need to order these with
> anything more than a compiler barrier (if we're preemptible here).
>
> What have I completely misunderstood? ;)
The compiler barrier would not change anything, because task_lock()
already implies a compiler barrier (provided by the arch spin lock
inline asm memory clobber). So compiler-wise, it cannot move the
mmgrab(mm) after the store "current->mm = NULL".
However, given the scenario involves multiples CPUs (one doing exit_mm(),
the other doing context switch), the actual order of perceived load/store
can be shuffled. And AFAIU nothing prevents the CPU from ordering the
atomic_inc() done by mmgrab(mm) _after_ the store to current->mm.
I wonder if we should not simply add a smp_mb__after_atomic() into
mmgrab() instead ? I see that e.g. futex.c does:
static inline void futex_get_mm(union futex_key *key)
{
mmgrab(key->private.mm);
/*
* Ensure futex_get_mm() implies a full barrier such that
* get_futex_key() implies a full barrier. This is relied upon
* as smp_mb(); (B), see the ordering comment above.
*/
smp_mb__after_atomic();
}
It could prevent nasty subtle bugs in other mmgrab() users.
Thoughts ?
Thanks,
Mathieu
>
> Thanks,
> Mark.
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Mark Rutland <mark.rutland@arm.com>, Will Deacon <will.deacon@arm.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: arm64/v4.16-rc1: KASAN: use-after-free Read in finish_task_switch
Date: Wed, 14 Feb 2018 18:53:44 +0000 (UTC) [thread overview]
Message-ID: <254787533.21950.1518634424009.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20180214165131.o25r3hhrtrjk3ejq@lakrids.cambridge.arm.com>
----- On Feb 14, 2018, at 11:51 AM, Mark Rutland mark.rutland@arm.com wrote:
> On Wed, Feb 14, 2018 at 03:07:41PM +0000, Will Deacon wrote:
>> Hi Mark,
>
> Hi Will,
>
>> Cheers for the report. These things tend to be a pain to debug, but I've had
>> a go.
>
> Thanks for taking a look!
>
>> On Wed, Feb 14, 2018 at 12:02:54PM +0000, Mark Rutland wrote:
>> The interesting thing here is on the exit path:
>>
>> > Freed by task 10882:
>> > save_stack mm/kasan/kasan.c:447 [inline]
>> > set_track mm/kasan/kasan.c:459 [inline]
>> > __kasan_slab_free+0x114/0x220 mm/kasan/kasan.c:520
>> > kasan_slab_free+0x10/0x18 mm/kasan/kasan.c:527
>> > slab_free_hook mm/slub.c:1393 [inline]
>> > slab_free_freelist_hook mm/slub.c:1414 [inline]
>> > slab_free mm/slub.c:2968 [inline]
>> > kmem_cache_free+0x88/0x270 mm/slub.c:2990
>> > __mmdrop+0x164/0x248 kernel/fork.c:604
>>
>> ^^ This should never run, because there's an mmgrab() about 8 lines above
>> the mmput() in exit_mm.
>>
>> > mmdrop+0x50/0x60 kernel/fork.c:615
>> > __mmput kernel/fork.c:981 [inline]
>> > mmput+0x270/0x338 kernel/fork.c:992
>> > exit_mm kernel/exit.c:544 [inline]
>>
>> Looking at exit_mm:
>>
>> mmgrab(mm);
>> BUG_ON(mm != current->active_mm);
>> /* more a memory barrier than a real lock */
>> task_lock(current);
>> current->mm = NULL;
>> up_read(&mm->mmap_sem);
>> enter_lazy_tlb(mm, current);
>> task_unlock(current);
>> mm_update_next_owner(mm);
>> mmput(mm);
>>
>> Then the comment already rings some alarm bells: our spin_lock (as used
>> by task_lock) has ACQUIRE semantics, so the mmgrab (which is unordered
>> due to being an atomic_inc) can be reordered with respect to the assignment
>> of NULL to current->mm.
>>
>> If the exit()ing task had recently migrated from another CPU, then that
>> CPU could concurrently run context_switch() and take this path:
>>
>> if (!prev->mm) {
>> prev->active_mm = NULL;
>> rq->prev_mm = oldmm;
>> }
>
> IIUC, on the prior context_switch, next->mm == NULL, so we set
> next->active_mm to prev->mm.
>
> Then, in this context_switch we set oldmm = prev->active_mm (where prev
> is next from the prior context switch).
>
> ... right?
>
>> which then means finish_task_switch will call mmdrop():
>>
>> struct mm_struct *mm = rq->prev_mm;
>> [...]
>> if (mm) {
>> membarrier_mm_sync_core_before_usermode(mm);
>> mmdrop(mm);
>> }
>
> ... then here we use what was prev->active_mm in the most recent context
> switch.
>
> So AFAICT, we're never concurrently accessing a task_struct::mm field
> here, only prev::{mm,active_mm} while prev is current...
>
> [...]
>
>> diff --git a/kernel/exit.c b/kernel/exit.c
>> index 995453d9fb55..f91e8d56b03f 100644
>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -534,8 +534,9 @@ static void exit_mm(void)
>> }
>> mmgrab(mm);
>> BUG_ON(mm != current->active_mm);
>> - /* more a memory barrier than a real lock */
>> task_lock(current);
>> + /* Ensure we've grabbed the mm before setting current->mm to NULL */
>> + smp_mb__after_spin_lock();
>> current->mm = NULL;
>
> ... and thus I don't follow why we would need to order these with
> anything more than a compiler barrier (if we're preemptible here).
>
> What have I completely misunderstood? ;)
The compiler barrier would not change anything, because task_lock()
already implies a compiler barrier (provided by the arch spin lock
inline asm memory clobber). So compiler-wise, it cannot move the
mmgrab(mm) after the store "current->mm = NULL".
However, given the scenario involves multiples CPUs (one doing exit_mm(),
the other doing context switch), the actual order of perceived load/store
can be shuffled. And AFAIU nothing prevents the CPU from ordering the
atomic_inc() done by mmgrab(mm) _after_ the store to current->mm.
I wonder if we should not simply add a smp_mb__after_atomic() into
mmgrab() instead ? I see that e.g. futex.c does:
static inline void futex_get_mm(union futex_key *key)
{
mmgrab(key->private.mm);
/*
* Ensure futex_get_mm() implies a full barrier such that
* get_futex_key() implies a full barrier. This is relied upon
* as smp_mb(); (B), see the ordering comment above.
*/
smp_mb__after_atomic();
}
It could prevent nasty subtle bugs in other mmgrab() users.
Thoughts ?
Thanks,
Mathieu
>
> Thanks,
> Mark.
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2018-02-14 18:53 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-14 12:02 arm64/v4.16-rc1: KASAN: use-after-free Read in finish_task_switch Mark Rutland
2018-02-14 12:02 ` Mark Rutland
2018-02-14 15:07 ` Will Deacon
2018-02-14 15:07 ` Will Deacon
2018-02-14 16:51 ` Mark Rutland
2018-02-14 16:51 ` Mark Rutland
2018-02-14 18:53 ` Mathieu Desnoyers [this message]
2018-02-14 18:53 ` Mathieu Desnoyers
2018-02-15 11:49 ` Peter Zijlstra
2018-02-15 11:49 ` Peter Zijlstra
2018-02-15 13:13 ` Mathieu Desnoyers
2018-02-15 13:13 ` Mathieu Desnoyers
2018-02-15 14:22 ` Will Deacon
2018-02-15 14:22 ` Will Deacon
2018-02-15 15:33 ` Will Deacon
2018-02-15 15:33 ` Will Deacon
2018-02-15 16:47 ` Peter Zijlstra
2018-02-15 16:47 ` Peter Zijlstra
2018-02-15 18:21 ` Will Deacon
2018-02-15 18:21 ` Will Deacon
2018-02-15 22:08 ` Mathieu Desnoyers
2018-02-15 22:08 ` Mathieu Desnoyers
2018-02-16 0:02 ` Mathieu Desnoyers
2018-02-16 0:02 ` Mathieu Desnoyers
2018-02-16 8:11 ` Peter Zijlstra
2018-02-16 8:11 ` Peter Zijlstra
2018-02-16 16:53 ` Mark Rutland
2018-02-16 16:53 ` Mark Rutland
2018-02-16 17:17 ` Mathieu Desnoyers
2018-02-16 17:17 ` Mathieu Desnoyers
2018-02-16 18:33 ` Mark Rutland
2018-02-16 18:33 ` Mark Rutland
2018-02-19 11:26 ` Catalin Marinas
2018-02-19 11:26 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=254787533.21950.1518634424009.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.