[RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
@ 2017-12-04 16:23 Vinayak Menon
  2017-12-04 16:55 ` Will Deacon
  0 siblings, 1 reply; 9+ messages in thread
From: Vinayak Menon @ 2017-12-04 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

A case is observed where a wrong physical address is read,
resulting in a bus error and that happens soon after TTBR0 is
set to the saved ttbr by uaccess_ttbr0_enable. This is always
seen to happen in the exit path of the task.

exception
__arch_copy_from_user
__copy_from_user
probe_kernel_read
get_freepointer_safe
slab_alloc_node
slab_alloc
kmem_cache_alloc
kmem_cache_zalloc
fill_pool
__debug_object_init
debug_object_init
rcuhead_fixup_activate
debug_object_fixup
debug_object_activate
debug_rcu_head_queue
__call_rcu
ep_remove
eventpoll_release_file
__fput
____fput
task_work_run
do_exit

The mm has been released and the pgd is freed, but probe_kernel_read
invoked from slub results in call to __arch_copy_from_user. At the
entry to __arch_copy_from_user, when SW PAN is enabled, this results
in stale value being set to ttbr0. May be a speculative fetch aftwerwards
is resulting in invalid physical address access.

Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
---

I have not tested this patch to see if it fixes the problem.
Sending it early for comments.

 arch/arm64/include/asm/mmu_context.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 3257895a..48a3f04 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -221,7 +221,16 @@ static inline void __switch_mm(struct mm_struct *next)
 		update_saved_ttbr0(tsk, next);
 }
 
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+static inline void deactivate_mm(struct task_struct *tsk, struct mm_struct *mm)
+{
+	if (system_uses_ttbr0_pan())
+		task_thread_info(tsk)->ttbr0 = __pa_symbol(empty_zero_page);
+}
+#else
 #define deactivate_mm(tsk,mm)	do { } while (0)
+#endif
+
 #define activate_mm(prev,next)	switch_mm(prev, next, current)
 
 void verify_cpu_asid_bits(void);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
  2017-12-04 16:23 [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated Vinayak Menon
@ 2017-12-04 16:55 ` Will Deacon
  2017-12-04 17:30   ` Mark Rutland
  2017-12-04 18:00   ` Mark Rutland
  0 siblings, 2 replies; 9+ messages in thread
From: Will Deacon @ 2017-12-04 16:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 04, 2017 at 09:53:26PM +0530, Vinayak Menon wrote:
> A case is observed where a wrong physical address is read,
> resulting in a bus error and that happens soon after TTBR0 is
> set to the saved ttbr by uaccess_ttbr0_enable. This is always
> seen to happen in the exit path of the task.
> 
> exception
> __arch_copy_from_user
> __copy_from_user
> probe_kernel_read
> get_freepointer_safe
> slab_alloc_node
> slab_alloc
> kmem_cache_alloc
> kmem_cache_zalloc
> fill_pool
> __debug_object_init
> debug_object_init
> rcuhead_fixup_activate
> debug_object_fixup
> debug_object_activate
> debug_rcu_head_queue
> __call_rcu
> ep_remove
> eventpoll_release_file
> __fput
> ____fput
> task_work_run
> do_exit
> 
> The mm has been released and the pgd is freed, but probe_kernel_read
> invoked from slub results in call to __arch_copy_from_user. At the
> entry to __arch_copy_from_user, when SW PAN is enabled, this results
> in stale value being set to ttbr0. May be a speculative fetch aftwerwards
> is resulting in invalid physical address access.
> 
> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
> ---
> 
> I have not tested this patch to see if it fixes the problem.
> Sending it early for comments.

I wonder whether it would be better to avoid restoring the user TTBR0 if
KERNEL_DS is set. We could do the same thing for PAN. Do we ever access
user addresses under KERNEL_DS?

Will

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
  2017-12-04 16:55 ` Will Deacon
@ 2017-12-04 17:30   ` Mark Rutland
  2017-12-04 18:00   ` Mark Rutland
  1 sibling, 0 replies; 9+ messages in thread
From: Mark Rutland @ 2017-12-04 17:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 04, 2017 at 04:55:33PM +0000, Will Deacon wrote:
> On Mon, Dec 04, 2017 at 09:53:26PM +0530, Vinayak Menon wrote:
> > A case is observed where a wrong physical address is read,
> > resulting in a bus error and that happens soon after TTBR0 is
> > set to the saved ttbr by uaccess_ttbr0_enable. This is always
> > seen to happen in the exit path of the task.

> > The mm has been released and the pgd is freed, but probe_kernel_read
> > invoked from slub results in call to __arch_copy_from_user. At the
> > entry to __arch_copy_from_user, when SW PAN is enabled, this results
> > in stale value being set to ttbr0. May be a speculative fetch aftwerwards
> > is resulting in invalid physical address access.

> I wonder whether it would be better to avoid restoring the user TTBR0 if
> KERNEL_DS is set. We could do the same thing for PAN. Do we ever access
> user addresses under KERNEL_DS?

I believe we assume that we don't.

IIUC, with PAN+UAO, when we have KERNEL_DS set, any uaccess to a user
address would fault.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
  2017-12-04 16:55 ` Will Deacon
  2017-12-04 17:30   ` Mark Rutland
@ 2017-12-04 18:00   ` Mark Rutland
  2017-12-05  5:00     ` Vinayak Menon
  1 sibling, 1 reply; 9+ messages in thread
From: Mark Rutland @ 2017-12-04 18:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 04, 2017 at 04:55:33PM +0000, Will Deacon wrote:
> On Mon, Dec 04, 2017 at 09:53:26PM +0530, Vinayak Menon wrote:
> > A case is observed where a wrong physical address is read,
> > resulting in a bus error and that happens soon after TTBR0 is
> > set to the saved ttbr by uaccess_ttbr0_enable. This is always
> > seen to happen in the exit path of the task.
> > 
> > exception
> > __arch_copy_from_user
> > __copy_from_user
> > probe_kernel_read
> > get_freepointer_safe
> > slab_alloc_node
> > slab_alloc
> > kmem_cache_alloc
> > kmem_cache_zalloc
> > fill_pool
> > __debug_object_init
> > debug_object_init
> > rcuhead_fixup_activate
> > debug_object_fixup
> > debug_object_activate
> > debug_rcu_head_queue
> > __call_rcu
> > ep_remove
> > eventpoll_release_file
> > __fput
> > ____fput
> > task_work_run
> > do_exit
> > 
> > The mm has been released and the pgd is freed, but probe_kernel_read
> > invoked from slub results in call to __arch_copy_from_user. At the
> > entry to __arch_copy_from_user, when SW PAN is enabled, this results
> > in stale value being set to ttbr0. May be a speculative fetch aftwerwards
> > is resulting in invalid physical address access.
> > 
> > Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
> > ---
> > 
> > I have not tested this patch to see if it fixes the problem.
> > Sending it early for comments.
> 
> I wonder whether it would be better to avoid restoring the user TTBR0 if
> KERNEL_DS is set. 

I think the problem here is that switch_mm() avoids updating the saved ttbr
value when the next mm is init_mm.

If we fixed that up to use the empty zero page (as we write to the real
ttbr0 in this case), I think that solves the problem. Though I agree we
should also avoid restoring the user TTBR for KERNEL_DS uaccess calls.

Example below.

Thanks,
Mark.

---->8----
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 3257895a9b5e..ef3567ce80b3 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -174,10 +174,15 @@ enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
 static inline void update_saved_ttbr0(struct task_struct *tsk,
                                      struct mm_struct *mm)
 {
+       u64 ttbr;
+
        if (system_uses_ttbr0_pan()) {
-               BUG_ON(mm->pgd == swapper_pg_dir);
-               task_thread_info(tsk)->ttbr0 =
-                       virt_to_phys(mm->pgd) | ASID(mm) << 48;
+               if (mm == &init_mm)
+                       ttbr = __pa_symbol(empty_zero_page);
+               else
+                       ttbr = virt_to_phys(mm->pgd) | ASID(mm) << 48;
+
+               task_thread_info(tsk)->ttbr0 = ttbr;
        }
 }
 #else
@@ -214,11 +219,9 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
         * Update the saved TTBR0_EL1 of the scheduled-in task as the previous
         * value may have not been initialised yet (activate_mm caller) or the
         * ASID has changed since the last run (following the context switch
-        * of another thread of the same process). Avoid setting the reserved
-        * TTBR0_EL1 to swapper_pg_dir (init_mm; e.g. via idle_task_exit).
+        * of another thread of the same process).
         */
-       if (next != &init_mm)
-               update_saved_ttbr0(tsk, next);
+       update_saved_ttbr0(tsk, next);
 }
 
 #define deactivate_mm(tsk,mm)  do { } while (0)

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
  2017-12-04 18:00   ` Mark Rutland
@ 2017-12-05  5:00     ` Vinayak Menon
  2017-12-05 11:06       ` Mark Rutland
  0 siblings, 1 reply; 9+ messages in thread
From: Vinayak Menon @ 2017-12-05  5:00 UTC (permalink / raw)
  To: linux-arm-kernel


On 12/4/2017 11:30 PM, Mark Rutland wrote:
> On Mon, Dec 04, 2017 at 04:55:33PM +0000, Will Deacon wrote:
>> On Mon, Dec 04, 2017 at 09:53:26PM +0530, Vinayak Menon wrote:
>>> A case is observed where a wrong physical address is read,
>>> resulting in a bus error and that happens soon after TTBR0 is
>>> set to the saved ttbr by uaccess_ttbr0_enable. This is always
>>> seen to happen in the exit path of the task.
>>>
>>> exception
>>> __arch_copy_from_user
>>> __copy_from_user
>>> probe_kernel_read
>>> get_freepointer_safe
>>> slab_alloc_node
>>> slab_alloc
>>> kmem_cache_alloc
>>> kmem_cache_zalloc
>>> fill_pool
>>> __debug_object_init
>>> debug_object_init
>>> rcuhead_fixup_activate
>>> debug_object_fixup
>>> debug_object_activate
>>> debug_rcu_head_queue
>>> __call_rcu
>>> ep_remove
>>> eventpoll_release_file
>>> __fput
>>> ____fput
>>> task_work_run
>>> do_exit
>>>
>>> The mm has been released and the pgd is freed, but probe_kernel_read
>>> invoked from slub results in call to __arch_copy_from_user. At the
>>> entry to __arch_copy_from_user, when SW PAN is enabled, this results
>>> in stale value being set to ttbr0. May be a speculative fetch aftwerwards
>>> is resulting in invalid physical address access.
>>>
>>> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
>>> ---
>>>
>>> I have not tested this patch to see if it fixes the problem.
>>> Sending it early for comments.
>> I wonder whether it would be better to avoid restoring the user TTBR0 if
>> KERNEL_DS is set. 
> I think the problem here is that switch_mm() avoids updating the saved ttbr
> value when the next mm is init_mm.
For this switch to happen, the schedule() in do_task_dead at the end of do_exit() need to be called, right ?
The issue is happening soon after exit_mm (probably from exit_files).
>
> If we fixed that up to use the empty zero page (as we write to the real
> ttbr0 in this case), I think that solves the problem. Though I agree we
> should also avoid restoring the user TTBR for KERNEL_DS uaccess calls.
>
> Example below.
>
> Thanks,
> Mark.
>
> ---->8----
> diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index 3257895a9b5e..ef3567ce80b3 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -174,10 +174,15 @@ enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
>  static inline void update_saved_ttbr0(struct task_struct *tsk,
>                                       struct mm_struct *mm)
>  {
> +       u64 ttbr;
> +
>         if (system_uses_ttbr0_pan()) {
> -               BUG_ON(mm->pgd == swapper_pg_dir);
> -               task_thread_info(tsk)->ttbr0 =
> -                       virt_to_phys(mm->pgd) | ASID(mm) << 48;
> +               if (mm == &init_mm)
> +                       ttbr = __pa_symbol(empty_zero_page);
> +               else
> +                       ttbr = virt_to_phys(mm->pgd) | ASID(mm) << 48;
> +
> +               task_thread_info(tsk)->ttbr0 = ttbr;
>         }
>  }
>  #else
> @@ -214,11 +219,9 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
>          * Update the saved TTBR0_EL1 of the scheduled-in task as the previous
>          * value may have not been initialised yet (activate_mm caller) or the
>          * ASID has changed since the last run (following the context switch
> -        * of another thread of the same process). Avoid setting the reserved
> -        * TTBR0_EL1 to swapper_pg_dir (init_mm; e.g. via idle_task_exit).
> +        * of another thread of the same process).
>          */
> -       if (next != &init_mm)
> -               update_saved_ttbr0(tsk, next);
> +       update_saved_ttbr0(tsk, next);
>  }
>  
>  #define deactivate_mm(tsk,mm)  do { } while (0)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
  2017-12-05  5:00     ` Vinayak Menon
@ 2017-12-05 11:06       ` Mark Rutland
  2017-12-05 14:55         ` Will Deacon
  2017-12-05 16:37         ` Mark Rutland
  0 siblings, 2 replies; 9+ messages in thread
From: Mark Rutland @ 2017-12-05 11:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 05, 2017 at 10:30:40AM +0530, Vinayak Menon wrote:
> On 12/4/2017 11:30 PM, Mark Rutland wrote:
> > On Mon, Dec 04, 2017 at 04:55:33PM +0000, Will Deacon wrote:
> >> On Mon, Dec 04, 2017 at 09:53:26PM +0530, Vinayak Menon wrote:
> >>> A case is observed where a wrong physical address is read,
> >>> resulting in a bus error and that happens soon after TTBR0 is
> >>> set to the saved ttbr by uaccess_ttbr0_enable. This is always
> >>> seen to happen in the exit path of the task.
> >>>
> >>> exception
> >>> __arch_copy_from_user
> >>> __copy_from_user
> >>> probe_kernel_read
> >>> get_freepointer_safe
> >>> slab_alloc_node
> >>> slab_alloc
> >>> kmem_cache_alloc
> >>> kmem_cache_zalloc
> >>> fill_pool
> >>> __debug_object_init
> >>> debug_object_init
> >>> rcuhead_fixup_activate
> >>> debug_object_fixup
> >>> debug_object_activate
> >>> debug_rcu_head_queue
> >>> __call_rcu
> >>> ep_remove
> >>> eventpoll_release_file
> >>> __fput
> >>> ____fput
> >>> task_work_run
> >>> do_exit
> >>>
> >>> The mm has been released and the pgd is freed, but probe_kernel_read
> >>> invoked from slub results in call to __arch_copy_from_user. At the
> >>> entry to __arch_copy_from_user, when SW PAN is enabled, this results
> >>> in stale value being set to ttbr0. May be a speculative fetch aftwerwards
> >>> is resulting in invalid physical address access.

> > I think the problem here is that switch_mm() avoids updating the saved ttbr
> > value when the next mm is init_mm.

> For this switch to happen, the schedule() in do_task_dead at the end
> of do_exit() need to be called, right ?  The issue is happening soon
> after exit_mm (probably from exit_files).

I'd assumed that we'd switch_mm() away from the task's mm prior to the
final mmput(). Otherwise, I can't see why we don't have issues in the
non SW PAN case (as that would leave the HW TTBR0 stale).

However, I can't see exactly where we do that, so I'll go diggging.
Something doesn't seem quite right.

Do you have a reproducer for the issue?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
  2017-12-05 11:06       ` Mark Rutland
@ 2017-12-05 14:55         ` Will Deacon
  2017-12-05 15:56           ` Vinayak Menon
  2017-12-05 16:37         ` Mark Rutland
  1 sibling, 1 reply; 9+ messages in thread
From: Will Deacon @ 2017-12-05 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 05, 2017 at 11:06:20AM +0000, Mark Rutland wrote:
> On Tue, Dec 05, 2017 at 10:30:40AM +0530, Vinayak Menon wrote:
> > On 12/4/2017 11:30 PM, Mark Rutland wrote:
> > > On Mon, Dec 04, 2017 at 04:55:33PM +0000, Will Deacon wrote:
> > >> On Mon, Dec 04, 2017 at 09:53:26PM +0530, Vinayak Menon wrote:
> > >>> A case is observed where a wrong physical address is read,
> > >>> resulting in a bus error and that happens soon after TTBR0 is
> > >>> set to the saved ttbr by uaccess_ttbr0_enable. This is always
> > >>> seen to happen in the exit path of the task.
> > >>>
> > >>> exception
> > >>> __arch_copy_from_user
> > >>> __copy_from_user
> > >>> probe_kernel_read
> > >>> get_freepointer_safe
> > >>> slab_alloc_node
> > >>> slab_alloc
> > >>> kmem_cache_alloc
> > >>> kmem_cache_zalloc
> > >>> fill_pool
> > >>> __debug_object_init
> > >>> debug_object_init
> > >>> rcuhead_fixup_activate
> > >>> debug_object_fixup
> > >>> debug_object_activate
> > >>> debug_rcu_head_queue
> > >>> __call_rcu
> > >>> ep_remove
> > >>> eventpoll_release_file
> > >>> __fput
> > >>> ____fput
> > >>> task_work_run
> > >>> do_exit
> > >>>
> > >>> The mm has been released and the pgd is freed, but probe_kernel_read
> > >>> invoked from slub results in call to __arch_copy_from_user. At the
> > >>> entry to __arch_copy_from_user, when SW PAN is enabled, this results
> > >>> in stale value being set to ttbr0. May be a speculative fetch aftwerwards
> > >>> is resulting in invalid physical address access.
> 
> > > I think the problem here is that switch_mm() avoids updating the saved ttbr
> > > value when the next mm is init_mm.
> 
> > For this switch to happen, the schedule() in do_task_dead at the end
> > of do_exit() need to be called, right ?  The issue is happening soon
> > after exit_mm (probably from exit_files).
> 
> I'd assumed that we'd switch_mm() away from the task's mm prior to the
> final mmput(). Otherwise, I can't see why we don't have issues in the
> non SW PAN case (as that would leave the HW TTBR0 stale).
> 
> However, I can't see exactly where we do that, so I'll go diggging.
> Something doesn't seem quite right.
> 
> Do you have a reproducer for the issue?

I'd be very interested in that, or just more details about how this was
observed. What was the workload? Kernel version? Hardware? .config? Do you
know for sure that it was a page table walk that triggered the abort?

In the report above, Vinayak claims that "The mm has been released and the
pgd is freed" but that really shouldn't happen in the do_exit path. We free
the other levels of page table in free_pgtables, but deliberately keep the
mm and the pgd around until we've switched away in finish_task_switch.

I'm quite prepared to believe that the ttbr0 stashing by the SW PAN code
isn't bulletproof, but I'm struggling to see how the backtrace above
can happen.

Will

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
  2017-12-05 14:55         ` Will Deacon
@ 2017-12-05 15:56           ` Vinayak Menon
  0 siblings, 0 replies; 9+ messages in thread
From: Vinayak Menon @ 2017-12-05 15:56 UTC (permalink / raw)
  To: linux-arm-kernel


On 12/5/2017 8:25 PM, Will Deacon wrote:
> On Tue, Dec 05, 2017 at 11:06:20AM +0000, Mark Rutland wrote:
>> On Tue, Dec 05, 2017 at 10:30:40AM +0530, Vinayak Menon wrote:
>>> On 12/4/2017 11:30 PM, Mark Rutland wrote:
>>>> On Mon, Dec 04, 2017 at 04:55:33PM +0000, Will Deacon wrote:
>>>>> On Mon, Dec 04, 2017 at 09:53:26PM +0530, Vinayak Menon wrote:
>>>>>> A case is observed where a wrong physical address is read,
>>>>>> resulting in a bus error and that happens soon after TTBR0 is
>>>>>> set to the saved ttbr by uaccess_ttbr0_enable. This is always
>>>>>> seen to happen in the exit path of the task.
>>>>>>
>>>>>> exception
>>>>>> __arch_copy_from_user
>>>>>> __copy_from_user
>>>>>> probe_kernel_read
>>>>>> get_freepointer_safe
>>>>>> slab_alloc_node
>>>>>> slab_alloc
>>>>>> kmem_cache_alloc
>>>>>> kmem_cache_zalloc
>>>>>> fill_pool
>>>>>> __debug_object_init
>>>>>> debug_object_init
>>>>>> rcuhead_fixup_activate
>>>>>> debug_object_fixup
>>>>>> debug_object_activate
>>>>>> debug_rcu_head_queue
>>>>>> __call_rcu
>>>>>> ep_remove
>>>>>> eventpoll_release_file
>>>>>> __fput
>>>>>> ____fput
>>>>>> task_work_run
>>>>>> do_exit
>>>>>>
>>>>>> The mm has been released and the pgd is freed, but probe_kernel_read
>>>>>> invoked from slub results in call to __arch_copy_from_user. At the
>>>>>> entry to __arch_copy_from_user, when SW PAN is enabled, this results
>>>>>> in stale value being set to ttbr0. May be a speculative fetch aftwerwards
>>>>>> is resulting in invalid physical address access.
>>>> I think the problem here is that switch_mm() avoids updating the saved ttbr
>>>> value when the next mm is init_mm.
>>> For this switch to happen, the schedule() in do_task_dead at the end
>>> of do_exit() need to be called, right ?  The issue is happening soon
>>> after exit_mm (probably from exit_files).
>> I'd assumed that we'd switch_mm() away from the task's mm prior to the
>> final mmput(). Otherwise, I can't see why we don't have issues in the
>> non SW PAN case (as that would leave the HW TTBR0 stale).
>>
>> However, I can't see exactly where we do that, so I'll go diggging.
>> Something doesn't seem quite right.
>>
>> Do you have a reproducer for the issue?
> I'd be very interested in that, or just more details about how this was
> observed. What was the workload? Kernel version? Hardware? .config? Do you
> know for sure that it was a page table walk that triggered the abort?
>
> In the report above, Vinayak claims that "The mm has been released and the
> pgd is freed" but that really shouldn't happen in the do_exit path. We free
> the other levels of page table in free_pgtables, but deliberately keep the
> mm and the pgd around until we've switched away in finish_task_switch.
>
> I'm quite prepared to believe that the ttbr0 stashing by the SW PAN code
> isn't bulletproof, but I'm struggling to see how the backtrace above
> can happen.
The issue was reported on 3.18 kernel. The hardware configuration is A53 octa core. The test which reproduces this
is a reboot test, which just boots up android and then reboots, and this is done in a loop. It may be reproducing
this problem? as reboot causes the tasks to be killed (do_exit). Its not very easy to reproduce the problem. "The issue
is not reproducible when CONFIG_ARM64_SW_TTBR0_PAN is disabled". What I have looked at is the coredumps
collected when the problem happens. The issue is that one of the cores tries to access a physical address which is
invalid. Interestingly there is no mapping for this physical address in page tables. And every time the issue happens,
the core which issues the wrong address is found to be in the path above, few instructions after the TTBR0 write
inside uaccess_ttbr0_enable (and rest of the callstack is also consistent, debug_object_init->kmem_cache_alloc->probe_kernel_read).
We are not sure it is a page table walk that resulted in this, that was just a guess, that a speculative access
would have caused a table walk with an invalid TTBR0. As per the coredumps, tsk->mm is made NULL. Rest was
assumption, that mm is released and pgd is freed. I assumed do_exit->exit_mmap->mmput->__mmdrop will do that.
But let me go and check if I can figure out from the dumps if that has really happened. I remember doing it but let me
confirm. Let me know if you want me to collect any other info from the dumps.

Thanks,
Vinayak

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated
  2017-12-05 11:06       ` Mark Rutland
  2017-12-05 14:55         ` Will Deacon
@ 2017-12-05 16:37         ` Mark Rutland
  1 sibling, 0 replies; 9+ messages in thread
From: Mark Rutland @ 2017-12-05 16:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 05, 2017 at 11:06:20AM +0000, Mark Rutland wrote:
> On Tue, Dec 05, 2017 at 10:30:40AM +0530, Vinayak Menon wrote:
> > On 12/4/2017 11:30 PM, Mark Rutland wrote:
> > > On Mon, Dec 04, 2017 at 04:55:33PM +0000, Will Deacon wrote:
> > >> On Mon, Dec 04, 2017 at 09:53:26PM +0530, Vinayak Menon wrote:
> > >>> A case is observed where a wrong physical address is read,
> > >>> resulting in a bus error and that happens soon after TTBR0 is
> > >>> set to the saved ttbr by uaccess_ttbr0_enable. This is always
> > >>> seen to happen in the exit path of the task.
> > >>>
> > >>> exception
> > >>> __arch_copy_from_user
> > >>> __copy_from_user
> > >>> probe_kernel_read
> > >>> get_freepointer_safe
> > >>> slab_alloc_node
> > >>> slab_alloc
> > >>> kmem_cache_alloc
> > >>> kmem_cache_zalloc
> > >>> fill_pool
> > >>> __debug_object_init
> > >>> debug_object_init
> > >>> rcuhead_fixup_activate
> > >>> debug_object_fixup
> > >>> debug_object_activate
> > >>> debug_rcu_head_queue
> > >>> __call_rcu
> > >>> ep_remove
> > >>> eventpoll_release_file
> > >>> __fput
> > >>> ____fput
> > >>> task_work_run
> > >>> do_exit
> > >>>
> > >>> The mm has been released and the pgd is freed, but probe_kernel_read
> > >>> invoked from slub results in call to __arch_copy_from_user. At the
> > >>> entry to __arch_copy_from_user, when SW PAN is enabled, this results
> > >>> in stale value being set to ttbr0. May be a speculative fetch aftwerwards
> > >>> is resulting in invalid physical address access.
> 
> > > I think the problem here is that switch_mm() avoids updating the saved ttbr
> > > value when the next mm is init_mm.
> 
> > For this switch to happen, the schedule() in do_task_dead at the end
> > of do_exit() need to be called, right ?  The issue is happening soon
> > after exit_mm (probably from exit_files).
> 
> I'd assumed that we'd switch_mm() away from the task's mm prior to the
> final mmput(). Otherwise, I can't see why we don't have issues in the
> non SW PAN case (as that would leave the HW TTBR0 stale).
> 
> However, I can't see exactly where we do that, so I'll go diggging.
> Something doesn't seem quite right.

AFAICT, we rely on finish_task_switch() to do the final drop of the mm,
after* we've switched away from the task. So while the task is
installed, the mm (and associated pgd) should still be live, and not
freed.

So my patch shouldn't be necessary.

I'm afraid I don't have any other theory as to what's going on here.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-12-05 16:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-04 16:23 [RFC PATCH] arm64: deactivate saved ttbr when mm is deactivated Vinayak Menon
2017-12-04 16:55 ` Will Deacon
2017-12-04 17:30   ` Mark Rutland
2017-12-04 18:00   ` Mark Rutland
2017-12-05  5:00     ` Vinayak Menon
2017-12-05 11:06       ` Mark Rutland
2017-12-05 14:55         ` Will Deacon
2017-12-05 15:56           ` Vinayak Menon
2017-12-05 16:37         ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).