* [PATCH] ARM: LPAE: Invalidate the TLB for module addresses during translation fault @ 2012-02-23 17:59 Catalin Marinas 2012-02-23 19:50 ` Russell King - ARM Linux 0 siblings, 1 reply; 6+ messages in thread From: Catalin Marinas @ 2012-02-23 17:59 UTC (permalink / raw) To: linux-arm-kernel During the free_pgtables() call all user and modules/pkmap entries are removed. If a translation fault for the modules/pkmap area occurs before we switched away from the current pgd, do_translation_fault() would copy the init_mm pud into the user pud. There is a small window between pud_clear() and pmd_free_tlb() in free_pmd_range() where the pud entry was cleared but the TLB has not been invalidated yet and the CPU may have cached the original (valid) pud entry in the TLB. A scenario like below would get stuck in continuous prefetch abort: 1. Current process exiting. The modules pmd entries not populated 2. exit_mmap() -> ... -> pmd_free_tlb() 3. pud_clear() for the 1GB pud containing user stack and modules (no TLB invalidation yet) 4. Interrupt -> module interrupt routine 5. Level 2 (pmd) translation fault occurs when executing the module interrupt routine. The CPU previously cached (TLB) the old valid pud value for the modules area, so we don't get a level 1 translation fault 6. do_translation fault() copies the pud_k into the pud 7. Linux returns to the faulty instruction. Goes back to 5 At point 7, since the CPU still has the old pud value, it goes back to point 5 and never gets out of this loop. With this patch, the stale pud TLB entry is invalidated after point 6 and the subsequent prefetch abort doesn't occur. Reported-by: Tony Thompson <Anthony.Thompson@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> --- arch/arm/mm/fault.c | 13 +++++++++++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index bb7eac3..d9aacf0 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -444,8 +444,16 @@ do_translation_fault(unsigned long addr, unsigned int fsr, if (pud_none(*pud_k)) goto bad_area; - if (!pud_present(*pud)) + if (!pud_present(*pud)) { set_pud(pud, *pud_k); + /* + * There is a small window during free_pgtables() where the + * user *pud entry is 0 but the TLB has not been invalidated + * and we get a level 2 (pmd) translation fault caused by the + * intermediate TLB caching of the old level 1 (pud) entry. + */ + flush_tlb_kernel_page(addr); + } pmd = pmd_offset(pud, addr); pmd_k = pmd_offset(pud_k, addr); @@ -468,8 +476,9 @@ do_translation_fault(unsigned long addr, unsigned int fsr, #endif if (pmd_none(pmd_k[index])) goto bad_area; + if (!pmd_present(pmd[index])) + copy_pmd(pmd, pmd_k); - copy_pmd(pmd, pmd_k); return 0; bad_area: ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] ARM: LPAE: Invalidate the TLB for module addresses during translation fault 2012-02-23 17:59 [PATCH] ARM: LPAE: Invalidate the TLB for module addresses during translation fault Catalin Marinas @ 2012-02-23 19:50 ` Russell King - ARM Linux 2012-02-23 22:13 ` Catalin Marinas 0 siblings, 1 reply; 6+ messages in thread From: Russell King - ARM Linux @ 2012-02-23 19:50 UTC (permalink / raw) To: linux-arm-kernel On Thu, Feb 23, 2012 at 05:59:41PM +0000, Catalin Marinas wrote: > During the free_pgtables() call all user and modules/pkmap entries are > removed. Err, no. free_pgtables() should only ever touch up to TASK_SIZE, which is the maximum size of userspace. Moreover, because TASK_SIZE does not cover an entire pud, free_pgtables() should not be removing the pud table. > There is a small window between pud_clear() and pmd_free_tlb() in > free_pmd_range() where the pud entry was cleared but the TLB has not > been invalidated yet and the CPU may have cached the original (valid) > pud entry in the TLB. A scenario like below would get stuck in > continuous prefetch abort: > > 1. Current process exiting. The modules pmd entries not populated > 2. exit_mmap() -> ... -> pmd_free_tlb() > 3. pud_clear() for the 1GB pud containing user stack and modules (no TLB > invalidation yet) This is where things are wrong. Because the request should not be requesting that the entire pud is being cleared, it should not be removed from the pgd. So, there's some debugging to be done here. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: LPAE: Invalidate the TLB for module addresses during translation fault 2012-02-23 19:50 ` Russell King - ARM Linux @ 2012-02-23 22:13 ` Catalin Marinas 2012-02-24 9:58 ` Catalin Marinas 0 siblings, 1 reply; 6+ messages in thread From: Catalin Marinas @ 2012-02-23 22:13 UTC (permalink / raw) To: linux-arm-kernel On Thu, Feb 23, 2012 at 07:50:03PM +0000, Russell King - ARM Linux wrote: > On Thu, Feb 23, 2012 at 05:59:41PM +0000, Catalin Marinas wrote: > > During the free_pgtables() call all user and modules/pkmap entries are > > removed. > > Err, no. free_pgtables() should only ever touch up to TASK_SIZE, which > is the maximum size of userspace. Moreover, because TASK_SIZE does not > cover an entire pud, free_pgtables() should not be removing the pud > table. Are you sure? exit_mmap() calls free_pgtables() with ceiling == 0. If free_pmd_range() is called for a range in the top 1GB of the task address space (which includes modules), it also calls pud_clear() because ceiling is 0. > > There is a small window between pud_clear() and pmd_free_tlb() in > > free_pmd_range() where the pud entry was cleared but the TLB has not > > been invalidated yet and the CPU may have cached the original (valid) > > pud entry in the TLB. A scenario like below would get stuck in > > continuous prefetch abort: > > > > 1. Current process exiting. The modules pmd entries not populated > > 2. exit_mmap() -> ... -> pmd_free_tlb() > > 3. pud_clear() for the 1GB pud containing user stack and modules (no TLB > > invalidation yet) > > This is where things are wrong. Because the request should not be > requesting that the entire pud is being cleared, it should not be > removed from the pgd. Actually the pgd and pud are the same. free_pgtables() removes the whole pmd (in free_pmd_range()) and clears the higher pud entry that includes the modules mapping. -- Catalin ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: LPAE: Invalidate the TLB for module addresses during translation fault 2012-02-23 22:13 ` Catalin Marinas @ 2012-02-24 9:58 ` Catalin Marinas 2012-02-24 10:12 ` Russell King - ARM Linux 0 siblings, 1 reply; 6+ messages in thread From: Catalin Marinas @ 2012-02-24 9:58 UTC (permalink / raw) To: linux-arm-kernel On Thu, Feb 23, 2012 at 10:13:43PM +0000, Catalin Marinas wrote: > On Thu, Feb 23, 2012 at 07:50:03PM +0000, Russell King - ARM Linux wrote: > > On Thu, Feb 23, 2012 at 05:59:41PM +0000, Catalin Marinas wrote: > > > During the free_pgtables() call all user and modules/pkmap entries are > > > removed. > > > > Err, no. free_pgtables() should only ever touch up to TASK_SIZE, which > > is the maximum size of userspace. Moreover, because TASK_SIZE does not > > cover an entire pud, free_pgtables() should not be removing the pud > > table. > > Are you sure? exit_mmap() calls free_pgtables() with ceiling == 0. If > free_pmd_range() is called for a range in the top 1GB of the task > address space (which includes modules), it also calls pud_clear() > because ceiling is 0. BTW, an alternative patch but which needs wider acknowledgement as it touches generic code (I can post it to linux-mm): diff --git a/mm/mmap.c b/mm/mmap.c index 3f758c7..5e5c8a8 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1866,7 +1866,7 @@ static void unmap_region(struct mm_struct *mm, unmap_vmas(&tlb, vma, start, end, &nr_accounted, NULL); vm_unacct_memory(nr_accounted); free_pgtables(&tlb, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS, - next ? next->vm_start : 0); + next ? next->vm_start : TASK_SIZE); tlb_finish_mmu(&tlb, start, end); } @@ -2241,7 +2241,7 @@ void exit_mmap(struct mm_struct *mm) end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); vm_unacct_memory(nr_accounted); - free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, 0); + free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, TASK_SIZE); tlb_finish_mmu(&tlb, 0, end); /* -- Catalin ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] ARM: LPAE: Invalidate the TLB for module addresses during translation fault 2012-02-24 9:58 ` Catalin Marinas @ 2012-02-24 10:12 ` Russell King - ARM Linux 2012-02-24 11:39 ` Catalin Marinas 0 siblings, 1 reply; 6+ messages in thread From: Russell King - ARM Linux @ 2012-02-24 10:12 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 24, 2012 at 09:58:06AM +0000, Catalin Marinas wrote: > On Thu, Feb 23, 2012 at 10:13:43PM +0000, Catalin Marinas wrote: > > On Thu, Feb 23, 2012 at 07:50:03PM +0000, Russell King - ARM Linux wrote: > > > On Thu, Feb 23, 2012 at 05:59:41PM +0000, Catalin Marinas wrote: > > > > During the free_pgtables() call all user and modules/pkmap entries are > > > > removed. > > > > > > Err, no. free_pgtables() should only ever touch up to TASK_SIZE, which > > > is the maximum size of userspace. Moreover, because TASK_SIZE does not > > > cover an entire pud, free_pgtables() should not be removing the pud > > > table. > > > > Are you sure? exit_mmap() calls free_pgtables() with ceiling == 0. If > > free_pmd_range() is called for a range in the top 1GB of the task > > address space (which includes modules), it also calls pud_clear() > > because ceiling is 0. > > BTW, an alternative patch but which needs wider acknowledgement as it > touches generic code (I can post it to linux-mm): I think that's the right solution. > diff --git a/mm/mmap.c b/mm/mmap.c > index 3f758c7..5e5c8a8 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1866,7 +1866,7 @@ static void unmap_region(struct mm_struct *mm, > unmap_vmas(&tlb, vma, start, end, &nr_accounted, NULL); > vm_unacct_memory(nr_accounted); > free_pgtables(&tlb, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS, > - next ? next->vm_start : 0); > + next ? next->vm_start : TASK_SIZE); > tlb_finish_mmu(&tlb, start, end); > } > > @@ -2241,7 +2241,7 @@ void exit_mmap(struct mm_struct *mm) > end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); > vm_unacct_memory(nr_accounted); > > - free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, 0); > + free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, TASK_SIZE); > tlb_finish_mmu(&tlb, 0, end); > > /* > > -- > Catalin ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: LPAE: Invalidate the TLB for module addresses during translation fault 2012-02-24 10:12 ` Russell King - ARM Linux @ 2012-02-24 11:39 ` Catalin Marinas 0 siblings, 0 replies; 6+ messages in thread From: Catalin Marinas @ 2012-02-24 11:39 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 24, 2012 at 10:12:00AM +0000, Russell King - ARM Linux wrote: > On Fri, Feb 24, 2012 at 09:58:06AM +0000, Catalin Marinas wrote: > > On Thu, Feb 23, 2012 at 10:13:43PM +0000, Catalin Marinas wrote: > > > On Thu, Feb 23, 2012 at 07:50:03PM +0000, Russell King - ARM Linux wrote: > > > > On Thu, Feb 23, 2012 at 05:59:41PM +0000, Catalin Marinas wrote: > > > > > During the free_pgtables() call all user and modules/pkmap entries are > > > > > removed. > > > > > > > > Err, no. free_pgtables() should only ever touch up to TASK_SIZE, which > > > > is the maximum size of userspace. Moreover, because TASK_SIZE does not > > > > cover an entire pud, free_pgtables() should not be removing the pud > > > > table. > > > > > > Are you sure? exit_mmap() calls free_pgtables() with ceiling == 0. If > > > free_pmd_range() is called for a range in the top 1GB of the task > > > address space (which includes modules), it also calls pud_clear() > > > because ceiling is 0. > > > > BTW, an alternative patch but which needs wider acknowledgement as it > > touches generic code (I can post it to linux-mm): > > I think that's the right solution. Something similar is needed in shift_arg_pages() when calling free_pgd_range(). But my concern - are there other architectures that rely on free_pgtables() to remove page tables corresponding to vmas beyond TASK_SIZE (e.g. vsyscall)? -- Catalin ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-02-24 11:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-23 17:59 [PATCH] ARM: LPAE: Invalidate the TLB for module addresses during translation fault Catalin Marinas 2012-02-23 19:50 ` Russell King - ARM Linux 2012-02-23 22:13 ` Catalin Marinas 2012-02-24 9:58 ` Catalin Marinas 2012-02-24 10:12 ` Russell King - ARM Linux 2012-02-24 11:39 ` Catalin Marinas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).