[PATCH] mm: don't allow fault_around

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] mm: don't allow fault_around_bytes to be 0
       [not found] <53D07E96.5000006@oracle.com>
@ 2014-07-28  7:43 ` Andrey Ryabinin
  2014-07-28  7:47   ` Andrey Ryabinin
  2014-07-28  9:36   ` Kirill A. Shutemov
  0 siblings, 2 replies; 8+ messages in thread
From: Andrey Ryabinin @ 2014-07-28  7:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Andi Kleen, Matthew Wilcox, Dave Hansen,
	Alexander Viro, Dave Chinner, Ning Qu, linux-mm, linux-fsdevel,
	linux-kernel, Dave Jones, Andrey Ryabinin, stable,
	Kirill A. Shutemov, Mel Gorman, Rik van Riel,
	Konstantin Khlebnikov, Hugh Dickins

Sasha Levin triggered use-after-free when fuzzing using trinity and the KASAN
patchset:

	AddressSanitizer: use after free in do_read_fault.isra.40+0x3c2/0x510 at addr ffff88048a733110
	page:ffffea001229ccc0 count:0 mapcount:0 mapping:          (null) index:0x0
	page flags: 0xafffff80008000(tail)
	page dumped because: kasan error
	CPU: 6 PID: 9262 Comm: trinity-c104 Not tainted 3.16.0-rc6-next-20140723-sasha-00047-g289342b-dirty #929
	 00000000000000fb 0000000000000000 ffffea001229ccc0 ffff88038ac0fb78
	 ffffffffa5e40903 ffff88038ac0fc48 ffff88038ac0fc38 ffffffffa142acfc
	 0000000000000001 ffff880509ff5aa8 ffff88038ac10038 ffff88038ac0fbb0
	Call Trace:
	dump_stack (lib/dump_stack.c:52)
	kasan_report_error (mm/kasan/report.c:98 mm/kasan/report.c:166)
	? debug_smp_processor_id (lib/smp_processor_id.c:57)
	? preempt_count_sub (kernel/sched/core.c:2606)
	? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
	? do_read_fault.isra.40 (mm/memory.c:2784 mm/memory.c:2849 mm/memory.c:2898)
	__asan_load8 (mm/kasan/kasan.c:364)
	? do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
	do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
	? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
	? __pte_alloc (mm/memory.c:598)
	handle_mm_fault (mm/memory.c:3092 mm/memory.c:3225 mm/memory.c:3345 mm/memory.c:3374)
	? pud_huge (./arch/x86/include/asm/paravirt.h:611 arch/x86/mm/hugetlbpage.c:76)
	__get_user_pages (mm/gup.c:286 mm/gup.c:478)
	__mlock_vma_pages_range (mm/mlock.c:262)
	__mm_populate (mm/mlock.c:710)
	SyS_remap_file_pages (mm/mmap.c:2653 mm/mmap.c:2593)
	tracesys (arch/x86/kernel/entry_64.S:541)
	Read of size 8 by thread T9262:
	Memory state around the buggy address:
	 ffff88048a732e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 ffff88048a732f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 ffff88048a732f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 ffff88048a733000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 ffff88048a733080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	>ffff88048a733100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	                         ^
	 ffff88048a733180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 ffff88048a733200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 ffff88048a733280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 ffff88048a733300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
	 ffff88048a733380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb


It looks like that pte pointer is invalid in do_fault_around().
This could happen if fault_around_bytes is set to 0.
fault_around_pages() and fault_around_mask() calls rounddown_pow_of_to(fault_around_bytes)
The result of rounddown_pow_of_to is undefined if parameter == 0
(in my environment it returns 0x8000000000000000).

One way to fix this would be to return 0 from fault_around_pages() if fault_around_bytes == 0,
however this would add extra code on fault path.

So let's just forbid to set fault_around_bytes to zero.
Fault around is not used if fault_around_pages() <= 1, so if anyone doesn't want to use
it, fault_around_bytes could be set to any value in range [1, 2*PAGE_SIZE - 1]
instead of 0.

Fixes: a9b0f861("mm: nominate faultaround area in bytes rather than page order")
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: <stable@vger.kernel.org> # 3.15.x
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
---
 mm/memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 7e8d820..5927c42 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2784,7 +2784,7 @@ static int fault_around_bytes_get(void *data, u64 *val)
 
 static int fault_around_bytes_set(void *data, u64 val)
 {
-	if (val / PAGE_SIZE > PTRS_PER_PTE)
+	if (!val || val / PAGE_SIZE > PTRS_PER_PTE)
 		return -EINVAL;
 	fault_around_bytes = val;
 	return 0;
-- 
1.8.5.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: don't allow fault_around_bytes to be 0
  2014-07-28  7:43 ` [PATCH] mm: don't allow fault_around_bytes to be 0 Andrey Ryabinin
@ 2014-07-28  7:47   ` Andrey Ryabinin
  2014-07-28  9:36   ` Kirill A. Shutemov
  1 sibling, 0 replies; 8+ messages in thread
From: Andrey Ryabinin @ 2014-07-28  7:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Andi Kleen, Matthew Wilcox, Dave Hansen,
	Alexander Viro, Dave Chinner, Ning Qu, linux-mm, linux-fsdevel,
	linux-kernel, Dave Jones, stable, Kirill A. Shutemov, Mel Gorman,
	Rik van Riel, Konstantin Khlebnikov, Hugh Dickins, Sasha Levin

On 07/28/14 11:43, Andrey Ryabinin wrote:
> Sasha Levin triggered use-after-free when fuzzing using trinity and the KASAN
> patchset:
> 
> 	AddressSanitizer: use after free in do_read_fault.isra.40+0x3c2/0x510 at addr ffff88048a733110
> 	page:ffffea001229ccc0 count:0 mapcount:0 mapping:          (null) index:0x0
> 	page flags: 0xafffff80008000(tail)
> 	page dumped because: kasan error
> 	CPU: 6 PID: 9262 Comm: trinity-c104 Not tainted 3.16.0-rc6-next-20140723-sasha-00047-g289342b-dirty #929
> 	 00000000000000fb 0000000000000000 ffffea001229ccc0 ffff88038ac0fb78
> 	 ffffffffa5e40903 ffff88038ac0fc48 ffff88038ac0fc38 ffffffffa142acfc
> 	 0000000000000001 ffff880509ff5aa8 ffff88038ac10038 ffff88038ac0fbb0
> 	Call Trace:
> 	dump_stack (lib/dump_stack.c:52)
> 	kasan_report_error (mm/kasan/report.c:98 mm/kasan/report.c:166)
> 	? debug_smp_processor_id (lib/smp_processor_id.c:57)
> 	? preempt_count_sub (kernel/sched/core.c:2606)
> 	? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
> 	? do_read_fault.isra.40 (mm/memory.c:2784 mm/memory.c:2849 mm/memory.c:2898)
> 	__asan_load8 (mm/kasan/kasan.c:364)
> 	? do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
> 	do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
> 	? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
> 	? __pte_alloc (mm/memory.c:598)
> 	handle_mm_fault (mm/memory.c:3092 mm/memory.c:3225 mm/memory.c:3345 mm/memory.c:3374)
> 	? pud_huge (./arch/x86/include/asm/paravirt.h:611 arch/x86/mm/hugetlbpage.c:76)
> 	__get_user_pages (mm/gup.c:286 mm/gup.c:478)
> 	__mlock_vma_pages_range (mm/mlock.c:262)
> 	__mm_populate (mm/mlock.c:710)
> 	SyS_remap_file_pages (mm/mmap.c:2653 mm/mmap.c:2593)
> 	tracesys (arch/x86/kernel/entry_64.S:541)
> 	Read of size 8 by thread T9262:
> 	Memory state around the buggy address:
> 	 ffff88048a732e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a732f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a732f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	>ffff88048a733100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	                         ^
> 	 ffff88048a733180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 
> 
> It looks like that pte pointer is invalid in do_fault_around().
> This could happen if fault_around_bytes is set to 0.
> fault_around_pages() and fault_around_mask() calls rounddown_pow_of_to(fault_around_bytes)
> The result of rounddown_pow_of_to is undefined if parameter == 0
> (in my environment it returns 0x8000000000000000).
> 
> One way to fix this would be to return 0 from fault_around_pages() if fault_around_bytes == 0,
> however this would add extra code on fault path.
> 
> So let's just forbid to set fault_around_bytes to zero.
> Fault around is not used if fault_around_pages() <= 1, so if anyone doesn't want to use
> it, fault_around_bytes could be set to any value in range [1, 2*PAGE_SIZE - 1]
> instead of 0.
> 
> Fixes: a9b0f861("mm: nominate faultaround area in bytes rather than page order")
> Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Cc: <stable@vger.kernel.org> # 3.15.x
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Dave Jones <davej@redhat.com>
> Cc: Konstantin Khlebnikov <koct9i@gmail.com>
> Cc: Hugh Dickins <hughd@google.com>
> ---
>  mm/memory.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 7e8d820..5927c42 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2784,7 +2784,7 @@ static int fault_around_bytes_get(void *data, u64 *val)
>  
>  static int fault_around_bytes_set(void *data, u64 val)
>  {
> -	if (val / PAGE_SIZE > PTRS_PER_PTE)
> +	if (!val || val / PAGE_SIZE > PTRS_PER_PTE)
>  		return -EINVAL;
>  	fault_around_bytes = val;
>  	return 0;
> 

Adding Sasha to cc.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: don't allow fault_around_bytes to be 0
  2014-07-28  7:43 ` [PATCH] mm: don't allow fault_around_bytes to be 0 Andrey Ryabinin
  2014-07-28  7:47   ` Andrey Ryabinin
@ 2014-07-28  9:36   ` Kirill A. Shutemov
  2014-07-28 10:27     ` Andrey Ryabinin
  2014-07-28 15:26     ` Dave Hansen
  1 sibling, 2 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-07-28  9:36 UTC (permalink / raw)
  To: Andrey Ryabinin, Sasha Levin
  Cc: Andrew Morton, Linus Torvalds, Andi Kleen, Matthew Wilcox,
	Dave Hansen, Alexander Viro, Dave Chinner, Ning Qu, linux-mm,
	linux-fsdevel, linux-kernel, Dave Jones, stable,
	Kirill A. Shutemov, Mel Gorman, Rik van Riel,
	Konstantin Khlebnikov, Hugh Dickins

On Mon, Jul 28, 2014 at 11:43:20AM +0400, Andrey Ryabinin wrote:
> Sasha Levin triggered use-after-free when fuzzing using trinity and the KASAN
> patchset:
> 
> 	AddressSanitizer: use after free in do_read_fault.isra.40+0x3c2/0x510 at addr ffff88048a733110
> 	page:ffffea001229ccc0 count:0 mapcount:0 mapping:          (null) index:0x0
> 	page flags: 0xafffff80008000(tail)
> 	page dumped because: kasan error
> 	CPU: 6 PID: 9262 Comm: trinity-c104 Not tainted 3.16.0-rc6-next-20140723-sasha-00047-g289342b-dirty #929
> 	 00000000000000fb 0000000000000000 ffffea001229ccc0 ffff88038ac0fb78
> 	 ffffffffa5e40903 ffff88038ac0fc48 ffff88038ac0fc38 ffffffffa142acfc
> 	 0000000000000001 ffff880509ff5aa8 ffff88038ac10038 ffff88038ac0fbb0
> 	Call Trace:
> 	dump_stack (lib/dump_stack.c:52)
> 	kasan_report_error (mm/kasan/report.c:98 mm/kasan/report.c:166)
> 	? debug_smp_processor_id (lib/smp_processor_id.c:57)
> 	? preempt_count_sub (kernel/sched/core.c:2606)
> 	? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
> 	? do_read_fault.isra.40 (mm/memory.c:2784 mm/memory.c:2849 mm/memory.c:2898)
> 	__asan_load8 (mm/kasan/kasan.c:364)
> 	? do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
> 	do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
> 	? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
> 	? __pte_alloc (mm/memory.c:598)
> 	handle_mm_fault (mm/memory.c:3092 mm/memory.c:3225 mm/memory.c:3345 mm/memory.c:3374)
> 	? pud_huge (./arch/x86/include/asm/paravirt.h:611 arch/x86/mm/hugetlbpage.c:76)
> 	__get_user_pages (mm/gup.c:286 mm/gup.c:478)
> 	__mlock_vma_pages_range (mm/mlock.c:262)
> 	__mm_populate (mm/mlock.c:710)
> 	SyS_remap_file_pages (mm/mmap.c:2653 mm/mmap.c:2593)
> 	tracesys (arch/x86/kernel/entry_64.S:541)
> 	Read of size 8 by thread T9262:
> 	Memory state around the buggy address:
> 	 ffff88048a732e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a732f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a732f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	>ffff88048a733100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	                         ^
> 	 ffff88048a733180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 	 ffff88048a733380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 
> 
> It looks like that pte pointer is invalid in do_fault_around().
> This could happen if fault_around_bytes is set to 0.
> fault_around_pages() and fault_around_mask() calls rounddown_pow_of_to(fault_around_bytes)
> The result of rounddown_pow_of_to is undefined if parameter == 0
> (in my environment it returns 0x8000000000000000).

Ouch. Good catch!

Although, I'm not convinced that it caused the issue. Sasha, did you touch the
debugfs handle?

> One way to fix this would be to return 0 from fault_around_pages() if fault_around_bytes == 0,
> however this would add extra code on fault path.
> 
> So let's just forbid to set fault_around_bytes to zero.
> Fault around is not used if fault_around_pages() <= 1, so if anyone doesn't want to use
> it, fault_around_bytes could be set to any value in range [1, 2*PAGE_SIZE - 1]
> instead of 0.

>From user point of view, 0 is perfectly fine. What about untested patch
below?

Other option: get rid of debugfs interface, so fault_around_pages() and
fault_around_mask() will always be known compile time.

There's other problem with the debugfs handle: we don't have serialization
between fault_around_bytes_set() and do_fault_around(). It can end up
badly if fault_around_bytes will be changed under do_fault_around()...

I don't think it worth adding the serialization to hot path to protect
against debug interface.
Any thoughts?

>From 2932fbcefe4ec21c046348e21981149ecce5d161 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Mon, 28 Jul 2014 12:16:49 +0300
Subject: [PATCH] mm, debugfs: workaround undefined behaviour of
 rounddown_pow_of_two(0)

Result of rounddown_pow_of_two(0) is not defined. It can cause a bug if
user will set fault_around_bytes to 0 via debugfs interface.

Let's set fault_around_bytes to PAGE_SIZE if user tries to set it to
something below PAGE_SIZE.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 7e8d8205b610..2d8fa7a7b0ee 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2786,7 +2786,8 @@ static int fault_around_bytes_set(void *data, u64 val)
 {
 	if (val / PAGE_SIZE > PTRS_PER_PTE)
 		return -EINVAL;
-	fault_around_bytes = val;
+	/* rounddown_pow_of_two(0) is not defined */
+	fault_around_bytes = max(val, PAGE_SIZE);
 	return 0;
 }
 DEFINE_SIMPLE_ATTRIBUTE(fault_around_bytes_fops,
-- 
 Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: don't allow fault_around_bytes to be 0
  2014-07-28  9:36   ` Kirill A. Shutemov
@ 2014-07-28 10:27     ` Andrey Ryabinin
  2014-07-28 10:52       ` Kirill A. Shutemov
                         ` (2 more replies)
  2014-07-28 15:26     ` Dave Hansen
  1 sibling, 3 replies; 8+ messages in thread
From: Andrey Ryabinin @ 2014-07-28 10:27 UTC (permalink / raw)
  To: Kirill A. Shutemov, Sasha Levin
  Cc: Andrew Morton, Linus Torvalds, Andi Kleen, Matthew Wilcox,
	Dave Hansen, Alexander Viro, Dave Chinner, Ning Qu, linux-mm,
	linux-fsdevel, linux-kernel, Dave Jones, stable,
	Kirill A. Shutemov, Mel Gorman, Rik van Riel,
	Konstantin Khlebnikov, Hugh Dickins

On 07/28/14 13:36, Kirill A. Shutemov wrote:
> On Mon, Jul 28, 2014 at 11:43:20AM +0400, Andrey Ryabinin wrote:
>> Sasha Levin triggered use-after-free when fuzzing using trinity and the KASAN
>> patchset:
>>
>> 	AddressSanitizer: use after free in do_read_fault.isra.40+0x3c2/0x510 at addr ffff88048a733110
>> 	page:ffffea001229ccc0 count:0 mapcount:0 mapping:          (null) index:0x0
>> 	page flags: 0xafffff80008000(tail)
>> 	page dumped because: kasan error
>> 	CPU: 6 PID: 9262 Comm: trinity-c104 Not tainted 3.16.0-rc6-next-20140723-sasha-00047-g289342b-dirty #929
>> 	 00000000000000fb 0000000000000000 ffffea001229ccc0 ffff88038ac0fb78
>> 	 ffffffffa5e40903 ffff88038ac0fc48 ffff88038ac0fc38 ffffffffa142acfc
>> 	 0000000000000001 ffff880509ff5aa8 ffff88038ac10038 ffff88038ac0fbb0
>> 	Call Trace:
>> 	dump_stack (lib/dump_stack.c:52)
>> 	kasan_report_error (mm/kasan/report.c:98 mm/kasan/report.c:166)
>> 	? debug_smp_processor_id (lib/smp_processor_id.c:57)
>> 	? preempt_count_sub (kernel/sched/core.c:2606)
>> 	? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
>> 	? do_read_fault.isra.40 (mm/memory.c:2784 mm/memory.c:2849 mm/memory.c:2898)
>> 	__asan_load8 (mm/kasan/kasan.c:364)
>> 	? do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
>> 	do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
>> 	? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
>> 	? __pte_alloc (mm/memory.c:598)
>> 	handle_mm_fault (mm/memory.c:3092 mm/memory.c:3225 mm/memory.c:3345 mm/memory.c:3374)
>> 	? pud_huge (./arch/x86/include/asm/paravirt.h:611 arch/x86/mm/hugetlbpage.c:76)
>> 	__get_user_pages (mm/gup.c:286 mm/gup.c:478)
>> 	__mlock_vma_pages_range (mm/mlock.c:262)
>> 	__mm_populate (mm/mlock.c:710)
>> 	SyS_remap_file_pages (mm/mmap.c:2653 mm/mmap.c:2593)
>> 	tracesys (arch/x86/kernel/entry_64.S:541)
>> 	Read of size 8 by thread T9262:
>> 	Memory state around the buggy address:
>> 	 ffff88048a732e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	 ffff88048a732f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	 ffff88048a732f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	 ffff88048a733000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	 ffff88048a733080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	>ffff88048a733100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	                         ^
>> 	 ffff88048a733180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	 ffff88048a733200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	 ffff88048a733280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	 ffff88048a733300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> 	 ffff88048a733380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>
>>
>> It looks like that pte pointer is invalid in do_fault_around().
>> This could happen if fault_around_bytes is set to 0.
>> fault_around_pages() and fault_around_mask() calls rounddown_pow_of_to(fault_around_bytes)
>> The result of rounddown_pow_of_to is undefined if parameter == 0
>> (in my environment it returns 0x8000000000000000).
> 
> Ouch. Good catch!
> 
> Although, I'm not convinced that it caused the issue. Sasha, did you touch the
> debugfs handle?
> 

I suppose trinity could change it, no? I've got the very same spew after setting fault_around_bytes to 0.

>> One way to fix this would be to return 0 from fault_around_pages() if fault_around_bytes == 0,
>> however this would add extra code on fault path.
>>
>> So let's just forbid to set fault_around_bytes to zero.
>> Fault around is not used if fault_around_pages() <= 1, so if anyone doesn't want to use
>> it, fault_around_bytes could be set to any value in range [1, 2*PAGE_SIZE - 1]
>> instead of 0.
> 
>>>From user point of view, 0 is perfectly fine. What about untested patch
> below?
> 

In case if we are not going to get rid of debugfs interface I would better keep
faul_around_bytes always roundded down, like in following patch:


>From f41b7777b29f06dc62f80526e5617cae82a38709 Mon Sep 17 00:00:00 2001
From: Andrey Ryabinin <a.ryabinin@samsung.com>
Date: Mon, 28 Jul 2014 13:46:10 +0400
Subject: [PATCH] mm: debugfs: move rounddown_pow_of_two() out from do_fault
 path

do_fault_around expects fault_around_bytes rounded down to nearest
page order. Instead of calling rounddown_pow_of_two every time
in fault_around_pages()/fault_around_mask() we could do round down
when user changes fault_around_bytes via debugfs interface.

This also fixes bug when user set fault_around_bytes to 0.
Result of rounddown_pow_of_two(0) is not defined, therefore
fault_around_bytes == 0 doesn't work without this patch.

Let's set fault_around_bytes to PAGE_SIZE if user sets to something
less than PAGE_SIZE

Fixes: a9b0f861("mm: nominate faultaround area in bytes rather than page order")
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: <stable@vger.kernel.org> # 3.15.x
---
 mm/memory.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 7e8d820..e0c6fd6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2758,20 +2758,16 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
 	update_mmu_cache(vma, address, pte);
 }

-static unsigned long fault_around_bytes = 65536;
+static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);

-/*
- * fault_around_pages() and fault_around_mask() round down fault_around_bytes
- * to nearest page order. It's what do_fault_around() expects to see.
- */
 static inline unsigned long fault_around_pages(void)
 {
-	return rounddown_pow_of_two(fault_around_bytes) / PAGE_SIZE;
+	return fault_around_bytes >> PAGE_SHIFT;
 }

 static inline unsigned long fault_around_mask(void)
 {
-	return ~(rounddown_pow_of_two(fault_around_bytes) - 1) & PAGE_MASK;
+	return ~(fault_around_bytes - 1) & PAGE_MASK;
 }


@@ -2782,11 +2778,18 @@ static int fault_around_bytes_get(void *data, u64 *val)
 	return 0;
 }

+/*
+ * fault_around_pages() and fault_around_mask() expects fault_around_bytes
+ * rounded down to nearest page order. It's what do_fault_around() expects to see.
+ */
 static int fault_around_bytes_set(void *data, u64 val)
 {
 	if (val / PAGE_SIZE > PTRS_PER_PTE)
 		return -EINVAL;
-	fault_around_bytes = val;
+	if (val > PAGE_SIZE)
+		fault_around_bytes = rounddown_pow_of_two(val);
+	else
+		fault_around_bytes = PAGE_SIZE; /* rounddown_pow_of_two(0) is undefined */
 	return 0;
 }
 DEFINE_SIMPLE_ATTRIBUTE(fault_around_bytes_fops,
-- 
1.8.5.5








^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: don't allow fault_around_bytes to be 0
  2014-07-28 10:27     ` Andrey Ryabinin
@ 2014-07-28 10:52       ` Kirill A. Shutemov
  2014-07-28 12:32       ` Sasha Levin
  2014-07-28 22:43       ` David Rientjes
  2 siblings, 0 replies; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-07-28 10:52 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Sasha Levin, Andrew Morton, Linus Torvalds, Andi Kleen,
	Matthew Wilcox, Dave Hansen, Alexander Viro, Dave Chinner,
	Ning Qu, linux-mm, linux-fsdevel, linux-kernel, Dave Jones,
	stable, Kirill A. Shutemov, Mel Gorman, Rik van Riel,
	Konstantin Khlebnikov, Hugh Dickins

On Mon, Jul 28, 2014 at 02:27:37PM +0400, Andrey Ryabinin wrote:
> On 07/28/14 13:36, Kirill A. Shutemov wrote:
> > On Mon, Jul 28, 2014 at 11:43:20AM +0400, Andrey Ryabinin wrote:
> >> Sasha Levin triggered use-after-free when fuzzing using trinity and the KASAN
> >> patchset:
> >>
> >> 	AddressSanitizer: use after free in do_read_fault.isra.40+0x3c2/0x510 at addr ffff88048a733110
> >> 	page:ffffea001229ccc0 count:0 mapcount:0 mapping:          (null) index:0x0
> >> 	page flags: 0xafffff80008000(tail)
> >> 	page dumped because: kasan error
> >> 	CPU: 6 PID: 9262 Comm: trinity-c104 Not tainted 3.16.0-rc6-next-20140723-sasha-00047-g289342b-dirty #929
> >> 	 00000000000000fb 0000000000000000 ffffea001229ccc0 ffff88038ac0fb78
> >> 	 ffffffffa5e40903 ffff88038ac0fc48 ffff88038ac0fc38 ffffffffa142acfc
> >> 	 0000000000000001 ffff880509ff5aa8 ffff88038ac10038 ffff88038ac0fbb0
> >> 	Call Trace:
> >> 	dump_stack (lib/dump_stack.c:52)
> >> 	kasan_report_error (mm/kasan/report.c:98 mm/kasan/report.c:166)
> >> 	? debug_smp_processor_id (lib/smp_processor_id.c:57)
> >> 	? preempt_count_sub (kernel/sched/core.c:2606)
> >> 	? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
> >> 	? do_read_fault.isra.40 (mm/memory.c:2784 mm/memory.c:2849 mm/memory.c:2898)
> >> 	__asan_load8 (mm/kasan/kasan.c:364)
> >> 	? do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
> >> 	do_read_fault.isra.40 (mm/memory.c:2864 mm/memory.c:2898)
> >> 	? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
> >> 	? __pte_alloc (mm/memory.c:598)
> >> 	handle_mm_fault (mm/memory.c:3092 mm/memory.c:3225 mm/memory.c:3345 mm/memory.c:3374)
> >> 	? pud_huge (./arch/x86/include/asm/paravirt.h:611 arch/x86/mm/hugetlbpage.c:76)
> >> 	__get_user_pages (mm/gup.c:286 mm/gup.c:478)
> >> 	__mlock_vma_pages_range (mm/mlock.c:262)
> >> 	__mm_populate (mm/mlock.c:710)
> >> 	SyS_remap_file_pages (mm/mmap.c:2653 mm/mmap.c:2593)
> >> 	tracesys (arch/x86/kernel/entry_64.S:541)
> >> 	Read of size 8 by thread T9262:
> >> 	Memory state around the buggy address:
> >> 	 ffff88048a732e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	 ffff88048a732f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	 ffff88048a732f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	 ffff88048a733000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	 ffff88048a733080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	>ffff88048a733100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	                         ^
> >> 	 ffff88048a733180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	 ffff88048a733200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	 ffff88048a733280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	 ffff88048a733300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> 	 ffff88048a733380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>
> >>
> >> It looks like that pte pointer is invalid in do_fault_around().
> >> This could happen if fault_around_bytes is set to 0.
> >> fault_around_pages() and fault_around_mask() calls rounddown_pow_of_to(fault_around_bytes)
> >> The result of rounddown_pow_of_to is undefined if parameter == 0
> >> (in my environment it returns 0x8000000000000000).
> > 
> > Ouch. Good catch!
> > 
> > Although, I'm not convinced that it caused the issue. Sasha, did you touch the
> > debugfs handle?
> > 
> 
> I suppose trinity could change it, no? I've got the very same spew after setting fault_around_bytes to 0.
> 
> >> One way to fix this would be to return 0 from fault_around_pages() if fault_around_bytes == 0,
> >> however this would add extra code on fault path.
> >>
> >> So let's just forbid to set fault_around_bytes to zero.
> >> Fault around is not used if fault_around_pages() <= 1, so if anyone doesn't want to use
> >> it, fault_around_bytes could be set to any value in range [1, 2*PAGE_SIZE - 1]
> >> instead of 0.
> > 
> >>From user point of view, 0 is perfectly fine. What about untested patch
> > below?
> > 
> 
> In case if we are not going to get rid of debugfs interface I would better keep
> faul_around_bytes always roundded down, like in following patch:
> 
> 
> From f41b7777b29f06dc62f80526e5617cae82a38709 Mon Sep 17 00:00:00 2001
> From: Andrey Ryabinin <a.ryabinin@samsung.com>
> Date: Mon, 28 Jul 2014 13:46:10 +0400
> Subject: [PATCH] mm: debugfs: move rounddown_pow_of_two() out from do_fault
>  path
> 
> do_fault_around expects fault_around_bytes rounded down to nearest
> page order. Instead of calling rounddown_pow_of_two every time
> in fault_around_pages()/fault_around_mask() we could do round down
> when user changes fault_around_bytes via debugfs interface.
> 
> This also fixes bug when user set fault_around_bytes to 0.
> Result of rounddown_pow_of_two(0) is not defined, therefore
> fault_around_bytes == 0 doesn't work without this patch.
> 
> Let's set fault_around_bytes to PAGE_SIZE if user sets to something
> less than PAGE_SIZE
> 
> Fixes: a9b0f861("mm: nominate faultaround area in bytes rather than page order")
> Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
> Cc: <stable@vger.kernel.org> # 3.15.x
> ---
>  mm/memory.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 7e8d820..e0c6fd6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2758,20 +2758,16 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
>  	update_mmu_cache(vma, address, pte);
>  }
> 
> -static unsigned long fault_around_bytes = 65536;
> +static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);

This looks weird, but okay...

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: don't allow fault_around_bytes to be 0
  2014-07-28 10:27     ` Andrey Ryabinin
  2014-07-28 10:52       ` Kirill A. Shutemov
@ 2014-07-28 12:32       ` Sasha Levin
  2014-07-28 22:43       ` David Rientjes
  2 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2014-07-28 12:32 UTC (permalink / raw)
  To: Andrey Ryabinin, Kirill A. Shutemov
  Cc: Andrew Morton, Linus Torvalds, Andi Kleen, Matthew Wilcox,
	Dave Hansen, Alexander Viro, Dave Chinner, Ning Qu, linux-mm,
	linux-fsdevel, linux-kernel, Dave Jones, stable,
	Kirill A. Shutemov, Mel Gorman, Rik van Riel,
	Konstantin Khlebnikov, Hugh Dickins

On 07/28/2014 06:27 AM, Andrey Ryabinin wrote:
>> Although, I'm not convinced that it caused the issue. Sasha, did you touch the
>> > debugfs handle?
>> > 
> I suppose trinity could change it, no? I've got the very same spew after setting fault_around_bytes to 0.

Not on purpose, but as Andrey said - it's very possible that trinity did.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: don't allow fault_around_bytes to be 0
  2014-07-28  9:36   ` Kirill A. Shutemov
  2014-07-28 10:27     ` Andrey Ryabinin
@ 2014-07-28 15:26     ` Dave Hansen
  1 sibling, 0 replies; 8+ messages in thread
From: Dave Hansen @ 2014-07-28 15:26 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrey Ryabinin, Sasha Levin
  Cc: Andrew Morton, Linus Torvalds, Andi Kleen, Matthew Wilcox,
	Alexander Viro, Dave Chinner, Ning Qu, linux-mm, linux-fsdevel,
	linux-kernel, Dave Jones, stable, Kirill A. Shutemov, Mel Gorman,
	Rik van Riel, Konstantin Khlebnikov, Hugh Dickins

On 07/28/2014 02:36 AM, Kirill A. Shutemov wrote:
> +++ b/mm/memory.c
> @@ -2786,7 +2786,8 @@ static int fault_around_bytes_set(void *data, u64 val)
>  {
>  	if (val / PAGE_SIZE > PTRS_PER_PTE)
>  		return -EINVAL;
> -	fault_around_bytes = val;
> +	/* rounddown_pow_of_two(0) is not defined */
> +	fault_around_bytes = max(val, PAGE_SIZE);
>  	return 0;
>  }

It's also possible to race and have fault_around_bytes change between
when fault_around_mask() and fault_around_pages() are called so that
they don't match any more.  The min()/max() in do_fault_around() should
keep this from doing anything _too_ nasty, but it's worth thinking about
at least.

The safest thing to do might be to use an ACCESS_ONCE() at the beginning
of do_fault_around() for fault_around_bytes and generate
fault_around_mask() from the ACCESS_ONCE() result.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: don't allow fault_around_bytes to be 0
  2014-07-28 10:27     ` Andrey Ryabinin
  2014-07-28 10:52       ` Kirill A. Shutemov
  2014-07-28 12:32       ` Sasha Levin
@ 2014-07-28 22:43       ` David Rientjes
  2 siblings, 0 replies; 8+ messages in thread
From: David Rientjes @ 2014-07-28 22:43 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Kirill A. Shutemov, Sasha Levin, Andrew Morton, Linus Torvalds,
	Andi Kleen, Matthew Wilcox, Dave Hansen, Alexander Viro,
	Dave Chinner, Ning Qu, linux-mm, linux-fsdevel, linux-kernel,
	Dave Jones, stable, Kirill A. Shutemov, Mel Gorman, Rik van Riel,
	Konstantin Khlebnikov, Hugh Dickins

On Mon, 28 Jul 2014, Andrey Ryabinin wrote:

> do_fault_around expects fault_around_bytes rounded down to nearest
> page order. Instead of calling rounddown_pow_of_two every time
> in fault_around_pages()/fault_around_mask() we could do round down
> when user changes fault_around_bytes via debugfs interface.
> 

If you're going to optimize this, it seems like fault_around_bytes would 
benefit from being __read_mostly.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-07-28 22:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <53D07E96.5000006@oracle.com>
2014-07-28  7:43 ` [PATCH] mm: don't allow fault_around_bytes to be 0 Andrey Ryabinin
2014-07-28  7:47   ` Andrey Ryabinin
2014-07-28  9:36   ` Kirill A. Shutemov
2014-07-28 10:27     ` Andrey Ryabinin
2014-07-28 10:52       ` Kirill A. Shutemov
2014-07-28 12:32       ` Sasha Levin
2014-07-28 22:43       ` David Rientjes
2014-07-28 15:26     ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox