* mm: NULL ptr deref in balance_dirty_pages_ratelimited @ 2014-02-25 19:32 Sasha Levin 2014-02-26 7:15 ` Bob Liu 0 siblings, 1 reply; 7+ messages in thread From: Sasha Levin @ 2014-02-25 19:32 UTC (permalink / raw) To: linux-mm@kvack.org; +Cc: Andrew Morton, LKML Hi all, While fuzzing with trinity inside a KVM tools running latest -next kernel I've stumbled on the following spew: [ 232.869443] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 232.870230] IP: [<mm/page-writeback.c:1612>] balance_dirty_pages_ratelimited+0x1e/0x150 [ 232.870230] PGD 586e1d067 PUD 586e1e067 PMD 0 [ 232.870230] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 232.870230] Dumping ftrace buffer: [ 232.870230] (ftrace buffer empty) [ 232.870230] Modules linked in: [ 232.870230] CPU: 36 PID: 9707 Comm: trinity-c36 Tainted: G W 3.14.0-rc4-next-20140225-sasha-00010-ga117461 #42 [ 232.870230] task: ffff880586dfb000 ti: ffff880586e34000 task.ti: ffff880586e34000 [ 232.870230] RIP: 0010:[<mm/page-writeback.c:1612>] [<mm/page-writeback.c:1612>] balance_dirty_pages_ratelimited+0x1e/0x150 [ 232.870230] RSP: 0000:ffff880586e35c58 EFLAGS: 00010282 [ 232.870230] RAX: 0000000000000000 RBX: ffff880582831361 RCX: 0000000000000007 [ 232.870230] RDX: 0000000000000007 RSI: ffff880586dfbcc0 RDI: ffff880582831361 [ 232.870230] RBP: ffff880586e35c78 R08: 0000000000000000 R09: 0000000000000000 [ 232.870230] R10: 0000000000000001 R11: 0000000000000001 R12: 00007f58007ee000 [ 232.870230] R13: ffff880c8d6d4f70 R14: 0000000000000200 R15: ffff880c8dcce710 [ 232.870230] FS: 00007f58018bb700(0000) GS:ffff880c8e800000(0000) knlGS:0000000000000000 [ 232.870230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 232.870230] CR2: 0000000000000020 CR3: 0000000586e1c000 CR4: 00000000000006e0 [ 232.870230] Stack: [ 232.870230] ffff880586e35c78 ffff880586e33400 00007f58007ee000 ffff880c8d6d4f70 [ 232.870230] ffff880586e35cd8 ffffffff8127d241 0000000000000001 0000000000000001 [ 232.870230] 0000000000000000 ffffea0032337080 0000000080000000 ffff880586e33400 [ 232.870230] Call Trace: [ 232.870230] [<mm/memory.c:3467>] do_shared_fault+0x1a1/0x1f0 [ 232.870230] [<mm/memory.c:3487>] handle_pte_fault+0xc8/0x230 [ 232.870230] [<arch/x86/include/asm/preempt.h:98>] ? delay_tsc+0xea/0x110 [ 232.870230] [<mm/memory.c:3770>] __handle_mm_fault+0x36e/0x3a0 [ 232.870230] [<include/linux/rcupdate.h:829>] ? rcu_read_unlock+0x5d/0x60 [ 232.870230] [<include/linux/memcontrol.h:148>] handle_mm_fault+0x10b/0x1b0 [ 232.870230] [<arch/x86/mm/fault.c:1147>] ? __do_page_fault+0x2e2/0x590 [ 232.870230] [<arch/x86/mm/fault.c:1214>] __do_page_fault+0x551/0x590 [ 232.870230] [<kernel/sched/cputime.c:681>] ? vtime_account_user+0x91/0xa0 [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] ? context_tracking_user_exit+0xa8/0x1c0 [ 232.870230] [<arch/x86/include/asm/preempt.h:98>] ? _raw_spin_unlock+0x30/0x50 [ 232.870230] [<kernel/sched/cputime.c:681>] ? vtime_account_user+0x91/0xa0 [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] ? context_tracking_user_exit+0xa8/0x1c0 [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] do_page_fault+0x3d/0x70 [ 232.870230] [<arch/x86/kernel/kvm.c:263>] do_async_page_fault+0x35/0x100 [ 232.870230] [<arch/x86/kernel/entry_64.S:1496>] async_page_fault+0x28/0x30 [ 232.870230] Code: 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 48 89 fb 48 8b 87 50 01 00 00 <f6> 40 20 01 0f 85 18 01 00 00 65 48 8b 14 25 40 da 00 00 44 8b [ 232.870230] RIP [<mm/page-writeback.c:1612>] balance_dirty_pages_ratelimited+0x1e/0x150 [ 232.870230] RSP <ffff880586e35c58> [ 232.870230] CR2: 0000000000000020 Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: mm: NULL ptr deref in balance_dirty_pages_ratelimited 2014-02-25 19:32 mm: NULL ptr deref in balance_dirty_pages_ratelimited Sasha Levin @ 2014-02-26 7:15 ` Bob Liu 2014-02-26 14:09 ` Kirill A. Shutemov 0 siblings, 1 reply; 7+ messages in thread From: Bob Liu @ 2014-02-26 7:15 UTC (permalink / raw) To: Sasha Levin; +Cc: linux-mm@kvack.org, Andrew Morton, LKML On Wed, Feb 26, 2014 at 3:32 AM, Sasha Levin <sasha.levin@oracle.com> wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools running latest -next kernel > I've stumbled on the following spew: > > [ 232.869443] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000020 > [ 232.870230] IP: [<mm/page-writeback.c:1612>] > balance_dirty_pages_ratelimited+0x1e/0x150 > [ 232.870230] PGD 586e1d067 PUD 586e1e067 PMD 0 > [ 232.870230] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [ 232.870230] Dumping ftrace buffer: > [ 232.870230] (ftrace buffer empty) > [ 232.870230] Modules linked in: > [ 232.870230] CPU: 36 PID: 9707 Comm: trinity-c36 Tainted: G W > 3.14.0-rc4-next-20140225-sasha-00010-ga117461 #42 > [ 232.870230] task: ffff880586dfb000 ti: ffff880586e34000 task.ti: > ffff880586e34000 > [ 232.870230] RIP: 0010:[<mm/page-writeback.c:1612>] > [<mm/page-writeback.c:1612>] balance_dirty_pages_ratelimited+0x1e/0x150 > [ 232.870230] RSP: 0000:ffff880586e35c58 EFLAGS: 00010282 > [ 232.870230] RAX: 0000000000000000 RBX: ffff880582831361 RCX: > 0000000000000007 > [ 232.870230] RDX: 0000000000000007 RSI: ffff880586dfbcc0 RDI: > ffff880582831361 > [ 232.870230] RBP: ffff880586e35c78 R08: 0000000000000000 R09: > 0000000000000000 > [ 232.870230] R10: 0000000000000001 R11: 0000000000000001 R12: > 00007f58007ee000 > [ 232.870230] R13: ffff880c8d6d4f70 R14: 0000000000000200 R15: > ffff880c8dcce710 > [ 232.870230] FS: 00007f58018bb700(0000) GS:ffff880c8e800000(0000) > knlGS:0000000000000000 > [ 232.870230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 232.870230] CR2: 0000000000000020 CR3: 0000000586e1c000 CR4: > 00000000000006e0 > [ 232.870230] Stack: > [ 232.870230] ffff880586e35c78 ffff880586e33400 00007f58007ee000 > ffff880c8d6d4f70 > [ 232.870230] ffff880586e35cd8 ffffffff8127d241 0000000000000001 > 0000000000000001 > [ 232.870230] 0000000000000000 ffffea0032337080 0000000080000000 > ffff880586e33400 > [ 232.870230] Call Trace: > [ 232.870230] [<mm/memory.c:3467>] do_shared_fault+0x1a1/0x1f0 > [ 232.870230] [<mm/memory.c:3487>] handle_pte_fault+0xc8/0x230 > [ 232.870230] [<arch/x86/include/asm/preempt.h:98>] ? delay_tsc+0xea/0x110 > [ 232.870230] [<mm/memory.c:3770>] __handle_mm_fault+0x36e/0x3a0 > [ 232.870230] [<include/linux/rcupdate.h:829>] ? rcu_read_unlock+0x5d/0x60 > [ 232.870230] [<include/linux/memcontrol.h:148>] > handle_mm_fault+0x10b/0x1b0 > [ 232.870230] [<arch/x86/mm/fault.c:1147>] ? __do_page_fault+0x2e2/0x590 > [ 232.870230] [<arch/x86/mm/fault.c:1214>] __do_page_fault+0x551/0x590 > [ 232.870230] [<kernel/sched/cputime.c:681>] ? > vtime_account_user+0x91/0xa0 > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] ? > context_tracking_user_exit+0xa8/0x1c0 > [ 232.870230] [<arch/x86/include/asm/preempt.h:98>] ? > _raw_spin_unlock+0x30/0x50 > [ 232.870230] [<kernel/sched/cputime.c:681>] ? > vtime_account_user+0x91/0xa0 > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] ? > context_tracking_user_exit+0xa8/0x1c0 > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] do_page_fault+0x3d/0x70 > [ 232.870230] [<arch/x86/kernel/kvm.c:263>] do_async_page_fault+0x35/0x100 > [ 232.870230] [<arch/x86/kernel/entry_64.S:1496>] > async_page_fault+0x28/0x30 > [ 232.870230] Code: 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 > 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 48 89 fb 48 8b 87 50 01 00 00 > <f6> 40 20 01 0f 85 18 01 00 00 65 48 8b 14 25 40 da 00 00 44 8b > [ 232.870230] RIP [<mm/page-writeback.c:1612>] > balance_dirty_pages_ratelimited+0x1e/0x150 > [ 232.870230] RSP <ffff880586e35c58> > [ 232.870230] CR2: 0000000000000020 > > Could you please test below patch? I think it may fix this issue. diff --git a/mm/memory.c b/mm/memory.c index 548d97e..90cea22 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3419,6 +3419,7 @@ static int do_shared_fault(struct mm_struct *mm, struct vm_area_struct *vma, pgoff_t pgoff, unsigned int flags, pte_t orig_pte) { struct page *fault_page; + struct address_space *mapping; spinlock_t *ptl; pte_t *pte; int dirtied = 0; @@ -3454,13 +3455,14 @@ static int do_shared_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (set_page_dirty(fault_page)) dirtied = 1; + mapping = fault_page->mapping; unlock_page(fault_page); - if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) { + if ((dirtied || vma->vm_ops->page_mkwrite) && mapping) { /* * Some device drivers do not set page.mapping but still * dirty their pages */ - balance_dirty_pages_ratelimited(fault_page->mapping); + balance_dirty_pages_ratelimited(mapping); } /* file_update_time outside page_lock */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: mm: NULL ptr deref in balance_dirty_pages_ratelimited 2014-02-26 7:15 ` Bob Liu @ 2014-02-26 14:09 ` Kirill A. Shutemov 2014-02-26 14:48 ` Bob Liu 0 siblings, 1 reply; 7+ messages in thread From: Kirill A. Shutemov @ 2014-02-26 14:09 UTC (permalink / raw) To: Bob Liu; +Cc: Sasha Levin, linux-mm@kvack.org, Andrew Morton, LKML On Wed, Feb 26, 2014 at 03:15:07PM +0800, Bob Liu wrote: > On Wed, Feb 26, 2014 at 3:32 AM, Sasha Levin <sasha.levin@oracle.com> wrote: > > Hi all, > > > > While fuzzing with trinity inside a KVM tools running latest -next kernel > > I've stumbled on the following spew: > > > > [ 232.869443] BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000020 > > [ 232.870230] IP: [<mm/page-writeback.c:1612>] > > balance_dirty_pages_ratelimited+0x1e/0x150 > > [ 232.870230] PGD 586e1d067 PUD 586e1e067 PMD 0 > > [ 232.870230] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > > [ 232.870230] Dumping ftrace buffer: > > [ 232.870230] (ftrace buffer empty) > > [ 232.870230] Modules linked in: > > [ 232.870230] CPU: 36 PID: 9707 Comm: trinity-c36 Tainted: G W > > 3.14.0-rc4-next-20140225-sasha-00010-ga117461 #42 > > [ 232.870230] task: ffff880586dfb000 ti: ffff880586e34000 task.ti: > > ffff880586e34000 > > [ 232.870230] RIP: 0010:[<mm/page-writeback.c:1612>] > > [<mm/page-writeback.c:1612>] balance_dirty_pages_ratelimited+0x1e/0x150 > > [ 232.870230] RSP: 0000:ffff880586e35c58 EFLAGS: 00010282 > > [ 232.870230] RAX: 0000000000000000 RBX: ffff880582831361 RCX: > > 0000000000000007 > > [ 232.870230] RDX: 0000000000000007 RSI: ffff880586dfbcc0 RDI: > > ffff880582831361 > > [ 232.870230] RBP: ffff880586e35c78 R08: 0000000000000000 R09: > > 0000000000000000 > > [ 232.870230] R10: 0000000000000001 R11: 0000000000000001 R12: > > 00007f58007ee000 > > [ 232.870230] R13: ffff880c8d6d4f70 R14: 0000000000000200 R15: > > ffff880c8dcce710 > > [ 232.870230] FS: 00007f58018bb700(0000) GS:ffff880c8e800000(0000) > > knlGS:0000000000000000 > > [ 232.870230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 232.870230] CR2: 0000000000000020 CR3: 0000000586e1c000 CR4: > > 00000000000006e0 > > [ 232.870230] Stack: > > [ 232.870230] ffff880586e35c78 ffff880586e33400 00007f58007ee000 > > ffff880c8d6d4f70 > > [ 232.870230] ffff880586e35cd8 ffffffff8127d241 0000000000000001 > > 0000000000000001 > > [ 232.870230] 0000000000000000 ffffea0032337080 0000000080000000 > > ffff880586e33400 > > [ 232.870230] Call Trace: > > [ 232.870230] [<mm/memory.c:3467>] do_shared_fault+0x1a1/0x1f0 > > [ 232.870230] [<mm/memory.c:3487>] handle_pte_fault+0xc8/0x230 > > [ 232.870230] [<arch/x86/include/asm/preempt.h:98>] ? delay_tsc+0xea/0x110 > > [ 232.870230] [<mm/memory.c:3770>] __handle_mm_fault+0x36e/0x3a0 > > [ 232.870230] [<include/linux/rcupdate.h:829>] ? rcu_read_unlock+0x5d/0x60 > > [ 232.870230] [<include/linux/memcontrol.h:148>] > > handle_mm_fault+0x10b/0x1b0 > > [ 232.870230] [<arch/x86/mm/fault.c:1147>] ? __do_page_fault+0x2e2/0x590 > > [ 232.870230] [<arch/x86/mm/fault.c:1214>] __do_page_fault+0x551/0x590 > > [ 232.870230] [<kernel/sched/cputime.c:681>] ? > > vtime_account_user+0x91/0xa0 > > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] ? > > context_tracking_user_exit+0xa8/0x1c0 > > [ 232.870230] [<arch/x86/include/asm/preempt.h:98>] ? > > _raw_spin_unlock+0x30/0x50 > > [ 232.870230] [<kernel/sched/cputime.c:681>] ? > > vtime_account_user+0x91/0xa0 > > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] ? > > context_tracking_user_exit+0xa8/0x1c0 > > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] do_page_fault+0x3d/0x70 > > [ 232.870230] [<arch/x86/kernel/kvm.c:263>] do_async_page_fault+0x35/0x100 > > [ 232.870230] [<arch/x86/kernel/entry_64.S:1496>] > > async_page_fault+0x28/0x30 > > [ 232.870230] Code: 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 > > 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 48 89 fb 48 8b 87 50 01 00 00 > > <f6> 40 20 01 0f 85 18 01 00 00 65 48 8b 14 25 40 da 00 00 44 8b > > [ 232.870230] RIP [<mm/page-writeback.c:1612>] > > balance_dirty_pages_ratelimited+0x1e/0x150 > > [ 232.870230] RSP <ffff880586e35c58> > > [ 232.870230] CR2: 0000000000000020 > > > > > > Could you please test below patch? I think it may fix this issue. What stops compiler from transform this back to unpatched? Do you relay on unlock_page() to have a compiler barrier? > > diff --git a/mm/memory.c b/mm/memory.c > index 548d97e..90cea22 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3419,6 +3419,7 @@ static int do_shared_fault(struct mm_struct *mm, > struct vm_area_struct *vma, > pgoff_t pgoff, unsigned int flags, pte_t orig_pte) > { > struct page *fault_page; > + struct address_space *mapping; > spinlock_t *ptl; > pte_t *pte; > int dirtied = 0; > @@ -3454,13 +3455,14 @@ static int do_shared_fault(struct mm_struct > *mm, struct vm_area_struct *vma, > > if (set_page_dirty(fault_page)) > dirtied = 1; > + mapping = fault_page->mapping; > unlock_page(fault_page); > - if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) { > + if ((dirtied || vma->vm_ops->page_mkwrite) && mapping) { > /* > * Some device drivers do not set page.mapping but still > * dirty their pages > */ > - balance_dirty_pages_ratelimited(fault_page->mapping); > + balance_dirty_pages_ratelimited(mapping); > } > > /* file_update_time outside page_lock */ > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: mm: NULL ptr deref in balance_dirty_pages_ratelimited 2014-02-26 14:09 ` Kirill A. Shutemov @ 2014-02-26 14:48 ` Bob Liu 2014-02-26 15:20 ` Kirill A. Shutemov 0 siblings, 1 reply; 7+ messages in thread From: Bob Liu @ 2014-02-26 14:48 UTC (permalink / raw) To: Kirill A. Shutemov; +Cc: Sasha Levin, linux-mm@kvack.org, Andrew Morton, LKML On Wed, Feb 26, 2014 at 10:09 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > On Wed, Feb 26, 2014 at 03:15:07PM +0800, Bob Liu wrote: >> On Wed, Feb 26, 2014 at 3:32 AM, Sasha Levin <sasha.levin@oracle.com> wrote: >> > Hi all, >> > >> > While fuzzing with trinity inside a KVM tools running latest -next kernel >> > I've stumbled on the following spew: >> > >> > [ 232.869443] BUG: unable to handle kernel NULL pointer dereference at >> > 0000000000000020 >> > [ 232.870230] IP: [<mm/page-writeback.c:1612>] >> > balance_dirty_pages_ratelimited+0x1e/0x150 >> > [ 232.870230] PGD 586e1d067 PUD 586e1e067 PMD 0 >> > [ 232.870230] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC >> > [ 232.870230] Dumping ftrace buffer: >> > [ 232.870230] (ftrace buffer empty) >> > [ 232.870230] Modules linked in: >> > [ 232.870230] CPU: 36 PID: 9707 Comm: trinity-c36 Tainted: G W >> > 3.14.0-rc4-next-20140225-sasha-00010-ga117461 #42 >> > [ 232.870230] task: ffff880586dfb000 ti: ffff880586e34000 task.ti: >> > ffff880586e34000 >> > [ 232.870230] RIP: 0010:[<mm/page-writeback.c:1612>] >> > [<mm/page-writeback.c:1612>] balance_dirty_pages_ratelimited+0x1e/0x150 >> > [ 232.870230] RSP: 0000:ffff880586e35c58 EFLAGS: 00010282 >> > [ 232.870230] RAX: 0000000000000000 RBX: ffff880582831361 RCX: >> > 0000000000000007 >> > [ 232.870230] RDX: 0000000000000007 RSI: ffff880586dfbcc0 RDI: >> > ffff880582831361 >> > [ 232.870230] RBP: ffff880586e35c78 R08: 0000000000000000 R09: >> > 0000000000000000 >> > [ 232.870230] R10: 0000000000000001 R11: 0000000000000001 R12: >> > 00007f58007ee000 >> > [ 232.870230] R13: ffff880c8d6d4f70 R14: 0000000000000200 R15: >> > ffff880c8dcce710 >> > [ 232.870230] FS: 00007f58018bb700(0000) GS:ffff880c8e800000(0000) >> > knlGS:0000000000000000 >> > [ 232.870230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> > [ 232.870230] CR2: 0000000000000020 CR3: 0000000586e1c000 CR4: >> > 00000000000006e0 >> > [ 232.870230] Stack: >> > [ 232.870230] ffff880586e35c78 ffff880586e33400 00007f58007ee000 >> > ffff880c8d6d4f70 >> > [ 232.870230] ffff880586e35cd8 ffffffff8127d241 0000000000000001 >> > 0000000000000001 >> > [ 232.870230] 0000000000000000 ffffea0032337080 0000000080000000 >> > ffff880586e33400 >> > [ 232.870230] Call Trace: >> > [ 232.870230] [<mm/memory.c:3467>] do_shared_fault+0x1a1/0x1f0 >> > [ 232.870230] [<mm/memory.c:3487>] handle_pte_fault+0xc8/0x230 >> > [ 232.870230] [<arch/x86/include/asm/preempt.h:98>] ? delay_tsc+0xea/0x110 >> > [ 232.870230] [<mm/memory.c:3770>] __handle_mm_fault+0x36e/0x3a0 >> > [ 232.870230] [<include/linux/rcupdate.h:829>] ? rcu_read_unlock+0x5d/0x60 >> > [ 232.870230] [<include/linux/memcontrol.h:148>] >> > handle_mm_fault+0x10b/0x1b0 >> > [ 232.870230] [<arch/x86/mm/fault.c:1147>] ? __do_page_fault+0x2e2/0x590 >> > [ 232.870230] [<arch/x86/mm/fault.c:1214>] __do_page_fault+0x551/0x590 >> > [ 232.870230] [<kernel/sched/cputime.c:681>] ? >> > vtime_account_user+0x91/0xa0 >> > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] ? >> > context_tracking_user_exit+0xa8/0x1c0 >> > [ 232.870230] [<arch/x86/include/asm/preempt.h:98>] ? >> > _raw_spin_unlock+0x30/0x50 >> > [ 232.870230] [<kernel/sched/cputime.c:681>] ? >> > vtime_account_user+0x91/0xa0 >> > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] ? >> > context_tracking_user_exit+0xa8/0x1c0 >> > [ 232.870230] [<arch/x86/include/asm/atomic.h:26>] do_page_fault+0x3d/0x70 >> > [ 232.870230] [<arch/x86/kernel/kvm.c:263>] do_async_page_fault+0x35/0x100 >> > [ 232.870230] [<arch/x86/kernel/entry_64.S:1496>] >> > async_page_fault+0x28/0x30 >> > [ 232.870230] Code: 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 >> > 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 48 89 fb 48 8b 87 50 01 00 00 >> > <f6> 40 20 01 0f 85 18 01 00 00 65 48 8b 14 25 40 da 00 00 44 8b >> > [ 232.870230] RIP [<mm/page-writeback.c:1612>] >> > balance_dirty_pages_ratelimited+0x1e/0x150 >> > [ 232.870230] RSP <ffff880586e35c58> >> > [ 232.870230] CR2: 0000000000000020 >> > >> > >> >> Could you please test below patch? I think it may fix this issue. > > What stops compiler from transform this back to unpatched? Sorry for my fault. I'll format a patch later. > Do you relay on unlock_page() to have a compiler barrier? > Before your commit mapping is a local variable and be assigned before unlock_page(): struct address_space *mapping = page->mapping; unlock_page(dirty_page); put_page(dirty_page); if ((dirtied || page_mkwrite) && mapping) { I'm afraid now "fault_page->mapping" might be changed to NULL after "if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) {" and then passed down to balance_dirty_pages_ratelimited(NULL). >> >> diff --git a/mm/memory.c b/mm/memory.c >> index 548d97e..90cea22 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -3419,6 +3419,7 @@ static int do_shared_fault(struct mm_struct *mm, >> struct vm_area_struct *vma, >> pgoff_t pgoff, unsigned int flags, pte_t orig_pte) >> { >> struct page *fault_page; >> + struct address_space *mapping; >> spinlock_t *ptl; >> pte_t *pte; >> int dirtied = 0; >> @@ -3454,13 +3455,14 @@ static int do_shared_fault(struct mm_struct >> *mm, struct vm_area_struct *vma, >> >> if (set_page_dirty(fault_page)) >> dirtied = 1; >> + mapping = fault_page->mapping; >> unlock_page(fault_page); >> - if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) { >> + if ((dirtied || vma->vm_ops->page_mkwrite) && mapping) { >> /* >> * Some device drivers do not set page.mapping but still >> * dirty their pages >> */ >> - balance_dirty_pages_ratelimited(fault_page->mapping); >> + balance_dirty_pages_ratelimited(mapping); >> } >> >> /* file_update_time outside page_lock */ >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > > -- > Kirill A. Shutemov -- Regards, --Bob -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: mm: NULL ptr deref in balance_dirty_pages_ratelimited 2014-02-26 14:48 ` Bob Liu @ 2014-02-26 15:20 ` Kirill A. Shutemov 2014-02-26 15:45 ` Paul E. McKenney 2014-02-26 15:47 ` Peter Zijlstra 0 siblings, 2 replies; 7+ messages in thread From: Kirill A. Shutemov @ 2014-02-26 15:20 UTC (permalink / raw) To: Bob Liu Cc: Sasha Levin, linux-mm@kvack.org, Andrew Morton, LKML, Peter Zijlstra, Paul E. McKenney On Wed, Feb 26, 2014 at 10:48:30PM +0800, Bob Liu wrote: > > Do you relay on unlock_page() to have a compiler barrier? > > > > Before your commit mapping is a local variable and be assigned before > unlock_page(): > struct address_space *mapping = page->mapping; > unlock_page(dirty_page); > put_page(dirty_page); > if ((dirtied || page_mkwrite) && mapping) { > > > I'm afraid now "fault_page->mapping" might be changed to NULL after > "if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) {" > and then passed down to balance_dirty_pages_ratelimited(NULL). I see what you try to fix. I wounder if we need to do mapping = ACCESS_ONCE(fault_page->mapping); instead. The question is if compiler on its own can eliminate intermediate variable and dereference fault_page->mapping twice, as code with my patch does. I ask because smp_mb__after_clear_bit() in unlock_page() does nothing on some architectures. > >> > >> diff --git a/mm/memory.c b/mm/memory.c > >> index 548d97e..90cea22 100644 > >> --- a/mm/memory.c > >> +++ b/mm/memory.c > >> @@ -3419,6 +3419,7 @@ static int do_shared_fault(struct mm_struct *mm, > >> struct vm_area_struct *vma, > >> pgoff_t pgoff, unsigned int flags, pte_t orig_pte) > >> { > >> struct page *fault_page; > >> + struct address_space *mapping; > >> spinlock_t *ptl; > >> pte_t *pte; > >> int dirtied = 0; > >> @@ -3454,13 +3455,14 @@ static int do_shared_fault(struct mm_struct > >> *mm, struct vm_area_struct *vma, > >> > >> if (set_page_dirty(fault_page)) > >> dirtied = 1; > >> + mapping = fault_page->mapping; > >> unlock_page(fault_page); > >> - if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) { > >> + if ((dirtied || vma->vm_ops->page_mkwrite) && mapping) { > >> /* > >> * Some device drivers do not set page.mapping but still > >> * dirty their pages > >> */ > >> - balance_dirty_pages_ratelimited(fault_page->mapping); > >> + balance_dirty_pages_ratelimited(mapping); > >> } > >> > >> /* file_update_time outside page_lock */ > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> Please read the FAQ at http://www.tux.org/lkml/ > > > > -- > > Kirill A. Shutemov > > -- > Regards, > --Bob -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: mm: NULL ptr deref in balance_dirty_pages_ratelimited 2014-02-26 15:20 ` Kirill A. Shutemov @ 2014-02-26 15:45 ` Paul E. McKenney 2014-02-26 15:47 ` Peter Zijlstra 1 sibling, 0 replies; 7+ messages in thread From: Paul E. McKenney @ 2014-02-26 15:45 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Bob Liu, Sasha Levin, linux-mm@kvack.org, Andrew Morton, LKML, Peter Zijlstra On Wed, Feb 26, 2014 at 05:20:51PM +0200, Kirill A. Shutemov wrote: > On Wed, Feb 26, 2014 at 10:48:30PM +0800, Bob Liu wrote: > > > Do you relay on unlock_page() to have a compiler barrier? > > > > > > > Before your commit mapping is a local variable and be assigned before > > unlock_page(): > > struct address_space *mapping = page->mapping; > > unlock_page(dirty_page); > > put_page(dirty_page); > > if ((dirtied || page_mkwrite) && mapping) { > > > > > > I'm afraid now "fault_page->mapping" might be changed to NULL after > > "if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) {" > > and then passed down to balance_dirty_pages_ratelimited(NULL). > > I see what you try to fix. I wounder if we need to do > > mapping = ACCESS_ONCE(fault_page->mapping); > > instead. > > The question is if compiler on its own can eliminate intermediate variable > and dereference fault_page->mapping twice, as code with my patch does. > I ask because smp_mb__after_clear_bit() in unlock_page() does nothing on > some architectures. The compiler is most definitely within its rights to eliminate intermediate variables if you don't use something like ACCESS_ONCE(). For more info, see the LWN writeup: http://lwn.net/Articles/508991/ Thanx, Paul > > >> > > >> diff --git a/mm/memory.c b/mm/memory.c > > >> index 548d97e..90cea22 100644 > > >> --- a/mm/memory.c > > >> +++ b/mm/memory.c > > >> @@ -3419,6 +3419,7 @@ static int do_shared_fault(struct mm_struct *mm, > > >> struct vm_area_struct *vma, > > >> pgoff_t pgoff, unsigned int flags, pte_t orig_pte) > > >> { > > >> struct page *fault_page; > > >> + struct address_space *mapping; > > >> spinlock_t *ptl; > > >> pte_t *pte; > > >> int dirtied = 0; > > >> @@ -3454,13 +3455,14 @@ static int do_shared_fault(struct mm_struct > > >> *mm, struct vm_area_struct *vma, > > >> > > >> if (set_page_dirty(fault_page)) > > >> dirtied = 1; > > >> + mapping = fault_page->mapping; > > >> unlock_page(fault_page); > > >> - if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) { > > >> + if ((dirtied || vma->vm_ops->page_mkwrite) && mapping) { > > >> /* > > >> * Some device drivers do not set page.mapping but still > > >> * dirty their pages > > >> */ > > >> - balance_dirty_pages_ratelimited(fault_page->mapping); > > >> + balance_dirty_pages_ratelimited(mapping); > > >> } > > >> > > >> /* file_update_time outside page_lock */ > > >> -- > > >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > >> the body of a message to majordomo@vger.kernel.org > > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > >> Please read the FAQ at http://www.tux.org/lkml/ > > > > > > -- > > > Kirill A. Shutemov > > > > -- > > Regards, > > --Bob > > -- > Kirill A. Shutemov > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: mm: NULL ptr deref in balance_dirty_pages_ratelimited 2014-02-26 15:20 ` Kirill A. Shutemov 2014-02-26 15:45 ` Paul E. McKenney @ 2014-02-26 15:47 ` Peter Zijlstra 1 sibling, 0 replies; 7+ messages in thread From: Peter Zijlstra @ 2014-02-26 15:47 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Bob Liu, Sasha Levin, linux-mm@kvack.org, Andrew Morton, LKML, Paul E. McKenney On Wed, Feb 26, 2014 at 05:20:51PM +0200, Kirill A. Shutemov wrote: > On Wed, Feb 26, 2014 at 10:48:30PM +0800, Bob Liu wrote: > > > Do you relay on unlock_page() to have a compiler barrier? > > > > > > > Before your commit mapping is a local variable and be assigned before > > unlock_page(): > > struct address_space *mapping = page->mapping; > > unlock_page(dirty_page); > > put_page(dirty_page); > > if ((dirtied || page_mkwrite) && mapping) { > > > > > > I'm afraid now "fault_page->mapping" might be changed to NULL after > > "if ((dirtied || vma->vm_ops->page_mkwrite) && fault_page->mapping) {" > > and then passed down to balance_dirty_pages_ratelimited(NULL). > > I see what you try to fix. I wounder if we need to do > > mapping = ACCESS_ONCE(fault_page->mapping); > > instead. > > The question is if compiler on its own can eliminate intermediate variable > and dereference fault_page->mapping twice, as code with my patch does. > I ask because smp_mb__after_clear_bit() in unlock_page() does nothing on > some architectures. That's a bug, and I have patches for that. That said; this is only ia64 and sparc32. ia64 has an actual full memory barrier in there very much including a compiler fence. And sparc32 atomics do too. In general, any atomic RMW op also implies a compiler fence. This includes clear_bit(). That said; unlock_page() should have RELEASE semantics, this too enforces that the read of page->mapping stay before the unlock_page(). The second usage of mapping may leak into the locked region, but it may not re-read after. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-02-26 15:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-02-25 19:32 mm: NULL ptr deref in balance_dirty_pages_ratelimited Sasha Levin 2014-02-26 7:15 ` Bob Liu 2014-02-26 14:09 ` Kirill A. Shutemov 2014-02-26 14:48 ` Bob Liu 2014-02-26 15:20 ` Kirill A. Shutemov 2014-02-26 15:45 ` Paul E. McKenney 2014-02-26 15:47 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).