All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Sasha Levin <sasha.levin@oracle.com>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrey Konovalov <andreyknvl@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: Multiple potential races on vma->vm_flags
Date: Wed, 9 Sep 2015 19:01:29 +0300	[thread overview]
Message-ID: <20150909160129.GA9526@node.dhcp.inet.fi> (raw)
In-Reply-To: <55F04FD4.6060308@suse.cz>

On Wed, Sep 09, 2015 at 05:27:16PM +0200, Vlastimil Babka wrote:
> On 09/07/2015 01:40 PM, Kirill A. Shutemov wrote:
> >On Sun, Sep 06, 2015 at 03:21:05PM -0400, Sasha Levin wrote:
> >>==================================================================
> >>ThreadSanitizer: data-race in munlock_vma_pages_range
> >>
> >>Write of size 8 by thread T378 (K2633, CPU3):
> >>  [<ffffffff81212579>] munlock_vma_pages_range+0x59/0x3e0 mm/mlock.c:425
> >>  [<ffffffff81212ac9>] mlock_fixup+0x1c9/0x280 mm/mlock.c:549
> >>  [<ffffffff81212ccc>] do_mlock+0x14c/0x180 mm/mlock.c:589
> >>  [<     inlined    >] SyS_munlock+0x74/0xb0 SYSC_munlock mm/mlock.c:651
> >>  [<ffffffff812130b4>] SyS_munlock+0x74/0xb0 mm/mlock.c:643
> >>  [<ffffffff81eb352e>] entry_SYSCALL_64_fastpath+0x12/0x71
> >>arch/x86/entry/entry_64.S:186
> >
> >...
> >
> >>Previous read of size 8 by thread T398 (K2623, CPU2):
> >>  [<ffffffff8121d198>] try_to_unmap_one+0x78/0x4f0 mm/rmap.c:1208
> >>  [<     inlined    >] rmap_walk+0x147/0x450 rmap_walk_file mm/rmap.c:1540
> >>  [<ffffffff8121e7b7>] rmap_walk+0x147/0x450 mm/rmap.c:1559
> >>  [<ffffffff8121ef72>] try_to_munlock+0xa2/0xc0 mm/rmap.c:1423
> >>  [<ffffffff81211bb0>] __munlock_isolated_page+0x30/0x60 mm/mlock.c:129
> >>  [<ffffffff81212066>] __munlock_pagevec+0x236/0x3f0 mm/mlock.c:331
> >>  [<ffffffff812128a0>] munlock_vma_pages_range+0x380/0x3e0 mm/mlock.c:476
> >>  [<ffffffff81212ac9>] mlock_fixup+0x1c9/0x280 mm/mlock.c:549
> >>  [<ffffffff81212ccc>] do_mlock+0x14c/0x180 mm/mlock.c:589
> >>  [<     inlined    >] SyS_munlock+0x74/0xb0 SYSC_munlock mm/mlock.c:651
> >>  [<ffffffff812130b4>] SyS_munlock+0x74/0xb0 mm/mlock.c:643
> >>  [<ffffffff81eb352e>] entry_SYSCALL_64_fastpath+0x12/0x71
> >>arch/x86/entry/entry_64.S:186
> >
> >Okay, the detected race is mlock/munlock vs. rmap.
> >
> >On rmap side we check vma->vm_flags in few places without taking
> >vma->vm_mm->mmap_sem. The vma cannot be freed since we hold i_mmap_rwsem
> >or anon_vma_lock, but nothing prevent vma->vm_flags from changing under
> >us.
> >
> >In this particular case, speculative check in beginning of
> >try_to_unmap_one() is fine, since we re-check it under mmap_sem later in
> >the function.
> >
> >False-negative is fine too here, since we will mlock the page in
> >__mm_populate() on mlock side after mlock_fixup().
> >
> >BUT.
> >
> >We *must* have all speculative vm_flags accesses wrapped READ_ONCE() to
> >avoid all compiler trickery, like duplication vm_flags access with
> >inconsistent results.
> 
> Doesn't taking a semaphore, as in try_to_unmap_one(), already imply a
> compiler barrier forcing vm_flags to be re-read?

Yes, but it doesn't prevent compiler from generation multiple reads from
vma->vm_flags and it may blow up if two values doesn't match.

> >I looked only on VM_LOCKED checks, but there are few other flags checked
> >in rmap. All of them must be handled carefully. At least READ_ONCE() is
> >required.
> >
> >Other solution would be to introduce per-vma spinlock to protect
> >vma->vm_flags and probably other vma fields and offload this duty
> >from mmap_sem.
> >But that's much bigger project.
> 
> Sounds like an overkill, unless we find something more serious than this.

May be...

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Sasha Levin <sasha.levin@oracle.com>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrey Konovalov <andreyknvl@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: Multiple potential races on vma->vm_flags
Date: Wed, 9 Sep 2015 19:01:29 +0300	[thread overview]
Message-ID: <20150909160129.GA9526@node.dhcp.inet.fi> (raw)
In-Reply-To: <55F04FD4.6060308@suse.cz>

On Wed, Sep 09, 2015 at 05:27:16PM +0200, Vlastimil Babka wrote:
> On 09/07/2015 01:40 PM, Kirill A. Shutemov wrote:
> >On Sun, Sep 06, 2015 at 03:21:05PM -0400, Sasha Levin wrote:
> >>==================================================================
> >>ThreadSanitizer: data-race in munlock_vma_pages_range
> >>
> >>Write of size 8 by thread T378 (K2633, CPU3):
> >>  [<ffffffff81212579>] munlock_vma_pages_range+0x59/0x3e0 mm/mlock.c:425
> >>  [<ffffffff81212ac9>] mlock_fixup+0x1c9/0x280 mm/mlock.c:549
> >>  [<ffffffff81212ccc>] do_mlock+0x14c/0x180 mm/mlock.c:589
> >>  [<     inlined    >] SyS_munlock+0x74/0xb0 SYSC_munlock mm/mlock.c:651
> >>  [<ffffffff812130b4>] SyS_munlock+0x74/0xb0 mm/mlock.c:643
> >>  [<ffffffff81eb352e>] entry_SYSCALL_64_fastpath+0x12/0x71
> >>arch/x86/entry/entry_64.S:186
> >
> >...
> >
> >>Previous read of size 8 by thread T398 (K2623, CPU2):
> >>  [<ffffffff8121d198>] try_to_unmap_one+0x78/0x4f0 mm/rmap.c:1208
> >>  [<     inlined    >] rmap_walk+0x147/0x450 rmap_walk_file mm/rmap.c:1540
> >>  [<ffffffff8121e7b7>] rmap_walk+0x147/0x450 mm/rmap.c:1559
> >>  [<ffffffff8121ef72>] try_to_munlock+0xa2/0xc0 mm/rmap.c:1423
> >>  [<ffffffff81211bb0>] __munlock_isolated_page+0x30/0x60 mm/mlock.c:129
> >>  [<ffffffff81212066>] __munlock_pagevec+0x236/0x3f0 mm/mlock.c:331
> >>  [<ffffffff812128a0>] munlock_vma_pages_range+0x380/0x3e0 mm/mlock.c:476
> >>  [<ffffffff81212ac9>] mlock_fixup+0x1c9/0x280 mm/mlock.c:549
> >>  [<ffffffff81212ccc>] do_mlock+0x14c/0x180 mm/mlock.c:589
> >>  [<     inlined    >] SyS_munlock+0x74/0xb0 SYSC_munlock mm/mlock.c:651
> >>  [<ffffffff812130b4>] SyS_munlock+0x74/0xb0 mm/mlock.c:643
> >>  [<ffffffff81eb352e>] entry_SYSCALL_64_fastpath+0x12/0x71
> >>arch/x86/entry/entry_64.S:186
> >
> >Okay, the detected race is mlock/munlock vs. rmap.
> >
> >On rmap side we check vma->vm_flags in few places without taking
> >vma->vm_mm->mmap_sem. The vma cannot be freed since we hold i_mmap_rwsem
> >or anon_vma_lock, but nothing prevent vma->vm_flags from changing under
> >us.
> >
> >In this particular case, speculative check in beginning of
> >try_to_unmap_one() is fine, since we re-check it under mmap_sem later in
> >the function.
> >
> >False-negative is fine too here, since we will mlock the page in
> >__mm_populate() on mlock side after mlock_fixup().
> >
> >BUT.
> >
> >We *must* have all speculative vm_flags accesses wrapped READ_ONCE() to
> >avoid all compiler trickery, like duplication vm_flags access with
> >inconsistent results.
> 
> Doesn't taking a semaphore, as in try_to_unmap_one(), already imply a
> compiler barrier forcing vm_flags to be re-read?

Yes, but it doesn't prevent compiler from generation multiple reads from
vma->vm_flags and it may blow up if two values doesn't match.

> >I looked only on VM_LOCKED checks, but there are few other flags checked
> >in rmap. All of them must be handled carefully. At least READ_ONCE() is
> >required.
> >
> >Other solution would be to introduce per-vma spinlock to protect
> >vma->vm_flags and probably other vma fields and offload this duty
> >from mmap_sem.
> >But that's much bigger project.
> 
> Sounds like an overkill, unless we find something more serious than this.

May be...

-- 
 Kirill A. Shutemov

  reply	other threads:[~2015-09-09 16:01 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAAeHK+z8o96YeRF-fQXmoApOKXa0b9pWsQHDeP=5GC_hMTuoDg@mail.gmail.com>
     [not found] ` <55EC9221.4040603@oracle.com>
2015-09-07 11:40   ` Multiple potential races on vma->vm_flags Kirill A. Shutemov
2015-09-07 11:40     ` Kirill A. Shutemov
2015-09-09 15:27     ` Vlastimil Babka
2015-09-09 15:27       ` Vlastimil Babka
2015-09-09 16:01       ` Kirill A. Shutemov [this message]
2015-09-09 16:01         ` Kirill A. Shutemov
2015-09-10  0:58     ` Sasha Levin
2015-09-10  0:58       ` Sasha Levin
2015-09-10  8:36       ` Kirill A. Shutemov
2015-09-10  8:36         ` Kirill A. Shutemov
2015-09-10 13:27         ` Andrey Konovalov
2015-09-10 13:27           ` Andrey Konovalov
2015-09-11 10:39           ` Kirill A. Shutemov
2015-09-11 10:39             ` Kirill A. Shutemov
2015-09-11 15:29             ` Vlastimil Babka
2015-09-11 15:29               ` Vlastimil Babka
2015-09-11 16:08               ` Vlastimil Babka
2015-09-11 16:08                 ` Vlastimil Babka
2015-09-12  1:27             ` Hugh Dickins
2015-09-12  1:27               ` Hugh Dickins
2015-09-14 10:16               ` Kirill A. Shutemov
2015-09-14 10:16                 ` Kirill A. Shutemov
2015-09-15 17:36               ` Sasha Levin
2015-09-15 17:36                 ` Sasha Levin
2015-09-15 19:01                 ` Kirill A. Shutemov
2015-09-15 19:01                   ` Kirill A. Shutemov
2015-09-22 16:47                   ` Andrey Konovalov
2015-09-22 16:47                     ` Andrey Konovalov
2015-09-22 18:54                     ` Hugh Dickins
2015-09-22 18:54                       ` Hugh Dickins
2015-09-22 19:45                       ` Andrey Konovalov
2015-09-22 19:45                         ` Andrey Konovalov
2015-09-23  1:39                         ` Hugh Dickins
2015-09-23  1:39                           ` Hugh Dickins
2015-09-23 11:46                           ` Kirill A. Shutemov
2015-09-23 11:46                             ` Kirill A. Shutemov
2015-09-23 22:58                             ` Davidlohr Bueso
2015-09-23 22:58                               ` Davidlohr Bueso
2015-09-23 13:08                           ` Andrey Konovalov
2015-09-23 13:08                             ` Andrey Konovalov
2015-09-24  0:42                             ` Sasha Levin
2015-09-24  0:42                               ` Sasha Levin
2015-09-25 19:33                               ` Kirill A. Shutemov
2015-09-25 19:33                                 ` Kirill A. Shutemov
2015-10-13 22:38                               ` Hugh Dickins
2015-10-13 22:38                                 ` Hugh Dickins
2015-10-13 22:33                             ` Hugh Dickins
2015-10-13 22:33                               ` Hugh Dickins
2015-10-15 16:58                               ` Andrey Konovalov
2015-10-15 16:58                                 ` Andrey Konovalov
2015-09-23 21:30                 ` Sasha Levin
2015-09-23 21:30                   ` Sasha Levin
2015-09-25 14:26                   ` Oleg Nesterov
2015-09-25 14:26                     ` Oleg Nesterov
2015-09-24 13:11                 ` Oleg Nesterov
2015-09-24 13:11                   ` Oleg Nesterov
2015-09-24 16:27                   ` Sasha Levin
2015-09-24 16:27                     ` Sasha Levin
2015-09-24 17:26                     ` Oleg Nesterov
2015-09-24 17:26                       ` Oleg Nesterov
2015-09-24 18:52                       ` Andrey Ryabinin
2015-09-24 18:52                         ` Andrey Ryabinin
2015-09-24 19:01                         ` Sasha Levin
2015-09-24 19:01                           ` Sasha Levin
2015-09-25 12:41                         ` Oleg Nesterov
2015-09-25 12:41                           ` Oleg Nesterov
2015-09-23 15:34             ` Oleg Nesterov
2015-09-23 15:34               ` Oleg Nesterov
2015-09-23 15:38               ` Oleg Nesterov
2015-09-23 15:38                 ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150909160129.GA9526@node.dhcp.inet.fi \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@google.com \
    --cc=dvyukov@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=sasha.levin@oracle.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.