All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrey Konovalov <andreyknvl@google.com>,
	Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Hugh Dickins <hughd@google.com>
Subject: Re: Multiple potential races on vma->vm_flags
Date: Fri, 11 Sep 2015 18:08:39 +0200	[thread overview]
Message-ID: <55F2FC87.6060908@suse.cz> (raw)
In-Reply-To: <55F2F354.1030607@suse.cz>

On 09/11/2015 05:29 PM, Vlastimil Babka wrote:
> On 09/11/2015 12:39 PM, Kirill A. Shutemov wrote:
>> On Thu, Sep 10, 2015 at 03:27:59PM +0200, Andrey Konovalov wrote:
>>> Can a vma be shared among a few mm's?
>>
>> Define "shared".
>>
>> vma can belong only to one process (mm_struct), but it can be accessed
>> from other process like in rmap case below.
>>
>> rmap uses anon_vma_lock for anon vma and i_mmap_rwsem for file vma to make
>> sure that the vma will not disappear under it.
>>
>>> If yes, then taking current->mm->mmap_sem to protect vma is not enough.
>>
>> Depends on what protection you are talking about.
>>
>>> In the first report below both T378 and T398 take
>>> current->mm->mmap_sem at mm/mlock.c:650, but they turn out to be
>>> different locks (the addresses are different).
>>
>> See i_mmap_lock_read() in T398. It will guarantee that vma is there.
>>
>>> In the second report T309 doesn't take any locks at all, since it
>>> assumes that after checking atomic_dec_and_test(&mm->mm_users) the mm
>>> has no other users, but then it does a write to vma.
>>
>> This one is tricky. I *assume* the mm cannot be generally accessible after
>> mm_users drops to zero, but I'm not entirely sure about it.
>> procfs? ptrace?
>>
>> The VMA is still accessible via rmap at this point. And I think it can be
>> a problem:
>>
>> 		CPU0					CPU1
>> exit_mmap()
>>     // mmap_sem is *not* taken
>>     munlock_vma_pages_all()
>>       munlock_vma_pages_range()
>>       						try_to_unmap_one()
>> 						  down_read_trylock(&vma->vm_mm->mmap_sem))
>> 						  !!(vma->vm_flags & VM_LOCKED) == true
>>         vma->vm_flags &= ~VM_LOCKED;
>>         <munlock the page>
>>         						  mlock_vma_page(page);
>> 						  // mlocked pages is leaked.
>>
>> The obvious solution is to take mmap_sem in exit path, but it would cause
>> performance regression.
>>
>> Any comments?
>
> Just so others don't repeat the paths that I already looked at:
>
> - First I thought that try_to_unmap_one() has the page locked and
> munlock_vma_pages_range() will also lock it... but it doesn't.

More precisely, it does (in __munlock_pagevec()), but 
TestClearPageMlocked(page) doesn't happen under that lock.

> - Then I thought that exit_mmap() will revisit the page anyway doing
> actual unmap. It would, if it's the one who has the page mapped, it will
> clear the mlock (see page_remove_rmap()). If it's not the last one, page
> will be left locked. So it won't be completely leaked, but still, it
> will be mlocked when it shouldn't.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrey Konovalov <andreyknvl@google.com>,
	Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Hugh Dickins <hughd@google.com>
Subject: Re: Multiple potential races on vma->vm_flags
Date: Fri, 11 Sep 2015 18:08:39 +0200	[thread overview]
Message-ID: <55F2FC87.6060908@suse.cz> (raw)
In-Reply-To: <55F2F354.1030607@suse.cz>

On 09/11/2015 05:29 PM, Vlastimil Babka wrote:
> On 09/11/2015 12:39 PM, Kirill A. Shutemov wrote:
>> On Thu, Sep 10, 2015 at 03:27:59PM +0200, Andrey Konovalov wrote:
>>> Can a vma be shared among a few mm's?
>>
>> Define "shared".
>>
>> vma can belong only to one process (mm_struct), but it can be accessed
>> from other process like in rmap case below.
>>
>> rmap uses anon_vma_lock for anon vma and i_mmap_rwsem for file vma to make
>> sure that the vma will not disappear under it.
>>
>>> If yes, then taking current->mm->mmap_sem to protect vma is not enough.
>>
>> Depends on what protection you are talking about.
>>
>>> In the first report below both T378 and T398 take
>>> current->mm->mmap_sem at mm/mlock.c:650, but they turn out to be
>>> different locks (the addresses are different).
>>
>> See i_mmap_lock_read() in T398. It will guarantee that vma is there.
>>
>>> In the second report T309 doesn't take any locks at all, since it
>>> assumes that after checking atomic_dec_and_test(&mm->mm_users) the mm
>>> has no other users, but then it does a write to vma.
>>
>> This one is tricky. I *assume* the mm cannot be generally accessible after
>> mm_users drops to zero, but I'm not entirely sure about it.
>> procfs? ptrace?
>>
>> The VMA is still accessible via rmap at this point. And I think it can be
>> a problem:
>>
>> 		CPU0					CPU1
>> exit_mmap()
>>     // mmap_sem is *not* taken
>>     munlock_vma_pages_all()
>>       munlock_vma_pages_range()
>>       						try_to_unmap_one()
>> 						  down_read_trylock(&vma->vm_mm->mmap_sem))
>> 						  !!(vma->vm_flags & VM_LOCKED) == true
>>         vma->vm_flags &= ~VM_LOCKED;
>>         <munlock the page>
>>         						  mlock_vma_page(page);
>> 						  // mlocked pages is leaked.
>>
>> The obvious solution is to take mmap_sem in exit path, but it would cause
>> performance regression.
>>
>> Any comments?
>
> Just so others don't repeat the paths that I already looked at:
>
> - First I thought that try_to_unmap_one() has the page locked and
> munlock_vma_pages_range() will also lock it... but it doesn't.

More precisely, it does (in __munlock_pagevec()), but 
TestClearPageMlocked(page) doesn't happen under that lock.

> - Then I thought that exit_mmap() will revisit the page anyway doing
> actual unmap. It would, if it's the one who has the page mapped, it will
> clear the mlock (see page_remove_rmap()). If it's not the last one, page
> will be left locked. So it won't be completely leaked, but still, it
> will be mlocked when it shouldn't.
>


  reply	other threads:[~2015-09-11 16:08 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAAeHK+z8o96YeRF-fQXmoApOKXa0b9pWsQHDeP=5GC_hMTuoDg@mail.gmail.com>
     [not found] ` <55EC9221.4040603@oracle.com>
2015-09-07 11:40   ` Multiple potential races on vma->vm_flags Kirill A. Shutemov
2015-09-07 11:40     ` Kirill A. Shutemov
2015-09-09 15:27     ` Vlastimil Babka
2015-09-09 15:27       ` Vlastimil Babka
2015-09-09 16:01       ` Kirill A. Shutemov
2015-09-09 16:01         ` Kirill A. Shutemov
2015-09-10  0:58     ` Sasha Levin
2015-09-10  0:58       ` Sasha Levin
2015-09-10  8:36       ` Kirill A. Shutemov
2015-09-10  8:36         ` Kirill A. Shutemov
2015-09-10 13:27         ` Andrey Konovalov
2015-09-10 13:27           ` Andrey Konovalov
2015-09-11 10:39           ` Kirill A. Shutemov
2015-09-11 10:39             ` Kirill A. Shutemov
2015-09-11 15:29             ` Vlastimil Babka
2015-09-11 15:29               ` Vlastimil Babka
2015-09-11 16:08               ` Vlastimil Babka [this message]
2015-09-11 16:08                 ` Vlastimil Babka
2015-09-12  1:27             ` Hugh Dickins
2015-09-12  1:27               ` Hugh Dickins
2015-09-14 10:16               ` Kirill A. Shutemov
2015-09-14 10:16                 ` Kirill A. Shutemov
2015-09-15 17:36               ` Sasha Levin
2015-09-15 17:36                 ` Sasha Levin
2015-09-15 19:01                 ` Kirill A. Shutemov
2015-09-15 19:01                   ` Kirill A. Shutemov
2015-09-22 16:47                   ` Andrey Konovalov
2015-09-22 16:47                     ` Andrey Konovalov
2015-09-22 18:54                     ` Hugh Dickins
2015-09-22 18:54                       ` Hugh Dickins
2015-09-22 19:45                       ` Andrey Konovalov
2015-09-22 19:45                         ` Andrey Konovalov
2015-09-23  1:39                         ` Hugh Dickins
2015-09-23  1:39                           ` Hugh Dickins
2015-09-23 11:46                           ` Kirill A. Shutemov
2015-09-23 11:46                             ` Kirill A. Shutemov
2015-09-23 22:58                             ` Davidlohr Bueso
2015-09-23 22:58                               ` Davidlohr Bueso
2015-09-23 13:08                           ` Andrey Konovalov
2015-09-23 13:08                             ` Andrey Konovalov
2015-09-24  0:42                             ` Sasha Levin
2015-09-24  0:42                               ` Sasha Levin
2015-09-25 19:33                               ` Kirill A. Shutemov
2015-09-25 19:33                                 ` Kirill A. Shutemov
2015-10-13 22:38                               ` Hugh Dickins
2015-10-13 22:38                                 ` Hugh Dickins
2015-10-13 22:33                             ` Hugh Dickins
2015-10-13 22:33                               ` Hugh Dickins
2015-10-15 16:58                               ` Andrey Konovalov
2015-10-15 16:58                                 ` Andrey Konovalov
2015-09-23 21:30                 ` Sasha Levin
2015-09-23 21:30                   ` Sasha Levin
2015-09-25 14:26                   ` Oleg Nesterov
2015-09-25 14:26                     ` Oleg Nesterov
2015-09-24 13:11                 ` Oleg Nesterov
2015-09-24 13:11                   ` Oleg Nesterov
2015-09-24 16:27                   ` Sasha Levin
2015-09-24 16:27                     ` Sasha Levin
2015-09-24 17:26                     ` Oleg Nesterov
2015-09-24 17:26                       ` Oleg Nesterov
2015-09-24 18:52                       ` Andrey Ryabinin
2015-09-24 18:52                         ` Andrey Ryabinin
2015-09-24 19:01                         ` Sasha Levin
2015-09-24 19:01                           ` Sasha Levin
2015-09-25 12:41                         ` Oleg Nesterov
2015-09-25 12:41                           ` Oleg Nesterov
2015-09-23 15:34             ` Oleg Nesterov
2015-09-23 15:34               ` Oleg Nesterov
2015-09-23 15:38               ` Oleg Nesterov
2015-09-23 15:38                 ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55F2FC87.6060908@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@google.com \
    --cc=dvyukov@google.com \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=riel@redhat.com \
    --cc=sasha.levin@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.