linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Jerome Glisse <j.glisse@gmail.com>,
	Hugh Dickins <hughd@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	kirill.shutemov@linux.intel.com, linux-kernel@vger.kernel.org,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: GUP guarantees wrt to userspace mappings redesign
Date: Mon, 2 May 2016 21:03:03 +0300	[thread overview]
Message-ID: <20160502180303.GA26252@node.shutemov.name> (raw)
In-Reply-To: <20160502162211.GA11678@redhat.com>

On Mon, May 02, 2016 at 06:22:11PM +0200, Oleg Nesterov wrote:
> On 05/02, Kirill A. Shutemov wrote:
> >
> > On Mon, May 02, 2016 at 04:15:38PM +0200, Oleg Nesterov wrote:
> > > >
> > > >  - I don't see any check page_count() around __replace_page() in uprobes,
> > > >    so it can easily replace pinned page.
> > >
> > > Why it should? even if it races with get_user_pages_fast()... this doesn't
> > > differ from the case when an application writes to MAP_PRIVATE non-anonymous
> > > region, no?
> >
> > < I know nothing about uprobes or ptrace in general >
> >
> > I think the difference is that the write is initiated by the process
> > itself, but IIUC __replace_page() can be initiated by other process, so
> > it's out of control of the application.
> 
> Yes. Just like gdb can insert a breakpoint into the read-only executable vma.
> 
> > So we have pages pinned by a driver and the driver expects the pinned
> > pages to be mapped into userspace, then __replace_page() kicks in and put
> > different page there -- driver's expectation is broken.
> 
> Yes... but I don't understand the problem space. I mean, I do not know why
> this driver should expect this, how it can be broken, etc.
> 
> I do not even understand why "initiated by other process" can make any
> difference... Unless this driver somehow controls all threads which could
> have this page mapped.

Okay, my understanding is following:

Some drivers (i.e. vfio) rely on get_user_page{,_fast}() to pin the memory
and expect pinned pages to be mapped into userspace until the pin is gone.
This memory is used to communicate between kernel and userspace.

This kinda works unless an application does something that can change page
tables in the area: fork() (due CoW), mremap(), munmap(), MADV_DONTNEED,
etc.

This model is really fragile, but the application has (kinda) control over
the situation: if it's very careful, it can keep the mapping intact.

With __replace_page() situation is different: it can be triggered from
outside of process and therefore out of control of the application.

I don't think there's something to fix on uprobe side. It's part of
debugging interface. Debuggers can be destructive, nothing new there.
I just tried to find the cases why this usage of GUP is not correct...

-- 
 Kirill A. Shutemov

  reply	other threads:[~2016-05-02 18:03 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-28 16:20 [BUG] vfio device assignment regression with THP ref counting redesign Alex Williamson
2016-04-28 18:17 ` Kirill A. Shutemov
2016-04-28 18:58   ` Alex Williamson
2016-04-28 23:21     ` Andrea Arcangeli
2016-04-29  0:44       ` Alex Williamson
2016-04-29  0:51       ` Kirill A. Shutemov
2016-04-29  2:45         ` Alex Williamson
2016-04-29  7:06           ` Kirill A. Shutemov
2016-04-29 15:12             ` Alex Williamson
2016-04-29 16:34             ` Andrea Arcangeli
2016-04-29 22:34               ` Alex Williamson
2016-05-02 10:41               ` Kirill A. Shutemov
2016-05-02 11:15                 ` Jerome Glisse
2016-05-02 12:14                   ` GUP guarantees wrt to userspace mappings redesign Kirill A. Shutemov
2016-05-02 13:39                     ` Jerome Glisse
2016-05-02 15:00                       ` GUP guarantees wrt to userspace mappings Kirill A. Shutemov
2016-05-02 15:22                         ` Jerome Glisse
2016-05-02 16:12                           ` Kirill A. Shutemov
2016-05-02 19:14                             ` Andrea Arcangeli
2016-05-02 19:11                           ` Andrea Arcangeli
2016-05-02 19:02                         ` Andrea Arcangeli
2016-05-02 14:15                     ` GUP guarantees wrt to userspace mappings redesign Oleg Nesterov
2016-05-02 16:21                       ` Kirill A. Shutemov
2016-05-02 16:22                         ` Oleg Nesterov
2016-05-02 18:03                           ` Kirill A. Shutemov [this message]
2016-05-02 17:41                             ` Oleg Nesterov
2016-05-02 18:56                     ` Andrea Arcangeli
2016-05-02 15:23                 ` [BUG] vfio device assignment regression with THP ref counting redesign Andrea Arcangeli
2016-05-02 16:00                   ` Kirill A. Shutemov
2016-05-02 18:03                     ` Andrea Arcangeli
2016-05-05  1:19                       ` Alex Williamson
2016-05-05 14:39                         ` Andrea Arcangeli
2016-05-05 15:09                           ` Andrea Arcangeli
2016-05-05 15:11                           ` Kirill A. Shutemov
2016-05-05 15:24                             ` Andrea Arcangeli
2016-05-06  7:29                               ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160502180303.GA26252@node.shutemov.name \
    --to=kirill@shutemov.name \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=hughd@google.com \
    --cc=j.glisse@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).