linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Mircea CIRJALIU - MELIU <mcirjaliu@bitdefender.com>
Cc: "Adalbert Lazăr" <alazar@bitdefender.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Alexander Graf" <graf@amazon.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Jerome Glisse" <jglisse@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Mihai Donțu" <mdontu@bitdefender.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Sargun Dhillon" <sargun@sargun.me>,
	"Aleksa Sarai" <cyphar@cyphar.com>,
	"Oleg Nesterov" <oleg@redhat.com>, "Jann Horn" <jannh@google.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Christian Brauner" <christian.brauner@ubuntu.com>
Subject: Re: [RESEND RFC PATCH 0/5] Remote mapping
Date: Fri, 4 Sep 2020 10:39:07 -0300	[thread overview]
Message-ID: <20200904133907.GS24045@ziepe.ca> (raw)
In-Reply-To: <AM7PR02MB60824E11A81A63A7CE94848CBB2D0@AM7PR02MB6082.eurprd02.prod.outlook.com>

On Fri, Sep 04, 2020 at 01:24:43PM +0000, Mircea CIRJALIU - MELIU wrote:
> > On Fri, Sep 04, 2020 at 02:31:11PM +0300, Adalbert Lazăr wrote:
> > > VMAs obtained by mmap()ing memory access fds mirror the contents of
> > > the remote process address space within the specified range. Pages are
> > > installed in the current process page tables at fault time and removed
> > > by the mmu_interval_notifier invalidate callbck. No further memory
> > management is involved.
> > > On attempts to access a hole, or if a mapping was removed by
> > > PIDFD_MEM_UNMAP, or if the remote process address space was reaped
> > by
> > > OOM, the remote mapping fault handler returns VM_FAULT_SIGBUS.
> > 
> > I still think anything along these lines needs to meet the XPMEM use cases as
> > well, we have to have more general solutions for such MM
> > stuff:
> > 
> > https://gitlab.com/hjelmn/xpmem
> > 
> > However, I think this fundamentally falls into some of the same bad direction
> > as xpmem.
> > 
> > I would much rather see this design copy & clone the VMA's than try to
> > mirror the PTEs inside the VMAs from the remote into a single giant VMA and
> > somehow split/mirror the VMA ops.
> 
> This design was made specifically for virtual machine introspection, where we 
> care more about the contents of the address space, rather than the remote VMAs
> and their vmops. (Right now only anon pages can be mapped, but I guess
> we can extend to pagecache pages as well.) I just used what seemed to be the
> common denominator to all page-related operations: range invalidation.
> This looks like a general solution.

The point is that a VMA is how the MM connects its parts together,
cloning the content of a VMA without the rest of the VMA meta-data is
just going to be very fragile in the long run.. 

Especially if the VMA is presented as a normal VMA with working struct
pages/etc, not a pfn map.

> IMO cloning a VMA in an address space that has a completely different layout
> will present its own set of caveats: What happens if the VMA resizes/splits? 
> Can you replay all the remote VMA vmops on the clone VMA?

The mirror would have to reclone the source VMA every time the source
VMA changes.

> > This is just too weird and fragile to be maintaible over a long term.
> > 
> > For instance, one of the major bugs in things like xpmem was that they are
> > incompatible with get_user_pages(), largely because of this issue.
> 
> We support get_user_pages(), that's how we integrate with KVM.

This seems really sketchy, get_user_pages is sensitive to the VMA,
what happens when VMA flags are different/etc?

Jason

  reply	other threads:[~2020-09-04 13:47 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-04 11:31 [RESEND RFC PATCH 0/5] Remote mapping Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 1/5] mm: add atomic capability to zap_details Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 2/5] mm: let the VMA decide how zap_pte_range() acts on mapped pages Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 3/5] mm/mmu_notifier: remove lockdep map, allow mmu notifier to be used in nested scenarios Adalbert Lazăr
2020-09-04 12:03   ` Jason Gunthorpe
2020-09-04 11:31 ` [RESEND RFC PATCH 4/5] mm/remote_mapping: use a pidfd to access memory belonging to unrelated process Adalbert Lazăr
2020-09-04 17:55   ` Oleg Nesterov
2020-09-07 14:30   ` Oleg Nesterov
2020-09-07 15:16     ` Adalbert Lazăr
2020-09-09  8:32     ` Mircea CIRJALIU - MELIU
2020-09-10 16:43       ` Oleg Nesterov
2020-09-07 15:02   ` Christian Brauner
2020-09-07 16:04     ` Mircea CIRJALIU - MELIU
2020-09-04 11:31 ` [RESEND RFC PATCH 5/5] pidfd_mem: implemented remote memory mapping system call Adalbert Lazăr
2020-09-04 19:18   ` Florian Weimer
2020-09-07 14:55   ` Christian Brauner
2020-09-04 12:11 ` [RESEND RFC PATCH 0/5] Remote mapping Jason Gunthorpe
2020-09-04 13:24   ` Mircea CIRJALIU - MELIU
2020-09-04 13:39     ` Jason Gunthorpe [this message]
2020-09-04 14:18       ` Mircea CIRJALIU - MELIU
2020-09-04 14:39         ` Jason Gunthorpe
2020-09-04 15:40           ` Mircea CIRJALIU - MELIU
2020-09-04 16:11             ` Jason Gunthorpe
2020-09-04 19:41   ` Matthew Wilcox
2020-09-04 19:49     ` Jason Gunthorpe
2020-09-04 20:08     ` Paolo Bonzini
2020-12-01 18:01     ` Jason Gunthorpe
2020-09-04 19:19 ` Florian Weimer
2020-09-04 20:18   ` Paolo Bonzini
2020-09-07  8:33     ` Christian Brauner
2020-09-04 19:39 ` Andy Lutomirski
2020-09-04 20:09   ` Paolo Bonzini
2020-09-04 20:34     ` Andy Lutomirski
2020-09-04 21:58       ` Paolo Bonzini
2020-09-04 23:17         ` Andy Lutomirski
2020-09-05 18:27           ` Paolo Bonzini
2020-09-07  8:38             ` Christian Brauner
2020-09-07 12:41           ` Mircea CIRJALIU - MELIU
2020-09-07  7:05         ` Christoph Hellwig
2020-09-07  8:44           ` Paolo Bonzini
2020-09-07 10:25   ` Mircea CIRJALIU - MELIU
2020-09-07 15:05 ` Christian Brauner
2020-09-07 20:43   ` Andy Lutomirski
2020-09-09 11:38     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200904133907.GS24045@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=akpm@linux-foundation.org \
    --cc=alazar@bitdefender.com \
    --cc=arnd@arndb.de \
    --cc=christian.brauner@ubuntu.com \
    --cc=cyphar@cyphar.com \
    --cc=graf@amazon.com \
    --cc=jannh@google.com \
    --cc=jglisse@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mcirjaliu@bitdefender.com \
    --cc=mdontu@bitdefender.com \
    --cc=oleg@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sargun@sargun.me \
    --cc=stefanha@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).