public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: "Jeff Squyres (jsquyres)"
	<jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Cc: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Shachar Raindel <raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: Status of "ummunot" branch?
Date: Wed, 5 Jun 2013 13:05:29 -0600	[thread overview]
Message-ID: <20130605190529.GA3044@obsidianresearch.com> (raw)
In-Reply-To: <EF66BBEB19BADC41AC8CCF5F684F07FC4F65DF6F-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>

On Wed, Jun 05, 2013 at 06:45:13PM +0000, Jeff Squyres (jsquyres) wrote:

> Hum.  I was under the impression that with today's code (i.e., not ODP), if you
> 
> a = malloc(N);
> ibv_reg_mr(..., a, N, ...);
> free(a);
> 
> (assuming that the memory actually left the process at free)
> 
> Then the relevant kernel verbs driver was notified, and would
> unregister that device.  ...but I'm an MPI guy, not a kernel guy --
> it seems like you're saying that my impression was wrong (which
> doesn't currently matter because we intercept free/sbrk and
> unregister such memory, anyway).

Sadly no, what happens is that once you do ibv_reg_mr that 'HCA
virtual address' is forever tied to the physical memory under the
'process virtual address' *at that moment* forever.

So in the case above, RDMA can continue after the free, and it
continues to hit the same *physical* memory that it always hit, but
due to the free the process has lost access to that memory (the kernel
keeps the physical memory reserved for RDMA purposes until unreg
though).

This is fundamentally why you need to intercept mmap/munmap/sbrk - if
the process's VM mapping is changed through those syscalls then the
HCA's VM and the process VM becomes de-synchronized.

> > 'magically be registered' is the wrong way to think about it - the
> > registration of VA=0x100 is simply kept, and any change to the
> > underlying physical mapping of the VA is synchronized with the HCA.
> 
> What happens if you:
> 
> a = malloc(N * page_size);
> ibv_reg_mr(..., a, N * page_size, ...);
> free(a);
> // incoming RDMA arrives targeted at buffer a

Haggai should comment on this, but my impression/expectation was
you'll get a remote protection fault/

> Or if you:
> 
> a = malloc(N * page_size);
> ibv_reg_mr(..., a, N * page_size, ...);
> free(a);
> a = malloc(N / 2 * page_size);
> // incoming RDMA arrives targeted at buffer a that is of length (N*page_size)

again, I expect a remote protection fault.

Noting of course, both of these cases are only true if the underlying
VM is manipulated in a way that makes the pages unmapped (eg
mmap/munmap, not free)

I would also assume that attempts to RDMA write read only pages
protection fault as well.

> It does seem quite odd, abstractly speaking, that a registration
> would survive a free/re-malloc (which is arguably a "different"
> buffer).

Not at all: the purpose of the registration is to allow access via
RDMA to a portion of the process's address space. The address space
doesn't change, but what it is mapped to can vary.

So - the ODP semantics make much more sense, so much so I'm not sure
we need a ODP flag at all, but that can be discussed when the patches
are proposed...

> That being said, it still seems like MPI needs a registration cache.
> It is several good steps forward if we don't need to intercept
> free/sbrk/whatever, but when MPI_Send(buf, ...) is invoked, we still
> have to check that the entire buf is registered.  If ibv_reg_mr(...,
> 0, 2^64, ...) was supported, that would obviate the entire need for
> registration caches.  That would be wonderful.

Yes, except that this shifts around where the registration overhead
ends up. Basically the HCA driver now has the registration cache you
had in MPI, and all the same overheads still exist. No free lunch
here :(

Haggai: A verb to resize a registration would probably be a helpful
step. MPI could maintain one registration that covers the sbrk
region and one registration that covers the heap, much easier than
searching tables and things.

Also bear in mind that all RDMA access protections will be disabled if
you register the entire process VM, the remote(s) can scribble/read
everything..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2013-06-05 19:05 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-28 17:51 Status of "ummunot" branch? Jeff Squyres (jsquyres)
     [not found] ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F643196-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-05-28 17:52   ` Roland Dreier
     [not found]     ` <CAL1RGDUops1ju6zU=w3vKxcUcLHp6XJFKfBTDr4nm397UkhaYA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-05-28 18:30       ` Jeff Squyres (jsquyres)
2013-05-29  8:53   ` Or Gerlitz
     [not found]     ` <CAJZOPZJc2Dq2jQgRspP_2c1j=4aJ40UxcBEcyiY_mhHPX1ptPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-05-29 22:56       ` Jeff Squyres (jsquyres)
     [not found]         ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F64AAB7-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-05-30  5:09           ` Or Gerlitz
     [not found]             ` <51A6DEEC.40305-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-05-30 15:52               ` Jeff Squyres (jsquyres)
2013-06-04  1:24       ` Jeff Squyres (jsquyres)
     [not found]         ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F657918-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-04  8:37           ` Or Gerlitz
     [not found]             ` <51ADA761.2080107-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-04  9:54               ` Haggai Eran
     [not found]                 ` <51ADB948.5080903-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-04 10:56                   ` Jeff Squyres (jsquyres)
     [not found]                     ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F659155-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-04 11:50                       ` Haggai Eran
     [not found]                         ` <51ADD489.3020902-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-04 17:04                           ` Jason Gunthorpe
     [not found]                             ` <20130604170441.GA13745-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-05  7:09                               ` Haggai Eran
2013-06-04 20:13                           ` Jeff Squyres (jsquyres)
     [not found]                             ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F65AE40-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-05  7:14                               ` Haggai Eran
     [not found]                                 ` <51AEE53C.2090603-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-05 12:45                                   ` Jeff Squyres (jsquyres)
     [not found]                                     ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F65C855-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-05 13:39                                       ` Haggai Eran
     [not found]                                         ` <51AF3FA8.7000900-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-05 16:53                                           ` Jeff Squyres (jsquyres)
     [not found]                                             ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F65D5D3-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-05 17:14                                               ` Jason Gunthorpe
     [not found]                                                 ` <20130605171426.GC30184-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-05 18:10                                                   ` Jeff Squyres (jsquyres)
     [not found]                                                     ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F65DC0D-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-05 18:18                                                       ` Jason Gunthorpe
     [not found]                                                         ` <20130605181853.GB1946-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-05 18:45                                                           ` Jeff Squyres (jsquyres)
     [not found]                                                             ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F65DF6F-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-05 19:05                                                               ` Jason Gunthorpe [this message]
     [not found]                                                                 ` <20130605190529.GA3044-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-06  2:58                                                                   ` Jeff Squyres (jsquyres)
2013-06-06  5:52                                                                   ` Haggai Eran
     [not found]                                                                     ` <51B023B9.9050000-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-06 23:33                                                                       ` Jeff Squyres (jsquyres)
     [not found]                                                                         ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F66B79C-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-07 22:59                                                                           ` Jeff Squyres (jsquyres)
     [not found]                                                                             ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F66E403-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-07 23:57                                                                               ` Jason Gunthorpe
     [not found]                                                                                 ` <20130607235731.GA25942-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-10  9:17                                                                                   ` Liran Liss
2013-06-10 14:49                                                                                   ` Jeff Squyres (jsquyres)
     [not found]                                                                                     ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F676E59-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-10 15:56                                                                                       ` Liran Liss
     [not found]                                                                                         ` <D554B471892C914E90E136467281724DAD695B50-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-06-12 21:10                                                                                           ` Jeff Squyres (jsquyres)
     [not found]                                                                                             ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F6808D7-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-12 21:17                                                                                               ` Jason Gunthorpe
     [not found]                                                                                                 ` <20130612211742.GA8625-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-14 22:48                                                                                                   ` Jeff Squyres (jsquyres)
2013-06-10 17:26                                                                                       ` Jason Gunthorpe
     [not found]                                                                                         ` <20130610172627.GC2391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-12 21:18                                                                                           ` Jeff Squyres (jsquyres)
     [not found]                                                                                             ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F680A2B-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-12 21:47                                                                                               ` Jason Gunthorpe
     [not found]                                                                                                 ` <20130612214708.GD8625-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-14 22:53                                                                                                   ` Jeff Squyres (jsquyres)
     [not found]                                                                                                     ` <EF66BBEB19BADC41AC8CCF5F684F07FC4F6886C8-nsZYYkk5h5QQ2GdVW7+PtKBKnGwkPULj@public.gmane.org>
2013-06-14 23:11                                                                                                       ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130605190529.GA3044@obsidianresearch.com \
    --to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
    --cc=haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox