All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haggai Eran <haggaie@mellanox.com>
To: Stephen Bates <Stephen.Bates@pmcs.com>,
	Sagi Grimberg <sagig@dev.mellanox.co.il>,
	Jason Gunthorpe <jgunthorpe@obsidianresearch.com>,
	Christoph Hellwig <hch@infradead.org>,
	"'Logan Gunthorpe' (logang@deltatee.com)" <logang@deltatee.com>
Cc: Artemy Kovalyov <artemyko@mellanox.com>,
	"dledford@redhat.com" <dledford@redhat.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Leon Romanovsky <leonro@mellanox.com>,
	"sagig@mellanox.com" <sagig@mellanox.com>
Subject: Re: [RFC 0/7] Peer-direct memory
Date: Sun, 21 Feb 2016 11:06:27 +0200	[thread overview]
Message-ID: <56C97E13.9090101@mellanox.com> (raw)
In-Reply-To: <36F6EBABA23FEF4391AF72944D228901EB70C102@BBYEXM01.pmc-sierra.internal>

On 18/02/2016 16:44, Stephen Bates wrote:
> Sagi
> 
>> CC'ing sbates who played with this stuff at some point...
> 
> Thanks for inviting me to this party Sagi ;-). Here are some comments and responses based on our experiences. Apologies in advance for the list format:
> 
> 1. As it stands in 4.5-rc4 devm_memremap_pages will not work with iomem. Myself and  (mostly) Logan (cc'ed here) developed the ability to do that in an out of tree patch for memremap.c. We also developed a simple example driver for a PCIe device that exposes DRAM on the card via a BAR. We used this code to provide some feedback to Dan (e.g.  [1]-[3]). At this time we are preparing an RFC to extend devm_memremap_pages for IO memory and we hope to have that ready soon but there is no guarantee our approach is acceptable to the community. My hope is that it will be a good starting point for moving forward...
I'd be happy to see your RFC when you are ready. I see in the thread 
of [3] that you are using write-combining. Do you think your patchset 
will also be suitable for uncachable memory?

> 2. The two good things about Peer-Direct are that is works and it is here today. That said, I do think an approach based on ZONE_DEVICE is more general and a preferred way to allow IO devices to communicate with each other. The question is can we find such an approach that is acceptable to the community? As noted in point 1 I hope the coming RFC will initiate a discussion. I have also requested attendance at LSF/MM to discuss this topic (among others). 
> 
> 3. As of now the section alignment requirement is somewhat relaxed. I quote from [4]. 
> 
> "I could loosen the restriction a bit to allow one unaligned mapping
> per section.  However, if another mapping request came along that
> tried to map a free part of the section it would fail because the code
> depends on a  "1 dev_pagemap per section" relationship.  Seems an ok
> compromise to me..."
> 
> This is implemented in 4.5-rc4 (see memremap.c line 315).

I don't think that's enough for our purposes. We have devices with 
rather small BARs (32MB) and multiple PFs that all need to expose their 
BAR to peer to peer access. One can expect these PFs will be assigned 
adjacent addresses and they will break the "one dev_pagemap per 
section" rule.

> 4. The out of tree patch we did allows one to register the device memory as IO memory. However, we were only concerned with DRAM exposed on the BAR and so were not affected by the "i/o side effects" issues. Someone would need to think about how this applies to IOMEM that does have side-effects when accessed.
With this RFC, we map parts of the HCA BAR that were mmapped to a process 
(both uncacheable and write-combining) and map them to a peer device 
(another HCA). As long as the kernel doesn't do anything else with 
these pages, and leaves them to be controlled by the user-space 
application and/or the peer device, I don't see a problem with mapping
IO memory with side effects. However, I'm not an expert here, and I'd
be happy to hear what others think about this.

> 5. I concur with Sagi's comment below that one approach we can use to inform 3rd party device drives about vanishing memory regions is via mmu_notifiers. However this needs to be fleshed out and tied into the relevant driver(s).
> 
> 6. In full disclosure, my main interest in this ties in to NVM Express devices which can act as DMA masters and expose regions of IOMEM at the same time (via CMBs). I want to be able to tie these devices together with other IO devices (like RDMA NICs, FPGA and GPGPU based offload engines, other NVMe devices and storage adaptors) in a peer-2-peer fashion and may not always have a RDMA device in the mix...
I understand.

Regards,
Haggai

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-02-21  9:06 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-11 16:12 [RFC 0/7] Peer-direct memory Artemy Kovalyov
     [not found] ` <1455207177-11949-1-git-send-email-artemyko-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-02-11 16:12   ` [RFC 1/7] IB/core: Introduce peer client interface Artemy Kovalyov
2016-02-11 16:12   ` [RFC 2/7] IB/core: Get/put peer memory client Artemy Kovalyov
2016-02-11 16:12   ` [RFC 3/7] IB/core: Umem tunneling peer memory APIs Artemy Kovalyov
2016-02-11 16:12   ` [RFC 4/7] IB/core: Infrastructure to manage peer core context Artemy Kovalyov
2016-02-11 16:12   ` [RFC 5/7] IB/core: Invalidation support for peer memory Artemy Kovalyov
2016-02-11 16:12   ` [RFC 6/7] IB/core: Peer memory client for IO memory Artemy Kovalyov
2016-02-11 16:12   ` [RFC 7/7] IB/mlx5: Invalidation support for MR over peer memory Artemy Kovalyov
2016-02-11 19:18   ` [RFC 0/7] Peer-direct memory Jason Gunthorpe
     [not found]     ` <20160211191838.GA23675-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-02-12 20:13       ` Christoph Hellwig
     [not found]         ` <20160212201328.GA14122-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-02-12 20:36           ` Jason Gunthorpe
     [not found]             ` <20160212203649.GA10540-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-02-14 15:25               ` Sagi Grimberg
     [not found]                 ` <56C09C7E.4060808-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-02-18 14:44                   ` Stephen Bates
2016-02-21  9:06                     ` Haggai Eran [this message]
     [not found]                       ` <56C97E13.9090101-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-02-24 23:45                         ` Stephen Bates
2016-02-24 23:45                           ` Stephen Bates
2016-02-25 11:27                           ` Haggai Eran
2016-02-14 14:09           ` Haggai Eran
2016-02-14 14:05       ` Haggai Eran
2016-02-14 14:27     ` Haggai Eran
2016-02-16 18:22       ` Jason Gunthorpe
2016-02-17  4:03         ` davide rossetti
     [not found]           ` <CAPSaadxbFCOcKV=c3yX7eGw9Wqzn3jvPRZe2LMWYmiQcijT4nw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-02-17  4:13             ` davide rossetti
2016-02-17  4:13               ` davide rossetti
2016-02-17  4:44               ` Jason Gunthorpe
2016-02-17  8:49                 ` Christoph Hellwig
     [not found]                   ` <20160217084959.GB13616-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-02-18 17:12                     ` Jason Gunthorpe
2016-02-18 17:12                       ` Jason Gunthorpe
     [not found]               ` <CAPSaadx3vNBSxoWuvjrTp2n8_-DVqofttFGZRR+X8zdWwV86nw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-02-17  8:44                 ` Christoph Hellwig
2016-02-17  8:44                   ` Christoph Hellwig
2016-02-17 15:25                   ` Haggai Eran
     [not found]                     ` <56C490DF.1090100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-02-19 18:54                       ` Dan Williams
2016-02-19 18:54                         ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56C97E13.9090101@mellanox.com \
    --to=haggaie@mellanox.com \
    --cc=Stephen.Bates@pmcs.com \
    --cc=artemyko@mellanox.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=jgunthorpe@obsidianresearch.com \
    --cc=leonro@mellanox.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=sagig@dev.mellanox.co.il \
    --cc=sagig@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.