From: Logan Gunthorpe <logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
To: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: "Jens Axboe" <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
"Benjamin Herrenschmidt"
<benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>,
"Keith Busch"
<keith.busch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
"Jérôme Glisse" <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"Jason Gunthorpe" <jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
"Bjorn Helgaas"
<bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
"Max Gurtovoy" <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
"Christoph Hellwig" <hch-jcswGhMUV9g@public.gmane.org>
Subject: [PATCH 00/11] Copy Offload in NVMe Fabrics with P2P PCI Memory
Date: Thu, 4 Jan 2018 12:01:25 -0700 [thread overview]
Message-ID: <20180104190137.7654-1-logang@deltatee.com> (raw)
Hello,
This is a continuation of our work to enable using Peer-to-Peer PCI
memory in NVMe fabrics targets. Many thanks go to Christoph Hellwig who
provided valuable feedback to get these patches to where they are today.
The concept here is to use memory that's exposed on a PCI BAR as
data buffers in the NVME target code such that data can be transferred
from an RDMA NIC to the special memory and then directly to an NVMe
device avoiding system memory entirely. The upside of this is better
QoS for applications running on the CPU utilizing memory and lower
PCI bandwidth required to the CPU (such that systems could be designed
with fewer lanes connected to the CPU). However, presently, the
trade-off is currently a reduction in overall throughput. (Largely due
to hardware issues that would certainly improve in the future).
Due to these trade-offs we've designed the system to only enable using
the PCI memory in cases where the NIC, NVMe devices and memory are all
behind the same PCI switch. This will mean many setups that could likely
work well will not be supported so that we can be more confident it
will work and not place any responsibility on the user to understand
their topology. (We chose to go this route based on feedback we
received at the last LSF). Future work may enable these transfers behind
a fabric of PCI switches or perhaps using a white list of known good
root complexes.
In order to enable this functionality, we introduce a few new PCI
functions such that a driver can register P2P memory with the system.
Struct pages are created for this memory using devm_memremap_pages()
and the PCI bus offset is stored in the corresponding pagemap structure.
Another set of functions allow a client driver to create a list of
client devices that will be used in a given P2P transactions and then
use that list to find any P2P memory that is supported by all the
client devices. This list is then also used to selectively disable the
ACS bits for the downstream ports behind these devices.
In the block layer, we also introduce a P2P request flag to indicate a
given request targets P2P memory as well as a flag for a request queue
to indicate a given queue supports targeting P2P memory. P2P requests
will only be accepted by queues that support it. Also, P2P requests
are marked to not be merged seeing a non-homogenous request would
complicate the DMA mapping requirements.
In the PCI NVMe driver, we modify the existing CMB support to utilize
the new PCI P2P memory infrastructure and also add support for P2P
memory in its request queue. When a P2P request is received it uses the
pci_p2pmem_map_sg() function which applies the necessary transformation
to get the corrent pci_bus_addr_t for the DMA transactions.
In the RDMA core, we also adjust rdma_rw_ctx_init() and
rdma_rw_ctx_destroy() to take a flags argument which indicates whether
to use the PCI P2P mapping functions or not.
Finally, in the NVMe fabrics target port we introduce a new
configuration boolean: 'allow_p2pmem'. When set, the port will attempt
to find P2P memory supported by the RDMA NIC and all namespaces. If
supported memory is found, it will be used in all IO transfers. And if
a port is using P2P memory, adding new namespaces that are not supported
by that memory will fail.
This series is based off of Christoph's v3 series to revamp
dev_pagemap. A git repo of the patches is available here[2].
Logan
Christoph Hellwig (2):
nvme-pci: clean up CMB initialization
nvme-pci: clean up SMBSZ bit definitions
Logan Gunthorpe (10):
pci-p2p: Support peer to peer memory
pci-p2p: Add sysfs group to display p2pmem stats
pci-p2p: Add PCI p2pmem dma mappings to adjust the bus offset
pci-p2p: Clear ACS P2P flags for all client devices
block: Introduce PCI P2P flags for request and request queue
IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]()
nvme-pci: Use PCI p2pmem subsystem to manage the CMB
nvme-pci: Add support for P2P memory in requests
nvme-pci: Add a quirk for a pseudo CMB
nvmet: Optionally use PCI P2P memory
Documentation/ABI/testing/sysfs-bus-pci | 25 +
block/blk-core.c | 3 +
drivers/infiniband/core/rw.c | 22 +-
drivers/infiniband/ulp/isert/ib_isert.c | 5 +-
drivers/infiniband/ulp/srpt/ib_srpt.c | 7 +-
drivers/nvme/host/core.c | 4 +
drivers/nvme/host/nvme.h | 8 +
drivers/nvme/host/pci.c | 164 ++++---
drivers/nvme/target/configfs.c | 29 ++
drivers/nvme/target/core.c | 95 +++-
drivers/nvme/target/io-cmd.c | 3 +
drivers/nvme/target/nvmet.h | 10 +
drivers/nvme/target/rdma.c | 41 +-
drivers/pci/Kconfig | 14 +
drivers/pci/Makefile | 1 +
drivers/pci/p2p.c | 781 ++++++++++++++++++++++++++++++++
include/linux/blk_types.h | 18 +-
include/linux/blkdev.h | 2 +
include/linux/memremap.h | 19 +
include/linux/nvme.h | 22 +-
include/linux/pci-p2p.h | 94 ++++
include/linux/pci.h | 6 +
include/rdma/rw.h | 7 +-
net/sunrpc/xprtrdma/svc_rdma_rw.c | 6 +-
24 files changed, 1291 insertions(+), 95 deletions(-)
create mode 100644 drivers/pci/p2p.c
create mode 100644 include/linux/pci-p2p.h
--
2.11.0
next reply other threads:[~2018-01-04 19:01 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-04 19:01 Logan Gunthorpe [this message]
[not found] ` <20180104190137.7654-1-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 19:01 ` [PATCH 01/12] pci-p2p: Support peer to peer memory Logan Gunthorpe
2018-01-04 21:40 ` Bjorn Helgaas
[not found] ` <20180104214028.GD189897-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2018-01-04 23:06 ` Logan Gunthorpe
[not found] ` <20180104190137.7654-2-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 21:59 ` Bjorn Helgaas
[not found] ` <20180104215941.GG189897-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2018-01-05 0:20 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 02/12] pci-p2p: Add sysfs group to display p2pmem stats Logan Gunthorpe
2018-01-04 21:50 ` Bjorn Helgaas
[not found] ` <20180104215040.GE189897-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2018-01-04 22:25 ` Jason Gunthorpe
2018-01-04 23:13 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 03/12] pci-p2p: Add PCI p2pmem dma mappings to adjust the bus offset Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices Logan Gunthorpe
[not found] ` <20180104190137.7654-5-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 21:57 ` Bjorn Helgaas
2018-01-04 22:35 ` Alex Williamson
[not found] ` <20180104153551.3118f71b-1yVPhWWZRC1BDLzU/O5InQ@public.gmane.org>
2018-01-05 0:00 ` Logan Gunthorpe
2018-01-05 1:09 ` Logan Gunthorpe
2018-01-05 3:33 ` Alex Williamson
2018-01-05 6:47 ` Jerome Glisse
2018-01-05 15:41 ` Alex Williamson
[not found] ` <20180104203300.79487c98-DGNDKt5SQtizQB+pC5nmwQ@public.gmane.org>
2018-01-05 17:10 ` Logan Gunthorpe
[not found] ` <77ec7893-6ff5-ebdf-163d-fc4e02077cc2-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-05 17:18 ` Alex Williamson
2018-01-04 19:01 ` [PATCH 05/12] block: Introduce PCI P2P flags for request and request queue Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]() Logan Gunthorpe
2018-01-04 19:22 ` Jason Gunthorpe
[not found] ` <20180104192225.GS11348-uk2M96/98Pc@public.gmane.org>
2018-01-04 19:52 ` Logan Gunthorpe
2018-01-04 22:13 ` Jason Gunthorpe
2018-01-04 23:44 ` Logan Gunthorpe
2018-01-05 4:50 ` Jason Gunthorpe
2018-01-08 14:59 ` Christoph Hellwig
2018-01-08 18:09 ` Jason Gunthorpe
[not found] ` <20180108180917.GF11348-uk2M96/98Pc@public.gmane.org>
2018-01-08 18:17 ` Logan Gunthorpe
[not found] ` <3daea7fe-f64a-36a9-ca80-0cb4d9acf171-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-08 18:29 ` Jason Gunthorpe
2018-01-08 18:34 ` Christoph Hellwig
[not found] ` <20180108183434.GA15549-jcswGhMUV9g@public.gmane.org>
2018-01-08 18:44 ` Logan Gunthorpe
[not found] ` <cffacbee-477a-fbe9-19cc-373cd2ec6fef-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-08 18:57 ` Christoph Hellwig
[not found] ` <20180108185743.GA15936-jcswGhMUV9g@public.gmane.org>
2018-01-08 19:05 ` Logan Gunthorpe
[not found] ` <7d276107-d2ae-530b-3d56-e104e22b4eea-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-09 16:47 ` Christoph Hellwig
2018-01-08 19:49 ` Jason Gunthorpe
2018-01-09 16:46 ` Christoph Hellwig
2018-01-09 17:10 ` Jason Gunthorpe
2018-01-08 19:01 ` Jason Gunthorpe
[not found] ` <20180108190116.GI11348-uk2M96/98Pc@public.gmane.org>
2018-01-09 16:55 ` Christoph Hellwig
2018-01-04 19:01 ` [PATCH 07/12] nvme-pci: clean up CMB initialization Logan Gunthorpe
[not found] ` <20180104190137.7654-8-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 19:08 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 08/12] nvme-pci: clean up SMBSZ bit definitions Logan Gunthorpe
[not found] ` <20180104190137.7654-9-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 19:08 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 09/12] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Logan Gunthorpe
[not found] ` <20180104190137.7654-10-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-05 15:30 ` Marta Rybczynska
[not found] ` <281272464.115248600.1515166226009.JavaMail.zimbra-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
2018-01-05 18:14 ` Logan Gunthorpe
2018-01-05 18:11 ` Keith Busch
2018-01-05 18:19 ` Logan Gunthorpe
[not found] ` <6c89e4b4-1854-c251-a5ec-7e54bc8085fc-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-05 19:01 ` Keith Busch
2018-01-05 19:04 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 10/12] nvme-pci: Add support for P2P memory in requests Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 11/12] nvme-pci: Add a quirk for a pseudo CMB Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 12/12] nvmet: Optionally use PCI P2P memory Logan Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180104190137.7654-1-logang@deltatee.com \
--to=logang-otvngxwrz7hwk0htik3j/w@public.gmane.org \
--cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
--cc=benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org \
--cc=bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=hch-jcswGhMUV9g@public.gmane.org \
--cc=jgg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=keith.busch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org \
--cc=linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
--cc=linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox