* RE: [PATCHv12 0/3] rdmacg: IB/core: rdma controller support
From: Liran Liss @ 2016-11-04 4:20 UTC (permalink / raw)
To: Leon Romanovsky, Parav Pandit
Cc: Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma, Li Zefan, Johannes Weiner, Doug Ledford,
Christoph Hellwig, Hefty, Sean, Jason Gunthorpe, Haggai Eran,
james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
Or Gerlitz, Matan Barak
In-Reply-To: <20161103180006.GL3617-2ukJVAZIZ/Y@public.gmane.org>
> From: Leon Romanovsky [mailto:leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org]
> We (Tejun, Christoph, Matan and me) had a face-to-face talk during KS/LPC and
> decided that the best way to move forward is to export to user one object
> (global HCA like) only and don't export anything else.
>
> All internal calculations will be based on this percentage.
>
> Once the cgroups users will come with reasonable justification why they need to
> configure different unexposed objects, we will expose them.
A global HCA metric is indeed in the right direction.
However, rethinking this, I think that we should specify the metric in terms of RDMA objects rather than percentage.
Basically, any resource that consumes an IDR is charged.
The reasons are:
- Some HCAs can have a huge amount of resources (millions of objects), of which even a small percentage may consume a considerable amount of kernel memory.
- We follow the same notion as FD limits, which accounts for numerous resource types that consume file objects in the kernel (files, pipes, sockets)
- The namespaces for RDMA resources are large (usually 24 bits). So even large resource counts won't come nowhere close in depleting the namespace. (Compare that to the mere 64K socket port space...)
- The metric measures the actual application usage of resources, rather than proportional to the resources of a given HCA adapter.
- We can continue to use the cgroup mechanism for charging (just as in the original proposal)
I have discussed this matter with Doug and Matan, and it seems like this is the right direction.
--Liran
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCHv12 0/3] rdmacg: IB/core: rdma controller support
From: Leon Romanovsky @ 2016-11-04 4:20 UTC (permalink / raw)
To: Parav Pandit
Cc: Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-rdma, Li Zefan,
Johannes Weiner, Doug Ledford, Christoph Hellwig, Liran Liss,
Hefty, Sean, Jason Gunthorpe, Haggai Eran,
james.l.morris-QHcLZuEGTsvQT0dZR+AlfA, Or Gerlitz, Matan Barak
In-Reply-To: <20161103180006.GL3617-2ukJVAZIZ/Y@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 1308 bytes --]
On Thu, Nov 03, 2016 at 08:00:06PM +0200, Leon Romanovsky wrote:
> On Tue, Nov 01, 2016 at 04:33:23PM +0530, Parav Pandit wrote:
> > So my opinion is:
> > (a) Let cgroup define the current standard objects and new reasonable
> > set of vendor specific objects in future.
> > (b) Add new rdma.percentage parameter so that any new standard object
> > or vendor specific object can be abstracted from average end user and
> > applications which are yet to catch up.
> > I believe this takes care of your point (1), (3), (4)?
>
> We (Tejun, Christoph, Matan and me) had a face-to-face talk during
> KS/LPC and decided that the best way to move forward is to export to
> user one object (global HCA like) only and don't export anything else.
>
> All internal calculations will be based on this percentage.
In order to simplify for users and developers more, this global cgroup
object should be not based on percentage, but on actual number of objects
units. While declaration of object unit is object which consumes IDR.
The IDR consumers can be of any type. Such simplification will give
excellent scalability to the cgroup without sacrificing user experience.
>
> Once the cgroups users will come with reasonable justification why they
> need to configure different unexposed objects, we will expose them.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [PATCH rdma-core] rxe: Use default dual-license instead of PathScale
From: Leon Romanovsky @ 2016-11-04 4:12 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, monis-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161104004813.GB30318-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 477 bytes --]
On Thu, Nov 03, 2016 at 06:48:13PM -0600, Jason Gunthorpe wrote:
> On Thu, Nov 03, 2016 at 05:49:15PM +0200, Leon Romanovsky wrote:
> > Remove the patent clauses from RXE copyright notice.
> >
> > Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
>
> Reviewed-by: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
>
> Thanks for getting this addressed!
Thanks
https://github.com/linux-rdma/rdma-core/pull/33
>
> Jason
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: RDMA developer gatherings around Kernel Summit and Linux Plumbers in Santa Fe
From: Matan Barak @ 2016-11-04 4:12 UTC (permalink / raw)
To: Liran Liss
Cc: Doug Ledford, Christoph Lameter,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
skc-YOWKrPYUwWM@public.gmane.org,
ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
Jason Gunthorpe,
john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
knut.omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org, Matan Barak
In-Reply-To: <HE1PR0501MB2812B2B88993AC90E0FBD72AB1A30-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
On Thu, Nov 3, 2016 at 11:54 PM, Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> Matan told me that he will advertise a git with the latest patches applied by EOD.
>
>> -----Original Message-----
>> From: Doug Ledford [mailto:dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
>> Sent: Thursday, November 03, 2016 3:41 PM
>> To: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> Cc: skc-YOWKrPYUwWM@public.gmane.org; ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org; Jason Gunthorpe
>> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>; john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org; leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org;
>> Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>; knut.omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org; Matan Barak
>> <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Subject: Re: RDMA developer gatherings around Kernel Summit and Linux
>> Plumbers in Santa Fe
>>
>> On 11/3/16 2:49 PM, Christoph Lameter wrote:
>> >> Saturday sessions 9am till 4pm. 12-1pm Lunchtime
>> >>
>> >> 9am Refine TODO list for consolidated library - Jason Gunthorpe
>> >> 10am Submission process for multi subsystem drivers - Doug Ledford
>> >> 11am Multicast features and gaps - Christoph Lameter
>> >>
>> >> 1pm Licensing carryover - Susan/Christoph
>> >> 2pm Standard network tools, integrating to the regular network
>> stack - Christoph
>> >> 3pm Open Discussion/Reserve Session - TBD
>> >> 4pm Closing Session - TBD
>> >
>> > Ok we have an on going conversation regarding the ioctl and I think
>> > that is of high importance. We tried to find a room for a meeting on
>> > Friday on this but we do not have access to a projector. I would like
>> > to have this issue dealt with first on Saturday and then we can
>> > rearrange times for the other presentations. I could skip some of my
>> > sessions if necessary and we have 2 hours that are pretty flexible at
>> > the end anyways. I hope that is agreeable to everyone?
>> >
>>
>> I'm agreeable with that.
>>
>> --
>> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> GPG Key ID: 0E572FDD
>> Red Hat, Inc.
>> 100 E. Davie St
>> Raleigh, NC 27601 USA
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
I would like to clean the series up a bit and make it bisect-able
before converting it from RFC to actual patches. So it's not just
wrapping with CONFIG_EXPERIMENTAL.
In the meantime, you could review the tree in my github:
https://github.com/matanb10/linux branch: abi_rfc_v5
Regards,
Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-core] rxe: Use default dual-license instead of PathScale
From: Jason Gunthorpe @ 2016-11-04 0:48 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, monis-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1478188155-24018-1-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
On Thu, Nov 03, 2016 at 05:49:15PM +0200, Leon Romanovsky wrote:
> Remove the patent clauses from RXE copyright notice.
>
> Signed-off-by: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Reviewed-by: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Thanks for getting this addressed!
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-core 0/8] libpvrdma: userspace library for PVRDMA
From: Jason Gunthorpe @ 2016-11-04 0:46 UTC (permalink / raw)
To: Adit Ranadive
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
On Thu, Nov 03, 2016 at 04:44:29PM -0700, Adit Ranadive wrote:
>
> I have included the shared ABI file here based on the RDMA fix up stuff
> that Jason pointed me to.
I left you some trivial notes on github.
The big item is that the shared ABI file must be byte for byte
identical to the kernel version, and it looks to me like it was
changed?
We still do not have a general solution to the need to add the header
struct in user space but not in kernel space, so you will need to
continue to get your enums from the kernel header but still have a
'copy' with the modified structs.
Does that make sense?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-core v2 4/4] redhat/spec: build split rpm packages
From: Jason Gunthorpe @ 2016-11-04 0:42 UTC (permalink / raw)
To: Doug Ledford
Cc: Jarod Wilson, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <581B9F91.4050407-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
On Thu, Nov 03, 2016 at 02:35:29PM -0600, Doug Ledford wrote:
> >>> +%package -n librdmacm-utils
> >>> +Summary: Examples for the librdmacm library
> >>> +Requires: librdmacm%{?_isa} = %{version}-%{release}
> >>
> >> Why the requires? Shouldn't auto shlib dependencies take care of that?
> >
> > Probably. I think this was another legacy bit copied over from a
> > stand-alone spec file.
>
> Actually, no. When you have a -utils package that goes with a library
> package, standard procedure is to tie them directly like this. The auto
> dependency stuff will allow, say, librdmacm-1.1.17-1 and
> librdmacm-utils-1.1.16-1 to happily satisfy each other since the later
> librdmacm provides all of the sonames and apis that the -utils package
> needs. This is as designed as you want a librdamcm update to not
> trigger a required update of, say, openmpi, unless there is truly a
> change that requires it. But, for the utils that go with the library,
> even though we don't *have* to update them with the library, we want
> that to happen automatically, so the explicit requires makes that happen
> even if librdmacm-utils was excluded from the update command.
Okay, Jarod you will need to send a patch to put this back, because I
applied all the changes discussed in this email when I made the pull
request.
Thanks,
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH 8/8] libpvrdma: Add fix up for ABI file
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Use the fix up added by Jason to use the kernel version of pvrdma-abi.h
if it exists.
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
| 1 +
buildlib/fixup-include/rdma-pvrdma-abi.h | 297 +++++++++++++++++++++++++++++++
providers/pvrdma/pvrdma-abi.h | 297 -------------------------------
providers/pvrdma/pvrdma.h | 2 +-
4 files changed, 299 insertions(+), 298 deletions(-)
create mode 100644 buildlib/fixup-include/rdma-pvrdma-abi.h
delete mode 100644 providers/pvrdma/pvrdma-abi.h
--git a/buildlib/RDMA_LinuxHeaders.cmake b/buildlib/RDMA_LinuxHeaders.cmake
index c67b0a6..4689cd1 100644
--- a/buildlib/RDMA_LinuxHeaders.cmake
+++ b/buildlib/RDMA_LinuxHeaders.cmake
@@ -83,3 +83,4 @@ rdma_check_kheader("rdma/ib_user_mad.h" "${DEFAULT_TEST}")
rdma_check_kheader("rdma/rdma_netlink.h" "int main(int argc,const char *argv[]) { return RDMA_NL_IWPM_REMOTE_INFO && RDMA_NL_IWCM; }")
rdma_check_kheader("rdma/rdma_user_cm.h" "${DEFAULT_TEST}")
rdma_check_kheader("rdma/rdma_user_rxe.h" "${DEFAULT_TEST}")
+rdma_check_kheader("rdma/pvrdma-abi.h" "${DEFAULT_TEST}")
diff --git a/buildlib/fixup-include/rdma-pvrdma-abi.h b/buildlib/fixup-include/rdma-pvrdma-abi.h
new file mode 100644
index 0000000..c7a38c5
--- /dev/null
+++ b/buildlib/fixup-include/rdma-pvrdma-abi.h
@@ -0,0 +1,297 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_ABI_H__
+#define __PVRDMA_ABI_H__
+
+#include <infiniband/kern-abi.h>
+
+#define PVRDMA_UVERBS_ABI_VERSION 3
+#define PVRDMA_UAR_HANDLE_MASK 0x00FFFFFF /* Bottom 24 bits. */
+#define PVRDMA_UAR_QP_OFFSET 0 /* QP doorbell offset. */
+#define PVRDMA_UAR_QP_SEND BIT(30) /* Send bit. */
+#define PVRDMA_UAR_QP_RECV BIT(31) /* Recv bit. */
+#define PVRDMA_UAR_CQ_OFFSET 4 /* CQ doorbell offset. */
+#define PVRDMA_UAR_CQ_ARM_SOL BIT(29) /* Arm solicited bit. */
+#define PVRDMA_UAR_CQ_ARM BIT(30) /* Arm bit. */
+#define PVRDMA_UAR_CQ_POLL BIT(31) /* Poll bit. */
+
+enum pvrdma_wr_opcode {
+ PVRDMA_WR_RDMA_WRITE,
+ PVRDMA_WR_RDMA_WRITE_WITH_IMM,
+ PVRDMA_WR_SEND,
+ PVRDMA_WR_SEND_WITH_IMM,
+ PVRDMA_WR_RDMA_READ,
+ PVRDMA_WR_ATOMIC_CMP_AND_SWP,
+ PVRDMA_WR_ATOMIC_FETCH_AND_ADD,
+ PVRDMA_WR_LSO,
+ PVRDMA_WR_SEND_WITH_INV,
+ PVRDMA_WR_RDMA_READ_WITH_INV,
+ PVRDMA_WR_LOCAL_INV,
+ PVRDMA_WR_FAST_REG_MR,
+ PVRDMA_WR_MASKED_ATOMIC_CMP_AND_SWP,
+ PVRDMA_WR_MASKED_ATOMIC_FETCH_AND_ADD,
+ PVRDMA_WR_BIND_MW,
+ PVRDMA_WR_REG_SIG_MR,
+};
+
+enum pvrdma_wc_status {
+ PVRDMA_WC_SUCCESS,
+ PVRDMA_WC_LOC_LEN_ERR,
+ PVRDMA_WC_LOC_QP_OP_ERR,
+ PVRDMA_WC_LOC_EEC_OP_ERR,
+ PVRDMA_WC_LOC_PROT_ERR,
+ PVRDMA_WC_WR_FLUSH_ERR,
+ PVRDMA_WC_MW_BIND_ERR,
+ PVRDMA_WC_BAD_RESP_ERR,
+ PVRDMA_WC_LOC_ACCESS_ERR,
+ PVRDMA_WC_REM_INV_REQ_ERR,
+ PVRDMA_WC_REM_ACCESS_ERR,
+ PVRDMA_WC_REM_OP_ERR,
+ PVRDMA_WC_RETRY_EXC_ERR,
+ PVRDMA_WC_RNR_RETRY_EXC_ERR,
+ PVRDMA_WC_LOC_RDD_VIOL_ERR,
+ PVRDMA_WC_REM_INV_RD_REQ_ERR,
+ PVRDMA_WC_REM_ABORT_ERR,
+ PVRDMA_WC_INV_EECN_ERR,
+ PVRDMA_WC_INV_EEC_STATE_ERR,
+ PVRDMA_WC_FATAL_ERR,
+ PVRDMA_WC_RESP_TIMEOUT_ERR,
+ PVRDMA_WC_GENERAL_ERR,
+};
+
+enum pvrdma_wc_opcode {
+ PVRDMA_WC_SEND,
+ PVRDMA_WC_RDMA_WRITE,
+ PVRDMA_WC_RDMA_READ,
+ PVRDMA_WC_COMP_SWAP,
+ PVRDMA_WC_FETCH_ADD,
+ PVRDMA_WC_BIND_MW,
+ PVRDMA_WC_LSO,
+ PVRDMA_WC_LOCAL_INV,
+ PVRDMA_WC_FAST_REG_MR,
+ PVRDMA_WC_MASKED_COMP_SWAP,
+ PVRDMA_WC_MASKED_FETCH_ADD,
+ PVRDMA_WC_RECV = 1 << 7,
+ PVRDMA_WC_RECV_RDMA_WITH_IMM,
+};
+
+enum pvrdma_wc_flags {
+ PVRDMA_WC_GRH = 1 << 0,
+ PVRDMA_WC_WITH_IMM = 1 << 1,
+ PVRDMA_WC_WITH_INVALIDATE = 1 << 2,
+ PVRDMA_WC_IP_CSUM_OK = 1 << 3,
+ PVRDMA_WC_WITH_SMAC = 1 << 4,
+ PVRDMA_WC_WITH_VLAN = 1 << 5,
+ PVRDMA_WC_FLAGS_MAX = PVRDMA_WC_WITH_VLAN,
+};
+
+struct pvrdma_alloc_ucontext_resp {
+ struct ibv_get_context_resp ibv_resp;
+ __u32 qp_tab_size;
+ __u32 reserved;
+};
+
+struct pvrdma_alloc_pd_resp {
+ struct ibv_alloc_pd_resp ibv_resp;
+ __u32 pdn;
+ __u32 reserved;
+};
+
+struct pvrdma_create_cq {
+ struct ibv_create_cq ibv_cmd;
+ __u64 buf_addr;
+ __u32 buf_size;
+ __u32 reserved;
+};
+
+struct pvrdma_create_cq_resp {
+ struct ibv_create_cq_resp ibv_resp;
+ __u32 cqn;
+ __u32 reserved;
+};
+
+struct pvrdma_resize_cq {
+ struct ibv_resize_cq ibv_cmd;
+ __u64 buf_addr;
+ __u32 buf_size;
+ __u32 reserved;
+};
+
+struct pvrdma_create_srq {
+ struct ibv_create_srq ibv_cmd;
+ __u64 buf_addr;
+};
+
+struct pvrdma_create_srq_resp {
+ struct ibv_create_srq_resp ibv_resp;
+ __u32 srqn;
+ __u32 reserved;
+};
+
+struct pvrdma_create_qp {
+ struct ibv_create_qp ibv_cmd;
+ __u64 rbuf_addr;
+ __u64 sbuf_addr;
+ __u32 rbuf_size;
+ __u32 sbuf_size;
+ __u64 qp_addr;
+};
+
+/* PVRDMA masked atomic compare and swap */
+struct pvrdma_ex_cmp_swap {
+ __u64 swap_val;
+ __u64 compare_val;
+ __u64 swap_mask;
+ __u64 compare_mask;
+};
+
+/* PVRDMA masked atomic fetch and add */
+struct pvrdma_ex_fetch_add {
+ __u64 add_val;
+ __u64 field_boundary;
+};
+
+/* PVRDMA address vector. */
+struct pvrdma_av {
+ __u32 port_pd;
+ __u32 sl_tclass_flowlabel;
+ __u8 dgid[16];
+ __u8 src_path_bits;
+ __u8 gid_index;
+ __u8 stat_rate;
+ __u8 hop_limit;
+ __u8 dmac[6];
+ __u8 reserved[6];
+};
+
+/* PVRDMA scatter/gather entry */
+struct pvrdma_sge {
+ __u64 addr;
+ __u32 length;
+ __u32 lkey;
+};
+
+/* PVRDMA receive queue work request */
+struct pvrdma_rq_wqe_hdr {
+ __u64 wr_id; /* wr id */
+ __u32 num_sge; /* size of s/g array */
+ __u32 total_len; /* reserved */
+};
+/* Use pvrdma_sge (ib_sge) for receive queue s/g array elements. */
+
+/* PVRDMA send queue work request */
+struct pvrdma_sq_wqe_hdr {
+ __u64 wr_id; /* wr id */
+ __u32 num_sge; /* size of s/g array */
+ __u32 total_len; /* reserved */
+ __u32 opcode; /* operation type */
+ __u32 send_flags; /* wr flags */
+ union {
+ __u32 imm_data;
+ __u32 invalidate_rkey;
+ } ex;
+ __u32 reserved;
+ union {
+ struct {
+ __u64 remote_addr;
+ __u32 rkey;
+ __u8 reserved[4];
+ } rdma;
+ struct {
+ __u64 remote_addr;
+ __u64 compare_add;
+ __u64 swap;
+ __u32 rkey;
+ __u32 reserved;
+ } atomic;
+ struct {
+ __u64 remote_addr;
+ __u32 log_arg_sz;
+ __u32 rkey;
+ union {
+ struct pvrdma_ex_cmp_swap cmp_swap;
+ struct pvrdma_ex_fetch_add fetch_add;
+ } wr_data;
+ } masked_atomics;
+ struct {
+ __u64 iova_start;
+ __u64 pl_pdir_dma;
+ __u32 page_shift;
+ __u32 page_list_len;
+ __u32 length;
+ __u32 access_flags;
+ __u32 rkey;
+ } fast_reg;
+ struct {
+ __u32 remote_qpn;
+ __u32 remote_qkey;
+ struct pvrdma_av av;
+ } ud;
+ } wr;
+};
+/* Use pvrdma_sge (ib_sge) for send queue s/g array elements. */
+
+/* Completion queue element. */
+struct pvrdma_cqe {
+ __u64 wr_id;
+ __u64 qp;
+ __u32 opcode;
+ __u32 status;
+ __u32 byte_len;
+ __u32 imm_data;
+ __u32 src_qp;
+ __u32 wc_flags;
+ __u32 vendor_err;
+ __u16 pkey_index;
+ __u16 slid;
+ __u8 sl;
+ __u8 dlid_path_bits;
+ __u8 port_num;
+ __u8 smac[6];
+ __u8 reserved2[7]; /* Pad to next power of 2 (64). */
+};
+
+#endif /* __PVRDMA_ABI_H__ */
diff --git a/providers/pvrdma/pvrdma-abi.h b/providers/pvrdma/pvrdma-abi.h
deleted file mode 100644
index c7a38c5..0000000
--- a/providers/pvrdma/pvrdma-abi.h
+++ /dev/null
@@ -1,297 +0,0 @@
-/*
- * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of EITHER the GNU General Public License
- * version 2 as published by the Free Software Foundation or the BSD
- * 2-Clause License. This program is distributed in the hope that it
- * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
- * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
- * See the GNU General Public License version 2 for more details at
- * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program available in the file COPYING in the main
- * directory of this source tree.
- *
- * The BSD 2-Clause License
- *
- * Redistribution and use in source and binary forms, with or
- * without modification, are permitted provided that the following
- * conditions are met:
- *
- * - Redistributions of source code must retain the above
- * copyright notice, this list of conditions and the following
- * disclaimer.
- *
- * - Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following
- * disclaimer in the documentation and/or other materials
- * provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
- * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
- * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
- * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
- * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
- * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
- * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
- * OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef __PVRDMA_ABI_H__
-#define __PVRDMA_ABI_H__
-
-#include <infiniband/kern-abi.h>
-
-#define PVRDMA_UVERBS_ABI_VERSION 3
-#define PVRDMA_UAR_HANDLE_MASK 0x00FFFFFF /* Bottom 24 bits. */
-#define PVRDMA_UAR_QP_OFFSET 0 /* QP doorbell offset. */
-#define PVRDMA_UAR_QP_SEND BIT(30) /* Send bit. */
-#define PVRDMA_UAR_QP_RECV BIT(31) /* Recv bit. */
-#define PVRDMA_UAR_CQ_OFFSET 4 /* CQ doorbell offset. */
-#define PVRDMA_UAR_CQ_ARM_SOL BIT(29) /* Arm solicited bit. */
-#define PVRDMA_UAR_CQ_ARM BIT(30) /* Arm bit. */
-#define PVRDMA_UAR_CQ_POLL BIT(31) /* Poll bit. */
-
-enum pvrdma_wr_opcode {
- PVRDMA_WR_RDMA_WRITE,
- PVRDMA_WR_RDMA_WRITE_WITH_IMM,
- PVRDMA_WR_SEND,
- PVRDMA_WR_SEND_WITH_IMM,
- PVRDMA_WR_RDMA_READ,
- PVRDMA_WR_ATOMIC_CMP_AND_SWP,
- PVRDMA_WR_ATOMIC_FETCH_AND_ADD,
- PVRDMA_WR_LSO,
- PVRDMA_WR_SEND_WITH_INV,
- PVRDMA_WR_RDMA_READ_WITH_INV,
- PVRDMA_WR_LOCAL_INV,
- PVRDMA_WR_FAST_REG_MR,
- PVRDMA_WR_MASKED_ATOMIC_CMP_AND_SWP,
- PVRDMA_WR_MASKED_ATOMIC_FETCH_AND_ADD,
- PVRDMA_WR_BIND_MW,
- PVRDMA_WR_REG_SIG_MR,
-};
-
-enum pvrdma_wc_status {
- PVRDMA_WC_SUCCESS,
- PVRDMA_WC_LOC_LEN_ERR,
- PVRDMA_WC_LOC_QP_OP_ERR,
- PVRDMA_WC_LOC_EEC_OP_ERR,
- PVRDMA_WC_LOC_PROT_ERR,
- PVRDMA_WC_WR_FLUSH_ERR,
- PVRDMA_WC_MW_BIND_ERR,
- PVRDMA_WC_BAD_RESP_ERR,
- PVRDMA_WC_LOC_ACCESS_ERR,
- PVRDMA_WC_REM_INV_REQ_ERR,
- PVRDMA_WC_REM_ACCESS_ERR,
- PVRDMA_WC_REM_OP_ERR,
- PVRDMA_WC_RETRY_EXC_ERR,
- PVRDMA_WC_RNR_RETRY_EXC_ERR,
- PVRDMA_WC_LOC_RDD_VIOL_ERR,
- PVRDMA_WC_REM_INV_RD_REQ_ERR,
- PVRDMA_WC_REM_ABORT_ERR,
- PVRDMA_WC_INV_EECN_ERR,
- PVRDMA_WC_INV_EEC_STATE_ERR,
- PVRDMA_WC_FATAL_ERR,
- PVRDMA_WC_RESP_TIMEOUT_ERR,
- PVRDMA_WC_GENERAL_ERR,
-};
-
-enum pvrdma_wc_opcode {
- PVRDMA_WC_SEND,
- PVRDMA_WC_RDMA_WRITE,
- PVRDMA_WC_RDMA_READ,
- PVRDMA_WC_COMP_SWAP,
- PVRDMA_WC_FETCH_ADD,
- PVRDMA_WC_BIND_MW,
- PVRDMA_WC_LSO,
- PVRDMA_WC_LOCAL_INV,
- PVRDMA_WC_FAST_REG_MR,
- PVRDMA_WC_MASKED_COMP_SWAP,
- PVRDMA_WC_MASKED_FETCH_ADD,
- PVRDMA_WC_RECV = 1 << 7,
- PVRDMA_WC_RECV_RDMA_WITH_IMM,
-};
-
-enum pvrdma_wc_flags {
- PVRDMA_WC_GRH = 1 << 0,
- PVRDMA_WC_WITH_IMM = 1 << 1,
- PVRDMA_WC_WITH_INVALIDATE = 1 << 2,
- PVRDMA_WC_IP_CSUM_OK = 1 << 3,
- PVRDMA_WC_WITH_SMAC = 1 << 4,
- PVRDMA_WC_WITH_VLAN = 1 << 5,
- PVRDMA_WC_FLAGS_MAX = PVRDMA_WC_WITH_VLAN,
-};
-
-struct pvrdma_alloc_ucontext_resp {
- struct ibv_get_context_resp ibv_resp;
- __u32 qp_tab_size;
- __u32 reserved;
-};
-
-struct pvrdma_alloc_pd_resp {
- struct ibv_alloc_pd_resp ibv_resp;
- __u32 pdn;
- __u32 reserved;
-};
-
-struct pvrdma_create_cq {
- struct ibv_create_cq ibv_cmd;
- __u64 buf_addr;
- __u32 buf_size;
- __u32 reserved;
-};
-
-struct pvrdma_create_cq_resp {
- struct ibv_create_cq_resp ibv_resp;
- __u32 cqn;
- __u32 reserved;
-};
-
-struct pvrdma_resize_cq {
- struct ibv_resize_cq ibv_cmd;
- __u64 buf_addr;
- __u32 buf_size;
- __u32 reserved;
-};
-
-struct pvrdma_create_srq {
- struct ibv_create_srq ibv_cmd;
- __u64 buf_addr;
-};
-
-struct pvrdma_create_srq_resp {
- struct ibv_create_srq_resp ibv_resp;
- __u32 srqn;
- __u32 reserved;
-};
-
-struct pvrdma_create_qp {
- struct ibv_create_qp ibv_cmd;
- __u64 rbuf_addr;
- __u64 sbuf_addr;
- __u32 rbuf_size;
- __u32 sbuf_size;
- __u64 qp_addr;
-};
-
-/* PVRDMA masked atomic compare and swap */
-struct pvrdma_ex_cmp_swap {
- __u64 swap_val;
- __u64 compare_val;
- __u64 swap_mask;
- __u64 compare_mask;
-};
-
-/* PVRDMA masked atomic fetch and add */
-struct pvrdma_ex_fetch_add {
- __u64 add_val;
- __u64 field_boundary;
-};
-
-/* PVRDMA address vector. */
-struct pvrdma_av {
- __u32 port_pd;
- __u32 sl_tclass_flowlabel;
- __u8 dgid[16];
- __u8 src_path_bits;
- __u8 gid_index;
- __u8 stat_rate;
- __u8 hop_limit;
- __u8 dmac[6];
- __u8 reserved[6];
-};
-
-/* PVRDMA scatter/gather entry */
-struct pvrdma_sge {
- __u64 addr;
- __u32 length;
- __u32 lkey;
-};
-
-/* PVRDMA receive queue work request */
-struct pvrdma_rq_wqe_hdr {
- __u64 wr_id; /* wr id */
- __u32 num_sge; /* size of s/g array */
- __u32 total_len; /* reserved */
-};
-/* Use pvrdma_sge (ib_sge) for receive queue s/g array elements. */
-
-/* PVRDMA send queue work request */
-struct pvrdma_sq_wqe_hdr {
- __u64 wr_id; /* wr id */
- __u32 num_sge; /* size of s/g array */
- __u32 total_len; /* reserved */
- __u32 opcode; /* operation type */
- __u32 send_flags; /* wr flags */
- union {
- __u32 imm_data;
- __u32 invalidate_rkey;
- } ex;
- __u32 reserved;
- union {
- struct {
- __u64 remote_addr;
- __u32 rkey;
- __u8 reserved[4];
- } rdma;
- struct {
- __u64 remote_addr;
- __u64 compare_add;
- __u64 swap;
- __u32 rkey;
- __u32 reserved;
- } atomic;
- struct {
- __u64 remote_addr;
- __u32 log_arg_sz;
- __u32 rkey;
- union {
- struct pvrdma_ex_cmp_swap cmp_swap;
- struct pvrdma_ex_fetch_add fetch_add;
- } wr_data;
- } masked_atomics;
- struct {
- __u64 iova_start;
- __u64 pl_pdir_dma;
- __u32 page_shift;
- __u32 page_list_len;
- __u32 length;
- __u32 access_flags;
- __u32 rkey;
- } fast_reg;
- struct {
- __u32 remote_qpn;
- __u32 remote_qkey;
- struct pvrdma_av av;
- } ud;
- } wr;
-};
-/* Use pvrdma_sge (ib_sge) for send queue s/g array elements. */
-
-/* Completion queue element. */
-struct pvrdma_cqe {
- __u64 wr_id;
- __u64 qp;
- __u32 opcode;
- __u32 status;
- __u32 byte_len;
- __u32 imm_data;
- __u32 src_qp;
- __u32 wc_flags;
- __u32 vendor_err;
- __u16 pkey_index;
- __u16 slid;
- __u8 sl;
- __u8 dlid_path_bits;
- __u8 port_num;
- __u8 smac[6];
- __u8 reserved2[7]; /* Pad to next power of 2 (64). */
-};
-
-#endif /* __PVRDMA_ABI_H__ */
diff --git a/providers/pvrdma/pvrdma.h b/providers/pvrdma/pvrdma.h
index d3df07d..703cb5f 100644
--- a/providers/pvrdma/pvrdma.h
+++ b/providers/pvrdma/pvrdma.h
@@ -55,10 +55,10 @@
#include <netinet/in.h>
#include <sys/mman.h>
#include <infiniband/driver.h>
+#include <rdma/pvrdma-abi.h>
#define BIT(nr) (1UL << (nr))
-#include "pvrdma-abi.h"
#include "pvrdma_ring.h"
#ifndef rmb
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 7/8] libpvrdma: Add to consolidated rdma-core
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Update the build scripts and infrastructure for the pvrdma user library.
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
CMakeLists.txt | 1 +
MAINTAINERS | 6 ++++++
README.md | 1 +
providers/pvrdma/CMakeLists.txt | 6 ++++++
4 files changed, 14 insertions(+)
create mode 100644 providers/pvrdma/CMakeLists.txt
diff --git a/CMakeLists.txt b/CMakeLists.txt
index b3b3ff1..2010265 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -335,6 +335,7 @@ add_subdirectory(providers/mlx5)
add_subdirectory(providers/mthca)
add_subdirectory(providers/nes)
add_subdirectory(providers/ocrdma)
+add_subdirectory(providers/pvrdma)
add_subdirectory(providers/qedr)
add_subdirectory(providers/rxe)
add_subdirectory(providers/rxe/man)
diff --git a/MAINTAINERS b/MAINTAINERS
index d83de10..69ab1f9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -139,6 +139,12 @@ M: Devesh Sharma <Devesh.sharma-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
S: Supported
F: providers/ocrdma/
+PVRDMA USERSPACE PROVIDER (for pvrdma.ko)
+M: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
+L: pv-drivers-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org
+S: Supported
+F: providers/pvrdma/
+
QEDR USERSPACE PROVIDER (for qedr.ko)
M: Ram Amrani <Ram.Amrani-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
M: Ariel Elior <Ariel.Elior-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
diff --git a/README.md b/README.md
index 3a13042..fed8803 100644
--- a/README.md
+++ b/README.md
@@ -25,6 +25,7 @@ is included:
- ib_mthca.ko
- iw_nes.ko
- ocrdma.ko
+ - pvrdma.ko
- qedr.ko
- rdma_rxe.ko
diff --git a/providers/pvrdma/CMakeLists.txt b/providers/pvrdma/CMakeLists.txt
new file mode 100644
index 0000000..8ba9a45
--- /dev/null
+++ b/providers/pvrdma/CMakeLists.txt
@@ -0,0 +1,6 @@
+rdma_provider(pvrdma
+ cq.c
+ pvrdma_main.c
+ qp.c
+ verbs.c
+)
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 6/8] libpvrdma: Add main library file
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Registers the pvrdma library with libibverbs and allocates the user context.
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
providers/pvrdma/pvrdma_main.c | 214 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 214 insertions(+)
create mode 100644 providers/pvrdma/pvrdma_main.c
diff --git a/providers/pvrdma/pvrdma_main.c b/providers/pvrdma/pvrdma_main.c
new file mode 100644
index 0000000..909cf1e
--- /dev/null
+++ b/providers/pvrdma/pvrdma_main.c
@@ -0,0 +1,214 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "pvrdma.h"
+
+static struct ibv_context_ops pvrdma_ctx_ops = {
+ .query_device = pvrdma_query_device,
+ .query_port = pvrdma_query_port,
+ .alloc_pd = pvrdma_alloc_pd,
+ .dealloc_pd = pvrdma_free_pd,
+
+ .reg_mr = pvrdma_reg_mr,
+ .dereg_mr = pvrdma_dereg_mr,
+ .create_cq = pvrdma_create_cq,
+ .poll_cq = pvrdma_poll_cq,
+ .req_notify_cq = pvrdma_req_notify_cq,
+ .destroy_cq = pvrdma_destroy_cq,
+
+ .create_qp = pvrdma_create_qp,
+ .query_qp = pvrdma_query_qp,
+ .modify_qp = pvrdma_modify_qp,
+ .destroy_qp = pvrdma_destroy_qp,
+
+ .post_send = pvrdma_post_send,
+ .post_recv = pvrdma_post_recv,
+ .create_ah = pvrdma_create_ah,
+ .destroy_ah = pvrdma_destroy_ah,
+};
+
+int pvrdma_alloc_buf(struct pvrdma_buf *buf, size_t size, int page_size)
+{
+ int ret;
+
+ buf->length = align(size, page_size);
+ buf->buf = mmap(NULL, buf->length, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ if (buf->buf == MAP_FAILED)
+ return errno;
+
+ ret = ibv_dontfork_range(buf->buf, size);
+ if (ret)
+ munmap(buf->buf, buf->length);
+
+ return ret;
+}
+
+void pvrdma_free_buf(struct pvrdma_buf *buf)
+{
+ ibv_dofork_range(buf->buf, buf->length);
+ munmap(buf->buf, buf->length);
+}
+
+static int pvrdma_init_context_shared(struct pvrdma_context *context,
+ struct ibv_device *ibdev,
+ int cmd_fd)
+{
+ struct ibv_get_context cmd;
+ struct pvrdma_alloc_ucontext_resp resp;
+
+ context->ibv_ctx.cmd_fd = cmd_fd;
+ if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof(cmd),
+ &resp.ibv_resp, sizeof(resp)))
+ return errno;
+
+ context->qp_tbl = calloc(resp.qp_tab_size & 0xFFFF,
+ sizeof(struct pvrdma_qp *));
+ if (!context->qp_tbl)
+ return -ENOMEM;
+
+ context->uar = mmap(NULL, to_vdev(ibdev)->page_size, PROT_WRITE,
+ MAP_SHARED, cmd_fd, 0);
+ if (context->uar == MAP_FAILED) {
+ free(context->qp_tbl);
+ return errno;
+ }
+
+ pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE);
+ context->ibv_ctx.ops = pvrdma_ctx_ops;
+
+ return 0;
+}
+
+static void pvrdma_free_context_shared(struct pvrdma_context *context,
+ struct pvrdma_device *dev)
+{
+ munmap(context->uar, dev->page_size);
+ free(context->qp_tbl);
+}
+
+static struct ibv_context *pvrdma_alloc_context(struct ibv_device *ibdev,
+ int cmd_fd)
+{
+ struct pvrdma_context *context;
+
+ context = malloc(sizeof(*context));
+ if (!context)
+ return NULL;
+
+ memset(context, 0, sizeof(*context));
+
+ if (pvrdma_init_context_shared(context, ibdev, cmd_fd)) {
+ free(context);
+ return NULL;
+ }
+
+ return &context->ibv_ctx;
+}
+
+static void pvrdma_free_context(struct ibv_context *ibctx)
+{
+ struct pvrdma_context *context = to_vctx(ibctx);
+
+ pvrdma_free_context_shared(context, to_vdev(ibctx->device));
+ free(context);
+}
+
+static struct ibv_device_ops pvrdma_dev_ops = {
+ .alloc_context = pvrdma_alloc_context,
+ .free_context = pvrdma_free_context
+};
+
+static struct pvrdma_device *pvrdma_driver_init_shared(
+ const char *uverbs_sys_path,
+ int abi_version)
+{
+ struct pvrdma_device *dev;
+ char name[16];
+
+ /* We support only a single ABI version for now. */
+ if (abi_version != PVRDMA_UVERBS_ABI_VERSION) {
+ fprintf(stderr, PFX "ABI version %d of %s is not "
+ "supported (supported %d)\n",
+ abi_version, uverbs_sys_path,
+ PVRDMA_UVERBS_ABI_VERSION);
+ return NULL;
+ }
+
+ if (ibv_read_sysfs_file(uverbs_sys_path,
+ "ibdev", name, sizeof(name)) < 0) {
+ fprintf(stderr, PFX "not ib device\n");
+ return NULL;
+ }
+
+ dev = malloc(sizeof(*dev));
+ if (!dev) {
+ fprintf(stderr, PFX "couldn't allocate device for %s\n",
+ uverbs_sys_path);
+ return NULL;
+ }
+
+ dev->abi_version = abi_version;
+ dev->page_size = sysconf(_SC_PAGESIZE);
+ dev->ibv_dev.ops = pvrdma_dev_ops;
+
+ return dev;
+}
+
+static struct ibv_device *pvrdma_driver_init(const char *uverbs_sys_path,
+ int abi_version)
+{
+ struct pvrdma_device *dev = pvrdma_driver_init_shared(uverbs_sys_path,
+ abi_version);
+ if (!dev)
+ return NULL;
+
+ return &dev->ibv_dev;
+}
+
+static __attribute__((constructor)) void pvrdma_register_driver(void)
+{
+ ibv_register_driver("pvrdma", pvrdma_driver_init);
+}
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 5/8] libpvrdma: Add misc verbs functions
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
This includes other verbs functions that dont necessarily fit anywhere else.
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
providers/pvrdma/verbs.c | 234 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 234 insertions(+)
create mode 100644 providers/pvrdma/verbs.c
diff --git a/providers/pvrdma/verbs.c b/providers/pvrdma/verbs.c
new file mode 100644
index 0000000..1646708
--- /dev/null
+++ b/providers/pvrdma/verbs.c
@@ -0,0 +1,234 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "pvrdma.h"
+
+int pvrdma_query_device(struct ibv_context *context,
+ struct ibv_device_attr *attr)
+{
+ struct ibv_query_device cmd;
+ uint64_t raw_fw_ver;
+ unsigned major, minor, sub_minor;
+ int ret;
+
+ ret = ibv_cmd_query_device(context, attr, &raw_fw_ver,
+ &cmd, sizeof(cmd));
+ if (ret)
+ return ret;
+
+ major = (raw_fw_ver >> 32) & 0xffff;
+ minor = (raw_fw_ver >> 16) & 0xffff;
+ sub_minor = raw_fw_ver & 0xffff;
+
+ snprintf(attr->fw_ver, sizeof(attr->fw_ver),
+ "%d.%d.%03d", major, minor, sub_minor);
+
+ return 0;
+}
+
+int pvrdma_query_port(struct ibv_context *context, uint8_t port,
+ struct ibv_port_attr *attr)
+{
+ struct ibv_query_port cmd;
+
+ return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd));
+}
+
+struct ibv_pd *pvrdma_alloc_pd(struct ibv_context *context)
+{
+ struct ibv_alloc_pd cmd;
+ struct pvrdma_alloc_pd_resp resp;
+ struct pvrdma_pd *pd;
+
+ pd = malloc(sizeof(*pd));
+ if (!pd)
+ return NULL;
+
+ if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd),
+ &resp.ibv_resp, sizeof(resp))) {
+ free(pd);
+ return NULL;
+ }
+
+ pd->pdn = resp.pdn;
+
+ return &pd->ibv_pd;
+}
+
+int pvrdma_free_pd(struct ibv_pd *pd)
+{
+ int ret;
+
+ ret = ibv_cmd_dealloc_pd(pd);
+ if (ret)
+ return ret;
+
+ free(to_vpd(pd));
+
+ return 0;
+}
+
+struct ibv_mr *pvrdma_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
+ int access)
+{
+ struct ibv_mr *mr;
+ struct ibv_reg_mr cmd;
+ struct ibv_reg_mr_resp resp;
+ int ret;
+
+ mr = malloc(sizeof(*mr));
+ if (!mr)
+ return NULL;
+
+ ret = ibv_cmd_reg_mr(pd, addr, length, (uintptr_t) addr,
+ access, mr, &cmd, sizeof(cmd),
+ &resp, sizeof(resp));
+ if (ret) {
+ free(mr);
+ return NULL;
+ }
+
+ return mr;
+}
+
+int pvrdma_dereg_mr(struct ibv_mr *mr)
+{
+ int ret;
+
+ ret = ibv_cmd_dereg_mr(mr);
+ if (ret)
+ return ret;
+
+ free(mr);
+
+ return 0;
+}
+
+static int is_multicast_gid(const union ibv_gid *gid)
+{
+ return gid->raw[0] == 0xff;
+}
+
+static int is_link_local_gid(const union ibv_gid *gid)
+{
+ uint32_t *hi = (uint32_t *)(gid->raw);
+ uint32_t *lo = (uint32_t *)(gid->raw + 4);
+ if (hi[0] == htonl(0xfe800000) && lo[0] == 0)
+ return 1;
+
+ return 0;
+}
+
+static int is_ipv6_addr_v4mapped(const struct in6_addr *a)
+{
+ return ((a->s6_addr32[0] | a->s6_addr32[1]) |
+ (a->s6_addr32[2] ^ htonl(0x0000ffff))) == 0UL ||
+ /* IPv4 encoded multicast addresses */
+ (a->s6_addr32[0] == htonl(0xff0e0000) &&
+ ((a->s6_addr32[1] |
+ (a->s6_addr32[2] ^ htonl(0x0000ffff))) == 0UL));
+}
+
+static void set_mac_from_gid(const union ibv_gid *gid,
+ __u8 mac[6])
+{
+ if (is_link_local_gid(gid)) {
+ /*
+ * The MAC is embedded in GID[8-10,13-15] with the
+ * 7th most significant bit inverted.
+ */
+ memcpy(mac, gid->raw + 8, 3);
+ memcpy(mac + 3, gid->raw + 13, 3);
+ mac[0] ^= 2;
+ }
+}
+
+struct ibv_ah *pvrdma_create_ah(struct ibv_pd *pd,
+ struct ibv_ah_attr *attr)
+{
+ struct pvrdma_ah *ah;
+ struct pvrdma_av *av;
+ struct ibv_port_attr port_attr;
+
+ if (!attr->is_global)
+ return NULL;
+
+ if (ibv_query_port(pd->context, attr->port_num, &port_attr))
+ return NULL;
+
+ if (port_attr.link_layer == IBV_LINK_LAYER_UNSPECIFIED ||
+ port_attr.link_layer == IBV_LINK_LAYER_INFINIBAND)
+ return NULL;
+
+ if (port_attr.link_layer == IBV_LINK_LAYER_ETHERNET &&
+ (!is_link_local_gid(&attr->grh.dgid) &&
+ !is_multicast_gid(&attr->grh.dgid) &&
+ !is_ipv6_addr_v4mapped((struct in6_addr *)attr->grh.dgid.raw)))
+ return NULL;
+
+ ah = calloc(1, sizeof(*ah));
+ if (!ah)
+ return NULL;
+
+ av = &ah->av;
+ av->port_pd = to_vpd(pd)->pdn | (attr->port_num << 24);
+ av->src_path_bits = attr->src_path_bits;
+ av->src_path_bits |= 0x80;
+ av->gid_index = attr->grh.sgid_index;
+ av->hop_limit = attr->grh.hop_limit;
+ av->sl_tclass_flowlabel = (attr->grh.traffic_class << 20) |
+ attr->grh.flow_label;
+ memcpy(av->dgid, attr->grh.dgid.raw, 16);
+ set_mac_from_gid(&attr->grh.dgid, av->dmac);
+
+ return &ah->ibv_ah;
+}
+
+int pvrdma_destroy_ah(struct ibv_ah *ah)
+{
+ free(to_vah(ah));
+
+ return 0;
+}
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 4/8] libpvrdma: Add queue pair functions
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Added functions to create, destroy, post on queue pairs.
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
providers/pvrdma/qp.c | 505 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 505 insertions(+)
create mode 100644 providers/pvrdma/qp.c
diff --git a/providers/pvrdma/qp.c b/providers/pvrdma/qp.c
new file mode 100644
index 0000000..46a0e32
--- /dev/null
+++ b/providers/pvrdma/qp.c
@@ -0,0 +1,505 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <infiniband/arch.h>
+
+#include "pvrdma.h"
+
+int pvrdma_alloc_qp_buf(struct pvrdma_device *dev, struct ibv_qp_cap *cap,
+ enum ibv_qp_type type, struct pvrdma_qp *qp)
+{
+ qp->sq.wrid = malloc(qp->sq.wqe_cnt * sizeof(uint64_t));
+ if (!qp->sq.wrid)
+ return -1;
+
+ qp->rq.wrid = malloc(qp->rq.wqe_cnt * sizeof(uint64_t));
+ if (!qp->rq.wrid) {
+ free(qp->sq.wrid);
+ return -1;
+ }
+
+ /* Align page size for [rq][sq] */
+ qp->rbuf.length = align(qp->rq.offset +
+ qp->rq.wqe_cnt * qp->rq.wqe_size,
+ dev->page_size);
+ qp->sbuf.length = align(qp->sq.offset +
+ qp->sq.wqe_cnt * qp->sq.wqe_size,
+ dev->page_size);
+ qp->buf_size = qp->rbuf.length + qp->sbuf.length;
+
+ if (pvrdma_alloc_buf(&qp->rbuf, qp->rbuf.length, dev->page_size)) {
+ free(qp->sq.wrid);
+ free(qp->rq.wrid);
+ return -1;
+ }
+
+ if (pvrdma_alloc_buf(&qp->sbuf, qp->sbuf.length, dev->page_size)) {
+ free(qp->sq.wrid);
+ free(qp->rq.wrid);
+ pvrdma_free_buf(&qp->rbuf);
+ return -1;
+ }
+
+ memset(qp->rbuf.buf, 0, qp->rbuf.length);
+ memset(qp->sbuf.buf, 0, qp->sbuf.length);
+
+ return 0;
+}
+
+static void pvrdma_init_qp_queue(struct pvrdma_qp *qp)
+{
+ atomic_set(&(qp->sq.ring_state->cons_head), 0);
+ atomic_set(&(qp->sq.ring_state->prod_tail), 0);
+ atomic_set(&(qp->rq.ring_state->cons_head), 0);
+ atomic_set(&(qp->rq.ring_state->prod_tail), 0);
+}
+
+struct ibv_qp *pvrdma_create_qp(struct ibv_pd *pd,
+ struct ibv_qp_init_attr *attr)
+{
+ struct pvrdma_device *dev = to_vdev(pd->context->device);
+ struct pvrdma_create_qp cmd;
+ struct ibv_create_qp_resp resp;
+ struct pvrdma_qp *qp;
+ int ret;
+
+ attr->cap.max_recv_sge =
+ align_next_power2(max(1U, attr->cap.max_recv_sge));
+ attr->cap.max_recv_wr =
+ align_next_power2(max(1U, attr->cap.max_recv_wr));
+ attr->cap.max_send_sge =
+ align_next_power2(max(1U, attr->cap.max_send_sge));
+ attr->cap.max_send_wr =
+ align_next_power2(max(1U, attr->cap.max_send_wr));
+
+ qp = calloc(1, sizeof(*qp));
+ if (!qp)
+ return NULL;
+
+ qp->rq.max_gs = attr->cap.max_recv_sge;
+ qp->rq.wqe_cnt = attr->cap.max_recv_wr;
+ qp->rq.offset = 0;
+ qp->rq.wqe_size = align_next_power2(sizeof(struct pvrdma_rq_wqe_hdr) +
+ sizeof(struct ibv_sge) *
+ qp->rq.max_gs);
+
+ qp->sq.max_gs = attr->cap.max_send_sge;
+ qp->sq.wqe_cnt = attr->cap.max_send_wr;
+ /* Extra page for shared ring state */
+ qp->sq.offset = dev->page_size;
+ qp->sq.wqe_size = align_next_power2(sizeof(struct pvrdma_sq_wqe_hdr) +
+ sizeof(struct ibv_sge) *
+ qp->sq.max_gs);
+
+ /* Reset attr.cap, no srq for now */
+ if (attr->srq) {
+ attr->cap.max_recv_wr = 0;
+ qp->rq.wqe_cnt = 0;
+ }
+
+ /* Allocate [rq][sq] memory */
+ if (pvrdma_alloc_qp_buf(dev, &attr->cap, attr->qp_type, qp))
+ goto err;
+
+ qp->sq.ring_state = qp->sbuf.buf;
+ qp->rq.ring_state = (struct pvrdma_ring *)&qp->sq.ring_state[1];
+ pvrdma_init_qp_queue(qp);
+
+ if (pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE) ||
+ pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE))
+ goto err_free;
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.rbuf_addr = (uintptr_t)qp->rbuf.buf;
+ cmd.rbuf_size = qp->rbuf.length;
+ cmd.sbuf_addr = (uintptr_t)qp->sbuf.buf;
+ cmd.sbuf_size = qp->sbuf.length;
+ cmd.qp_addr = (uintptr_t) qp;
+
+ ret = ibv_cmd_create_qp(pd, &qp->ibv_qp, attr,
+ &cmd.ibv_cmd, sizeof(cmd),
+ &resp, sizeof(resp));
+
+ if (ret)
+ goto err_free;
+
+ to_vctx(pd->context)->qp_tbl[qp->ibv_qp.qp_num & 0xFFFF] = qp;
+
+ /* If set, each WR submitted to the SQ generate a completion entry */
+ if (attr->sq_sig_all)
+ qp->sq_signal_bits = htonl(PVRDMA_WQE_CTRL_CQ_UPDATE);
+ else
+ qp->sq_signal_bits = 0;
+
+ return &qp->ibv_qp;
+
+err_free:
+ if (qp->sq.wqe_cnt)
+ free(qp->sq.wrid);
+ if (qp->rq.wqe_cnt)
+ free(qp->rq.wrid);
+ pvrdma_free_buf(&qp->rbuf);
+ pvrdma_free_buf(&qp->sbuf);
+err:
+ free(qp);
+
+ return NULL;
+}
+
+int pvrdma_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr,
+ int attr_mask,
+ struct ibv_qp_init_attr *init_attr)
+{
+ struct ibv_query_qp cmd;
+ struct pvrdma_qp *qp = to_vqp(ibqp);
+ int ret;
+
+ ret = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr,
+ &cmd, sizeof(cmd));
+ if (ret)
+ return ret;
+
+ /* Passing back */
+ init_attr->cap.max_send_wr = qp->sq.wqe_cnt;
+ init_attr->cap.max_send_sge = qp->sq.max_gs;
+ init_attr->cap.max_inline_data = qp->max_inline_data;
+
+ attr->cap = init_attr->cap;
+
+ return 0;
+}
+
+int pvrdma_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ int attr_mask)
+{
+ struct ibv_modify_qp cmd;
+ int ret;
+
+ /* Sanity check */
+ if (!attr_mask)
+ return 0;
+
+ ret = ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof(cmd));
+
+ if (!ret &&
+ (attr_mask & IBV_QP_STATE) &&
+ attr->qp_state == IBV_QPS_RESET) {
+ pvrdma_cq_clean(to_vcq(qp->recv_cq), qp->qp_num);
+ if (qp->send_cq != qp->recv_cq)
+ pvrdma_cq_clean(to_vcq(qp->send_cq), qp->qp_num);
+ pvrdma_init_qp_queue(to_vqp(qp));
+ }
+
+ return ret;
+}
+
+static void pvrdma_lock_cqs(struct ibv_qp *qp)
+{
+ struct pvrdma_cq *send_cq = to_vcq(qp->send_cq);
+ struct pvrdma_cq *recv_cq = to_vcq(qp->recv_cq);
+
+ if (send_cq == recv_cq)
+ pthread_spin_lock(&send_cq->lock);
+ else if (send_cq->cqn < recv_cq->cqn) {
+ pthread_spin_lock(&send_cq->lock);
+ pthread_spin_lock(&recv_cq->lock);
+ } else {
+ pthread_spin_lock(&recv_cq->lock);
+ pthread_spin_lock(&send_cq->lock);
+ }
+}
+
+static void pvrdma_unlock_cqs(struct ibv_qp *qp)
+{
+ struct pvrdma_cq *send_cq = to_vcq(qp->send_cq);
+ struct pvrdma_cq *recv_cq = to_vcq(qp->recv_cq);
+
+ if (send_cq == recv_cq)
+ pthread_spin_unlock(&send_cq->lock);
+ else if (send_cq->cqn < recv_cq->cqn) {
+ pthread_spin_unlock(&recv_cq->lock);
+ pthread_spin_unlock(&send_cq->lock);
+ } else {
+ pthread_spin_unlock(&send_cq->lock);
+ pthread_spin_unlock(&recv_cq->lock);
+ }
+}
+
+int pvrdma_destroy_qp(struct ibv_qp *ibqp)
+{
+ struct pvrdma_context *ctx = to_vctx(ibqp->context);
+ struct pvrdma_qp *qp = to_vqp(ibqp);
+ int ret;
+
+ ret = ibv_cmd_destroy_qp(ibqp);
+ if (ret) {
+ return ret;
+ }
+
+ pvrdma_lock_cqs(ibqp);
+ /* Dump cqs */
+ __pvrdma_cq_clean(to_vcq(ibqp->recv_cq), ibqp->qp_num);
+
+ if (ibqp->send_cq != ibqp->recv_cq)
+ __pvrdma_cq_clean(to_vcq(ibqp->send_cq), ibqp->qp_num);
+ pvrdma_unlock_cqs(ibqp);
+
+ free(qp->sq.wrid);
+ free(qp->rq.wrid);
+ pvrdma_free_buf(&qp->rbuf);
+ pvrdma_free_buf(&qp->sbuf);
+ ctx->qp_tbl[ibqp->qp_num & 0xFFFF] = NULL;
+ free(qp);
+
+ return 0;
+}
+
+static void *get_rq_wqe(struct pvrdma_qp *qp, int n)
+{
+ return qp->rbuf.buf + qp->rq.offset + (n * qp->rq.wqe_size);
+}
+
+static void *get_sq_wqe(struct pvrdma_qp *qp, int n)
+{
+ return qp->sbuf.buf + qp->sq.offset + (n * qp->sq.wqe_size);
+}
+
+int pvrdma_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
+ struct ibv_send_wr **bad_wr)
+{
+ struct pvrdma_context *ctx = to_vctx(ibqp->context);
+ struct pvrdma_qp *qp = to_vqp(ibqp);
+ int ind;
+ int nreq = 0;
+ struct pvrdma_sq_wqe_hdr *wqe_hdr;
+ struct ibv_sge *sge;
+ int ret = 0;
+ int i;
+
+ /*
+ * In states lower than RTS, we can fail immediately. In other states,
+ * just post and let the device figure it out.
+ */
+ if (ibqp->state < IBV_QPS_RTS) {
+ *bad_wr = wr;
+ return EINVAL;
+ }
+
+ pthread_spin_lock(&qp->sq.lock);
+ ind = pvrdma_idx(&(qp->sq.ring_state->prod_tail), qp->sq.wqe_cnt);
+ if (ind < 0) {
+ pthread_spin_unlock(&qp->sq.lock);
+ ret = EINVAL;
+ goto out;
+ }
+
+ for (nreq = 0; wr; ++nreq, wr = wr->next) {
+ unsigned int tail;
+
+ if (pvrdma_idx_ring_has_space(qp->sq.ring_state,
+ qp->sq.wqe_cnt, &tail) <= 0) {
+ ret = ENOMEM;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ if (wr->num_sge > qp->sq.max_gs) {
+ ret = EINVAL;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ wqe_hdr = (struct pvrdma_sq_wqe_hdr *)get_sq_wqe(qp, ind);
+ wqe_hdr->wr_id = wr->wr_id;
+ wqe_hdr->num_sge = wr->num_sge;
+ wqe_hdr->opcode = ibv_wr_opcode_to_pvrdma(wr->opcode);
+ wqe_hdr->send_flags = ibv_send_flags_to_pvrdma(wr->send_flags);
+ if (wr->opcode == IBV_WR_SEND_WITH_IMM ||
+ wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM)
+ wqe_hdr->ex.imm_data = wr->imm_data;
+
+ switch (ibqp->qp_type) {
+ case IBV_QPT_UD:
+ wqe_hdr->wr.ud.remote_qpn = wr->wr.ud.remote_qpn;
+ wqe_hdr->wr.ud.remote_qkey = wr->wr.ud.remote_qkey;
+ wqe_hdr->wr.ud.av = to_vah(wr->wr.ud.ah)->av;
+ break;
+ case IBV_QPT_RC:
+ switch (wr->opcode) {
+ case IBV_WR_RDMA_READ:
+ case IBV_WR_RDMA_WRITE:
+ case IBV_WR_RDMA_WRITE_WITH_IMM:
+ wqe_hdr->wr.rdma.remote_addr =
+ wr->wr.rdma.remote_addr;
+ wqe_hdr->wr.rdma.rkey = wr->wr.rdma.rkey;
+ break;
+ case IBV_WR_ATOMIC_CMP_AND_SWP:
+ case IBV_WR_ATOMIC_FETCH_AND_ADD:
+ wqe_hdr->wr.atomic.remote_addr = wr->wr.atomic.remote_addr;
+ wqe_hdr->wr.atomic.rkey = wr->wr.atomic.rkey;
+ wqe_hdr->wr.atomic.compare_add = wr->wr.atomic.compare_add;
+ if (wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP)
+ wqe_hdr->wr.atomic.swap = wr->wr.atomic.swap;
+ break;
+ default:
+ /* No extra segments required for sends */
+ break;
+ }
+ break;
+ default:
+ fprintf(stderr, PFX "invalid post send opcode\n");
+ ret = EINVAL;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ /* Write each segment */
+ sge = (struct ibv_sge *)&wqe_hdr[1];
+ for (i = 0; i < wr->num_sge; i++) {
+ sge->addr = wr->sg_list[i].addr;
+ sge->length = wr->sg_list[i].length;
+ sge->lkey = wr->sg_list[i].lkey;
+ sge++;
+ }
+
+ pvrdma_idx_ring_inc(&(qp->sq.ring_state->prod_tail),
+ qp->sq.wqe_cnt);
+
+ wmb();
+
+ qp->sq.wrid[ind] = wr->wr_id;
+ ++ind;
+ if (ind >= qp->sq.wqe_cnt)
+ ind = 0;
+ }
+
+out:
+ if (nreq)
+ pvrdma_write_uar_qp(ctx->uar,
+ PVRDMA_UAR_QP_SEND | ibqp->qp_num);
+
+ wmb();
+ pthread_spin_unlock(&qp->sq.lock);
+
+ return ret;
+}
+
+int pvrdma_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr,
+ struct ibv_recv_wr **bad_wr)
+{
+ struct pvrdma_context *ctx = to_vctx(ibqp->context);
+ struct pvrdma_qp *qp = to_vqp(ibqp);
+ struct pvrdma_rq_wqe_hdr *wqe_hdr;
+ struct ibv_sge *sge;
+ int nreq;
+ int ind;
+ int i;
+ int ret = 0;
+
+ if (!wr || !bad_wr)
+ return EINVAL;
+
+ /*
+ * In the RESET state, we can fail immediately. For other states,
+ * just post and let the device figure it out.
+ */
+ if (ibqp->state == IBV_QPS_RESET) {
+ *bad_wr = wr;
+ return EINVAL;
+ }
+
+ pthread_spin_lock(&qp->rq.lock);
+
+ ind = pvrdma_idx(&(qp->rq.ring_state->prod_tail), qp->rq.wqe_cnt);
+ if (ind < 0) {
+ pthread_spin_unlock(&qp->rq.lock);
+ *bad_wr = wr;
+ return EINVAL;
+ }
+
+ for (nreq = 0; wr; ++nreq, wr = wr->next) {
+ unsigned int tail;
+
+ if (pvrdma_idx_ring_has_space(qp->rq.ring_state,
+ qp->rq.wqe_cnt, &tail) <= 0) {
+ ret = ENOMEM;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ if (wr->num_sge > qp->rq.max_gs) {
+ ret = EINVAL;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ /* Fetch wqe */
+ wqe_hdr = (struct pvrdma_rq_wqe_hdr *)get_rq_wqe(qp, ind);
+ wqe_hdr->wr_id = wr->wr_id;
+ wqe_hdr->num_sge = wr->num_sge;
+
+ sge = (struct ibv_sge *)(wqe_hdr + 1);
+ for (i = 0; i < wr->num_sge; ++i) {
+ sge->addr = (uint64_t)wr->sg_list[i].addr;
+ sge->length = wr->sg_list[i].length;
+ sge->lkey = wr->sg_list[i].lkey;
+ sge++;
+ }
+
+ pvrdma_idx_ring_inc(&qp->rq.ring_state->prod_tail,
+ qp->rq.wqe_cnt);
+
+ qp->rq.wrid[ind] = wr->wr_id;
+ ind = (ind + 1) & (qp->rq.wqe_cnt - 1);
+ }
+
+out:
+ if (nreq)
+ pvrdma_write_uar_qp(ctx->uar,
+ PVRDMA_UAR_QP_RECV | ibqp->qp_num);
+
+ pthread_spin_unlock(&qp->rq.lock);
+ return ret;
+}
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 3/8] libpvrdma: Add completion queue functions
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Add support for completion queue creation, destruction, polling and
events.
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
providers/pvrdma/cq.c | 287 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 287 insertions(+)
create mode 100644 providers/pvrdma/cq.c
diff --git a/providers/pvrdma/cq.c b/providers/pvrdma/cq.c
new file mode 100644
index 0000000..f99873c
--- /dev/null
+++ b/providers/pvrdma/cq.c
@@ -0,0 +1,287 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <infiniband/arch.h>
+
+#include "pvrdma.h"
+
+enum {
+ CQ_OK = 0,
+ CQ_EMPTY = -1,
+ CQ_POLL_ERR = -2,
+};
+
+enum {
+ PVRDMA_CQE_IS_SEND_MASK = 0x40,
+ PVRDMA_CQE_OPCODE_MASK = 0x1f,
+};
+
+int pvrdma_alloc_cq_buf(struct pvrdma_device *dev, struct pvrdma_cq *cq,
+ struct pvrdma_buf *buf, int entries)
+{
+ if (pvrdma_alloc_buf(buf, cq->offset +
+ entries * (sizeof(struct pvrdma_cqe)),
+ dev->page_size))
+ return -1;
+ memset(buf->buf, 0, buf->length);
+
+ return 0;
+}
+
+static struct pvrdma_cqe *get_cqe(struct pvrdma_cq *cq, int entry)
+{
+ return cq->buf.buf + cq->offset +
+ entry * (sizeof(struct pvrdma_cqe));
+}
+
+static int pvrdma_poll_one(struct pvrdma_cq *cq,
+ struct pvrdma_qp **cur_qp,
+ struct ibv_wc *wc)
+{
+ struct pvrdma_context *ctx = to_vctx(cq->ibv_cq.context);
+ int has_data;
+ unsigned int head;
+ int tried = 0;
+ struct pvrdma_cqe *cqe;
+
+retry:
+ has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
+ cq->cqe_cnt, &head);
+ if (has_data == 0) {
+ unsigned int val;
+
+ if (tried)
+ return CQ_EMPTY;
+
+ /* Pass down POLL to give physical HCA a chance to poll. */
+ val = cq->cqn | PVRDMA_UAR_CQ_POLL;
+ pvrdma_write_uar_cq(ctx->uar, val);
+
+ tried = 1;
+ goto retry;
+ } else if (has_data == -1) {
+ return CQ_POLL_ERR;
+ }
+
+ cqe = get_cqe(cq, head);
+ if (!cqe)
+ return CQ_EMPTY;
+
+ rmb();
+
+ if (ctx->qp_tbl[cqe->qp & 0xFFFF])
+ *cur_qp = (struct pvrdma_qp *)ctx->qp_tbl[cqe->qp & 0xFFFF];
+ else
+ return CQ_POLL_ERR;
+
+ wc->opcode = pvrdma_wc_opcode_to_ibv(cqe->opcode);
+ wc->status = pvrdma_wc_status_to_ibv(cqe->status);
+ wc->wr_id = cqe->wr_id;
+ wc->qp_num = (*cur_qp)->ibv_qp.qp_num;
+ wc->byte_len = cqe->byte_len;
+ wc->imm_data = cqe->imm_data;
+ wc->src_qp = cqe->src_qp;
+ wc->wc_flags = cqe->wc_flags;
+ wc->pkey_index = cqe->pkey_index;
+ wc->slid = cqe->slid;
+ wc->sl = cqe->sl;
+ wc->dlid_path_bits = cqe->dlid_path_bits;
+ wc->vendor_err = 0;
+
+ /* Update shared ring state. */
+ pvrdma_idx_ring_inc(&(cq->ring_state->rx.cons_head), cq->cqe_cnt);
+
+ return CQ_OK;
+}
+
+int pvrdma_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc)
+{
+ struct pvrdma_cq *cq = to_vcq(ibcq);
+ struct pvrdma_qp *qp;
+ int npolled = 0;
+
+ if (num_entries < 1 || wc == NULL)
+ return 0;
+
+ pthread_spin_lock(&cq->lock);
+
+ for (npolled = 0; npolled < num_entries; ++npolled) {
+ if (pvrdma_poll_one(cq, &qp, wc + npolled) != CQ_OK)
+ break;
+ }
+
+ pthread_spin_unlock(&cq->lock);
+
+ return npolled;
+}
+
+void __pvrdma_cq_clean(struct pvrdma_cq *cq, uint32_t qpn)
+{
+ /* Flush CQEs from specified QP */
+ int has_data;
+ unsigned int head;
+
+ /* Lock held */
+ has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
+ cq->cqe_cnt, &head);
+
+ if (unlikely(has_data > 0)) {
+ int items;
+ int curr;
+ int tail = pvrdma_idx(&cq->ring_state->rx.prod_tail,
+ cq->cqe_cnt);
+ struct pvrdma_cqe *cqe;
+ struct pvrdma_cqe *curr_cqe;
+
+ items = (tail > head) ? (tail - head) :
+ (cq->cqe_cnt - head + tail);
+ curr = --tail;
+ while (items-- > 0) {
+ if (curr < 0)
+ curr = cq->cqe_cnt - 1;
+ if (tail < 0)
+ tail = cq->cqe_cnt - 1;
+ curr_cqe = get_cqe(cq, curr);
+ rmb();
+ if ((curr_cqe->qp & 0xFFFF) != qpn) {
+ if (curr != tail) {
+ cqe = get_cqe(cq, tail);
+ rmb();
+ *cqe = *curr_cqe;
+ }
+ tail--;
+ } else {
+ pvrdma_idx_ring_inc(
+ &cq->ring_state->rx.cons_head,
+ cq->cqe_cnt);
+ }
+ curr--;
+ }
+ }
+}
+
+void pvrdma_cq_clean(struct pvrdma_cq *cq, uint32_t qpn)
+{
+ pthread_spin_lock(&cq->lock);
+ __pvrdma_cq_clean(cq, qpn);
+ pthread_spin_unlock(&cq->lock);
+}
+
+struct ibv_cq *pvrdma_create_cq(struct ibv_context *context, int cqe,
+ struct ibv_comp_channel *channel,
+ int comp_vector)
+{
+ struct pvrdma_device *dev = to_vdev(context->device);
+ struct pvrdma_create_cq cmd;
+ struct pvrdma_create_cq_resp resp;
+ struct pvrdma_cq *cq;
+ int ret;
+
+ if (cqe < 1)
+ return NULL;
+
+ cq = malloc(sizeof(*cq));
+ if (!cq)
+ return NULL;
+
+ /* Extra page for shared ring state */
+ cq->offset = dev->page_size;
+
+ if (pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE))
+ goto err;
+
+ cqe = align_next_power2(cqe);
+
+ if (pvrdma_alloc_cq_buf(dev, cq, &cq->buf, cqe))
+ goto err;
+
+ cq->ring_state = cq->buf.buf;
+
+ cmd.buf_addr = (uintptr_t) cq->buf.buf;
+ cmd.buf_size = cq->buf.length;
+ ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector,
+ &cq->ibv_cq, &cmd.ibv_cmd, sizeof(cmd),
+ &resp.ibv_resp, sizeof(resp));
+ if (ret)
+ goto err_buf;
+
+ cq->cqn = resp.cqn;
+ cq->cqe_cnt = cq->ibv_cq.cqe;
+
+ return &cq->ibv_cq;
+
+err_buf:
+ pvrdma_free_buf(&cq->buf);
+err:
+ free(cq);
+
+ return NULL;
+}
+
+int pvrdma_destroy_cq(struct ibv_cq *cq)
+{
+ int ret;
+
+ ret = ibv_cmd_destroy_cq(cq);
+ if (ret)
+ return ret;
+
+ pvrdma_free_buf(&to_vcq(cq)->buf);
+ free(to_vcq(cq));
+
+ return 0;
+}
+
+int pvrdma_req_notify_cq(struct ibv_cq *ibcq, int solicited)
+{
+ struct pvrdma_context *ctx = to_vctx(ibcq->context);
+ struct pvrdma_cq *cq = to_vcq(ibcq);
+ unsigned int val = cq->cqn;
+
+ val |= solicited ? PVRDMA_UAR_CQ_ARM_SOL : PVRDMA_UAR_CQ_ARM;
+ pvrdma_write_uar_cq(ctx->uar, val);
+
+ return 0;
+}
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 2/8] libpvrdma: Add ring traversal
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
CQs and QPs use these structures to traverse the CQE/WQE rings.
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
providers/pvrdma/pvrdma_ring.h | 136 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 136 insertions(+)
create mode 100644 providers/pvrdma/pvrdma_ring.h
diff --git a/providers/pvrdma/pvrdma_ring.h b/providers/pvrdma/pvrdma_ring.h
new file mode 100644
index 0000000..e99a551
--- /dev/null
+++ b/providers/pvrdma/pvrdma_ring.h
@@ -0,0 +1,136 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program in the file COPYING. If not, write to the
+ * Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
+ * Boston, MA 02110-1301, USA.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_RING_H__
+#define __PVRDMA_RING_H__
+
+#include <linux/types.h>
+
+#define PVRDMA_INVALID_IDX -1 /* Invalid index. */
+#define atomic_read(_x) *(_x)
+#define atomic_set(_x, _y) (*(_x) = (_y))
+
+typedef uint32_t atomic_t;
+
+struct pvrdma_ring {
+ atomic_t prod_tail; /* Producer tail. */
+ atomic_t cons_head; /* Consumer head. */
+};
+
+struct pvrdma_ring_state {
+ struct pvrdma_ring tx; /* Tx ring. */
+ struct pvrdma_ring rx; /* Rx ring. */
+};
+
+static inline int pvrdma_idx_valid(__u32 idx, __u32 max_elems)
+{
+ /* Generates fewer instructions than a less-than. */
+ return (idx & ~((max_elems << 1) - 1)) == 0;
+}
+
+static inline __s32 pvrdma_idx(atomic_t *var, __u32 max_elems)
+{
+ const unsigned idx = atomic_read(var);
+
+ if (pvrdma_idx_valid(idx, max_elems))
+ return idx & (max_elems - 1);
+ return PVRDMA_INVALID_IDX;
+}
+
+static inline void pvrdma_idx_ring_inc(atomic_t *var, __u32 max_elems)
+{
+ __u32 idx = atomic_read(var) + 1; /* Increment. */
+
+ idx &= (max_elems << 1) - 1; /* Modulo size, flip gen. */
+ atomic_set(var, idx);
+}
+
+static inline __s32 pvrdma_idx_ring_has_space(const struct pvrdma_ring *r,
+ __u32 max_elems, __u32 *out_tail)
+{
+ const __u32 tail = atomic_read(&r->prod_tail);
+ const __u32 head = atomic_read(&r->cons_head);
+
+ if (pvrdma_idx_valid(tail, max_elems) &&
+ pvrdma_idx_valid(head, max_elems)) {
+ *out_tail = tail & (max_elems - 1);
+ return tail != (head ^ max_elems);
+ }
+ return PVRDMA_INVALID_IDX;
+}
+
+static inline __s32 pvrdma_idx_ring_has_data(const struct pvrdma_ring *r,
+ __u32 max_elems, __u32 *out_head)
+{
+ const __u32 tail = atomic_read(&r->prod_tail);
+ const __u32 head = atomic_read(&r->cons_head);
+
+ if (pvrdma_idx_valid(tail, max_elems) &&
+ pvrdma_idx_valid(head, max_elems)) {
+ *out_head = head & (max_elems - 1);
+ return tail != head;
+ }
+ return PVRDMA_INVALID_IDX;
+}
+
+static inline __s32 pvrdma_idx_ring_is_valid_idx(const struct pvrdma_ring *r,
+ __u32 max_elems, __u32 *idx)
+{
+ const __u32 tail = atomic_read(&r->prod_tail);
+ const __u32 head = atomic_read(&r->cons_head);
+
+ if (pvrdma_idx_valid(tail, max_elems) &&
+ pvrdma_idx_valid(head, max_elems) &&
+ pvrdma_idx_valid(*idx, max_elems)) {
+ if (tail > head && (*idx < tail && *idx >= head))
+ return 1;
+ else if (head > tail && (*idx >= head || *idx < tail))
+ return 1;
+ }
+ return 0;
+}
+
+#endif /* __PVRDMA_RING_H__ */
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 1/8] libpvrdma: Add ABI and main header files
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
In-Reply-To: <1478216677-6150-1-git-send-email-aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
Add all the structures and types to interact with kernel driver.
Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
providers/pvrdma/pvrdma-abi.h | 297 ++++++++++++++++++++++++++++++++++++
providers/pvrdma/pvrdma.h | 347 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 644 insertions(+)
create mode 100644 providers/pvrdma/pvrdma-abi.h
create mode 100644 providers/pvrdma/pvrdma.h
diff --git a/providers/pvrdma/pvrdma-abi.h b/providers/pvrdma/pvrdma-abi.h
new file mode 100644
index 0000000..c7a38c5
--- /dev/null
+++ b/providers/pvrdma/pvrdma-abi.h
@@ -0,0 +1,297 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_ABI_H__
+#define __PVRDMA_ABI_H__
+
+#include <infiniband/kern-abi.h>
+
+#define PVRDMA_UVERBS_ABI_VERSION 3
+#define PVRDMA_UAR_HANDLE_MASK 0x00FFFFFF /* Bottom 24 bits. */
+#define PVRDMA_UAR_QP_OFFSET 0 /* QP doorbell offset. */
+#define PVRDMA_UAR_QP_SEND BIT(30) /* Send bit. */
+#define PVRDMA_UAR_QP_RECV BIT(31) /* Recv bit. */
+#define PVRDMA_UAR_CQ_OFFSET 4 /* CQ doorbell offset. */
+#define PVRDMA_UAR_CQ_ARM_SOL BIT(29) /* Arm solicited bit. */
+#define PVRDMA_UAR_CQ_ARM BIT(30) /* Arm bit. */
+#define PVRDMA_UAR_CQ_POLL BIT(31) /* Poll bit. */
+
+enum pvrdma_wr_opcode {
+ PVRDMA_WR_RDMA_WRITE,
+ PVRDMA_WR_RDMA_WRITE_WITH_IMM,
+ PVRDMA_WR_SEND,
+ PVRDMA_WR_SEND_WITH_IMM,
+ PVRDMA_WR_RDMA_READ,
+ PVRDMA_WR_ATOMIC_CMP_AND_SWP,
+ PVRDMA_WR_ATOMIC_FETCH_AND_ADD,
+ PVRDMA_WR_LSO,
+ PVRDMA_WR_SEND_WITH_INV,
+ PVRDMA_WR_RDMA_READ_WITH_INV,
+ PVRDMA_WR_LOCAL_INV,
+ PVRDMA_WR_FAST_REG_MR,
+ PVRDMA_WR_MASKED_ATOMIC_CMP_AND_SWP,
+ PVRDMA_WR_MASKED_ATOMIC_FETCH_AND_ADD,
+ PVRDMA_WR_BIND_MW,
+ PVRDMA_WR_REG_SIG_MR,
+};
+
+enum pvrdma_wc_status {
+ PVRDMA_WC_SUCCESS,
+ PVRDMA_WC_LOC_LEN_ERR,
+ PVRDMA_WC_LOC_QP_OP_ERR,
+ PVRDMA_WC_LOC_EEC_OP_ERR,
+ PVRDMA_WC_LOC_PROT_ERR,
+ PVRDMA_WC_WR_FLUSH_ERR,
+ PVRDMA_WC_MW_BIND_ERR,
+ PVRDMA_WC_BAD_RESP_ERR,
+ PVRDMA_WC_LOC_ACCESS_ERR,
+ PVRDMA_WC_REM_INV_REQ_ERR,
+ PVRDMA_WC_REM_ACCESS_ERR,
+ PVRDMA_WC_REM_OP_ERR,
+ PVRDMA_WC_RETRY_EXC_ERR,
+ PVRDMA_WC_RNR_RETRY_EXC_ERR,
+ PVRDMA_WC_LOC_RDD_VIOL_ERR,
+ PVRDMA_WC_REM_INV_RD_REQ_ERR,
+ PVRDMA_WC_REM_ABORT_ERR,
+ PVRDMA_WC_INV_EECN_ERR,
+ PVRDMA_WC_INV_EEC_STATE_ERR,
+ PVRDMA_WC_FATAL_ERR,
+ PVRDMA_WC_RESP_TIMEOUT_ERR,
+ PVRDMA_WC_GENERAL_ERR,
+};
+
+enum pvrdma_wc_opcode {
+ PVRDMA_WC_SEND,
+ PVRDMA_WC_RDMA_WRITE,
+ PVRDMA_WC_RDMA_READ,
+ PVRDMA_WC_COMP_SWAP,
+ PVRDMA_WC_FETCH_ADD,
+ PVRDMA_WC_BIND_MW,
+ PVRDMA_WC_LSO,
+ PVRDMA_WC_LOCAL_INV,
+ PVRDMA_WC_FAST_REG_MR,
+ PVRDMA_WC_MASKED_COMP_SWAP,
+ PVRDMA_WC_MASKED_FETCH_ADD,
+ PVRDMA_WC_RECV = 1 << 7,
+ PVRDMA_WC_RECV_RDMA_WITH_IMM,
+};
+
+enum pvrdma_wc_flags {
+ PVRDMA_WC_GRH = 1 << 0,
+ PVRDMA_WC_WITH_IMM = 1 << 1,
+ PVRDMA_WC_WITH_INVALIDATE = 1 << 2,
+ PVRDMA_WC_IP_CSUM_OK = 1 << 3,
+ PVRDMA_WC_WITH_SMAC = 1 << 4,
+ PVRDMA_WC_WITH_VLAN = 1 << 5,
+ PVRDMA_WC_FLAGS_MAX = PVRDMA_WC_WITH_VLAN,
+};
+
+struct pvrdma_alloc_ucontext_resp {
+ struct ibv_get_context_resp ibv_resp;
+ __u32 qp_tab_size;
+ __u32 reserved;
+};
+
+struct pvrdma_alloc_pd_resp {
+ struct ibv_alloc_pd_resp ibv_resp;
+ __u32 pdn;
+ __u32 reserved;
+};
+
+struct pvrdma_create_cq {
+ struct ibv_create_cq ibv_cmd;
+ __u64 buf_addr;
+ __u32 buf_size;
+ __u32 reserved;
+};
+
+struct pvrdma_create_cq_resp {
+ struct ibv_create_cq_resp ibv_resp;
+ __u32 cqn;
+ __u32 reserved;
+};
+
+struct pvrdma_resize_cq {
+ struct ibv_resize_cq ibv_cmd;
+ __u64 buf_addr;
+ __u32 buf_size;
+ __u32 reserved;
+};
+
+struct pvrdma_create_srq {
+ struct ibv_create_srq ibv_cmd;
+ __u64 buf_addr;
+};
+
+struct pvrdma_create_srq_resp {
+ struct ibv_create_srq_resp ibv_resp;
+ __u32 srqn;
+ __u32 reserved;
+};
+
+struct pvrdma_create_qp {
+ struct ibv_create_qp ibv_cmd;
+ __u64 rbuf_addr;
+ __u64 sbuf_addr;
+ __u32 rbuf_size;
+ __u32 sbuf_size;
+ __u64 qp_addr;
+};
+
+/* PVRDMA masked atomic compare and swap */
+struct pvrdma_ex_cmp_swap {
+ __u64 swap_val;
+ __u64 compare_val;
+ __u64 swap_mask;
+ __u64 compare_mask;
+};
+
+/* PVRDMA masked atomic fetch and add */
+struct pvrdma_ex_fetch_add {
+ __u64 add_val;
+ __u64 field_boundary;
+};
+
+/* PVRDMA address vector. */
+struct pvrdma_av {
+ __u32 port_pd;
+ __u32 sl_tclass_flowlabel;
+ __u8 dgid[16];
+ __u8 src_path_bits;
+ __u8 gid_index;
+ __u8 stat_rate;
+ __u8 hop_limit;
+ __u8 dmac[6];
+ __u8 reserved[6];
+};
+
+/* PVRDMA scatter/gather entry */
+struct pvrdma_sge {
+ __u64 addr;
+ __u32 length;
+ __u32 lkey;
+};
+
+/* PVRDMA receive queue work request */
+struct pvrdma_rq_wqe_hdr {
+ __u64 wr_id; /* wr id */
+ __u32 num_sge; /* size of s/g array */
+ __u32 total_len; /* reserved */
+};
+/* Use pvrdma_sge (ib_sge) for receive queue s/g array elements. */
+
+/* PVRDMA send queue work request */
+struct pvrdma_sq_wqe_hdr {
+ __u64 wr_id; /* wr id */
+ __u32 num_sge; /* size of s/g array */
+ __u32 total_len; /* reserved */
+ __u32 opcode; /* operation type */
+ __u32 send_flags; /* wr flags */
+ union {
+ __u32 imm_data;
+ __u32 invalidate_rkey;
+ } ex;
+ __u32 reserved;
+ union {
+ struct {
+ __u64 remote_addr;
+ __u32 rkey;
+ __u8 reserved[4];
+ } rdma;
+ struct {
+ __u64 remote_addr;
+ __u64 compare_add;
+ __u64 swap;
+ __u32 rkey;
+ __u32 reserved;
+ } atomic;
+ struct {
+ __u64 remote_addr;
+ __u32 log_arg_sz;
+ __u32 rkey;
+ union {
+ struct pvrdma_ex_cmp_swap cmp_swap;
+ struct pvrdma_ex_fetch_add fetch_add;
+ } wr_data;
+ } masked_atomics;
+ struct {
+ __u64 iova_start;
+ __u64 pl_pdir_dma;
+ __u32 page_shift;
+ __u32 page_list_len;
+ __u32 length;
+ __u32 access_flags;
+ __u32 rkey;
+ } fast_reg;
+ struct {
+ __u32 remote_qpn;
+ __u32 remote_qkey;
+ struct pvrdma_av av;
+ } ud;
+ } wr;
+};
+/* Use pvrdma_sge (ib_sge) for send queue s/g array elements. */
+
+/* Completion queue element. */
+struct pvrdma_cqe {
+ __u64 wr_id;
+ __u64 qp;
+ __u32 opcode;
+ __u32 status;
+ __u32 byte_len;
+ __u32 imm_data;
+ __u32 src_qp;
+ __u32 wc_flags;
+ __u32 vendor_err;
+ __u16 pkey_index;
+ __u16 slid;
+ __u8 sl;
+ __u8 dlid_path_bits;
+ __u8 port_num;
+ __u8 smac[6];
+ __u8 reserved2[7]; /* Pad to next power of 2 (64). */
+};
+
+#endif /* __PVRDMA_ABI_H__ */
diff --git a/providers/pvrdma/pvrdma.h b/providers/pvrdma/pvrdma.h
new file mode 100644
index 0000000..d3df07d
--- /dev/null
+++ b/providers/pvrdma/pvrdma.h
@@ -0,0 +1,347 @@
+/*
+ * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of EITHER the GNU General Public License
+ * version 2 as published by the Free Software Foundation or the BSD
+ * 2-Clause License. This program is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
+ * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License version 2 for more details at
+ * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program available in the file COPYING in the main
+ * directory of this source tree.
+ *
+ * The BSD 2-Clause License
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __PVRDMA_H__
+#define __PVRDMA_H__
+
+#include <config.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <pthread.h>
+#include <unistd.h>
+#include <netinet/in.h>
+#include <sys/mman.h>
+#include <infiniband/driver.h>
+
+#define BIT(nr) (1UL << (nr))
+
+#include "pvrdma-abi.h"
+#include "pvrdma_ring.h"
+
+#ifndef rmb
+# define rmb() mb()
+#endif
+
+#ifndef wmb
+# define wmb() mb()
+#endif
+
+#ifndef likely
+#define likely(x) __builtin_expect(!!(x), 1)
+#else
+#define likely(x) (x)
+#endif
+
+#ifndef unlikely
+#define unlikely(x) __builtin_expect(!!(x), 0)
+#else
+#define unlikely(x) (x)
+#endif
+
+#ifndef max
+#define max(a, b) \
+ ({ typeof (a) _a = (a); \
+ typeof (b) _b = (b); \
+ _a > _b ? _a : _b; })
+#endif
+
+#ifndef min
+#define min(a,b) \
+ ({ typeof (a) _a = (a); \
+ typeof (b) _b = (b); \
+ _a < _b ? _a : _b; })
+#endif
+
+#define PFX "pvrdma: "
+
+enum {
+ PVRDMA_OPCODE_NOP = 0x00,
+ PVRDMA_OPCODE_SEND_INVAL = 0x01,
+ PVRDMA_OPCODE_RDMA_WRITE = 0x08,
+ PVRDMA_OPCODE_RDMA_WRITE_IMM = 0x09,
+ PVRDMA_OPCODE_SEND = 0x0a,
+ PVRDMA_OPCODE_SEND_IMM = 0x0b,
+ PVRDMA_OPCODE_LSO = 0x0e,
+ PVRDMA_OPCODE_RDMA_READ = 0x10,
+ PVRDMA_OPCODE_ATOMIC_CS = 0x11,
+ PVRDMA_OPCODE_ATOMIC_FA = 0x12,
+ PVRDMA_OPCODE_ATOMIC_MASK_CS = 0x14,
+ PVRDMA_OPCODE_ATOMIC_MASK_FA = 0x15,
+ PVRDMA_OPCODE_BIND_MW = 0x18,
+ PVRDMA_OPCODE_FMR = 0x19,
+ PVRDMA_OPCODE_LOCAL_INVAL = 0x1b,
+ PVRDMA_OPCODE_CONFIG_CMD = 0x1f,
+
+ PVRDMA_RECV_OPCODE_RDMA_WRITE_IMM = 0x00,
+ PVRDMA_RECV_OPCODE_SEND = 0x01,
+ PVRDMA_RECV_OPCODE_SEND_IMM = 0x02,
+ PVRDMA_RECV_OPCODE_SEND_INVAL = 0x03,
+
+ PVRDMA_CQE_OPCODE_ERROR = 0x1e,
+ PVRDMA_CQE_OPCODE_RESIZE = 0x16,
+};
+
+enum {
+ PVRDMA_WQE_CTRL_FENCE = 1 << 6,
+ PVRDMA_WQE_CTRL_CQ_UPDATE = 3 << 2,
+ PVRDMA_WQE_CTRL_SOLICIT = 1 << 1,
+};
+
+struct pvrdma_device {
+ struct ibv_device ibv_dev;
+ int page_size;
+ int abi_version;
+};
+
+struct pvrdma_context {
+ struct ibv_context ibv_ctx;
+ void *uar;
+ pthread_spinlock_t uar_lock;
+ int max_qp_wr;
+ int max_sge;
+ int max_cqe;
+ struct pvrdma_qp **qp_tbl;
+};
+
+struct pvrdma_buf {
+ void *buf;
+ size_t length;
+};
+
+struct pvrdma_pd {
+ struct ibv_pd ibv_pd;
+ uint32_t pdn;
+};
+
+struct pvrdma_cq {
+ struct ibv_cq ibv_cq;
+ struct pvrdma_buf buf;
+ struct pvrdma_buf resize_buf;
+ pthread_spinlock_t lock;
+ struct pvrdma_ring_state *ring_state;
+ uint32_t cqe_cnt;
+ uint32_t offset;
+ uint32_t cqn;
+};
+
+struct pvrdma_wq {
+ uint64_t *wrid;
+ pthread_spinlock_t lock;
+ int wqe_cnt;
+ int wqe_size;
+ struct pvrdma_ring *ring_state;
+ int max_gs;
+ int wqe_shift;
+ int offset;
+};
+
+struct pvrdma_qp {
+ struct ibv_qp ibv_qp;
+ struct pvrdma_buf rbuf;
+ struct pvrdma_buf sbuf;
+ int max_inline_data;
+ int buf_size;
+ uint32_t sq_signal_bits;
+ int sq_spare_wqes;
+ struct pvrdma_wq sq;
+ struct pvrdma_wq rq;
+};
+
+struct pvrdma_ah {
+ struct ibv_ah ibv_ah;
+ struct pvrdma_av av;
+};
+
+static inline unsigned long align(unsigned long val, unsigned long align)
+{
+ return (val + align - 1) & ~(align - 1);
+}
+
+static inline int align_next_power2(int size)
+{
+ int val = 1;
+
+ while (val < size)
+ val <<= 1;
+
+ return val;
+}
+
+#define to_vxxx(xxx, type) \
+ ((struct pvrdma_##type *) \
+ ((void *) ib##xxx - offsetof(struct pvrdma_##type, ibv_##xxx)))
+
+static inline struct pvrdma_device *to_vdev(struct ibv_device *ibdev)
+{
+ return to_vxxx(dev, device);
+}
+
+static inline struct pvrdma_context *to_vctx(struct ibv_context *ibctx)
+{
+ return to_vxxx(ctx, context);
+}
+
+static inline struct pvrdma_pd *to_vpd(struct ibv_pd *ibpd)
+{
+ return to_vxxx(pd, pd);
+}
+
+static inline struct pvrdma_cq *to_vcq(struct ibv_cq *ibcq)
+{
+ return to_vxxx(cq, cq);
+}
+
+static inline struct pvrdma_qp *to_vqp(struct ibv_qp *ibqp)
+{
+ return to_vxxx(qp, qp);
+}
+
+static inline struct pvrdma_ah *to_vah(struct ibv_ah *ibah)
+{
+ return to_vxxx(ah, ah);
+}
+
+static inline void pvrdma_write_uar_qp(void *uar, unsigned value)
+{
+ *(uint32_t *)(uar + PVRDMA_UAR_QP_OFFSET) = htole32(value);
+}
+
+static inline void pvrdma_write_uar_cq(void *uar, unsigned value)
+{
+ *(uint32_t *)(uar + PVRDMA_UAR_CQ_OFFSET) = htole32(value);
+}
+
+static inline int ibv_send_flags_to_pvrdma(int flags)
+{
+ return flags;
+}
+
+static inline enum pvrdma_wr_opcode ibv_wr_opcode_to_pvrdma(
+ enum ibv_wr_opcode op)
+{
+ return (enum pvrdma_wr_opcode)op;
+}
+
+static inline enum ibv_wc_status pvrdma_wc_status_to_ibv(
+ enum pvrdma_wc_status status)
+{
+ return (enum ibv_wc_status)status;
+}
+
+static inline enum ibv_wc_opcode pvrdma_wc_opcode_to_ibv(
+ enum pvrdma_wc_opcode op)
+{
+ return (enum ibv_wc_opcode)op;
+}
+
+static inline int pvrdma_wc_flags_to_ibv(int flags)
+{
+ return flags;
+}
+
+int pvrdma_alloc_buf(struct pvrdma_buf *buf, size_t size, int page_size);
+void pvrdma_free_buf(struct pvrdma_buf *buf);
+
+int pvrdma_query_device(struct ibv_context *context,
+ struct ibv_device_attr *attr);
+int pvrdma_query_port(struct ibv_context *context, uint8_t port,
+ struct ibv_port_attr *attr);
+
+struct ibv_pd *pvrdma_alloc_pd(struct ibv_context *context);
+int pvrdma_free_pd(struct ibv_pd *pd);
+
+struct ibv_mr *pvrdma_reg_mr(struct ibv_pd *pd, void *addr,
+ size_t length, int access);
+int pvrdma_dereg_mr(struct ibv_mr *mr);
+
+struct ibv_cq *pvrdma_create_cq(struct ibv_context *context, int cqe,
+ struct ibv_comp_channel *channel,
+ int comp_vector);
+int pvrdma_alloc_cq_buf(struct pvrdma_device *dev, struct pvrdma_cq *cq,
+ struct pvrdma_buf *buf, int nent);
+int pvrdma_destroy_cq(struct ibv_cq *cq);
+int pvrdma_req_notify_cq(struct ibv_cq *cq, int solicited);
+int pvrdma_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc);
+void pvrdma_cq_event(struct ibv_cq *cq);
+void __pvrdma_cq_clean(struct pvrdma_cq *cq, uint32_t qpn);
+void pvrdma_cq_clean(struct pvrdma_cq *cq, uint32_t qpn);
+int pvrdma_get_outstanding_cqes(struct pvrdma_cq *cq);
+void pvrdma_cq_resize_copy_cqes(struct pvrdma_cq *cq, void *buf,
+ int new_cqe);
+
+struct ibv_qp *pvrdma_create_qp(struct ibv_pd *pd,
+ struct ibv_qp_init_attr *attr);
+int pvrdma_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ int attr_mask, struct ibv_qp_init_attr *init_attr);
+int pvrdma_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ int attr_mask);
+int pvrdma_destroy_qp(struct ibv_qp *qp);
+void pvrdma_init_qp_indices(struct pvrdma_qp *qp);
+void pvrdma_qp_init_sq_ownership(struct pvrdma_qp *qp);
+int pvrdma_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
+ struct ibv_send_wr **bad_wr);
+int pvrdma_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr,
+ struct ibv_recv_wr **bad_wr);
+void pvrdma_calc_sq_wqe_size(struct ibv_qp_cap *cap, enum ibv_qp_type type,
+ struct pvrdma_qp *qp);
+int pvrdma_alloc_qp_buf(struct pvrdma_device *dev, struct ibv_qp_cap *cap,
+ enum ibv_qp_type type, struct pvrdma_qp *qp);
+void pvrdma_set_sq_sizes(struct pvrdma_qp *qp, struct ibv_qp_cap *cap,
+ enum ibv_qp_type type);
+struct pvrdma_qp *pvrdma_find_qp(struct pvrdma_context *ctx,
+ uint32_t qpn);
+int pvrdma_store_qp(struct pvrdma_context *ctx, uint32_t qpn,
+ struct pvrdma_qp *qp);
+void pvrdma_clear_qp(struct pvrdma_context *ctx, uint32_t qpn);
+
+struct ibv_ah *pvrdma_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr);
+int pvrdma_destroy_ah(struct ibv_ah *ah);
+
+int pvrdma_alloc_av(struct pvrdma_pd *pd, struct ibv_ah_attr *attr,
+ struct pvrdma_ah *ah);
+void pvrdma_free_av(struct pvrdma_ah *ah);
+
+#endif /* __PVRDMA_H__ */
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH rdma-core 0/8] libpvrdma: userspace library for PVRDMA
From: Adit Ranadive @ 2016-11-03 23:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA
Cc: Adit Ranadive
This patch series adds the userspace library for our paravirtual RDMA device.
I have already sent out the patchset for the RDMA driver before [1]. I will
also send a pull request on github.
I have included the shared ABI file here based on the RDMA fix up stuff
that Jason pointed me to.
The patch series was built on 28d918732cc4efb2185d91a500be1e41aeb149e7.
[1] http://marc.info/?l=linux-rdma&m=147546066322315&w=2
Adit Ranadive (8):
libpvrdma: Add ABI and main header files
libpvrdma: Add ring traversal
libpvrdma: Add completion queue functions
libpvrdma: Add queue pair functions
libpvrdma: Add misc verbs functions
libpvrdma: Add main library file
libpvrdma: Add to consolidated rdma-core
libpvrdma: Add fix up for ABI file
CMakeLists.txt | 1 +
MAINTAINERS | 6 +
README.md | 1 +
buildlib/RDMA_LinuxHeaders.cmake | 1 +
buildlib/fixup-include/rdma-pvrdma-abi.h | 297 ++++++++++++++++++
providers/pvrdma/CMakeLists.txt | 6 +
providers/pvrdma/cq.c | 287 ++++++++++++++++++
providers/pvrdma/pvrdma.h | 347 +++++++++++++++++++++
providers/pvrdma/pvrdma_main.c | 214 +++++++++++++
providers/pvrdma/pvrdma_ring.h | 136 +++++++++
providers/pvrdma/qp.c | 505 +++++++++++++++++++++++++++++++
providers/pvrdma/verbs.c | 234 ++++++++++++++
12 files changed, 2035 insertions(+)
create mode 100644 buildlib/fixup-include/rdma-pvrdma-abi.h
create mode 100644 providers/pvrdma/CMakeLists.txt
create mode 100644 providers/pvrdma/cq.c
create mode 100644 providers/pvrdma/pvrdma.h
create mode 100644 providers/pvrdma/pvrdma_main.c
create mode 100644 providers/pvrdma/pvrdma_ring.h
create mode 100644 providers/pvrdma/qp.c
create mode 100644 providers/pvrdma/verbs.c
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH for-next V2 00/15][PULL request] Mellanox mlx5 core driver updates 2016-10-25
From: Doug Ledford @ 2016-11-03 22:33 UTC (permalink / raw)
To: David Miller, saeedm@mellanox.com
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
ogerlitz@mellanox.com, leonro@mellanox.com, talal@mellanox.com,
matanb@mellanox.com
In-Reply-To: <20161030.173522.1563055187413070239.davem@davemloft.net>
[-- Attachment #1.1: Type: text/plain, Size: 637 bytes --]
On 10/30/16 3:35 PM, David Miller wrote:
> From: Saeed Mahameed <saeedm@mellanox.com>
> Date: Sun, 30 Oct 2016 23:21:53 +0200
>
>> This series contains some updates and fixes of mlx5 core and
>> IB drivers with the addition of two features that demand
>> new low level commands and infrastructure updates.
>> - SRIOV VF max rate limit support
>> - mlx5e tc support for FWD rules with counter.
>>
>> Needed for both net and rdma subsystems.
>
> Pulled, thanks.
Thanks, done here as well.
--
Doug Ledford <dledford@redhat.com> GPG Key ID: 0E572FDD
Red Hat, Inc.
100 E. Davie St
Raleigh, NC 27601 USA
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 907 bytes --]
^ permalink raw reply
* RE: RDMA developer gatherings around Kernel Summit and Linux Plumbers in Santa Fe
From: Liran Liss @ 2016-11-03 21:54 UTC (permalink / raw)
To: Doug Ledford, Christoph Lameter,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: skc-YOWKrPYUwWM@public.gmane.org,
ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
Jason Gunthorpe,
john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
knut.omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org, Matan Barak
In-Reply-To: <581BAEFD.70501-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Matan told me that he will advertise a git with the latest patches applied by EOD.
> -----Original Message-----
> From: Doug Ledford [mailto:dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> Sent: Thursday, November 03, 2016 3:41 PM
> To: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: skc-YOWKrPYUwWM@public.gmane.org; ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org; Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>; john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org; leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org;
> Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>; knut.omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org; Matan Barak
> <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Subject: Re: RDMA developer gatherings around Kernel Summit and Linux
> Plumbers in Santa Fe
>
> On 11/3/16 2:49 PM, Christoph Lameter wrote:
> >> Saturday sessions 9am till 4pm. 12-1pm Lunchtime
> >>
> >> 9am Refine TODO list for consolidated library - Jason Gunthorpe
> >> 10am Submission process for multi subsystem drivers - Doug Ledford
> >> 11am Multicast features and gaps - Christoph Lameter
> >>
> >> 1pm Licensing carryover - Susan/Christoph
> >> 2pm Standard network tools, integrating to the regular network
> stack - Christoph
> >> 3pm Open Discussion/Reserve Session - TBD
> >> 4pm Closing Session - TBD
> >
> > Ok we have an on going conversation regarding the ioctl and I think
> > that is of high importance. We tried to find a room for a meeting on
> > Friday on this but we do not have access to a projector. I would like
> > to have this issue dealt with first on Saturday and then we can
> > rearrange times for the other presentations. I could skip some of my
> > sessions if necessary and we have 2 hours that are pretty flexible at
> > the end anyways. I hope that is agreeable to everyone?
> >
>
> I'm agreeable with that.
>
> --
> Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> GPG Key ID: 0E572FDD
> Red Hat, Inc.
> 100 E. Davie St
> Raleigh, NC 27601 USA
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: RDMA developer gatherings around Kernel Summit and Linux Plumbers in Santa Fe
From: Doug Ledford @ 2016-11-03 21:41 UTC (permalink / raw)
To: Christoph Lameter,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: skc-YOWKrPYUwWM@public.gmane.org,
ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
Jason Gunthorpe,
john.fleck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
knut.omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
In-Reply-To: <alpine.DEB.2.20.1611031544040.13528-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 1198 bytes --]
On 11/3/16 2:49 PM, Christoph Lameter wrote:
>> Saturday sessions 9am till 4pm. 12-1pm Lunchtime
>>
>> 9am Refine TODO list for consolidated library - Jason Gunthorpe
>> 10am Submission process for multi subsystem drivers - Doug Ledford
>> 11am Multicast features and gaps - Christoph Lameter
>>
>> 1pm Licensing carryover - Susan/Christoph
>> 2pm Standard network tools, integrating to the regular network stack - Christoph
>> 3pm Open Discussion/Reserve Session - TBD
>> 4pm Closing Session - TBD
>
> Ok we have an on going conversation regarding the ioctl and I think that
> is of high importance. We tried to find a room for a meeting on Friday on
> this but we do not have access to a projector. I would like to have this
> issue dealt with first on Saturday and then we can rearrange times for the
> other presentations. I could skip some of my sessions if necessary and we
> have 2 hours that are pretty flexible at the end anyways. I hope that is
> agreeable to everyone?
>
I'm agreeable with that.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> GPG Key ID: 0E572FDD
Red Hat, Inc.
100 E. Davie St
Raleigh, NC 27601 USA
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 907 bytes --]
^ permalink raw reply
* Re: RDMA developer gatherings around Kernel Summit and Linux Plumbers in Santa Fe
From: Doug Ledford @ 2016-11-03 21:41 UTC (permalink / raw)
To: Fleck John, Christoph Lameter
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
skc-YOWKrPYUwWM@public.gmane.org, Weiny Ira, Jason Gunthorpe,
leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
knut.omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
In-Reply-To: <D1AF4CEA-5C50-4346-A599-C8417484723F-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 1519 bytes --]
On 11/3/16 3:27 PM, Fleck, John wrote:
> Is there a kernel tree with the patches applied?
No. I haven't seen the revised version with CONFIG_EXPERIMENTAL
wrapping the new changes posted yet. Did I miss them?
> John
>
> On Nov 3, 2016, at 2:49 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
>
>>> Saturday sessions 9am till 4pm. 12-1pm Lunchtime
>>>
>>> 9am Refine TODO list for consolidated library - Jason Gunthorpe
>>> 10am Submission process for multi subsystem drivers - Doug Ledford
>>> 11am Multicast features and gaps - Christoph Lameter
>>>
>>> 1pm Licensing carryover - Susan/Christoph
>>> 2pm Standard network tools, integrating to the regular network stack - Christoph
>>> 3pm Open Discussion/Reserve Session - TBD
>>> 4pm Closing Session - TBD
>>
>> Ok we have an on going conversation regarding the ioctl and I think that
>> is of high importance. We tried to find a room for a meeting on Friday on
>> this but we do not have access to a projector. I would like to have this
>> issue dealt with first on Saturday and then we can rearrange times for the
>> other presentations. I could skip some of my sessions if necessary and we
>> have 2 hours that are pretty flexible at the end anyways. I hope that is
>> agreeable to everyone?
>>
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> GPG Key ID: 0E572FDD
Red Hat, Inc.
100 E. Davie St
Raleigh, NC 27601 USA
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 907 bytes --]
^ permalink raw reply
* Re: RDMA developer gatherings around Kernel Summit and Linux Plumbers in Santa Fe
From: Fleck, John @ 2016-11-03 21:27 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Doug Ledford,
skc-YOWKrPYUwWM@public.gmane.org, Weiny, Ira, Jason Gunthorpe,
leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
knut.omang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
In-Reply-To: <alpine.DEB.2.20.1611031544040.13528-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
Is there a kernel tree with the patches applied?
John
On Nov 3, 2016, at 2:49 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
>> Saturday sessions 9am till 4pm. 12-1pm Lunchtime
>>
>> 9am Refine TODO list for consolidated library - Jason Gunthorpe
>> 10am Submission process for multi subsystem drivers - Doug Ledford
>> 11am Multicast features and gaps - Christoph Lameter
>>
>> 1pm Licensing carryover - Susan/Christoph
>> 2pm Standard network tools, integrating to the regular network stack - Christoph
>> 3pm Open Discussion/Reserve Session - TBD
>> 4pm Closing Session - TBD
>
> Ok we have an on going conversation regarding the ioctl and I think that
> is of high importance. We tried to find a room for a meeting on Friday on
> this but we do not have access to a projector. I would like to have this
> issue dealt with first on Saturday and then we can rearrange times for the
> other presentations. I could skip some of my sessions if necessary and we
> have 2 hours that are pretty flexible at the end anyways. I hope that is
> agreeable to everyone?
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-next V1 16/17] IB/isert: Remove and fix debug prints after allocation failure
From: Sagi Grimberg @ 2016-11-03 21:24 UTC (permalink / raw)
To: Leon Romanovsky, dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1478184265-9620-17-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: RDMA developer gatherings around Kernel Summit and Linux Plumbers in Santa Fe
From: Christoph Lameter @ 2016-11-03 20:49 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: Doug Ledford, skc-YOWKrPYUwWM, ira.weiny-ral2JQCrhuEAvxtiuMwx3w,
Jason Gunthorpe, john.fleck-ral2JQCrhuEAvxtiuMwx3w,
leon-DgEjT+Ai2ygdnm+yROfE0A, liranl-VPRAkNaXOzVWk0Htik3J/w,
knut.omang-QHcLZuEGTsvQT0dZR+AlfA, matanb-VPRAkNaXOzVWk0Htik3J/w
In-Reply-To: <alpine.DEB.2.20.1610281212220.8691-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
> Saturday sessions 9am till 4pm. 12-1pm Lunchtime
>
> 9am Refine TODO list for consolidated library - Jason Gunthorpe
> 10am Submission process for multi subsystem drivers - Doug Ledford
> 11am Multicast features and gaps - Christoph Lameter
>
> 1pm Licensing carryover - Susan/Christoph
> 2pm Standard network tools, integrating to the regular network stack - Christoph
> 3pm Open Discussion/Reserve Session - TBD
> 4pm Closing Session - TBD
Ok we have an on going conversation regarding the ioctl and I think that
is of high importance. We tried to find a room for a meeting on Friday on
this but we do not have access to a projector. I would like to have this
issue dealt with first on Saturday and then we can rearrange times for the
other presentations. I could skip some of my sessions if necessary and we
have 2 hours that are pretty flexible at the end anyways. I hope that is
agreeable to everyone?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-core v2 4/4] redhat/spec: build split rpm packages
From: Doug Ledford @ 2016-11-03 20:35 UTC (permalink / raw)
To: Jarod Wilson, Jason Gunthorpe
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <20161028171147.GJ42084-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 2473 bytes --]
On 10/28/16 11:11 AM, Jarod Wilson wrote:
> On Thu, Oct 27, 2016 at 03:10:59PM -0600, Jason Gunthorpe wrote:
>> On Thu, Oct 20, 2016 at 11:33:57AM -0400, Jarod Wilson wrote:
>>> Url: http://openfabrics.org/
>>
>> I guess we should change this url to
>> https://github.com/linux-rdma/rdma-core ?
>
> Either one works for me.
We should get the github url in.
>>> Source: rdma-core-%{version}.tgz
>>> -BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root
>>> +# https://github.com/linux-rdma/rdma-core
>>> +BuildRoot: %(mktemp -ud %{_tmppath}/%{name}-%{version}-%{release}-XXXXXX)
>>
>> I always wondered why there was so much variability in spec files
>> here.. I followed the Fedora guidelines, should we copy the above into
>> the other spec file?
>
> I believe the current Fedora guidelines actually say "just omit
> BuildRoot", because rpm will figure out a sane default by itself. The one
> with mktemp was introduced by the security-conscious/paranoid, I just
> copied it over from another of the specs I was merging together here, not
> sure what the "best" route is here now.
We won't be putting this anywhere that requires the buildroot be
specified, so I would leave it out.
>>> +%package -n librdmacm-utils
>>> +Summary: Examples for the librdmacm library
>>> +Requires: librdmacm%{?_isa} = %{version}-%{release}
>>
>> Why the requires? Shouldn't auto shlib dependencies take care of that?
>
> Probably. I think this was another legacy bit copied over from a
> stand-alone spec file.
Actually, no. When you have a -utils package that goes with a library
package, standard procedure is to tie them directly like this. The auto
dependency stuff will allow, say, librdmacm-1.1.17-1 and
librdmacm-utils-1.1.16-1 to happily satisfy each other since the later
librdmacm provides all of the sonames and apis that the -utils package
needs. This is as designed as you want a librdamcm update to not
trigger a required update of, say, openmpi, unless there is truly a
change that requires it. But, for the utils that go with the library,
even though we don't *have* to update them with the library, we want
that to happen automatically, so the explicit requires makes that happen
even if librdmacm-utils was excluded from the update command.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> GPG Key ID: 0E572FDD
Red Hat, Inc.
100 E. Davie St
Raleigh, NC 27601 USA
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 907 bytes --]
^ permalink raw reply
* [PATCH 4.9-rc] iw_cxgb4: invalidate the mr when posting a read_w_inv wr
From: Steve Wise @ 2016-11-03 19:09 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Also, rearrange things a bit to have a common c4iw_invalidate_mr()
function used everywhere that we need to invalidate.
Fixes: 49b53a93a64a ("iw_cxgb4: add fast-path for small REG_MR operations")
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/infiniband/hw/cxgb4/cq.c | 17 +++--------------
drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +-
drivers/infiniband/hw/cxgb4/mem.c | 12 ++++++++++++
drivers/infiniband/hw/cxgb4/qp.c | 16 ++++++++--------
4 files changed, 24 insertions(+), 23 deletions(-)
diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
index 867b8cf..19c6477 100644
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -666,18 +666,6 @@ static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, struct t4_cqe *cqe,
return ret;
}
-static void invalidate_mr(struct c4iw_dev *rhp, u32 rkey)
-{
- struct c4iw_mr *mhp;
- unsigned long flags;
-
- spin_lock_irqsave(&rhp->lock, flags);
- mhp = get_mhp(rhp, rkey >> 8);
- if (mhp)
- mhp->attr.state = 0;
- spin_unlock_irqrestore(&rhp->lock, flags);
-}
-
/*
* Get one cq entry from c4iw and map it to openib.
*
@@ -733,7 +721,7 @@ static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ib_wc *wc)
CQE_OPCODE(&cqe) == FW_RI_SEND_WITH_SE_INV) {
wc->ex.invalidate_rkey = CQE_WRID_STAG(&cqe);
wc->wc_flags |= IB_WC_WITH_INVALIDATE;
- invalidate_mr(qhp->rhp, wc->ex.invalidate_rkey);
+ c4iw_invalidate_mr(qhp->rhp, wc->ex.invalidate_rkey);
}
} else {
switch (CQE_OPCODE(&cqe)) {
@@ -762,7 +750,8 @@ static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ib_wc *wc)
/* Invalidate the MR if the fastreg failed */
if (CQE_STATUS(&cqe) != T4_ERR_SUCCESS)
- invalidate_mr(qhp->rhp, CQE_WRID_FR_STAG(&cqe));
+ c4iw_invalidate_mr(qhp->rhp,
+ CQE_WRID_FR_STAG(&cqe));
break;
default:
printk(KERN_ERR MOD "Unexpected opcode %d "
diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
index 7e7f79e..4788e1a 100644
--- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
+++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
@@ -999,6 +999,6 @@ extern int db_coalescing_threshold;
extern int use_dsgl;
void c4iw_drain_rq(struct ib_qp *qp);
void c4iw_drain_sq(struct ib_qp *qp);
-
+void c4iw_invalidate_mr(struct c4iw_dev *rhp, u32 rkey);
#endif
diff --git a/drivers/infiniband/hw/cxgb4/mem.c b/drivers/infiniband/hw/cxgb4/mem.c
index 80e2774..410408f 100644
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -770,3 +770,15 @@ int c4iw_dereg_mr(struct ib_mr *ib_mr)
kfree(mhp);
return 0;
}
+
+void c4iw_invalidate_mr(struct c4iw_dev *rhp, u32 rkey)
+{
+ struct c4iw_mr *mhp;
+ unsigned long flags;
+
+ spin_lock_irqsave(&rhp->lock, flags);
+ mhp = get_mhp(rhp, rkey >> 8);
+ if (mhp)
+ mhp->attr.state = 0;
+ spin_unlock_irqrestore(&rhp->lock, flags);
+}
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c
index 5790e1d..b7ac97b 100644
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -706,12 +706,8 @@ static int build_memreg(struct t4_sq *sq, union t4_wr *wqe,
return 0;
}
-static int build_inv_stag(struct c4iw_dev *dev, union t4_wr *wqe,
- struct ib_send_wr *wr, u8 *len16)
+static int build_inv_stag(union t4_wr *wqe, struct ib_send_wr *wr, u8 *len16)
{
- struct c4iw_mr *mhp = get_mhp(dev, wr->ex.invalidate_rkey >> 8);
-
- mhp->attr.state = 0;
wqe->inv.stag_inv = cpu_to_be32(wr->ex.invalidate_rkey);
wqe->inv.r2 = 0;
*len16 = DIV_ROUND_UP(sizeof wqe->inv, 16);
@@ -842,10 +838,13 @@ int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
case IB_WR_RDMA_READ_WITH_INV:
fw_opcode = FW_RI_RDMA_READ_WR;
swsqe->opcode = FW_RI_READ_REQ;
- if (wr->opcode == IB_WR_RDMA_READ_WITH_INV)
+ if (wr->opcode == IB_WR_RDMA_READ_WITH_INV) {
+ c4iw_invalidate_mr(qhp->rhp,
+ wr->sg_list[0].lkey);
fw_flags = FW_RI_RDMA_READ_INVALIDATE;
- else
+ } else {
fw_flags = 0;
+ }
err = build_rdma_read(wqe, wr, &len16);
if (err)
break;
@@ -878,7 +877,8 @@ int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
fw_flags |= FW_RI_LOCAL_FENCE_FLAG;
fw_opcode = FW_RI_INV_LSTAG_WR;
swsqe->opcode = FW_RI_LOCAL_INV;
- err = build_inv_stag(qhp->rhp, wqe, wr, &len16);
+ err = build_inv_stag(wqe, wr, &len16);
+ c4iw_invalidate_mr(qhp->rhp, wr->ex.invalidate_rkey);
break;
default:
PDBG("%s post of type=%d TBD!\n", __func__,
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox