* [patch 00/44] RDMA over Ethernet
@ 2011-07-01 13:18 rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 01/44] ib_pack.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (44 more replies)
0 siblings, 45 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
This patch set implements a software emulation of RoCE or InfiniBand transport.
It consists of two kernel modules. The first, ib_rxe, implements the RDMA
transport and registers with the RDMA core as a kernel verbs provider. The
second, ib_rxe_net or ib_sample, implement the packet IO layer. ib_rxe_net
attaches to the Linux netdev stack as a network protocol and can send and
receive packets over any Ethernet device. It uses the RoCE protocol to handle
RDMA transport. ib_sample is a pure loopback device that uses the InfiniBand
transport i.e. it includes the LRH header while RoCE only includes the GRH
header.
The modules are configured by entries in /sys. There is a configuration script
(rxe_cfg) that simplifies the use of this interface. Rxe_cfg is part of the
rxe user space code, librxe.
The use of rxe verbs in user space requires the inclusion of librxe as a device
specific plug-in to libibverbs. Librxe is packaged separately.
Copies of the user space library and tools for 'upstream' and a tar file of
these patches are available at support.systemfabricworks.com/downloads/rxe.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 01/44] ib_pack.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 02/44] rxe_hdr.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (43 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch1 --]
[-- Type: text/plain, Size: 2154 bytes --]
Bring up to date with the current version of the IBTA spec.
- add new opcodes for RC and RD
- add new groups of opcodes for CN and XRC
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
include/rdma/ib_pack.h | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
Index: infiniband/include/rdma/ib_pack.h
===================================================================
--- infiniband.orig/include/rdma/ib_pack.h
+++ infiniband/include/rdma/ib_pack.h
@@ -75,6 +75,7 @@ enum {
IB_OPCODE_UC = 0x20,
IB_OPCODE_RD = 0x40,
IB_OPCODE_UD = 0x60,
+ IB_OPCODE_CN = 0x80,
/* operations -- just used to define real constants */
IB_OPCODE_SEND_FIRST = 0x00,
@@ -98,6 +99,10 @@ enum {
IB_OPCODE_ATOMIC_ACKNOWLEDGE = 0x12,
IB_OPCODE_COMPARE_SWAP = 0x13,
IB_OPCODE_FETCH_ADD = 0x14,
+ IB_OPCODE_CNP = 0x00,
+ IB_OPCODE_RESYNC = 0x15,
+ IB_OPCODE_SEND_LAST_INV = 0x16,
+ IB_OPCODE_SEND_ONLY_INV = 0x17,
/* real constants follow -- see comment about above IB_OPCODE()
macro for more details */
@@ -124,6 +129,8 @@ enum {
IB_OPCODE(RC, ATOMIC_ACKNOWLEDGE),
IB_OPCODE(RC, COMPARE_SWAP),
IB_OPCODE(RC, FETCH_ADD),
+ IB_OPCODE(RC, SEND_LAST_INV),
+ IB_OPCODE(RC, SEND_ONLY_INV),
/* UC */
IB_OPCODE(UC, SEND_FIRST),
@@ -161,10 +168,15 @@ enum {
IB_OPCODE(RD, ATOMIC_ACKNOWLEDGE),
IB_OPCODE(RD, COMPARE_SWAP),
IB_OPCODE(RD, FETCH_ADD),
+ IB_OPCODE(RD, RESYNC),
/* UD */
IB_OPCODE(UD, SEND_ONLY),
- IB_OPCODE(UD, SEND_ONLY_WITH_IMMEDIATE)
+ IB_OPCODE(UD, SEND_ONLY_WITH_IMMEDIATE),
+
+ /* CN */
+ IB_OPCODE(CN, CNP),
+
};
enum {
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 02/44] rxe_hdr.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 01/44] ib_pack.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 03/44] rxe_opcode.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (42 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch2 --]
[-- Type: text/plain, Size: 33398 bytes --]
Add declarations for data structures used to hold per opcode
and per work request opcode tables.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_hdr.h | 1294 ++++++++++++++++++++++++++++++++++++
1 file changed, 1294 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_hdr.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_hdr.h
@@ -0,0 +1,1294 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_HDR_H
+#define RXE_HDR_H
+
+/* extracted information about a packet carried in an sk_buff struct
+ fits in the skbuff cb array. Must be at most 48 bytes. */
+struct rxe_pkt_info {
+ struct rxe_dev *rxe; /* device that owns packet */
+ struct rxe_qp *qp; /* qp that owns packet */
+ u8 *hdr; /* points to grh or bth */
+ u32 mask; /* useful info about pkt */
+ u32 psn; /* bth psn of packet */
+ u16 pkey_index; /* partition of pkt */
+ u16 paylen; /* length of bth - icrc */
+ u8 port_num; /* port pkt received on */
+ u8 opcode; /* bth opcode of packet */
+ u8 offset; /* bth offset from pkt->hdr */
+};
+
+#define SKB_TO_PKT(skb) ((struct rxe_pkt_info *)(skb)->cb)
+#define PKT_TO_SKB(pkt) ((struct sk_buff *)((char *)(pkt) \
+ - offsetof(struct sk_buff, cb)))
+
+/*
+ * IBA header types and methods
+ *
+ * Some of these are for reference and completeness only since
+ * rxe does not currently support RD transport
+ * most of this could be moved into OFED core. ib_pack.h has
+ * part of this but is incomplete
+ *
+ * Header specific routines to insert/extract values to/from headers
+ * the routines that are named __hhh_(set_)fff() take a pointer to a
+ * hhh header and get(set) the fff field. The routines named
+ * hhh_(set_)fff take a packet info struct and find the
+ * header and field based on the opcode in the packet.
+ * Conversion to/from network byte order from cpu order is also done.
+ */
+
+#define RXE_ICRC_SIZE (4)
+#define RXE_MAX_HDR_LENGTH (80)
+
+/******************************************************************************
+ * Local Route Header
+ ******************************************************************************/
+struct rxe_lrh {
+ u8 vlver; /* vl and lver */
+ u8 slnh; /* sl and lnh */
+ __be16 dlid;
+ __be16 length;
+ __be16 slid;
+};
+
+#define LRH_LVER (0)
+
+#define LRH_VL_MASK (0xf0)
+#define LRH_LVER_MASK (0x0f)
+#define LRH_SL_MASK (0xf0)
+#define LRH_RESV2_MASK (0x0c)
+#define LRH_LNH_MASK (0x03)
+#define LRH_RESV5_MASK (0xf800)
+#define LRH_LENGTH_MASK (0x07ff)
+
+enum lrh_lnh {
+ LRH_LNH_RAW = 0,
+ LRH_LNH_IP = 1,
+ LRH_LNH_IBA_LOC = 2,
+ LRH_LNH_IBA_GBL = 3,
+};
+
+static inline u8 __lrh_vl(void *arg)
+{
+ struct rxe_lrh *lrh = arg;
+ return lrh->vlver >> 4;
+}
+
+static inline void __lrh_set_vl(void *arg, u8 vl)
+{
+ struct rxe_lrh *lrh = arg;
+ lrh->vlver = (vl << 4) | (~LRH_VL_MASK & lrh->vlver);
+}
+
+static inline u8 __lrh_lver(void *arg)
+{
+ struct rxe_lrh *lrh = arg;
+ return lrh->vlver & LRH_LVER_MASK;
+}
+
+static inline void __lrh_set_lver(void *arg, u8 lver)
+{
+ struct rxe_lrh *lrh = arg;
+ lrh->vlver = (lver & LRH_LVER_MASK) | (~LRH_LVER_MASK & lrh->vlver);
+}
+
+static inline u8 __lrh_sl(void *arg)
+{
+ struct rxe_lrh *lrh = arg;
+ return lrh->slnh >> 4;
+}
+
+static inline void __lrh_set_sl(void *arg, u8 sl)
+{
+ struct rxe_lrh *lrh = arg;
+ lrh->slnh = (sl << 4) | (~LRH_SL_MASK & lrh->slnh);
+}
+
+static inline u8 __lrh_lnh(void *arg)
+{
+ struct rxe_lrh *lrh = arg;
+ return lrh->slnh & LRH_LNH_MASK;
+}
+
+static inline void __lrh_set_lnh(void *arg, u8 lnh)
+{
+ struct rxe_lrh *lrh = arg;
+ lrh->slnh = (lnh & LRH_LNH_MASK) | (~LRH_LNH_MASK & lrh->slnh);
+}
+
+static inline u8 __lrh_dlid(void *arg)
+{
+ struct rxe_lrh *lrh = arg;
+ return be16_to_cpu(lrh->dlid);
+}
+
+static inline void __lrh_set_dlid(void *arg, u16 dlid)
+{
+ struct rxe_lrh *lrh = arg;
+ lrh->dlid = cpu_to_be16(dlid);
+}
+
+static inline u8 __lrh_length(void *arg)
+{
+ struct rxe_lrh *lrh = arg;
+ return be16_to_cpu(lrh->length) & LRH_LENGTH_MASK;
+}
+
+static inline void __lrh_set_length(void *arg, u16 length)
+{
+ struct rxe_lrh *lrh = arg;
+ u16 was = be16_to_cpu(lrh->length);
+ lrh->length = cpu_to_be16((length & LRH_LENGTH_MASK)
+ | (~LRH_LENGTH_MASK & was));
+}
+
+static inline u8 __lrh_slid(void *arg)
+{
+ struct rxe_lrh *lrh = arg;
+ return be16_to_cpu(lrh->slid);
+}
+
+static inline void __lrh_set_slid(void *arg, u16 slid)
+{
+ struct rxe_lrh *lrh = arg;
+ lrh->slid = cpu_to_be16(slid);
+}
+
+static inline void lrh_init(struct rxe_pkt_info *pkt, u8 vl, u8 sl,
+ u8 lnh, u16 length, u16 dlid, u16 slid)
+{
+ struct rxe_lrh *lrh = (struct rxe_lrh *)pkt->hdr;
+
+ lrh->vlver = (vl << 4);
+ lrh->slnh = (sl << 4) | (lnh & LRH_LNH_MASK);
+ lrh->length = cpu_to_be16(length & LRH_LENGTH_MASK);
+ lrh->dlid = cpu_to_be16(dlid);
+ lrh->slid = cpu_to_be16(slid);
+}
+
+/******************************************************************************
+ * Global Route Header
+ ******************************************************************************/
+struct rxe_grh {
+ __be32 vflow; /* ipver, tclass and flow */
+ __be16 paylen;
+ u8 next_hdr;
+ u8 hop_limit;
+ union ib_gid sgid;
+ union ib_gid dgid;
+};
+
+#define GRH_IPV6 (6)
+#define GRH_RXE_NEXT_HDR (0x1b)
+
+#define GRH_IPVER_MASK (0xf0000000)
+#define GRH_TCLASS_MASK (0x0ff00000)
+#define GRH_FLOW_MASK (0x000fffff)
+
+static inline u8 __grh_ipver(void *arg)
+{
+ struct rxe_grh *grh = arg;
+ return (GRH_IPVER_MASK & be32_to_cpu(grh->vflow)) >> 28;
+}
+
+static inline void __grh_set_ipver(void *arg, u8 ipver)
+{
+ struct rxe_grh *grh = arg;
+ u32 vflow = be32_to_cpu(grh->vflow);
+ grh->vflow = cpu_to_be32((GRH_IPVER_MASK & (ipver << 28))
+ | (~GRH_IPVER_MASK & vflow));
+}
+
+static inline u8 __grh_tclass(void *arg)
+{
+ struct rxe_grh *grh = arg;
+ return (GRH_TCLASS_MASK & be32_to_cpu(grh->vflow)) >> 20;
+}
+
+static inline void __grh_set_tclass(void *arg, u8 tclass)
+{
+ struct rxe_grh *grh = arg;
+ u32 vflow = be32_to_cpu(grh->vflow);
+ grh->vflow = cpu_to_be32((GRH_TCLASS_MASK & (tclass << 20))
+ | (~GRH_TCLASS_MASK & vflow));
+}
+
+static inline u32 __grh_flow(void *arg)
+{
+ struct rxe_grh *grh = arg;
+ return GRH_FLOW_MASK & be32_to_cpu(grh->vflow);
+}
+
+static inline void __grh_set_flow(void *arg, u32 flow)
+{
+ struct rxe_grh *grh = arg;
+ u32 vflow = be32_to_cpu(grh->vflow);
+ grh->vflow = cpu_to_be32((GRH_FLOW_MASK & flow)
+ | (~GRH_FLOW_MASK & vflow));
+}
+
+static inline void __grh_set_vflow(void *arg, u8 tclass, u32 flow)
+{
+ struct rxe_grh *grh = arg;
+ grh->vflow = cpu_to_be32((GRH_IPV6 << 28) | (tclass << 20)
+ | (flow & GRH_FLOW_MASK));
+}
+
+static inline u16 __grh_paylen(void *arg)
+{
+ struct rxe_grh *grh = arg;
+ return be16_to_cpu(grh->paylen);
+}
+
+static inline void __grh_set_paylen(void *arg, u16 paylen)
+{
+ struct rxe_grh *grh = arg;
+ grh->paylen = cpu_to_be16(paylen);
+}
+
+static inline u8 __grh_next_hdr(void *arg)
+{
+ struct rxe_grh *grh = arg;
+ return grh->next_hdr;
+}
+
+static inline void __grh_set_next_hdr(void *arg, u8 next_hdr)
+{
+ struct rxe_grh *grh = arg;
+ grh->next_hdr = next_hdr;
+}
+
+static inline u8 __grh_hop_limit(void *arg)
+{
+ struct rxe_grh *grh = arg;
+ return grh->hop_limit;
+}
+
+static inline void __grh_set_hop_limit(void *arg, u8 hop_limit)
+{
+ struct rxe_grh *grh = arg;
+ grh->hop_limit = hop_limit;
+}
+
+static inline union ib_gid *__grh_sgid(void *arg)
+{
+ struct rxe_grh *grh = arg;
+ return &grh->sgid;
+}
+
+static inline void __grh_set_sgid(void *arg, __be64 prefix, __be64 guid)
+{
+ struct rxe_grh *grh = arg;
+ grh->sgid.global.subnet_prefix = prefix;
+ grh->sgid.global.interface_id = guid;
+}
+
+static inline union ib_gid *__grh_dgid(void *arg)
+{
+ struct rxe_grh *grh = arg;
+ return &grh->dgid;
+}
+
+static inline void __grh_set_dgid(void *arg, __be64 prefix, __be64 guid)
+{
+ struct rxe_grh *grh = arg;
+ grh->dgid.global.subnet_prefix = prefix;
+ grh->dgid.global.interface_id = guid;
+}
+
+static inline u8 grh_ipver(struct rxe_pkt_info *pkt)
+{
+ return __grh_ipver(pkt->hdr);
+}
+
+static inline void grh_set_ipver(struct rxe_pkt_info *pkt, u8 ipver)
+{
+ __grh_set_ipver(pkt->hdr, ipver);
+ return;
+}
+
+static inline u8 grh_tclass(struct rxe_pkt_info *pkt)
+{
+ return __grh_tclass(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh));
+}
+
+static inline void grh_set_tclass(struct rxe_pkt_info *pkt, u8 tclass)
+{
+ __grh_set_tclass(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh), tclass);
+ return;
+}
+
+static inline u32 grh_flow(struct rxe_pkt_info *pkt)
+{
+ return __grh_flow(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh));
+}
+
+static inline void grh_set_flow(struct rxe_pkt_info *pkt, u32 flow)
+{
+ __grh_set_flow(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh), flow);
+ return;
+}
+
+static inline u16 grh_paylen(struct rxe_pkt_info *pkt)
+{
+ return __grh_paylen(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh));
+}
+
+static inline void grh_set_paylen(struct rxe_pkt_info *pkt, u16 paylen)
+{
+ __grh_set_paylen(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh), paylen);
+ return;
+}
+
+static inline u8 grh_next_hdr(struct rxe_pkt_info *pkt)
+{
+ return __grh_next_hdr(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh));
+}
+
+static inline void grh_set_next_hdr(struct rxe_pkt_info *pkt, u8 next_hdr)
+{
+ __grh_set_next_hdr(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh), next_hdr);
+ return;
+}
+
+static inline u8 grh_hop_limit(struct rxe_pkt_info *pkt)
+{
+ return __grh_hop_limit(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh));
+}
+
+static inline void grh_set_hop_limit(struct rxe_pkt_info *pkt, u8 hop_limit)
+{
+ __grh_set_hop_limit(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh), hop_limit);
+ return;
+}
+
+static inline union ib_gid *grh_sgid(struct rxe_pkt_info *pkt)
+{
+ return __grh_sgid(pkt->hdr + pkt->offset - sizeof(struct rxe_grh));
+}
+
+static inline void grh_set_sgid(struct rxe_pkt_info *pkt,
+ __be64 prefix, __be64 guid)
+{
+ __grh_set_sgid(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh), prefix, guid);
+ return;
+}
+
+static inline union ib_gid *grh_dgid(struct rxe_pkt_info *pkt)
+{
+ return __grh_dgid(pkt->hdr + pkt->offset - sizeof(struct rxe_grh));
+}
+
+static inline void grh_set_dgid(struct rxe_pkt_info *pkt,
+ __be64 prefix, __be64 guid)
+{
+ __grh_set_dgid(pkt->hdr + pkt->offset
+ - sizeof(struct rxe_grh), prefix, guid);
+ return;
+}
+
+static inline void grh_init(struct rxe_pkt_info *pkt, u8 tclass, u32 flow,
+ u16 paylen, u8 hop_limit)
+{
+ struct rxe_grh *grh = (struct rxe_grh *)pkt->hdr
+ + pkt->offset - sizeof(struct rxe_grh);
+
+ grh->vflow = cpu_to_be32((GRH_IPV6 << 28) | (tclass << 20)
+ | (flow & GRH_FLOW_MASK));
+ grh->paylen = cpu_to_be16(paylen);
+ grh->next_hdr = GRH_RXE_NEXT_HDR;
+ grh->hop_limit = hop_limit;
+}
+
+/******************************************************************************
+ * Base Transport Header
+ ******************************************************************************/
+struct rxe_bth {
+ u8 opcode;
+ u8 flags;
+ __be16 pkey;
+ __be32 qpn;
+ __be32 apsn;
+};
+
+#define BTH_TVER (0)
+#define BTH_DEF_PKEY (0xffff)
+
+#define BTH_SE_MASK (0x80)
+#define BTH_MIG_MASK (0x40)
+#define BTH_PAD_MASK (0x30)
+#define BTH_TVER_MASK (0x0f)
+#define BTH_FECN_MASK (0x80000000)
+#define BTH_BECN_MASK (0x40000000)
+#define BTH_RESV6A_MASK (0x3f000000)
+#define BTH_QPN_MASK (0x00ffffff)
+#define BTH_ACK_MASK (0x80000000)
+#define BTH_RESV7_MASK (0x7f000000)
+#define BTH_PSN_MASK (0x00ffffff)
+
+static inline u8 __bth_opcode(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return bth->opcode;
+}
+
+static inline void __bth_set_opcode(void *arg, u8 opcode)
+{
+ struct rxe_bth *bth = arg;
+ bth->opcode = opcode;
+}
+
+static inline u8 __bth_se(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return 0 != (BTH_SE_MASK & bth->flags);
+}
+
+static inline void __bth_set_se(void *arg, int se)
+{
+ struct rxe_bth *bth = arg;
+ if (se)
+ bth->flags |= BTH_SE_MASK;
+ else
+ bth->flags &= ~BTH_SE_MASK;
+}
+
+static inline u8 __bth_mig(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return 0 != (BTH_MIG_MASK & bth->flags);
+}
+
+static inline void __bth_set_mig(void *arg, u8 mig)
+{
+ struct rxe_bth *bth = arg;
+ if (mig)
+ bth->flags |= BTH_MIG_MASK;
+ else
+ bth->flags &= ~BTH_MIG_MASK;
+}
+
+static inline u8 __bth_pad(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return (BTH_PAD_MASK & bth->flags) >> 4;
+}
+
+static inline void __bth_set_pad(void *arg, u8 pad)
+{
+ struct rxe_bth *bth = arg;
+ bth->flags = (BTH_PAD_MASK & (pad << 4)) |
+ (~BTH_PAD_MASK & bth->flags);
+}
+
+static inline u8 __bth_tver(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return BTH_TVER_MASK & bth->flags;
+}
+
+static inline void __bth_set_tver(void *arg, u8 tver)
+{
+ struct rxe_bth *bth = arg;
+ bth->flags = (BTH_TVER_MASK & tver) |
+ (~BTH_TVER_MASK & bth->flags);
+}
+
+static inline u16 __bth_pkey(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return be16_to_cpu(bth->pkey);
+}
+
+static inline void __bth_set_pkey(void *arg, u16 pkey)
+{
+ struct rxe_bth *bth = arg;
+ bth->pkey = cpu_to_be16(pkey);
+}
+
+static inline u32 __bth_qpn(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return BTH_QPN_MASK & be32_to_cpu(bth->qpn);
+}
+
+static inline void __bth_set_qpn(void *arg, u32 qpn)
+{
+ struct rxe_bth *bth = arg;
+ u32 resvqpn = be32_to_cpu(bth->qpn);
+ bth->qpn = cpu_to_be32((BTH_QPN_MASK & qpn) |
+ (~BTH_QPN_MASK & resvqpn));
+}
+
+static inline int __bth_fecn(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return 0 != (__constant_cpu_to_be32(BTH_FECN_MASK) & bth->qpn);
+}
+
+static inline void __bth_set_fecn(void *arg, int fecn)
+{
+ struct rxe_bth *bth = arg;
+ if (fecn)
+ bth->qpn |= __constant_cpu_to_be32(BTH_FECN_MASK);
+ else
+ bth->qpn &= ~__constant_cpu_to_be32(BTH_FECN_MASK);
+}
+
+static inline int __bth_becn(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return 0 != (__constant_cpu_to_be32(BTH_BECN_MASK) & bth->qpn);
+}
+
+static inline void __bth_set_becn(void *arg, int becn)
+{
+ struct rxe_bth *bth = arg;
+ if (becn)
+ bth->qpn |= __constant_cpu_to_be32(BTH_BECN_MASK);
+ else
+ bth->qpn &= ~__constant_cpu_to_be32(BTH_BECN_MASK);
+}
+
+static inline u8 __bth_resv6a(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return (BTH_RESV6A_MASK & be32_to_cpu(bth->qpn)) >> 24;
+}
+
+static inline void __bth_set_resv6a(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ bth->qpn = __constant_cpu_to_be32(~BTH_RESV6A_MASK);
+}
+
+static inline int __bth_ack(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return 0 != (__constant_cpu_to_be32(BTH_ACK_MASK) & bth->apsn);
+}
+
+static inline void __bth_set_ack(void *arg, int ack)
+{
+ struct rxe_bth *bth = arg;
+ if (ack)
+ bth->apsn |= __constant_cpu_to_be32(BTH_ACK_MASK);
+ else
+ bth->apsn &= ~__constant_cpu_to_be32(BTH_ACK_MASK);
+}
+
+static inline void __bth_set_resv7(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ bth->apsn &= ~__constant_cpu_to_be32(BTH_RESV7_MASK);
+}
+
+static inline u32 __bth_psn(void *arg)
+{
+ struct rxe_bth *bth = arg;
+ return BTH_PSN_MASK & be32_to_cpu(bth->apsn);
+}
+
+static inline void __bth_set_psn(void *arg, u32 psn)
+{
+ struct rxe_bth *bth = arg;
+ u32 apsn = be32_to_cpu(bth->apsn);
+ bth->apsn = cpu_to_be32((BTH_PSN_MASK & psn) |
+ (~BTH_PSN_MASK & apsn));
+}
+
+static inline u8 bth_opcode(struct rxe_pkt_info *pkt)
+{
+ return __bth_opcode(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_opcode(struct rxe_pkt_info *pkt, u8 opcode)
+{
+ __bth_set_opcode(pkt->hdr + pkt->offset, opcode);
+ return;
+}
+
+static inline u8 bth_se(struct rxe_pkt_info *pkt)
+{
+ return __bth_se(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_se(struct rxe_pkt_info *pkt, int se)
+{
+ __bth_set_se(pkt->hdr + pkt->offset, se);
+ return;
+}
+
+static inline u8 bth_mig(struct rxe_pkt_info *pkt)
+{
+ return __bth_mig(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_mig(struct rxe_pkt_info *pkt, u8 mig)
+{
+ __bth_set_mig(pkt->hdr + pkt->offset, mig);
+ return;
+}
+
+static inline u8 bth_pad(struct rxe_pkt_info *pkt)
+{
+ return __bth_pad(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_pad(struct rxe_pkt_info *pkt, u8 pad)
+{
+ __bth_set_pad(pkt->hdr + pkt->offset, pad);
+ return;
+}
+
+static inline u8 bth_tver(struct rxe_pkt_info *pkt)
+{
+ return __bth_tver(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_tver(struct rxe_pkt_info *pkt, u8 tver)
+{
+ __bth_set_tver(pkt->hdr + pkt->offset, tver);
+ return;
+}
+
+static inline u16 bth_pkey(struct rxe_pkt_info *pkt)
+{
+ return __bth_pkey(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_pkey(struct rxe_pkt_info *pkt, u16 pkey)
+{
+ __bth_set_pkey(pkt->hdr + pkt->offset, pkey);
+ return;
+}
+
+static inline u32 bth_qpn(struct rxe_pkt_info *pkt)
+{
+ return __bth_qpn(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_qpn(struct rxe_pkt_info *pkt, u32 qpn)
+{
+ __bth_set_qpn(pkt->hdr + pkt->offset, qpn);
+ return;
+}
+
+static inline int bth_fecn(struct rxe_pkt_info *pkt)
+{
+ return __bth_fecn(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_fecn(struct rxe_pkt_info *pkt, int fecn)
+{
+ __bth_set_fecn(pkt->hdr + pkt->offset, fecn);
+ return;
+}
+
+static inline int bth_becn(struct rxe_pkt_info *pkt)
+{
+ return __bth_becn(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_becn(struct rxe_pkt_info *pkt, int becn)
+{
+ __bth_set_becn(pkt->hdr + pkt->offset, becn);
+ return;
+}
+
+static inline u8 bth_resv6a(struct rxe_pkt_info *pkt)
+{
+ return __bth_resv6a(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_resv6a(struct rxe_pkt_info *pkt)
+{
+ __bth_set_resv6a(pkt->hdr + pkt->offset);
+ return;
+}
+
+static inline int bth_ack(struct rxe_pkt_info *pkt)
+{
+ return __bth_ack(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_ack(struct rxe_pkt_info *pkt, int ack)
+{
+ __bth_set_ack(pkt->hdr + pkt->offset, ack);
+ return;
+}
+
+static inline void bth_set_resv7(struct rxe_pkt_info *pkt)
+{
+ __bth_set_resv7(pkt->hdr + pkt->offset);
+ return;
+}
+
+static inline u32 bth_psn(struct rxe_pkt_info *pkt)
+{
+ return __bth_psn(pkt->hdr + pkt->offset);
+}
+
+static inline void bth_set_psn(struct rxe_pkt_info *pkt, u32 psn)
+{
+ __bth_set_psn(pkt->hdr + pkt->offset, psn);
+ return;
+}
+
+static inline void bth_init(struct rxe_pkt_info *pkt, u8 opcode, int se,
+ int mig, int pad, u16 pkey, u32 qpn, int ack_req, u32 psn)
+{
+ struct rxe_bth *bth = (struct rxe_bth *)(pkt->hdr + pkt->offset);
+
+ bth->opcode = opcode;
+ bth->flags = (pad << 4) & BTH_PAD_MASK;
+ if (se)
+ bth->flags |= BTH_SE_MASK;
+ if (mig)
+ bth->flags |= BTH_MIG_MASK;
+ bth->pkey = cpu_to_be16(pkey);
+ bth->qpn = cpu_to_be32(qpn & BTH_QPN_MASK);
+ psn &= BTH_PSN_MASK;
+ if (ack_req)
+ psn |= BTH_ACK_MASK;
+ bth->apsn = cpu_to_be32(psn);
+}
+
+/******************************************************************************
+ * Reliable Datagram Extended Transport Header
+ ******************************************************************************/
+struct rxe_rdeth {
+ __be32 een;
+};
+
+#define RDETH_EEN_MASK (0x00ffffff)
+
+static inline u8 __rdeth_een(void *arg)
+{
+ struct rxe_rdeth *rdeth = arg;
+ return RDETH_EEN_MASK & be32_to_cpu(rdeth->een);
+}
+
+static inline void __rdeth_set_een(void *arg, u32 een)
+{
+ struct rxe_rdeth *rdeth = arg;
+ rdeth->een = cpu_to_be32(RDETH_EEN_MASK & een);
+}
+
+static inline u8 rdeth_een(struct rxe_pkt_info *pkt)
+{
+ return __rdeth_een(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_RDETH]);
+}
+
+static inline void rdeth_set_een(struct rxe_pkt_info *pkt, u32 een)
+{
+ __rdeth_set_een(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_RDETH], een);
+ return;
+}
+
+/******************************************************************************
+ * Datagram Extended Transport Header
+ ******************************************************************************/
+struct rxe_deth {
+ __be32 qkey;
+ __be32 sqp;
+};
+
+#define GSI_QKEY (0x80010000)
+#define DETH_SQP_MASK (0x00ffffff)
+
+static inline u32 __deth_qkey(void *arg)
+{
+ struct rxe_deth *deth = arg;
+ return be32_to_cpu(deth->qkey);
+}
+
+static inline void __deth_set_qkey(void *arg, u32 qkey)
+{
+ struct rxe_deth *deth = arg;
+ deth->qkey = cpu_to_be32(qkey);
+}
+
+static inline u32 __deth_sqp(void *arg)
+{
+ struct rxe_deth *deth = arg;
+ return DETH_SQP_MASK & be32_to_cpu(deth->sqp);
+}
+
+static inline void __deth_set_sqp(void *arg, u32 sqp)
+{
+ struct rxe_deth *deth = arg;
+ deth->sqp = cpu_to_be32(DETH_SQP_MASK & sqp);
+}
+
+static inline u32 deth_qkey(struct rxe_pkt_info *pkt)
+{
+ return __deth_qkey(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_DETH]);
+}
+
+static inline void deth_set_qkey(struct rxe_pkt_info *pkt, u32 qkey)
+{
+ __deth_set_qkey(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_DETH], qkey);
+ return;
+}
+
+static inline u32 deth_sqp(struct rxe_pkt_info *pkt)
+{
+ return __deth_sqp(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_DETH]);
+}
+
+static inline void deth_set_sqp(struct rxe_pkt_info *pkt, u32 sqp)
+{
+ __deth_set_sqp(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_DETH], sqp);
+ return;
+}
+
+/******************************************************************************
+ * RDMA Extended Transport Header
+ ******************************************************************************/
+struct rxe_reth {
+ __be64 va;
+ __be32 rkey;
+ __be32 len;
+};
+
+static inline u64 __reth_va(void *arg)
+{
+ struct rxe_reth *reth = arg;
+ return be64_to_cpu(reth->va);
+}
+
+static inline void __reth_set_va(void *arg, u64 va)
+{
+ struct rxe_reth *reth = arg;
+ reth->va = cpu_to_be64(va);
+}
+
+static inline u32 __reth_rkey(void *arg)
+{
+ struct rxe_reth *reth = arg;
+ return be32_to_cpu(reth->rkey);
+}
+
+static inline void __reth_set_rkey(void *arg, u32 rkey)
+{
+ struct rxe_reth *reth = arg;
+ reth->rkey = cpu_to_be32(rkey);
+}
+
+static inline u32 __reth_len(void *arg)
+{
+ struct rxe_reth *reth = arg;
+ return be32_to_cpu(reth->len);
+}
+
+static inline void __reth_set_len(void *arg, u32 len)
+{
+ struct rxe_reth *reth = arg;
+ reth->len = cpu_to_be32(len);
+}
+
+static inline u64 reth_va(struct rxe_pkt_info *pkt)
+{
+ return __reth_va(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_RETH]);
+}
+
+static inline void reth_set_va(struct rxe_pkt_info *pkt, u64 va)
+{
+ __reth_set_va(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_RETH], va);
+ return;
+}
+
+static inline u32 reth_rkey(struct rxe_pkt_info *pkt)
+{
+ return __reth_rkey(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_RETH]);
+}
+
+static inline void reth_set_rkey(struct rxe_pkt_info *pkt, u32 rkey)
+{
+ __reth_set_rkey(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_RETH], rkey);
+ return;
+}
+
+static inline u32 reth_len(struct rxe_pkt_info *pkt)
+{
+ return __reth_len(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_RETH]);
+}
+
+static inline void reth_set_len(struct rxe_pkt_info *pkt, u32 len)
+{
+ __reth_set_len(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_RETH], len);
+ return;
+}
+
+/******************************************************************************
+ * Atomic Extended Transport Header
+ ******************************************************************************/
+struct rxe_atmeth {
+ __be64 va;
+ __be32 rkey;
+ __be64 swap_add;
+ __be64 comp;
+} __attribute__((__packed__));
+
+static inline u64 __atmeth_va(void *arg)
+{
+ struct rxe_atmeth *atmeth = arg;
+ return be64_to_cpu(atmeth->va);
+}
+
+static inline void __atmeth_set_va(void *arg, u64 va)
+{
+ struct rxe_atmeth *atmeth = arg;
+ atmeth->va = cpu_to_be64(va);
+}
+
+static inline u32 __atmeth_rkey(void *arg)
+{
+ struct rxe_atmeth *atmeth = arg;
+ return be32_to_cpu(atmeth->rkey);
+}
+
+static inline void __atmeth_set_rkey(void *arg, u32 rkey)
+{
+ struct rxe_atmeth *atmeth = arg;
+ atmeth->rkey = cpu_to_be32(rkey);
+}
+
+static inline u64 __atmeth_swap_add(void *arg)
+{
+ struct rxe_atmeth *atmeth = arg;
+ return be64_to_cpu(atmeth->swap_add);
+}
+
+static inline void __atmeth_set_swap_add(void *arg, u64 swap_add)
+{
+ struct rxe_atmeth *atmeth = arg;
+ atmeth->swap_add = cpu_to_be64(swap_add);
+}
+
+static inline u64 __atmeth_comp(void *arg)
+{
+ struct rxe_atmeth *atmeth = arg;
+ return be64_to_cpu(atmeth->comp);
+}
+
+static inline void __atmeth_set_comp(void *arg, u64 comp)
+{
+ struct rxe_atmeth *atmeth = arg;
+ atmeth->comp = cpu_to_be64(comp);
+}
+
+static inline u64 atmeth_va(struct rxe_pkt_info *pkt)
+{
+ return __atmeth_va(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMETH]);
+}
+
+static inline void atmeth_set_va(struct rxe_pkt_info *pkt, u64 va)
+{
+ __atmeth_set_va(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMETH], va);
+ return;
+}
+
+static inline u32 atmeth_rkey(struct rxe_pkt_info *pkt)
+{
+ return __atmeth_rkey(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMETH]);
+}
+
+static inline void atmeth_set_rkey(struct rxe_pkt_info *pkt, u32 rkey)
+{
+ __atmeth_set_rkey(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMETH], rkey);
+ return;
+}
+
+static inline u64 atmeth_swap_add(struct rxe_pkt_info *pkt)
+{
+ return __atmeth_swap_add(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMETH]);
+}
+
+static inline void atmeth_set_swap_add(struct rxe_pkt_info *pkt, u64 swap_add)
+{
+ __atmeth_set_swap_add(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMETH], swap_add);
+ return;
+}
+
+static inline u64 atmeth_comp(struct rxe_pkt_info *pkt)
+{
+ return __atmeth_comp(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMETH]);
+}
+
+static inline void atmeth_set_comp(struct rxe_pkt_info *pkt, u64 comp)
+{
+ __atmeth_set_comp(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMETH], comp);
+ return;
+}
+
+/******************************************************************************
+ * Ack Extended Transport Header
+ ******************************************************************************/
+struct rxe_aeth {
+ __be32 smsn;
+};
+
+#define AETH_SYN_MASK (0xff000000)
+#define AETH_MSN_MASK (0x00ffffff)
+
+enum aeth_syndrome {
+ AETH_TYPE_MASK = 0xe0,
+ AETH_ACK = 0x00,
+ AETH_RNR_NAK = 0x20,
+ AETH_RSVD = 0x40,
+ AETH_NAK = 0x60,
+ AETH_ACK_UNLIMITED = 0x1f,
+ AETH_NAK_PSN_SEQ_ERROR = 0x60,
+ AETH_NAK_INVALID_REQ = 0x61,
+ AETH_NAK_REM_ACC_ERR = 0x62,
+ AETH_NAK_REM_OP_ERR = 0x63,
+ AETH_NAK_INV_RD_REQ = 0x64,
+};
+
+static inline u8 __aeth_syn(void *arg)
+{
+ struct rxe_aeth *aeth = arg;
+ return (AETH_SYN_MASK & be32_to_cpu(aeth->smsn)) >> 24;
+}
+
+static inline void __aeth_set_syn(void *arg, u8 syn)
+{
+ struct rxe_aeth *aeth = arg;
+ u32 smsn = be32_to_cpu(aeth->smsn);
+ aeth->smsn = cpu_to_be32((AETH_SYN_MASK & (syn << 24)) |
+ (~AETH_SYN_MASK & smsn));
+}
+
+static inline u32 __aeth_msn(void *arg)
+{
+ struct rxe_aeth *aeth = arg;
+ return AETH_MSN_MASK & be32_to_cpu(aeth->smsn);
+}
+
+static inline void __aeth_set_msn(void *arg, u32 msn)
+{
+ struct rxe_aeth *aeth = arg;
+ u32 smsn = be32_to_cpu(aeth->smsn);
+ aeth->smsn = cpu_to_be32((AETH_MSN_MASK & msn) |
+ (~AETH_MSN_MASK & smsn));
+}
+
+static inline u8 aeth_syn(struct rxe_pkt_info *pkt)
+{
+ return __aeth_syn(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_AETH]);
+}
+
+static inline void aeth_set_syn(struct rxe_pkt_info *pkt, u8 syn)
+{
+ __aeth_set_syn(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_AETH], syn);
+ return;
+}
+
+static inline u32 aeth_msn(struct rxe_pkt_info *pkt)
+{
+ return __aeth_msn(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_AETH]);
+}
+
+static inline void aeth_set_msn(struct rxe_pkt_info *pkt, u32 msn)
+{
+ __aeth_set_msn(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_AETH], msn);
+ return;
+}
+
+/******************************************************************************
+ * Atomic Ack Extended Transport Header
+ ******************************************************************************/
+struct rxe_atmack {
+ __be64 orig;
+};
+
+static inline u64 __atmack_orig(void *arg)
+{
+ struct rxe_atmack *atmack = arg;
+ return be64_to_cpu(atmack->orig);
+}
+
+static inline void __atmack_set_orig(void *arg, u64 orig)
+{
+ struct rxe_atmack *atmack = arg;
+ atmack->orig = cpu_to_be64(orig);
+}
+
+static inline u64 atmack_orig(struct rxe_pkt_info *pkt)
+{
+ return __atmack_orig(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMACK]);
+}
+
+static inline void atmack_set_orig(struct rxe_pkt_info *pkt, u64 orig)
+{
+ __atmack_set_orig(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_ATMACK], orig);
+ return;
+}
+
+/******************************************************************************
+ * Immediate Extended Transport Header
+ ******************************************************************************/
+struct rxe_immdt {
+ __be32 imm;
+};
+
+static inline __be32 __immdt_imm(void *arg)
+{
+ struct rxe_immdt *immdt = arg;
+ return immdt->imm;
+}
+
+static inline void __immdt_set_imm(void *arg, __be32 imm)
+{
+ struct rxe_immdt *immdt = arg;
+ immdt->imm = imm;
+}
+
+static inline __be32 immdt_imm(struct rxe_pkt_info *pkt)
+{
+ return __immdt_imm(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_IMMDT]);
+}
+
+static inline void immdt_set_imm(struct rxe_pkt_info *pkt, __be32 imm)
+{
+ __immdt_set_imm(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_IMMDT], imm);
+ return;
+}
+
+/******************************************************************************
+ * Invalid Extended Transport Header
+ ******************************************************************************/
+struct rxe_ieth {
+ __be32 rkey;
+};
+
+static inline u32 __ieth_rkey(void *arg)
+{
+ struct rxe_ieth *ieth = arg;
+ return be32_to_cpu(ieth->rkey);
+}
+
+static inline void __ieth_set_rkey(void *arg, u32 rkey)
+{
+ struct rxe_ieth *ieth = arg;
+ ieth->rkey = cpu_to_be32(rkey);
+}
+
+static inline u32 ieth_rkey(struct rxe_pkt_info *pkt)
+{
+ return __ieth_rkey(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_IETH]);
+}
+
+static inline void ieth_set_rkey(struct rxe_pkt_info *pkt, u32 rkey)
+{
+ __ieth_set_rkey(pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_IETH], rkey);
+ return;
+}
+
+enum rxe_hdr_length {
+ RXE_LRH_BYTES = sizeof(struct rxe_lrh),
+ RXE_GRH_BYTES = sizeof(struct rxe_grh),
+ RXE_BTH_BYTES = sizeof(struct rxe_bth),
+ RXE_DETH_BYTES = sizeof(struct rxe_deth),
+ RXE_IMMDT_BYTES = sizeof(struct rxe_immdt),
+ RXE_RETH_BYTES = sizeof(struct rxe_reth),
+ RXE_AETH_BYTES = sizeof(struct rxe_aeth),
+ RXE_ATMACK_BYTES = sizeof(struct rxe_atmack),
+ RXE_ATMETH_BYTES = sizeof(struct rxe_atmeth),
+ RXE_IETH_BYTES = sizeof(struct rxe_ieth),
+ RXE_RDETH_BYTES = sizeof(struct rxe_rdeth),
+};
+
+static inline size_t header_size(struct rxe_pkt_info *pkt)
+{
+ return pkt->offset + rxe_opcode[pkt->opcode].length;
+}
+
+static inline void *payload_addr(struct rxe_pkt_info *pkt)
+{
+ return pkt->hdr + pkt->offset
+ + rxe_opcode[pkt->opcode].offset[RXE_PAYLOAD];
+}
+
+static inline size_t payload_size(struct rxe_pkt_info *pkt)
+{
+ return pkt->paylen - rxe_opcode[pkt->opcode].offset[RXE_PAYLOAD]
+ - bth_pad(pkt) - RXE_ICRC_SIZE;
+}
+
+#endif /* RXE_HDR_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 03/44] rxe_opcode.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 01/44] ib_pack.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 02/44] rxe_hdr.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 04/44] rxe_opcode.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (41 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch3 --]
[-- Type: text/plain, Size: 4726 bytes --]
Add declarations for data structures used to hold per opcode
and per work request opcode tables.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_opcode.h | 130 +++++++++++++++++++++++++++++++++
1 file changed, 130 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_opcode.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_opcode.h
@@ -0,0 +1,130 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_OPCODE_H
+#define RXE_OPCODE_H
+
+/*
+ * contains header bit mask definitions and header lengths
+ * declaration of the rxe_opcode_info struct and
+ * rxe_wr_opcode_info struct
+ */
+
+enum rxe_wr_mask {
+ WR_INLINE_MASK = (1 << 0),
+ WR_ATOMIC_MASK = (1 << 1),
+ WR_SEND_MASK = (1 << 2),
+ WR_READ_MASK = (1 << 3),
+ WR_WRITE_MASK = (1 << 4),
+ WR_LOCAL_MASK = (1 << 5),
+
+ WR_READ_OR_WRITE_MASK = WR_READ_MASK | WR_WRITE_MASK,
+ WR_READ_WRITE_OR_SEND_MASK = WR_READ_OR_WRITE_MASK | WR_SEND_MASK,
+ WR_WRITE_OR_SEND_MASK = WR_WRITE_MASK | WR_SEND_MASK,
+ WR_ATOMIC_OR_READ_MASK = WR_ATOMIC_MASK | WR_READ_MASK,
+};
+
+#define WR_MAX_QPT (8)
+
+struct rxe_wr_opcode_info {
+ char *name;
+ enum rxe_wr_mask mask[WR_MAX_QPT];
+};
+
+extern struct rxe_wr_opcode_info rxe_wr_opcode_info[];
+
+enum rxe_hdr_type {
+ RXE_LRH,
+ RXE_GRH,
+ RXE_BTH,
+ RXE_RETH,
+ RXE_AETH,
+ RXE_ATMETH,
+ RXE_ATMACK,
+ RXE_IETH,
+ RXE_RDETH,
+ RXE_DETH,
+ RXE_IMMDT,
+ RXE_PAYLOAD,
+ NUM_HDR_TYPES
+};
+
+enum rxe_hdr_mask {
+ RXE_LRH_MASK = (1 << RXE_LRH),
+ RXE_GRH_MASK = (1 << RXE_GRH),
+ RXE_BTH_MASK = (1 << RXE_BTH),
+ RXE_IMMDT_MASK = (1 << RXE_IMMDT),
+ RXE_RETH_MASK = (1 << RXE_RETH),
+ RXE_AETH_MASK = (1 << RXE_AETH),
+ RXE_ATMETH_MASK = (1 << RXE_ATMETH),
+ RXE_ATMACK_MASK = (1 << RXE_ATMACK),
+ RXE_IETH_MASK = (1 << RXE_IETH),
+ RXE_RDETH_MASK = (1 << RXE_RDETH),
+ RXE_DETH_MASK = (1 << RXE_DETH),
+ RXE_PAYLOAD_MASK = (1 << RXE_PAYLOAD),
+
+ RXE_REQ_MASK = (1 << (NUM_HDR_TYPES+0)),
+ RXE_ACK_MASK = (1 << (NUM_HDR_TYPES+1)),
+ RXE_SEND_MASK = (1 << (NUM_HDR_TYPES+2)),
+ RXE_WRITE_MASK = (1 << (NUM_HDR_TYPES+3)),
+ RXE_READ_MASK = (1 << (NUM_HDR_TYPES+4)),
+ RXE_ATOMIC_MASK = (1 << (NUM_HDR_TYPES+5)),
+
+ RXE_RWR_MASK = (1 << (NUM_HDR_TYPES+6)),
+ RXE_COMP_MASK = (1 << (NUM_HDR_TYPES+7)),
+
+ RXE_START_MASK = (1 << (NUM_HDR_TYPES+8)),
+ RXE_MIDDLE_MASK = (1 << (NUM_HDR_TYPES+9)),
+ RXE_END_MASK = (1 << (NUM_HDR_TYPES+10)),
+
+ RXE_CNP_MASK = (1 << (NUM_HDR_TYPES+11)),
+
+ RXE_LOOPBACK_MASK = (1 << (NUM_HDR_TYPES+12)),
+
+ RXE_READ_OR_ATOMIC = (RXE_READ_MASK | RXE_ATOMIC_MASK),
+ RXE_WRITE_OR_SEND = (RXE_WRITE_MASK | RXE_SEND_MASK),
+};
+
+#define OPCODE_NONE (-1)
+#define RXE_NUM_OPCODE 256
+
+struct rxe_opcode_info {
+ char *name;
+ enum rxe_hdr_mask mask;
+ int length;
+ int offset[NUM_HDR_TYPES];
+};
+
+extern struct rxe_opcode_info rxe_opcode[RXE_NUM_OPCODE];
+
+#endif /* RXE_OPCODE_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 04/44] rxe_opcode.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (2 preceding siblings ...)
2011-07-01 13:18 ` [patch 03/44] rxe_opcode.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 05/44] rxe_param.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (40 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch4 --]
[-- Type: text/plain, Size: 30279 bytes --]
Add data structures used to hold per opcode
and per work request opcode tables.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_opcode.c | 982 +++++++++++++++++++++++++++++++++
1 file changed, 982 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_opcode.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_opcode.c
@@ -0,0 +1,982 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <rdma/ib_pack.h>
+#include "rxe_opcode.h"
+#include "rxe_hdr.h"
+
+/* useful information about work request opcodes and pkt opcodes
+ in table form */
+
+struct rxe_wr_opcode_info rxe_wr_opcode_info[] = {
+ [IB_WR_RDMA_WRITE] = {
+ .name = "IB_WR_RDMA_WRITE",
+ .mask = {
+ [IB_QPT_RC] = WR_INLINE_MASK | WR_WRITE_MASK,
+ [IB_QPT_UC] = WR_INLINE_MASK | WR_WRITE_MASK,
+ },
+ },
+ [IB_WR_RDMA_WRITE_WITH_IMM] = {
+ .name = "IB_WR_RDMA_WRITE_WITH_IMM",
+ .mask = {
+ [IB_QPT_RC] = WR_INLINE_MASK | WR_WRITE_MASK,
+ [IB_QPT_UC] = WR_INLINE_MASK | WR_WRITE_MASK,
+ },
+ },
+ [IB_WR_SEND] = {
+ .name = "IB_WR_SEND",
+ .mask = {
+ [IB_QPT_SMI] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_GSI] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_RC] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_UC] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_UD] = WR_INLINE_MASK | WR_SEND_MASK,
+ },
+ },
+ [IB_WR_SEND_WITH_IMM] = {
+ .name = "IB_WR_SEND_WITH_IMM",
+ .mask = {
+ [IB_QPT_SMI] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_GSI] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_RC] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_UC] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_UD] = WR_INLINE_MASK | WR_SEND_MASK,
+ },
+ },
+ [IB_WR_RDMA_READ] = {
+ .name = "IB_WR_RDMA_READ",
+ .mask = {
+ [IB_QPT_RC] = WR_READ_MASK,
+ },
+ },
+ [IB_WR_ATOMIC_CMP_AND_SWP] = {
+ .name = "IB_WR_ATOMIC_CMP_AND_SWP",
+ .mask = {
+ [IB_QPT_RC] = WR_ATOMIC_MASK,
+ },
+ },
+ [IB_WR_ATOMIC_FETCH_AND_ADD] = {
+ .name = "IB_WR_ATOMIC_FETCH_AND_ADD",
+ .mask = {
+ [IB_QPT_RC] = WR_ATOMIC_MASK,
+ },
+ },
+ [IB_WR_LSO] = {
+ .name = "IB_WR_LSO",
+ .mask = {
+ /* not supported */
+ },
+ },
+ [IB_WR_SEND_WITH_INV] = {
+ .name = "IB_WR_SEND_WITH_INV",
+ .mask = {
+ [IB_QPT_RC] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_UC] = WR_INLINE_MASK | WR_SEND_MASK,
+ [IB_QPT_UD] = WR_INLINE_MASK | WR_SEND_MASK,
+ },
+ },
+ [IB_WR_RDMA_READ_WITH_INV] = {
+ .name = "IB_WR_RDMA_READ_WITH_INV",
+ .mask = {
+ [IB_QPT_RC] = WR_READ_MASK | WR_READ_MASK,
+ },
+ },
+ [IB_WR_LOCAL_INV] = {
+ .name = "IB_WR_LOCAL_INV",
+ .mask = {
+ /* not supported */
+ },
+ },
+ [IB_WR_FAST_REG_MR] = {
+ .name = "IB_WR_FAST_REG_MR",
+ .mask = {
+ /* not supported */
+ },
+ },
+};
+
+struct rxe_opcode_info rxe_opcode[RXE_NUM_OPCODE] = {
+ [IB_OPCODE_RC_SEND_FIRST] = {
+ .name = "IB_OPCODE_RC_SEND_FIRST",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_RWR_MASK
+ | RXE_SEND_MASK | RXE_START_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_SEND_MIDDLE] = {
+ .name = "IB_OPCODE_RC_SEND_MIDDLE]",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_SEND_MASK
+ | RXE_MIDDLE_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_SEND_LAST] = {
+ .name = "IB_OPCODE_RC_SEND_LAST",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_COMP_MASK
+ | RXE_SEND_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_SEND_LAST_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_RC_SEND_LAST_WITH_IMMEDIATE",
+ .mask = RXE_IMMDT_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_SEND_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_IMMDT] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_SEND_ONLY] = {
+ .name = "IB_OPCODE_RC_SEND_ONLY",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_COMP_MASK
+ | RXE_RWR_MASK | RXE_SEND_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_SEND_ONLY_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_RC_SEND_ONLY_WITH_IMMEDIATE",
+ .mask = RXE_IMMDT_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_RWR_MASK | RXE_SEND_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_IMMDT] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_WRITE_FIRST] = {
+ .name = "IB_OPCODE_RC_RDMA_WRITE_FIRST",
+ .mask = RXE_RETH_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_START_MASK,
+ .length = RXE_BTH_BYTES + RXE_RETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_WRITE_MIDDLE] = {
+ .name = "IB_OPCODE_RC_RDMA_WRITE_MIDDLE",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_MIDDLE_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_WRITE_LAST] = {
+ .name = "IB_OPCODE_RC_RDMA_WRITE_LAST",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_WRITE_LAST_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_RC_RDMA_WRITE_LAST_WITH_IMMEDIATE",
+ .mask = RXE_IMMDT_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_COMP_MASK | RXE_RWR_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_IMMDT] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_WRITE_ONLY] = {
+ .name = "IB_OPCODE_RC_RDMA_WRITE_ONLY",
+ .mask = RXE_RETH_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_START_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_RETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_WRITE_ONLY_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_RC_RDMA_WRITE_ONLY_WITH_IMMEDIATE",
+ .mask = RXE_RETH_MASK | RXE_IMMDT_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_COMP_MASK | RXE_RWR_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES + RXE_RETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RETH] = RXE_BTH_BYTES,
+ [RXE_IMMDT] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_READ_REQUEST] = {
+ .name = "IB_OPCODE_RC_RDMA_READ_REQUEST",
+ .mask = RXE_RETH_MASK | RXE_REQ_MASK | RXE_READ_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_RETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST] = {
+ .name = "IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST",
+ .mask = RXE_AETH_MASK | RXE_PAYLOAD_MASK | RXE_ACK_MASK
+ | RXE_START_MASK,
+ .length = RXE_BTH_BYTES + RXE_AETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_AETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE] = {
+ .name = "IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE",
+ .mask = RXE_PAYLOAD_MASK | RXE_ACK_MASK | RXE_MIDDLE_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST] = {
+ .name = "IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST",
+ .mask = RXE_AETH_MASK | RXE_PAYLOAD_MASK | RXE_ACK_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_AETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_AETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY] = {
+ .name = "IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY",
+ .mask = RXE_AETH_MASK | RXE_PAYLOAD_MASK | RXE_ACK_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_AETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_AETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_ACKNOWLEDGE] = {
+ .name = "IB_OPCODE_RC_ACKNOWLEDGE",
+ .mask = RXE_AETH_MASK | RXE_ACK_MASK | RXE_START_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_AETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_AETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_ATOMIC_ACKNOWLEDGE] = {
+ .name = "IB_OPCODE_RC_ATOMIC_ACKNOWLEDGE",
+ .mask = RXE_AETH_MASK | RXE_ATMACK_MASK | RXE_ACK_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_ATMACK_BYTES + RXE_AETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_AETH] = RXE_BTH_BYTES,
+ [RXE_ATMACK] = RXE_BTH_BYTES
+ + RXE_AETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_ATMACK_BYTES + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_COMPARE_SWAP] = {
+ .name = "IB_OPCODE_RC_COMPARE_SWAP",
+ .mask = RXE_ATMETH_MASK | RXE_REQ_MASK | RXE_ATOMIC_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_ATMETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_ATMETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_ATMETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_FETCH_ADD] = {
+ .name = "IB_OPCODE_RC_FETCH_ADD",
+ .mask = RXE_ATMETH_MASK | RXE_REQ_MASK | RXE_ATOMIC_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_ATMETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_ATMETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_ATMETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_SEND_LAST_INV] = {
+ .name = "IB_OPCODE_RC_SEND_LAST_INV",
+ .mask = RXE_IETH_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_SEND_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_IETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_IETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RC_SEND_ONLY_INV] = {
+ .name = "IB_OPCODE_RC_SEND_ONLY_INV",
+ .mask = RXE_IETH_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_RWR_MASK | RXE_SEND_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_IETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_IETH_BYTES,
+ }
+ },
+
+ /* UC */
+ [IB_OPCODE_UC_SEND_FIRST] = {
+ .name = "IB_OPCODE_UC_SEND_FIRST",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_RWR_MASK
+ | RXE_SEND_MASK | RXE_START_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_SEND_MIDDLE] = {
+ .name = "IB_OPCODE_UC_SEND_MIDDLE",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_SEND_MASK
+ | RXE_MIDDLE_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_SEND_LAST] = {
+ .name = "IB_OPCODE_UC_SEND_LAST",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_COMP_MASK
+ | RXE_SEND_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_SEND_LAST_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_UC_SEND_LAST_WITH_IMMEDIATE",
+ .mask = RXE_IMMDT_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_SEND_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_IMMDT] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_SEND_ONLY] = {
+ .name = "IB_OPCODE_UC_SEND_ONLY",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_COMP_MASK
+ | RXE_RWR_MASK | RXE_SEND_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_SEND_ONLY_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_UC_SEND_ONLY_WITH_IMMEDIATE",
+ .mask = RXE_IMMDT_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_RWR_MASK | RXE_SEND_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_IMMDT] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_RDMA_WRITE_FIRST] = {
+ .name = "IB_OPCODE_UC_RDMA_WRITE_FIRST",
+ .mask = RXE_RETH_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_START_MASK,
+ .length = RXE_BTH_BYTES + RXE_RETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_RDMA_WRITE_MIDDLE] = {
+ .name = "IB_OPCODE_UC_RDMA_WRITE_MIDDLE",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_MIDDLE_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_RDMA_WRITE_LAST] = {
+ .name = "IB_OPCODE_UC_RDMA_WRITE_LAST",
+ .mask = RXE_PAYLOAD_MASK | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_RDMA_WRITE_LAST_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_UC_RDMA_WRITE_LAST_WITH_IMMEDIATE",
+ .mask = RXE_IMMDT_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_COMP_MASK | RXE_RWR_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_IMMDT] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_RDMA_WRITE_ONLY] = {
+ .name = "IB_OPCODE_UC_RDMA_WRITE_ONLY",
+ .mask = RXE_RETH_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_START_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_RETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES,
+ }
+ },
+ [IB_OPCODE_UC_RDMA_WRITE_ONLY_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_UC_RDMA_WRITE_ONLY_WITH_IMMEDIATE",
+ .mask = RXE_RETH_MASK | RXE_IMMDT_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_COMP_MASK | RXE_RWR_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES + RXE_RETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RETH] = RXE_BTH_BYTES,
+ [RXE_IMMDT] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+
+ /* RD */
+ [IB_OPCODE_RD_SEND_FIRST] = {
+ .name = "IB_OPCODE_RD_SEND_FIRST",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_RWR_MASK | RXE_SEND_MASK
+ | RXE_START_MASK,
+ .length = RXE_BTH_BYTES + RXE_DETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_SEND_MIDDLE] = {
+ .name = "IB_OPCODE_RD_SEND_MIDDLE",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_SEND_MASK
+ | RXE_MIDDLE_MASK,
+ .length = RXE_BTH_BYTES + RXE_DETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_SEND_LAST] = {
+ .name = "IB_OPCODE_RD_SEND_LAST",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_COMP_MASK | RXE_SEND_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_DETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_SEND_LAST_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_RD_SEND_LAST_WITH_IMMEDIATE",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_IMMDT_MASK
+ | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_SEND_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_IMMDT] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_SEND_ONLY] = {
+ .name = "IB_OPCODE_RD_SEND_ONLY",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_COMP_MASK | RXE_RWR_MASK
+ | RXE_SEND_MASK | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_DETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_SEND_ONLY_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_RD_SEND_ONLY_WITH_IMMEDIATE",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_IMMDT_MASK
+ | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_RWR_MASK | RXE_SEND_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_IMMDT] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_WRITE_FIRST] = {
+ .name = "IB_OPCODE_RD_RDMA_WRITE_FIRST",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_RETH_MASK
+ | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_START_MASK,
+ .length = RXE_BTH_BYTES + RXE_RETH_BYTES + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_RETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_RETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_WRITE_MIDDLE] = {
+ .name = "IB_OPCODE_RD_RDMA_WRITE_MIDDLE",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_MIDDLE_MASK,
+ .length = RXE_BTH_BYTES + RXE_DETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_WRITE_LAST] = {
+ .name = "IB_OPCODE_RD_RDMA_WRITE_LAST",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_DETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_WRITE_LAST_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_RD_RDMA_WRITE_LAST_WITH_IMMEDIATE",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_IMMDT_MASK
+ | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_COMP_MASK | RXE_RWR_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_IMMDT] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_WRITE_ONLY] = {
+ .name = "IB_OPCODE_RD_RDMA_WRITE_ONLY",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_RETH_MASK
+ | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_WRITE_MASK | RXE_START_MASK
+ | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_RETH_BYTES + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_RETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_RETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_WRITE_ONLY_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_RD_RDMA_WRITE_ONLY_WITH_IMMEDIATE",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_RETH_MASK
+ | RXE_IMMDT_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_WRITE_MASK
+ | RXE_COMP_MASK | RXE_RWR_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES + RXE_RETH_BYTES
+ + RXE_DETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_RETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_IMMDT] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_RETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_RETH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_READ_REQUEST] = {
+ .name = "IB_OPCODE_RD_RDMA_READ_REQUEST",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_RETH_MASK
+ | RXE_REQ_MASK | RXE_READ_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_RETH_BYTES + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_RETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RETH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_READ_RESPONSE_FIRST] = {
+ .name = "IB_OPCODE_RD_RDMA_READ_RESPONSE_FIRST",
+ .mask = RXE_RDETH_MASK | RXE_AETH_MASK
+ | RXE_PAYLOAD_MASK | RXE_ACK_MASK
+ | RXE_START_MASK,
+ .length = RXE_BTH_BYTES + RXE_AETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_AETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_READ_RESPONSE_MIDDLE] = {
+ .name = "IB_OPCODE_RD_RDMA_READ_RESPONSE_MIDDLE",
+ .mask = RXE_RDETH_MASK | RXE_PAYLOAD_MASK | RXE_ACK_MASK
+ | RXE_MIDDLE_MASK,
+ .length = RXE_BTH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_READ_RESPONSE_LAST] = {
+ .name = "IB_OPCODE_RD_RDMA_READ_RESPONSE_LAST",
+ .mask = RXE_RDETH_MASK | RXE_AETH_MASK | RXE_PAYLOAD_MASK
+ | RXE_ACK_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_AETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_AETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RDMA_READ_RESPONSE_ONLY] = {
+ .name = "IB_OPCODE_RD_RDMA_READ_RESPONSE_ONLY",
+ .mask = RXE_RDETH_MASK | RXE_AETH_MASK | RXE_PAYLOAD_MASK
+ | RXE_ACK_MASK | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_AETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_AETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_ACKNOWLEDGE] = {
+ .name = "IB_OPCODE_RD_ACKNOWLEDGE",
+ .mask = RXE_RDETH_MASK | RXE_AETH_MASK | RXE_ACK_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_AETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_AETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_ATOMIC_ACKNOWLEDGE] = {
+ .name = "IB_OPCODE_RD_ATOMIC_ACKNOWLEDGE",
+ .mask = RXE_RDETH_MASK | RXE_AETH_MASK | RXE_ATMACK_MASK
+ | RXE_ACK_MASK | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_ATMACK_BYTES + RXE_AETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_AETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_ATMACK] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_AETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_COMPARE_SWAP] = {
+ .name = "RD_COMPARE_SWAP",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_ATMETH_MASK
+ | RXE_REQ_MASK | RXE_ATOMIC_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_ATMETH_BYTES + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_ATMETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES +
+ + RXE_ATMETH_BYTES
+ + RXE_DETH_BYTES +
+ + RXE_RDETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_FETCH_ADD] = {
+ .name = "IB_OPCODE_RD_FETCH_ADD",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_ATMETH_MASK
+ | RXE_REQ_MASK | RXE_ATOMIC_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_ATMETH_BYTES + RXE_DETH_BYTES
+ + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ [RXE_ATMETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES +
+ + RXE_ATMETH_BYTES
+ + RXE_DETH_BYTES +
+ + RXE_RDETH_BYTES,
+ }
+ },
+ [IB_OPCODE_RD_RESYNC] = {
+ .name = "RD_RESYNC",
+ .mask = RXE_RDETH_MASK | RXE_DETH_MASK | RXE_REQ_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_DETH_BYTES + RXE_RDETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_RDETH] = RXE_BTH_BYTES,
+ [RXE_DETH] = RXE_BTH_BYTES
+ + RXE_RDETH_BYTES,
+ }
+ },
+
+ /* UD */
+ [IB_OPCODE_UD_SEND_ONLY] = {
+ .name = "IB_OPCODE_UD_SEND_ONLY",
+ .mask = RXE_DETH_MASK | RXE_PAYLOAD_MASK | RXE_REQ_MASK
+ | RXE_COMP_MASK | RXE_RWR_MASK | RXE_SEND_MASK
+ | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_DETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_DETH] = RXE_BTH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_DETH_BYTES,
+ }
+ },
+ [IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE] = {
+ .name = "IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE",
+ .mask = RXE_DETH_MASK | RXE_IMMDT_MASK | RXE_PAYLOAD_MASK
+ | RXE_REQ_MASK | RXE_COMP_MASK | RXE_RWR_MASK
+ | RXE_SEND_MASK | RXE_START_MASK | RXE_END_MASK,
+ .length = RXE_BTH_BYTES + RXE_IMMDT_BYTES + RXE_DETH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ [RXE_DETH] = RXE_BTH_BYTES,
+ [RXE_IMMDT] = RXE_BTH_BYTES
+ + RXE_DETH_BYTES,
+ [RXE_PAYLOAD] = RXE_BTH_BYTES
+ + RXE_DETH_BYTES
+ + RXE_IMMDT_BYTES,
+ }
+ },
+
+ /* CNP */
+ [IB_OPCODE_CN_CNP] = {
+ .name = "IB_OPCODE_CN_CNP",
+ .mask = RXE_CNP_MASK,
+ .length = RXE_BTH_BYTES,
+ .offset = {
+ [RXE_BTH] = 0,
+ }
+ },
+};
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 05/44] rxe_param.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (3 preceding siblings ...)
2011-07-01 13:18 ` [patch 04/44] rxe_opcode.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 06/44] rxe.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (39 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch5 --]
[-- Type: text/plain, Size: 6607 bytes --]
default rxe device and port parameters
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_param.h | 212 ++++++++++++++++++++++++++++++++++
1 file changed, 212 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_param.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_param.h
@@ -0,0 +1,212 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_PARAM_H
+#define RXE_PARAM_H
+
+enum rxe_mtu {
+ RXE_MTU_INVALID = 0,
+ RXE_MTU_256 = 1,
+ RXE_MTU_512 = 2,
+ RXE_MTU_1024 = 3,
+ RXE_MTU_2048 = 4,
+ RXE_MTU_4096 = 5,
+ RXE_MTU_8192 = 6,
+};
+
+static inline int rxe_mtu_enum_to_int(enum rxe_mtu mtu)
+{
+ switch (mtu) {
+ case RXE_MTU_256: return 256;
+ case RXE_MTU_512: return 512;
+ case RXE_MTU_1024: return 1024;
+ case RXE_MTU_2048: return 2048;
+ case RXE_MTU_4096: return 4096;
+ case RXE_MTU_8192: return 8192;
+ default: return -1;
+ }
+}
+
+static inline enum rxe_mtu rxe_mtu_int_to_enum(int mtu)
+{
+ if (mtu < 256)
+ return 0;
+ else if (mtu < 512)
+ return RXE_MTU_256;
+ else if (mtu < 1024)
+ return RXE_MTU_512;
+ else if (mtu < 2048)
+ return RXE_MTU_1024;
+ else if (mtu < 4096)
+ return RXE_MTU_2048;
+ else if (mtu < 8192)
+ return RXE_MTU_4096;
+ else
+ return RXE_MTU_8192;
+}
+
+/* Find the IB mtu for a given network MTU. */
+static inline enum rxe_mtu eth_mtu_int_to_enum(int mtu)
+{
+ mtu -= RXE_MAX_HDR_LENGTH;
+
+ return rxe_mtu_int_to_enum(mtu);
+}
+
+/*
+ * default/initial rxe device parameter settings
+ */
+enum rxe_device_param {
+ RXE_FW_VER = 0,
+ RXE_MAX_MR_SIZE = -1ull,
+ RXE_PAGE_SIZE_CAP = 0xfffff000,
+ RXE_VENDOR_ID = 0,
+ RXE_VENDOR_PART_ID = 0,
+ RXE_HW_VER = 0,
+ RXE_MAX_QP = 0x10000,
+ RXE_MAX_QP_WR = 0x4000,
+ RXE_MAX_INLINE_DATA = 400,
+ RXE_DEVICE_CAP_FLAGS = IB_DEVICE_BAD_PKEY_CNTR
+ | IB_DEVICE_BAD_QKEY_CNTR
+ | IB_DEVICE_AUTO_PATH_MIG
+ | IB_DEVICE_CHANGE_PHY_PORT
+ | IB_DEVICE_UD_AV_PORT_ENFORCE
+ | IB_DEVICE_PORT_ACTIVE_EVENT
+ | IB_DEVICE_SYS_IMAGE_GUID
+ | IB_DEVICE_RC_RNR_NAK_GEN
+ | IB_DEVICE_SRQ_RESIZE,
+ RXE_MAX_SGE = 27,
+ RXE_MAX_SGE_RD = 0,
+ RXE_MAX_CQ = 16384,
+ RXE_MAX_LOG_CQE = 13,
+ RXE_MAX_MR = 2*1024,
+ RXE_MAX_PD = 0x7ffc,
+ RXE_MAX_QP_RD_ATOM = 128,
+ RXE_MAX_EE_RD_ATOM = 0,
+ RXE_MAX_RES_RD_ATOM = 0x3f000,
+ RXE_MAX_QP_INIT_RD_ATOM = 128,
+ RXE_MAX_EE_INIT_RD_ATOM = 0,
+ RXE_ATOMIC_CAP = 1,
+ RXE_MAX_EE = 0,
+ RXE_MAX_RDD = 0,
+ RXE_MAX_MW = 0,
+ RXE_MAX_RAW_IPV6_QP = 0,
+ RXE_MAX_RAW_ETHY_QP = 0,
+ RXE_MAX_MCAST_GRP = 8192,
+ RXE_MAX_MCAST_QP_ATTACH = 56,
+ RXE_MAX_TOT_MCAST_QP_ATTACH = 0x70000,
+ RXE_MAX_AH = 100,
+ RXE_MAX_FMR = 2*1024,
+ RXE_MAX_MAP_PER_FMR = 100,
+ RXE_MAX_SRQ = 960,
+ RXE_MAX_SRQ_WR = 0x4000,
+ RXE_MIN_SRQ_WR = 1,
+ RXE_MAX_SRQ_SGE = 27,
+ RXE_MIN_SRQ_SGE = 1,
+ RXE_MAX_FMR_PAGE_LIST_LEN = 0,
+ RXE_MAX_PKEYS = 64,
+ RXE_LOCAL_CA_ACK_DELAY = 15,
+
+ RXE_MAX_UCONTEXT = 128,
+
+ RXE_NUM_PORT = 1,
+ RXE_NUM_COMP_VECTORS = 1,
+
+ RXE_MIN_QP_INDEX = 16,
+ RXE_MAX_QP_INDEX = 0x00020000,
+
+ RXE_MIN_SRQ_INDEX = 0x00020001,
+ RXE_MAX_SRQ_INDEX = 0x00040000,
+
+ RXE_MIN_MR_INDEX = 0x00000001,
+ RXE_MAX_MR_INDEX = 0x00020000,
+ RXE_MIN_FMR_INDEX = 0x00020001,
+ RXE_MAX_FMR_INDEX = 0x00040000,
+ RXE_MIN_MW_INDEX = 0x00040001,
+ RXE_MAX_MW_INDEX = 0x00060000,
+};
+
+/*
+ * default/initial rxe port parameters
+ */
+enum rxe_port_param {
+ RXE_PORT_STATE = IB_PORT_DOWN,
+ RXE_PORT_MAX_MTU = RXE_MTU_4096,
+ RXE_PORT_ACTIVE_MTU = RXE_MTU_256,
+ RXE_PORT_GID_TBL_LEN = 32,
+ RXE_PORT_PORT_CAP_FLAGS = 0,
+ RXE_PORT_MAX_MSG_SZ = 0x800000,
+ RXE_PORT_BAD_PKEY_CNTR = 0,
+ RXE_PORT_QKEY_VIOL_CNTR = 0,
+ RXE_PORT_LID = 0,
+ RXE_PORT_SM_LID = 0,
+ RXE_PORT_SM_SL = 0,
+ RXE_PORT_LMC = 0,
+ RXE_PORT_MAX_VL_NUM = 1,
+ RXE_PORT_SUBNET_TIMEOUT = 0,
+ RXE_PORT_INIT_TYPE_REPLY = 0,
+ RXE_PORT_ACTIVE_WIDTH = IB_WIDTH_1X,
+ RXE_PORT_ACTIVE_SPEED = 1,
+ RXE_PORT_PKEY_TBL_LEN = 64,
+ RXE_PORT_PHYS_STATE = 2,
+ RXE_PORT_SUBNET_PREFIX = 0xfe80000000000000ULL,
+ RXE_PORT_CC_TBL_LEN = 128,
+ RXE_PORT_CC_TIMER = 1024,
+ RXE_PORT_CC_INCREASE = 1,
+ RXE_PORT_CC_THRESHOLD = 64,
+ RXE_PORT_CC_CCTI_MIN = 0,
+};
+
+/*
+ * default/initial port info parameters
+ */
+enum rxe_port_info_param {
+ RXE_PORT_INFO_VL_CAP = 4, /* 1-8 */
+ RXE_PORT_INFO_MTU_CAP = 5, /* 4096 */
+ RXE_PORT_INFO_OPER_VL = 1, /* 1 */
+};
+
+extern int rxe_debug_flags;
+extern int rxe_crc_disable;
+extern int rxe_nsec_per_packet;
+extern int rxe_nsec_per_kbyte;
+extern int rxe_max_skb_per_qp;
+extern int rxe_max_req_comp_gap;
+extern int rxe_max_pkt_per_ack;
+extern int rxe_default_mtu;
+extern int rxe_fast_comp;
+extern int rxe_fast_resp;
+extern int rxe_fast_req;
+extern int rxe_fast_arb;
+
+#endif /* RXE_PARAM_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 06/44] rxe.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (4 preceding siblings ...)
2011-07-01 13:18 ` [patch 05/44] rxe_param.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 07/44] rxe_loc.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (38 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch6 --]
[-- Type: text/plain, Size: 2814 bytes --]
ib_rxe external interface to lower level modules
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe.h | 63 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 63 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe.h
@@ -0,0 +1,63 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_H
+#define RXE_H
+
+#include <linux/skbuff.h>
+
+#include <rdma/ib_verbs.h>
+#include <rdma/ib_user_verbs.h>
+#include <rdma/ib_pack.h>
+#include <rdma/ib_smi.h>
+#include <rdma/ib_umem.h>
+
+#include "rxe_opcode.h"
+#include "rxe_hdr.h"
+#include "rxe_param.h"
+#include "rxe_verbs.h"
+
+#define RXE_UVERBS_ABI_VERSION (1)
+
+#define IB_PHYS_STATE_LINK_UP (5)
+
+int rxe_set_mtu(struct rxe_dev *rxe, unsigned int dev_mtu,
+ unsigned int port_num);
+
+int rxe_add(struct rxe_dev *rxe, unsigned int mtu);
+
+void rxe_remove(struct rxe_dev *rxe);
+
+int rxe_rcv(struct sk_buff *skb);
+
+#endif /* RXE_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 07/44] rxe_loc.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (5 preceding siblings ...)
2011-07-01 13:18 ` [patch 06/44] rxe.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 08/44] rxe_mmap.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (37 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch7 --]
[-- Type: text/plain, Size: 3423 bytes --]
misc local interfaces between files in ib_rxe.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_loc.h | 78 ++++++++++++++++++++++++++++++++++++
1 file changed, 78 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_loc.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_loc.h
@@ -0,0 +1,78 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_LOC_H
+#define RXE_LOC_H
+
+/* local declarations shared between rxe files */
+
+extern struct ib_dma_mapping_ops rxe_dma_mapping_ops;
+
+void rxe_release(struct kref *kref);
+
+void arbiter_skb_queue(struct rxe_dev *rxe,
+ struct rxe_qp *qp, struct sk_buff *skb);
+
+int rxe_arbiter(void *arg);
+int rxe_completer(void *arg);
+int rxe_requester(void *arg);
+int rxe_responder(void *arg);
+
+__be32 rxe_sb8_ib_headers(struct rxe_pkt_info *pkt);
+__be32 rxe_sb8(struct rxe_pkt_info *pkt);
+__be32 sb8_le_copy(void *dest, const void *data,
+ const size_t length, __be32 last_crc);
+
+static inline __be32 sb8_copy(void *dest, const void *src,
+ const size_t length, __be32 crc)
+{
+#ifdef __LITTLE_ENDIAN
+ return sb8_le_copy(dest, src, length, crc);
+#else
+ /* TODO this is wrong */
+ return sb8_le_copy(dest, src, length, crc);
+#endif
+}
+
+void rxe_resp_queue_pkt(struct rxe_dev *rxe,
+ struct rxe_qp *qp, struct sk_buff *skb);
+
+void rxe_comp_queue_pkt(struct rxe_dev *rxe,
+ struct rxe_qp *qp, struct sk_buff *skb);
+
+static inline unsigned wr_opcode_mask(int opcode, struct rxe_qp *qp)
+{
+ return rxe_wr_opcode_info[opcode].mask[qp->ibqp.qp_type];
+}
+
+#endif /* RXE_LOC_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 08/44] rxe_mmap.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (6 preceding siblings ...)
2011-07-01 13:18 ` [patch 07/44] rxe_loc.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 09/44] rxe_mmap.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (36 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch8 --]
[-- Type: text/plain, Size: 2855 bytes --]
declarations for mmap routines.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_mmap.h | 63 +++++++++++++++++++++++++++++++++++
1 file changed, 63 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_mmap.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_mmap.h
@@ -0,0 +1,63 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_MMAP_H
+#define RXE_MMAP_H
+
+/* This info has to be copied to user space so that it can call mmap
+ * and get the right stuff mapped */
+struct mminfo {
+ __u64 offset;
+ __u32 size;
+ __u32 pad;
+};
+
+struct rxe_mmap_info {
+ struct list_head pending_mmaps;
+ struct ib_ucontext *context;
+ struct kref ref;
+ void *obj;
+
+ struct mminfo info;
+};
+
+void rxe_mmap_release(struct kref *ref);
+
+struct rxe_mmap_info *rxe_create_mmap_info(struct rxe_dev *dev,
+ u32 size,
+ struct ib_ucontext *context,
+ void *obj);
+
+int rxe_mmap(struct ib_ucontext *context, struct vm_area_struct *vma);
+
+#endif /* RXE_MMAP_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 09/44] rxe_mmap.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (7 preceding siblings ...)
2011-07-01 13:18 ` [patch 08/44] rxe_mmap.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 10/44] rxe_queue.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (35 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch9 --]
[-- Type: text/plain, Size: 5465 bytes --]
mmap routines.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_mmap.c | 171 +++++++++++++++++++++++++++++++++++
1 file changed, 171 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_mmap.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_mmap.c
@@ -0,0 +1,171 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ * Copyright (c) 2006, 2007 QLogic Corporation. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/module.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+#include <linux/errno.h>
+#include <asm/pgtable.h>
+
+#include "rxe.h"
+#include "rxe_queue.h"
+#include "rxe_mmap.h"
+
+void rxe_mmap_release(struct kref *ref)
+{
+ struct rxe_mmap_info *ip = container_of(ref,
+ struct rxe_mmap_info, ref);
+ struct rxe_dev *rxe = to_rdev(ip->context->device);
+
+ spin_lock_bh(&rxe->pending_lock);
+
+ if (!list_empty(&ip->pending_mmaps))
+ list_del(&ip->pending_mmaps);
+
+ spin_unlock_bh(&rxe->pending_lock);
+
+ vfree(ip->obj); /* buf */
+ kfree(ip);
+}
+
+/*
+ * open and close keep track of how many times the CQ is mapped,
+ * to avoid releasing it.
+ */
+static void rxe_vma_open(struct vm_area_struct *vma)
+{
+ struct rxe_mmap_info *ip = vma->vm_private_data;
+
+ kref_get(&ip->ref);
+}
+
+static void rxe_vma_close(struct vm_area_struct *vma)
+{
+ struct rxe_mmap_info *ip = vma->vm_private_data;
+
+ kref_put(&ip->ref, rxe_mmap_release);
+
+}
+
+static struct vm_operations_struct rxe_vm_ops = {
+ .open = rxe_vma_open,
+ .close = rxe_vma_close,
+};
+
+/**
+ * rxe_mmap - create a new mmap region
+ * @context: the IB user context of the process making the mmap() call
+ * @vma: the VMA to be initialized
+ * Return zero if the mmap is OK. Otherwise, return an errno.
+ */
+int rxe_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
+{
+ struct rxe_dev *rxe = to_rdev(context->device);
+ unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
+ unsigned long size = vma->vm_end - vma->vm_start;
+ struct rxe_mmap_info *ip, *pp;
+ int ret;
+
+ /*
+ * Search the device's list of objects waiting for a mmap call.
+ * Normally, this list is very short since a call to create a
+ * CQ, QP, or SRQ is soon followed by a call to mmap().
+ */
+ spin_lock_bh(&rxe->pending_lock);
+ list_for_each_entry_safe(ip, pp, &rxe->pending_mmaps, pending_mmaps) {
+ if (context != ip->context || (__u64)offset != ip->info.offset)
+ continue;
+
+ /* Don't allow a mmap larger than the object. */
+ if (size > ip->info.size)
+ break;
+
+ goto found_it;
+ }
+ pr_warning("unable to find pending mmap info\n");
+ spin_unlock_bh(&rxe->pending_lock);
+ ret = -EINVAL;
+ goto done;
+
+found_it:
+ list_del_init(&ip->pending_mmaps);
+ spin_unlock_bh(&rxe->pending_lock);
+
+ ret = remap_vmalloc_range(vma, ip->obj, 0);
+ if (ret) {
+ pr_info("rxe: err %d from remap_vmalloc_range\n", ret);
+ goto done;
+ }
+
+ vma->vm_ops = &rxe_vm_ops;
+ vma->vm_private_data = ip;
+ rxe_vma_open(vma);
+done:
+ return ret;
+}
+
+/*
+ * Allocate information for rxe_mmap
+ */
+struct rxe_mmap_info *rxe_create_mmap_info(struct rxe_dev *rxe,
+ u32 size,
+ struct ib_ucontext *context,
+ void *obj)
+{
+ struct rxe_mmap_info *ip;
+
+ ip = kmalloc(sizeof *ip, GFP_KERNEL);
+ if (!ip)
+ return NULL;
+
+ size = PAGE_ALIGN(size);
+
+ spin_lock_bh(&rxe->mmap_offset_lock);
+
+ if (rxe->mmap_offset == 0)
+ rxe->mmap_offset = PAGE_SIZE;
+
+ ip->info.offset = rxe->mmap_offset;
+ rxe->mmap_offset += size;
+
+ spin_unlock_bh(&rxe->mmap_offset_lock);
+
+ INIT_LIST_HEAD(&ip->pending_mmaps);
+ ip->info.size = size;
+ ip->context = context;
+ ip->obj = obj;
+ kref_init(&ip->ref);
+
+ return ip;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 10/44] rxe_queue.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (8 preceding siblings ...)
2011-07-01 13:18 ` [patch 09/44] rxe_mmap.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 11/44] rxe_queue.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (34 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch10 --]
[-- Type: text/plain, Size: 6178 bytes --]
declarations for common work and completion queue.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_queue.h | 174 ++++++++++++++++++++++++++++++++++
1 file changed, 174 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_queue.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_queue.h
@@ -0,0 +1,174 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_QUEUE_H
+#define RXE_QUEUE_H
+
+/* implements a simple circular buffer that can optionally be
+ shared between user space and the kernel and can be resized
+
+ the requested element size is rounded up to a power of 2
+ and the number of elements in the buffer is also rounded
+ up to a power of 2. Since the queue is empty when the
+ producer and consumer indices match the maximum capacity
+ of the queue is one less than the number of element slots */
+
+/* this data structure is shared between user space and kernel
+ space for those cases where the queue is shared. It contains
+ the producer and consumer indices. Is also contains a copy
+ of the queue size parameters for user space to use but the
+ kernel must use the parameters in the rxe_queue struct
+ this MUST MATCH the corresponding librxe struct
+ for performance reasons arrange to have producer and consumer
+ pointers in separate cache lines
+ the kernel should always mask the indices to avoid accessing
+ memory outside of the data area */
+struct rxe_queue_buf {
+ __u32 log2_elem_size;
+ __u32 index_mask;
+ __u32 pad_1[30];
+ __u32 producer_index;
+ __u32 pad_2[31];
+ __u32 consumer_index;
+ __u32 pad_3[31];
+ __u8 data[0];
+};
+
+struct rxe_queue {
+ struct rxe_dev *rxe;
+ struct rxe_queue_buf *buf;
+ struct rxe_mmap_info *ip;
+ size_t buf_size;
+ size_t elem_size;
+ unsigned int log2_elem_size;
+ unsigned int index_mask;
+};
+
+int do_mmap_info(struct rxe_dev *rxe,
+ struct ib_udata *udata,
+ int offset,
+ struct ib_ucontext *context,
+ struct rxe_queue_buf *buf,
+ size_t buf_size,
+ struct rxe_mmap_info **ip_p);
+
+struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe,
+ unsigned int *num_elem,
+ unsigned int elem_size);
+
+int rxe_queue_resize(struct rxe_queue *q,
+ unsigned int *num_elem_p,
+ unsigned int elem_size,
+ struct ib_ucontext *context,
+ struct ib_udata *udata,
+ spinlock_t *producer_lock,
+ spinlock_t *consumer_lock);
+
+void rxe_queue_cleanup(struct rxe_queue *queue);
+
+static inline int next_index(struct rxe_queue *q, int index)
+{
+ return (index + 1) & q->buf->index_mask;
+}
+
+static inline int queue_empty(struct rxe_queue *q)
+{
+ return ((q->buf->producer_index - q->buf->consumer_index)
+ & q->index_mask) == 0;
+}
+
+static inline int queue_full(struct rxe_queue *q)
+{
+ return ((q->buf->producer_index + 1 - q->buf->consumer_index)
+ & q->index_mask) == 0;
+}
+
+static inline void advance_producer(struct rxe_queue *q)
+{
+ q->buf->producer_index = (q->buf->producer_index + 1)
+ & q->index_mask;
+}
+
+static inline void advance_consumer(struct rxe_queue *q)
+{
+ q->buf->consumer_index = (q->buf->consumer_index + 1)
+ & q->index_mask;
+}
+
+static inline void *producer_addr(struct rxe_queue *q)
+{
+ return q->buf->data + ((q->buf->producer_index & q->index_mask)
+ << q->log2_elem_size);
+}
+
+static inline void *consumer_addr(struct rxe_queue *q)
+{
+ return q->buf->data + ((q->buf->consumer_index & q->index_mask)
+ << q->log2_elem_size);
+}
+
+static inline unsigned int producer_index(struct rxe_queue *q)
+{
+ return q->buf->producer_index;
+}
+
+static inline unsigned int consumer_index(struct rxe_queue *q)
+{
+ return q->buf->consumer_index;
+}
+
+static inline void *addr_from_index(struct rxe_queue *q, unsigned int index)
+{
+ return q->buf->data + ((index & q->index_mask)
+ << q->buf->log2_elem_size);
+}
+
+static inline unsigned int index_from_addr(const struct rxe_queue *q,
+ const void *addr)
+{
+ return (((u8 *)addr - q->buf->data) >> q->log2_elem_size)
+ & q->index_mask;
+}
+
+static inline unsigned int queue_count(const struct rxe_queue *q)
+{
+ return (q->buf->producer_index - q->buf->consumer_index)
+ & q->index_mask;
+}
+
+static inline void *queue_head(struct rxe_queue *q)
+{
+ return queue_empty(q) ? NULL : consumer_addr(q);
+}
+
+#endif /* RXE_QUEUE_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 11/44] rxe_queue.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (9 preceding siblings ...)
2011-07-01 13:18 ` [patch 10/44] rxe_queue.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 12/44] rxe_verbs.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (33 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch11 --]
[-- Type: text/plain, Size: 5868 bytes --]
common work and completion queue implementation.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_queue.c | 209 ++++++++++++++++++++++++++++++++++
1 file changed, 209 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_queue.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_queue.c
@@ -0,0 +1,209 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must retailuce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/vmalloc.h>
+#include "rxe.h"
+#include "rxe_queue.h"
+#include "rxe_mmap.h"
+
+int do_mmap_info(struct rxe_dev *rxe,
+ struct ib_udata *udata,
+ int offset,
+ struct ib_ucontext *context,
+ struct rxe_queue_buf *buf,
+ size_t buf_size,
+ struct rxe_mmap_info **ip_p)
+{
+ int err;
+ struct rxe_mmap_info *ip = NULL;
+
+ if (udata) {
+ if ((udata->outlen - offset) < sizeof(struct mminfo))
+ goto err1;
+
+ ip = rxe_create_mmap_info(rxe, buf_size, context, buf);
+ if (!ip)
+ goto err1;
+
+ err = copy_to_user(udata->outbuf + offset, &ip->info,
+ sizeof(ip->info));
+ if (err)
+ goto err2;
+
+ spin_lock_bh(&rxe->pending_lock);
+ list_add(&ip->pending_mmaps, &rxe->pending_mmaps);
+ spin_unlock_bh(&rxe->pending_lock);
+ }
+
+ *ip_p = ip;
+
+ return 0;
+
+err2:
+ kfree(ip);
+err1:
+ return -EINVAL;
+}
+
+struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe,
+ unsigned int *num_elem,
+ unsigned int elem_size)
+{
+ struct rxe_queue *q;
+ size_t buf_size;
+ unsigned int num_slots;
+
+ /* num_elem == 0 is allowed, but uninteresting */
+ if (*num_elem < 0)
+ goto err1;
+
+ q = kmalloc(sizeof(*q), GFP_KERNEL);
+ if (!q)
+ goto err1;
+
+ q->rxe = rxe;
+
+ q->elem_size = elem_size;
+ elem_size = roundup_pow_of_two(elem_size);
+ q->log2_elem_size = order_base_2(elem_size);
+
+ num_slots = *num_elem + 1;
+ num_slots = roundup_pow_of_two(num_slots);
+ q->index_mask = num_slots - 1;
+
+ buf_size = sizeof(struct rxe_queue_buf) + num_slots*elem_size;
+
+ q->buf = vmalloc_user(buf_size);
+ if (!q->buf)
+ goto err2;
+
+ q->buf->log2_elem_size = q->log2_elem_size;
+ q->buf->index_mask = q->index_mask;
+
+ q->buf->producer_index = 0;
+ q->buf->consumer_index = 0;
+
+ q->buf_size = buf_size;
+
+ *num_elem = num_slots - 1;
+ return q;
+
+err2:
+ kfree(q);
+err1:
+ return NULL;
+}
+
+/* copies elements from original q to new q and then
+ swaps the contents of the two q headers. This is
+ so that if anyone is holding a pointer to q it will
+ still work */
+static int resize_finish(struct rxe_queue *q, struct rxe_queue *new_q,
+ unsigned int num_elem)
+{
+ struct rxe_queue temp;
+
+ if (!queue_empty(q) && (num_elem < queue_count(q)))
+ return -EINVAL;
+
+ while (!queue_empty(q)) {
+ memcpy(producer_addr(new_q), consumer_addr(q),
+ new_q->elem_size);
+ advance_producer(new_q);
+ advance_consumer(q);
+ }
+
+ temp = *q;
+ *q = *new_q;
+ *new_q = temp;
+
+
+ return 0;
+}
+
+int rxe_queue_resize(struct rxe_queue *q,
+ unsigned int *num_elem_p,
+ unsigned int elem_size,
+ struct ib_ucontext *context,
+ struct ib_udata *udata,
+ spinlock_t *producer_lock,
+ spinlock_t *consumer_lock)
+{
+ struct rxe_queue *new_q;
+ unsigned int num_elem = *num_elem_p;
+ int err;
+ unsigned long flags = 0;
+
+ new_q = rxe_queue_init(q->rxe, &num_elem, elem_size);
+ if (!new_q)
+ return -ENOMEM;
+
+ err = do_mmap_info(new_q->rxe, udata, 0, context, new_q->buf,
+ new_q->buf_size, &new_q->ip);
+ if (err) {
+ vfree(new_q->buf);
+ kfree(new_q);
+ goto err1;
+ }
+
+ spin_lock_bh(consumer_lock);
+
+ if (producer_lock) {
+ spin_lock_irqsave(producer_lock, flags);
+ err = resize_finish(q, new_q, num_elem);
+ spin_unlock_irqrestore(producer_lock, flags);
+ } else
+ err = resize_finish(q, new_q, num_elem);
+
+ spin_unlock_bh(consumer_lock);
+
+ rxe_queue_cleanup(new_q); /* new/old dep on err */
+ if (err)
+ goto err1;
+
+ *num_elem_p = num_elem;
+ return 0;
+
+err1:
+ return err;
+}
+
+void rxe_queue_cleanup(struct rxe_queue *q)
+{
+ if (q->ip)
+ kref_put(&q->ip->ref, rxe_mmap_release);
+ else
+ vfree(q->buf);
+
+ kfree(q);
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 12/44] rxe_verbs.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (10 preceding siblings ...)
2011-07-01 13:18 ` [patch 11/44] rxe_queue.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 13/44] rxe_verbs.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (32 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch12 --]
[-- Type: text/plain, Size: 13034 bytes --]
declarations for rxe interface to rdma/core
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_verbs.h | 571 ++++++++++++++++++++++++++++++++++
1 file changed, 571 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_verbs.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_verbs.h
@@ -0,0 +1,571 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_VERBS_H
+#define RXE_VERBS_H
+
+#include <linux/interrupt.h>
+#include "rxe_pool.h"
+#include "rxe_task.h"
+
+static inline int pkey_match(u16 key1, u16 key2)
+{
+ return (((key1 & 0x7fff) != 0) &&
+ ((key1 & 0x7fff) == (key2 & 0x7fff)) &&
+ ((key1 & 0x8000) || (key2 & 0x8000))) ? 1 : 0;
+}
+
+/* Return >0 if psn_a > psn_b
+ * 0 if psn_a == psn_b
+ * <0 if psn_a < psn_b
+ */
+static inline int psn_compare(u32 psn_a, u32 psn_b)
+{
+ s32 diff;
+ diff = (psn_a - psn_b) << 8;
+ return diff;
+}
+
+struct rxe_ucontext {
+ struct rxe_pool_entry pelem;
+ struct ib_ucontext ibuc;
+};
+
+struct rxe_pd {
+ struct rxe_pool_entry pelem;
+ struct ib_pd ibpd;
+};
+
+#define RXE_LL_ADDR_LEN (16)
+
+struct rxe_av {
+ struct ib_ah_attr attr;
+ u8 ll_addr[RXE_LL_ADDR_LEN];
+
+};
+
+struct rxe_ah {
+ struct rxe_pool_entry pelem;
+ struct ib_ah ibah;
+ struct rxe_pd *pd;
+ struct rxe_av av;
+};
+
+struct rxe_cqe {
+ union {
+ struct ib_wc ibwc;
+ struct ib_uverbs_wc uibwc;
+ };
+};
+
+struct rxe_cq {
+ struct rxe_pool_entry pelem;
+ struct ib_cq ibcq;
+ struct rxe_queue *queue;
+ spinlock_t cq_lock;
+ u8 notify;
+ u8 special;
+ int is_user;
+ struct tasklet_struct comp_task;
+};
+
+struct rxe_dma_info {
+ __u32 length;
+ __u32 resid;
+ __u32 cur_sge;
+ __u32 num_sge;
+ __u32 sge_offset;
+ union {
+ u8 inline_data[0];
+ struct ib_sge sge[0];
+ };
+};
+
+enum wqe_state {
+ wqe_state_posted,
+ wqe_state_processing,
+ wqe_state_pending,
+ wqe_state_done,
+ wqe_state_error,
+};
+
+/* must match corresponding data structure in librxe */
+struct rxe_send_wqe {
+ struct ib_send_wr ibwr;
+ struct rxe_av av; /* UD only */
+ u32 status;
+ u32 state;
+ u64 iova;
+ u32 mask;
+ u32 first_psn;
+ u32 last_psn;
+ u32 ack_length;
+ u32 ssn;
+ u32 has_rd_atomic;
+ struct rxe_dma_info dma; /* must go last */
+};
+
+struct rxe_recv_wqe {
+ __u64 wr_id;
+ __u32 num_sge;
+ __u32 padding;
+ struct rxe_dma_info dma;
+};
+
+struct rxe_sq {
+ int max_wr;
+ int max_sge;
+ int max_inline;
+ struct ib_wc next_wc;
+ spinlock_t sq_lock;
+ struct rxe_queue *queue;
+};
+
+struct rxe_rq {
+ int max_wr;
+ int max_sge;
+ struct ib_wc next_wc;
+ spinlock_t producer_lock;
+ spinlock_t consumer_lock;
+ struct rxe_queue *queue;
+};
+
+struct rxe_srq {
+ struct rxe_pool_entry pelem;
+ struct ib_srq ibsrq;
+ struct rxe_pd *pd;
+ struct rxe_cq *cq;
+ struct rxe_rq rq;
+ u32 srq_num;
+
+ void (*event_handler)(
+ struct ib_event *, void *);
+ void *context;
+
+ int limit;
+ int error;
+};
+
+enum rxe_qp_state {
+ QP_STATE_RESET,
+ QP_STATE_INIT,
+ QP_STATE_READY,
+ QP_STATE_DRAIN, /* req only */
+ QP_STATE_DRAINED, /* req only */
+ QP_STATE_ERROR
+};
+
+extern char *rxe_qp_state_name[];
+
+struct rxe_req_info {
+ enum rxe_qp_state state;
+ int wqe_index;
+ u32 psn;
+ int opcode;
+ atomic_t rd_atomic;
+ int wait_fence;
+ int need_rd_atomic;
+ int wait_psn;
+ int need_retry;
+ int noack_pkts;
+ struct rxe_task task;
+};
+
+struct rxe_comp_info {
+ u32 psn;
+ int opcode;
+ int timeout;
+ int timeout_retry;
+ u32 retry_cnt;
+ u32 rnr_retry;
+ struct rxe_task task;
+};
+
+enum rdatm_res_state {
+ rdatm_res_state_next,
+ rdatm_res_state_new,
+ rdatm_res_state_replay,
+};
+
+struct resp_res {
+ int type;
+ u32 first_psn;
+ u32 last_psn;
+ u32 cur_psn;
+ enum rdatm_res_state state;
+
+ union {
+ struct {
+ struct sk_buff *skb;
+ } atomic;
+ struct {
+ struct rxe_mem *mr;
+ u64 va_org;
+ u32 rkey;
+ u32 length;
+ u64 va;
+ u32 resid;
+ } read;
+ };
+};
+
+struct rxe_resp_info {
+ enum rxe_qp_state state;
+ u32 msn;
+ u32 psn;
+ int opcode;
+ int drop_msg;
+ int goto_error;
+ int sent_psn_nak;
+ enum ib_wc_status status;
+ u8 aeth_syndrome;
+
+ /* Receive only */
+ struct rxe_recv_wqe *wqe;
+
+ /* RDMA read / atomic only */
+ u64 va;
+ struct rxe_mem *mr;
+ u32 resid;
+ u32 rkey;
+ u64 atomic_orig;
+
+ /* SRQ only */
+ struct {
+ struct rxe_recv_wqe wqe;
+ struct ib_sge sge[RXE_MAX_SGE];
+ } srq_wqe;
+
+ /* Responder resources. It's a circular list where the oldest
+ * resource is dropped first. */
+ struct resp_res *resources;
+ unsigned int res_head;
+ unsigned int res_tail;
+ struct resp_res *res;
+ struct rxe_task task;
+};
+
+struct rxe_qp {
+ struct rxe_pool_entry pelem;
+ struct ib_qp ibqp;
+ struct ib_qp_attr attr;
+ unsigned int valid;
+ unsigned int mtu;
+ int is_user;
+
+ struct rxe_pd *pd;
+ struct rxe_srq *srq;
+ struct rxe_cq *scq;
+ struct rxe_cq *rcq;
+
+ enum ib_sig_type sq_sig_type;
+
+ struct rxe_sq sq;
+ struct rxe_rq rq;
+
+ struct rxe_av pri_av;
+ struct rxe_av alt_av;
+
+ /* list of mcast groups qp has joined
+ (for cleanup) */
+ struct list_head grp_list;
+ spinlock_t grp_lock;
+
+ struct sk_buff_head req_pkts;
+ struct sk_buff_head resp_pkts;
+ struct sk_buff_head send_pkts;
+
+ struct rxe_req_info req;
+ struct rxe_comp_info comp;
+ struct rxe_resp_info resp;
+
+ struct ib_udata *udata;
+
+ struct list_head arbiter_list;
+ atomic_t ssn;
+ atomic_t req_skb_in;
+ atomic_t resp_skb_in;
+ atomic_t req_skb_out;
+ atomic_t resp_skb_out;
+ int need_req_skb;
+
+ /* Timer for retranmitting packet when ACKs have been lost. RC
+ * only. The requester sets it when it is not already
+ * started. The responder resets it whenever an ack is
+ * received. */
+ struct timer_list retrans_timer;
+ u64 qp_timeout_jiffies;
+
+ /* Timer for handling RNR NAKS. */
+ struct timer_list rnr_nak_timer;
+
+ spinlock_t state_lock;
+};
+
+enum rxe_mem_state {
+ RXE_MEM_STATE_ZOMBIE,
+ RXE_MEM_STATE_INVALID,
+ RXE_MEM_STATE_FREE,
+ RXE_MEM_STATE_VALID,
+};
+
+enum rxe_mem_type {
+ RXE_MEM_TYPE_NONE,
+ RXE_MEM_TYPE_DMA,
+ RXE_MEM_TYPE_MR,
+ RXE_MEM_TYPE_FMR,
+ RXE_MEM_TYPE_MW,
+};
+
+#define RXE_BUF_PER_MAP (PAGE_SIZE/sizeof(struct ib_phys_buf))
+
+struct rxe_map {
+ struct ib_phys_buf buf[RXE_BUF_PER_MAP];
+};
+
+struct rxe_mem {
+ struct rxe_pool_entry pelem;
+ union {
+ struct ib_mr ibmr;
+ struct ib_fmr ibfmr;
+ struct ib_mw ibmw;
+ };
+
+ struct rxe_pd *pd;
+ struct ib_umem *umem;
+
+ u32 lkey;
+ u32 rkey;
+
+ enum rxe_mem_state state;
+ enum rxe_mem_type type;
+ u64 va;
+ u64 iova;
+ size_t length;
+ u32 offset;
+ int access;
+
+ int page_shift;
+ int page_mask;
+ int map_shift;
+ int map_mask;
+
+ u32 num_buf;
+
+ u32 max_buf;
+ u32 num_map;
+
+ struct rxe_map **map;
+};
+
+struct rxe_fast_reg_page_list {
+ struct ib_fast_reg_page_list ibfrpl;
+};
+
+struct rxe_mc_grp {
+ struct rxe_pool_entry pelem;
+ spinlock_t mcg_lock;
+ struct rxe_dev *rxe;
+ struct list_head qp_list;
+ union ib_gid mgid;
+ int num_qp;
+ u32 qkey;
+ u16 mlid;
+ u16 pkey;
+};
+
+struct rxe_mc_elem {
+ struct rxe_pool_entry pelem;
+ struct list_head qp_list;
+ struct list_head grp_list;
+ struct rxe_qp *qp;
+ struct rxe_mc_grp *grp;
+};
+
+struct rxe_port {
+ struct ib_port_attr attr;
+ u16 *pkey_tbl;
+ __be64 *guid_tbl;
+ __be64 subnet_prefix;
+
+ /* rate control */
+ /* TODO */
+
+ spinlock_t port_lock;
+
+ unsigned int mtu_cap;
+
+ /* special QPs */
+ u32 qp_smi_index;
+ u32 qp_gsi_index;
+};
+
+struct rxe_arbiter {
+ struct rxe_task task;
+ struct list_head qp_list;
+ spinlock_t list_lock;
+ struct timespec time;
+ int delay;
+ int queue_stalled;
+};
+
+/* callbacks from ib_rxe to network interface layer */
+struct rxe_ifc_ops {
+ void (*release)(struct rxe_dev *rxe);
+ __be64 (*node_guid)(struct rxe_dev *rxe);
+ __be64 (*port_guid)(struct rxe_dev *rxe, unsigned int port_num);
+ struct device *(*dma_device)(struct rxe_dev *rxe);
+ int (*mcast_add)(struct rxe_dev *rxe, union ib_gid *mgid);
+ int (*mcast_delete)(struct rxe_dev *rxe, union ib_gid *mgid);
+ int (*send)(struct rxe_dev *rxe, struct sk_buff *skb);
+ int (*loopback)(struct rxe_dev *rxe, struct sk_buff *skb);
+ struct sk_buff *(*init_packet)(struct rxe_dev *rxe, struct rxe_av *av,
+ int paylen);
+ int (*init_av)(struct rxe_dev *rxe, struct ib_ah_attr *attr,
+ struct rxe_av *av);
+ char *(*parent_name)(struct rxe_dev *rxe, unsigned int port_num);
+ enum rdma_link_layer (*link_layer)(struct rxe_dev *rxe,
+ unsigned int port_num);
+};
+
+#define RXE_QUEUE_STOPPED (999)
+
+struct rxe_dev {
+ struct ib_device ib_dev;
+ struct ib_device_attr attr;
+ int max_ucontext;
+ int max_inline_data;
+ struct kref ref_cnt;
+
+ struct rxe_ifc_ops *ifc_ops;
+
+ struct net_device *ndev;
+
+ struct rxe_arbiter arbiter;
+
+ atomic_t ind;
+
+ atomic_t req_skb_in;
+ atomic_t resp_skb_in;
+ atomic_t req_skb_out;
+ atomic_t resp_skb_out;
+
+ int xmit_errors;
+
+ struct rxe_pool uc_pool;
+ struct rxe_pool pd_pool;
+ struct rxe_pool ah_pool;
+ struct rxe_pool srq_pool;
+ struct rxe_pool qp_pool;
+ struct rxe_pool cq_pool;
+ struct rxe_pool mr_pool;
+ struct rxe_pool mw_pool;
+ struct rxe_pool fmr_pool;
+ struct rxe_pool mc_grp_pool;
+ struct rxe_pool mc_elem_pool;
+
+ spinlock_t pending_lock;
+ struct list_head pending_mmaps;
+
+ spinlock_t mmap_offset_lock;
+ int mmap_offset;
+
+ u8 num_ports;
+ struct rxe_port *port;
+
+ enum rxe_mtu pref_mtu;
+};
+
+static inline struct rxe_dev *to_rdev(struct ib_device *dev)
+{
+ return dev ? container_of(dev, struct rxe_dev, ib_dev) : NULL;
+}
+
+static inline struct rxe_ucontext *to_ruc(struct ib_ucontext *uc)
+{
+ return uc ? container_of(uc, struct rxe_ucontext, ibuc) : NULL;
+}
+
+static inline struct rxe_pd *to_rpd(struct ib_pd *pd)
+{
+ return pd ? container_of(pd, struct rxe_pd, ibpd) : NULL;
+}
+
+static inline struct rxe_ah *to_rah(struct ib_ah *ah)
+{
+ return ah ? container_of(ah, struct rxe_ah, ibah) : NULL;
+}
+
+static inline struct rxe_srq *to_rsrq(struct ib_srq *srq)
+{
+ return srq ? container_of(srq, struct rxe_srq, ibsrq) : NULL;
+}
+
+static inline struct rxe_qp *to_rqp(struct ib_qp *qp)
+{
+ return qp ? container_of(qp, struct rxe_qp, ibqp) : NULL;
+}
+
+static inline struct rxe_cq *to_rcq(struct ib_cq *cq)
+{
+ return cq ? container_of(cq, struct rxe_cq, ibcq) : NULL;
+}
+
+static inline struct rxe_mem *to_rmr(struct ib_mr *mr)
+{
+ return mr ? container_of(mr, struct rxe_mem, ibmr) : NULL;
+}
+
+static inline struct rxe_mem *to_rfmr(struct ib_fmr *fmr)
+{
+ return fmr ? container_of(fmr, struct rxe_mem, ibfmr) : NULL;
+}
+
+static inline struct rxe_mem *to_rmw(struct ib_mw *mw)
+{
+ return mw ? container_of(mw, struct rxe_mem, ibmw) : NULL;
+}
+
+static inline struct rxe_fast_reg_page_list *to_rfrpl(
+ struct ib_fast_reg_page_list *frpl)
+{
+ return frpl ? container_of(frpl,
+ struct rxe_fast_reg_page_list, ibfrpl) : NULL;
+}
+
+int rxe_register_device(struct rxe_dev *rxe);
+int rxe_unregister_device(struct rxe_dev *rxe);
+
+void rxe_mc_cleanup(void *arg);
+
+#endif /* RXE_VERBS_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 13/44] rxe_verbs.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (11 preceding siblings ...)
2011-07-01 13:18 ` [patch 12/44] rxe_verbs.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 14/44] rxe_pool.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (31 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch13 --]
[-- Type: text/plain, Size: 30888 bytes --]
rxe interface to rdma/core
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_verbs.c | 1351 ++++++++++++++++++++++++++++++++++
1 file changed, 1351 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_verbs.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_verbs.c
@@ -0,0 +1,1351 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_queue.h"
+#include "rxe_av.h"
+#include "rxe_mmap.h"
+#include "rxe_srq.h"
+#include "rxe_cq.h"
+#include "rxe_qp.h"
+#include "rxe_mr.h"
+#include "rxe_mcast.h"
+
+static int rxe_query_device(struct ib_device *dev, struct ib_device_attr *attr)
+{
+ struct rxe_dev *rxe = to_rdev(dev);
+
+ *attr = rxe->attr;
+ return 0;
+}
+
+static int rxe_query_port(struct ib_device *dev,
+ u8 port_num, struct ib_port_attr *attr)
+{
+ struct rxe_dev *rxe = to_rdev(dev);
+ struct rxe_port *port;
+
+ if (unlikely(port_num < 1 || port_num > rxe->num_ports)) {
+ pr_warn("invalid port_number %d\n", port_num);
+ goto err1;
+ }
+
+ port = &rxe->port[port_num - 1];
+
+ *attr = port->attr;
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static int rxe_query_gid(struct ib_device *device,
+ u8 port_num, int index, union ib_gid *gid)
+{
+ struct rxe_dev *rxe = to_rdev(device);
+ struct rxe_port *port;
+
+ if (unlikely(port_num < 1 || port_num > rxe->num_ports)) {
+ pr_warn("invalid port_num = %d\n", port_num);
+ goto err1;
+ }
+
+ port = &rxe->port[port_num - 1];
+
+ if (unlikely(index < 0 || index >= port->attr.gid_tbl_len)) {
+ pr_warn("invalid index = %d\n", index);
+ goto err1;
+ }
+
+ gid->global.subnet_prefix = port->subnet_prefix;
+ gid->global.interface_id = port->guid_tbl[index];
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static int rxe_query_pkey(struct ib_device *device,
+ u8 port_num, u16 index, u16 *pkey)
+{
+ struct rxe_dev *rxe = to_rdev(device);
+ struct rxe_port *port;
+
+ if (unlikely(port_num < 1 || port_num > rxe->num_ports)) {
+ pr_warn("invalid port_num = %d\n", port_num);
+ goto err1;
+ }
+
+ port = &rxe->port[port_num - 1];
+
+ if (unlikely(index >= port->attr.pkey_tbl_len)) {
+ pr_warn("invalid index = %d\n", index);
+ goto err1;
+ }
+
+ *pkey = port->pkey_tbl[index];
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static int rxe_modify_device(struct ib_device *dev,
+ int mask, struct ib_device_modify *attr)
+{
+ struct rxe_dev *rxe = to_rdev(dev);
+
+ if (mask & IB_DEVICE_MODIFY_SYS_IMAGE_GUID)
+ rxe->attr.sys_image_guid = attr->sys_image_guid;
+
+ if (mask & IB_DEVICE_MODIFY_NODE_DESC) {
+ memcpy(rxe->ib_dev.node_desc,
+ attr->node_desc, sizeof(rxe->ib_dev.node_desc));
+ }
+
+ return 0;
+}
+
+static int rxe_modify_port(struct ib_device *dev,
+ u8 port_num, int mask, struct ib_port_modify *attr)
+{
+ struct rxe_dev *rxe = to_rdev(dev);
+ struct rxe_port *port;
+
+ if (unlikely(port_num < 1 || port_num > rxe->num_ports)) {
+ pr_warn("invalid port_num = %d\n", port_num);
+ goto err1;
+ }
+
+ port = &rxe->port[port_num - 1];
+
+ port->attr.port_cap_flags |= attr->set_port_cap_mask;
+ port->attr.port_cap_flags &= ~attr->clr_port_cap_mask;
+
+ if (mask & IB_PORT_RESET_QKEY_CNTR)
+ port->attr.qkey_viol_cntr = 0;
+
+ if (mask & IB_PORT_INIT_TYPE)
+ /* TODO init type */
+ ;
+
+ if (mask & IB_PORT_SHUTDOWN)
+ /* TODO shutdown port */
+ ;
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static enum rdma_link_layer rxe_get_link_layer(struct ib_device *dev,
+ u8 port_num)
+{
+ struct rxe_dev *rxe = to_rdev(dev);
+
+ return rxe->ifc_ops->link_layer(rxe, port_num);
+}
+
+static struct ib_ucontext *rxe_alloc_ucontext(struct ib_device *dev,
+ struct ib_udata *udata)
+{
+ struct rxe_dev *rxe = to_rdev(dev);
+ struct rxe_ucontext *uc;
+
+ uc = rxe_alloc(&rxe->uc_pool);
+ return uc ? &uc->ibuc : ERR_PTR(-ENOMEM);
+}
+
+static int rxe_dealloc_ucontext(struct ib_ucontext *ibuc)
+{
+ struct rxe_ucontext *uc = to_ruc(ibuc);
+
+ rxe_drop_ref(uc);
+ return 0;
+}
+
+static struct ib_pd *rxe_alloc_pd(struct ib_device *dev,
+ struct ib_ucontext *context,
+ struct ib_udata *udata)
+{
+ struct rxe_dev *rxe = to_rdev(dev);
+ struct rxe_pd *pd;
+
+ pd = rxe_alloc(&rxe->pd_pool);
+ return pd ? &pd->ibpd : ERR_PTR(-ENOMEM);
+}
+
+static int rxe_dealloc_pd(struct ib_pd *ibpd)
+{
+ struct rxe_pd *pd = to_rpd(ibpd);
+
+ rxe_drop_ref(pd);
+ return 0;
+}
+
+static struct ib_ah *rxe_create_ah(struct ib_pd *ibpd, struct ib_ah_attr *attr)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_ah *ah;
+
+ err = rxe_av_chk_attr(rxe, attr);
+ if (err)
+ goto err1;
+
+ ah = rxe_alloc(&rxe->ah_pool);
+ if (!ah) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_add_ref(pd);
+ ah->pd = pd;
+
+ err = rxe_av_from_attr(rxe, attr->port_num, &ah->av, attr);
+ if (err)
+ goto err2;
+
+ return &ah->ibah;
+
+err2:
+ rxe_drop_ref(pd);
+ rxe_drop_ref(ah);
+err1:
+ return ERR_PTR(err);
+}
+
+static int rxe_modify_ah(struct ib_ah *ibah, struct ib_ah_attr *attr)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(ibah->device);
+ struct rxe_ah *ah = to_rah(ibah);
+
+ err = rxe_av_chk_attr(rxe, attr);
+ if (err)
+ goto err1;
+
+ err = rxe_av_from_attr(rxe, attr->port_num, &ah->av, attr);
+err1:
+ return err;
+}
+
+static int rxe_query_ah(struct ib_ah *ibah, struct ib_ah_attr *attr)
+{
+ struct rxe_dev *rxe = to_rdev(ibah->device);
+ struct rxe_ah *ah = to_rah(ibah);
+
+ rxe_av_to_attr(rxe, &ah->av, attr);
+ return 0;
+}
+
+static int rxe_destroy_ah(struct ib_ah *ibah)
+{
+ struct rxe_ah *ah = to_rah(ibah);
+
+ rxe_drop_ref(ah->pd);
+ rxe_drop_ref(ah);
+ return 0;
+}
+
+static int post_one_recv(struct rxe_rq *rq, struct ib_recv_wr *ibwr)
+{
+ int err;
+ int i;
+ u32 length;
+ struct rxe_recv_wqe *recv_wqe;
+ int num_sge = ibwr->num_sge;
+
+ if (unlikely(queue_full(rq->queue))) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ if (unlikely(num_sge > rq->max_sge)) {
+ err = -EINVAL;
+ goto err1;
+ }
+
+ length = 0;
+ for (i = 0; i < num_sge; i++)
+ length += ibwr->sg_list[i].length;
+
+ recv_wqe = producer_addr(rq->queue);
+ recv_wqe->wr_id = ibwr->wr_id;
+ recv_wqe->num_sge = num_sge;
+
+ memcpy(recv_wqe->dma.sge, ibwr->sg_list,
+ num_sge*sizeof(struct ib_sge));
+
+ recv_wqe->dma.length = length;
+ recv_wqe->dma.resid = length;
+ recv_wqe->dma.num_sge = num_sge;
+ recv_wqe->dma.cur_sge = 0;
+ recv_wqe->dma.sge_offset = 0;
+
+ /* make sure all changes to the work queue are
+ written before we update the producer pointer */
+ wmb();
+
+ advance_producer(rq->queue);
+ return 0;
+
+err1:
+ return err;
+}
+
+static struct ib_srq *rxe_create_srq(struct ib_pd *ibpd,
+ struct ib_srq_init_attr *init,
+ struct ib_udata *udata)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_srq *srq;
+ struct ib_ucontext *context = udata ? ibpd->uobject->context : NULL;
+
+ err = rxe_srq_chk_attr(rxe, NULL, &init->attr, IB_SRQ_INIT_MASK);
+ if (err)
+ goto err1;
+
+ srq = rxe_alloc(&rxe->srq_pool);
+ if (!srq) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_add_index(srq);
+ rxe_add_ref(pd);
+ srq->pd = pd;
+
+ err = rxe_srq_from_init(rxe, srq, init, context, udata);
+ if (err)
+ goto err2;
+
+ return &srq->ibsrq;
+
+err2:
+ rxe_drop_ref(pd);
+ rxe_drop_index(srq);
+ rxe_drop_ref(srq);
+err1:
+ return ERR_PTR(err);
+}
+
+static int rxe_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
+ enum ib_srq_attr_mask mask,
+ struct ib_udata *udata)
+{
+ int err;
+ struct rxe_srq *srq = to_rsrq(ibsrq);
+ struct rxe_dev *rxe = to_rdev(ibsrq->device);
+
+ err = rxe_srq_chk_attr(rxe, srq, attr, mask);
+ if (err)
+ goto err1;
+
+ err = rxe_srq_from_attr(rxe, srq, attr, mask, udata);
+ if (err)
+ goto err1;
+
+ return 0;
+
+err1:
+ return err;
+}
+
+static int rxe_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr)
+{
+ struct rxe_srq *srq = to_rsrq(ibsrq);
+
+ if (srq->error)
+ return -EINVAL;
+
+ attr->max_wr = srq->rq.queue->buf->index_mask;
+ attr->max_sge = srq->rq.max_sge;
+ attr->srq_limit = srq->limit;
+ return 0;
+}
+
+static int rxe_destroy_srq(struct ib_srq *ibsrq)
+{
+ struct rxe_srq *srq = to_rsrq(ibsrq);
+
+ if (srq->cq)
+ rxe_drop_ref(srq->cq);
+
+ rxe_drop_ref(srq->pd);
+ rxe_drop_index(srq);
+ rxe_drop_ref(srq);
+ return 0;
+}
+
+static int rxe_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr,
+ struct ib_recv_wr **bad_wr)
+{
+ int err = 0;
+ unsigned long flags;
+ struct rxe_srq *srq = to_rsrq(ibsrq);
+
+ spin_lock_irqsave(&srq->rq.producer_lock, flags);
+
+ while (wr) {
+ err = post_one_recv(&srq->rq, wr);
+ if (unlikely(err))
+ break;
+ wr = wr->next;
+ }
+
+ spin_unlock_irqrestore(&srq->rq.producer_lock, flags);
+
+ if (err)
+ *bad_wr = wr;
+
+ return err;
+}
+
+static struct ib_qp *rxe_create_qp(struct ib_pd *ibpd,
+ struct ib_qp_init_attr *init,
+ struct ib_udata *udata)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_qp *qp;
+
+ err = rxe_qp_chk_init(rxe, init);
+ if (err)
+ goto err1;
+
+ qp = rxe_alloc(&rxe->qp_pool);
+ if (!qp) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_add_index(qp);
+
+ if (udata)
+ qp->is_user = 1;
+
+ err = rxe_qp_from_init(rxe, qp, pd, init, udata, ibpd);
+ if (err)
+ goto err2;
+
+ return &qp->ibqp;
+
+err2:
+ rxe_drop_index(qp);
+ rxe_drop_ref(qp);
+err1:
+ return ERR_PTR(err);
+}
+
+static int rxe_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+ int mask, struct ib_udata *udata)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(ibqp->device);
+ struct rxe_qp *qp = to_rqp(ibqp);
+
+ err = rxe_qp_chk_attr(rxe, qp, attr, mask);
+ if (err)
+ goto err1;
+
+ err = rxe_qp_from_attr(qp, attr, mask, udata);
+ if (err)
+ goto err1;
+
+ return 0;
+
+err1:
+ return err;
+}
+
+static int rxe_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+ int mask, struct ib_qp_init_attr *init)
+{
+ struct rxe_qp *qp = to_rqp(ibqp);
+
+ rxe_qp_to_init(qp, init);
+ rxe_qp_to_attr(qp, attr, mask);
+
+ return 0;
+}
+
+static int rxe_destroy_qp(struct ib_qp *ibqp)
+{
+ struct rxe_qp *qp = to_rqp(ibqp);
+
+ rxe_qp_destroy(qp);
+ rxe_drop_index(qp);
+ rxe_drop_ref(qp);
+ return 0;
+}
+
+static int validate_send_wr(struct rxe_qp *qp, struct ib_send_wr *ibwr,
+ unsigned int mask, unsigned int length)
+{
+ int num_sge = ibwr->num_sge;
+ struct rxe_sq *sq = &qp->sq;
+
+ if (unlikely(num_sge > sq->max_sge))
+ goto err1;
+
+ if (unlikely(mask & WR_ATOMIC_MASK)) {
+ if (length < 8)
+ goto err1;
+
+ if (ibwr->wr.atomic.remote_addr & 0x7)
+ goto err1;
+ }
+
+ if (unlikely((ibwr->send_flags & IB_SEND_INLINE)
+ && (length > sq->max_inline)))
+ goto err1;
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,
+ unsigned int mask, unsigned int length,
+ struct rxe_send_wqe *wqe)
+{
+ int num_sge = ibwr->num_sge;
+ struct ib_sge *sge;
+ int i;
+ u8 *p;
+
+ memcpy(&wqe->ibwr, ibwr, sizeof(wqe->ibwr));
+
+ if (qp_type(qp) == IB_QPT_UD
+ || qp_type(qp) == IB_QPT_SMI
+ || qp_type(qp) == IB_QPT_GSI)
+ memcpy(&wqe->av, &to_rah(ibwr->wr.ud.ah)->av, sizeof(wqe->av));
+
+ if (unlikely(ibwr->send_flags & IB_SEND_INLINE)) {
+ p = wqe->dma.inline_data;
+
+ sge = ibwr->sg_list;
+ for (i = 0; i < num_sge; i++, sge++) {
+ if (qp->is_user && copy_from_user(p, (__user void *)
+ (uintptr_t)sge->addr, sge->length))
+ return -EFAULT;
+ else
+ memcpy(p, (void *)(uintptr_t)sge->addr,
+ sge->length);
+
+ p += sge->length;
+ }
+ } else
+ memcpy(wqe->dma.sge, ibwr->sg_list,
+ num_sge*sizeof(struct ib_sge));
+
+ wqe->iova = (mask & WR_ATOMIC_MASK) ?
+ ibwr->wr.atomic.remote_addr :
+ ibwr->wr.rdma.remote_addr;
+ wqe->mask = mask;
+ wqe->dma.length = length;
+ wqe->dma.resid = length;
+ wqe->dma.num_sge = num_sge;
+ wqe->dma.cur_sge = 0;
+ wqe->dma.sge_offset = 0;
+ wqe->state = wqe_state_posted;
+ wqe->ssn = atomic_add_return(1, &qp->ssn);
+
+ return 0;
+}
+
+static int post_one_send(struct rxe_qp *qp, struct ib_send_wr *ibwr,
+ unsigned mask, u32 length)
+{
+ int err;
+ struct rxe_sq *sq = &qp->sq;
+ struct rxe_send_wqe *send_wqe;
+ unsigned long flags;
+
+ err = validate_send_wr(qp, ibwr, mask, length);
+ if (err)
+ return err;
+
+ spin_lock_irqsave(&qp->sq.sq_lock, flags);
+
+ if (unlikely(queue_full(sq->queue))) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ send_wqe = producer_addr(sq->queue);
+
+ err = init_send_wqe(qp, ibwr, mask, length, send_wqe);
+ if (unlikely(err))
+ goto err1;
+
+ /* make sure all changes to the work queue are
+ written before we update the producer pointer */
+ wmb();
+
+ advance_producer(sq->queue);
+ spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
+
+ return 0;
+
+err1:
+ spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
+ return err;
+}
+
+static int rxe_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
+ struct ib_send_wr **bad_wr)
+{
+ int err = 0;
+ struct rxe_qp *qp = to_rqp(ibqp);
+ unsigned int mask;
+ unsigned int length = 0;
+ int i;
+ int must_sched;
+
+ if (unlikely(!qp->valid || qp->req.state == QP_STATE_ERROR)) {
+ *bad_wr = wr;
+ return -EINVAL;
+ }
+
+ if (unlikely(qp->req.state < QP_STATE_READY)) {
+ *bad_wr = wr;
+ return -EINVAL;
+ }
+
+ if (qp->is_user) {
+ unsigned int wqe_index = producer_index(qp->sq.queue);
+ struct rxe_send_wqe *wqe;
+
+ WARN_ON(wr);
+
+ /* compute length of last wr for use in
+ fastpath decision below */
+ wqe_index = (wqe_index - 1) & qp->sq.queue->index_mask;
+ wqe = addr_from_index(qp->sq.queue, wqe_index);
+ length = wqe->dma.resid;
+ }
+
+ while (wr) {
+ mask = wr_opcode_mask(wr->opcode, qp);
+ if (unlikely(!mask)) {
+ err = -EINVAL;
+ *bad_wr = wr;
+ break;
+ }
+
+ if (unlikely((wr->send_flags & IB_SEND_INLINE) &&
+ !(mask & WR_INLINE_MASK))) {
+ err = -EINVAL;
+ *bad_wr = wr;
+ break;
+ }
+
+ length = 0;
+ for (i = 0; i < wr->num_sge; i++)
+ length += wr->sg_list[i].length;
+
+ err = post_one_send(qp, wr, mask, length);
+
+ if (err) {
+ *bad_wr = wr;
+ break;
+ }
+ wr = wr->next;
+ }
+
+ must_sched = queue_count(qp->sq.queue) > 1;
+ rxe_run_task(&qp->req.task, must_sched);
+
+ return err;
+}
+
+static int rxe_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
+ struct ib_recv_wr **bad_wr)
+{
+ int err = 0;
+ struct rxe_qp *qp = to_rqp(ibqp);
+ struct rxe_rq *rq = &qp->rq;
+ unsigned long flags;
+
+ if (unlikely((qp_state(qp) < IB_QPS_INIT) || !qp->valid)) {
+ *bad_wr = wr;
+ err = -EINVAL;
+ goto err1;
+ }
+
+ if (unlikely(qp->srq)) {
+ *bad_wr = wr;
+ err = -EINVAL;
+ goto err1;
+ }
+
+ if (unlikely(qp->is_user)) {
+ *bad_wr = wr;
+ err = -EINVAL;
+ goto err1;
+ }
+
+ spin_lock_irqsave(&rq->producer_lock, flags);
+
+ while (wr) {
+ err = post_one_recv(rq, wr);
+ if (unlikely(err)) {
+ *bad_wr = wr;
+ break;
+ }
+ wr = wr->next;
+ }
+
+ spin_unlock_irqrestore(&rq->producer_lock, flags);
+
+err1:
+ return err;
+}
+
+static struct ib_cq *rxe_create_cq(struct ib_device *dev, int cqe,
+ int comp_vector,
+ struct ib_ucontext *context,
+ struct ib_udata *udata)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(dev);
+ struct rxe_cq *cq;
+
+ err = rxe_cq_chk_attr(rxe, NULL, cqe, comp_vector, udata);
+ if (err)
+ goto err1;
+
+ cq = rxe_alloc(&rxe->cq_pool);
+ if (!cq) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ err = rxe_cq_from_init(rxe, cq, cqe, comp_vector, context, udata);
+ if (err)
+ goto err2;
+
+ return &cq->ibcq;
+
+err2:
+ rxe_drop_ref(cq);
+err1:
+ return ERR_PTR(err);
+}
+
+static int rxe_destroy_cq(struct ib_cq *ibcq)
+{
+ struct rxe_cq *cq = to_rcq(ibcq);
+
+ rxe_drop_ref(cq);
+ return 0;
+}
+
+static int rxe_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata)
+{
+ int err;
+ struct rxe_cq *cq = to_rcq(ibcq);
+ struct rxe_dev *rxe = to_rdev(ibcq->device);
+
+ err = rxe_cq_chk_attr(rxe, cq, cqe, 0, udata);
+ if (err)
+ goto err1;
+
+ err = rxe_cq_resize_queue(cq, cqe, udata);
+ if (err)
+ goto err1;
+
+ return 0;
+
+err1:
+ return err;
+}
+
+static int rxe_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
+{
+ int i;
+ struct rxe_cq *cq = to_rcq(ibcq);
+ struct rxe_cqe *cqe;
+
+ for (i = 0; i < num_entries; i++) {
+ cqe = queue_head(cq->queue);
+ if (!cqe)
+ break;
+
+ memcpy(wc++, &cqe->ibwc, sizeof(*wc));
+ advance_consumer(cq->queue);
+ }
+
+ return i;
+}
+
+static int rxe_peek_cq(struct ib_cq *ibcq, int wc_cnt)
+{
+ struct rxe_cq *cq = to_rcq(ibcq);
+ int count = queue_count(cq->queue);
+
+ return (count > wc_cnt) ? wc_cnt : count;
+}
+
+static int rxe_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
+{
+ struct rxe_cq *cq = to_rcq(ibcq);
+
+ if (cq->notify != IB_CQ_NEXT_COMP)
+ cq->notify = flags & IB_CQ_SOLICITED_MASK;
+
+ return 0;
+}
+
+static int rxe_req_ncomp_notif(struct ib_cq *ibcq, int wc_cnt)
+{
+ return -EINVAL;
+}
+
+static struct ib_mr *rxe_get_dma_mr(struct ib_pd *ibpd, int access)
+{
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_mem *mr;
+ int err;
+
+ mr = rxe_alloc(&rxe->mr_pool);
+ if (!mr) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_add_index(mr);
+
+ rxe_add_ref(pd);
+
+ err = rxe_mem_init_dma(rxe, pd, access, mr);
+ if (err)
+ goto err2;
+
+ return &mr->ibmr;
+
+err2:
+ rxe_drop_ref(pd);
+ rxe_drop_index(mr);
+ rxe_drop_ref(mr);
+err1:
+ return ERR_PTR(err);
+}
+
+static struct ib_mr *rxe_reg_phys_mr(struct ib_pd *ibpd,
+ struct ib_phys_buf *phys_buf_array,
+ int num_phys_buf,
+ int access, u64 *iova_start)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_mem *mr;
+ u64 iova = *iova_start;
+
+ mr = rxe_alloc(&rxe->mr_pool);
+ if (!mr) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_add_index(mr);
+
+ rxe_add_ref(pd);
+
+ err = rxe_mem_init_phys(rxe, pd, access, iova,
+ phys_buf_array, num_phys_buf, mr);
+ if (err)
+ goto err2;
+
+ return &mr->ibmr;
+
+err2:
+ rxe_drop_ref(pd);
+ rxe_drop_index(mr);
+ rxe_drop_ref(mr);
+err1:
+ return ERR_PTR(err);
+}
+
+static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd,
+ u64 start,
+ u64 length,
+ u64 iova,
+ int access, struct ib_udata *udata)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_mem *mr;
+
+ mr = rxe_alloc(&rxe->mr_pool);
+ if (!mr) {
+ err = -ENOMEM;
+ goto err2;
+ }
+
+ rxe_add_index(mr);
+
+ rxe_add_ref(pd);
+
+ err = rxe_mem_init_user(rxe, pd, start, length, iova,
+ access, udata, mr);
+ if (err)
+ goto err3;
+
+ return &mr->ibmr;
+
+err3:
+ rxe_drop_ref(pd);
+ rxe_drop_index(mr);
+ rxe_drop_ref(mr);
+err2:
+ return ERR_PTR(err);
+}
+
+static int rxe_dereg_mr(struct ib_mr *ibmr)
+{
+ struct rxe_mem *mr = to_rmr(ibmr);
+
+ mr->state = RXE_MEM_STATE_ZOMBIE;
+ rxe_drop_ref(mr->pd);
+ rxe_drop_index(mr);
+ rxe_drop_ref(mr);
+ return 0;
+}
+
+static struct ib_mr *rxe_alloc_fast_reg_mr(struct ib_pd *ibpd, int max_pages)
+{
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_mem *mr;
+ int err;
+
+ mr = rxe_alloc(&rxe->mr_pool);
+ if (!mr) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_add_index(mr);
+
+ rxe_add_ref(pd);
+
+ err = rxe_mem_init_fast(rxe, pd, max_pages, mr);
+ if (err)
+ goto err2;
+
+ return &mr->ibmr;
+
+err2:
+ rxe_drop_ref(pd);
+ rxe_drop_index(mr);
+ rxe_drop_ref(mr);
+err1:
+ return ERR_PTR(err);
+}
+
+static struct ib_fast_reg_page_list *
+ rxe_alloc_fast_reg_page_list(struct ib_device *device,
+ int page_list_len)
+{
+ struct rxe_fast_reg_page_list *frpl;
+ int err;
+
+ frpl = kmalloc(sizeof(*frpl), GFP_KERNEL);
+ if (!frpl) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ frpl->ibfrpl.page_list = kmalloc(page_list_len*sizeof(u64), GFP_KERNEL);
+ if (!frpl->ibfrpl.page_list) {
+ err = -ENOMEM;
+ goto err2;
+ }
+
+ return &frpl->ibfrpl;
+
+err2:
+ kfree(frpl);
+err1:
+ return ERR_PTR(err);
+}
+
+static void rxe_free_fast_reg_page_list(struct ib_fast_reg_page_list *ibfrpl)
+{
+ struct rxe_fast_reg_page_list *frpl = to_rfrpl(ibfrpl);
+
+ kfree(frpl->ibfrpl.page_list);
+ kfree(frpl);
+ return;
+}
+
+static int rxe_rereg_phys_mr(struct ib_mr *ibmr, int mr_rereg_mask,
+ struct ib_pd *ibpd,
+ struct ib_phys_buf *phys_buf_array,
+ int num_phys_buf, int mr_access_flags,
+ u64 *iova_start)
+{
+ return -EINVAL;
+}
+
+static struct ib_mw *rxe_alloc_mw(struct ib_pd *ibpd)
+{
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_mem *mw;
+ int err;
+
+ mw = rxe_alloc(&rxe->mw_pool);
+ if (!mw) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_add_index(mw);
+
+ rxe_add_ref(pd);
+
+ err = rxe_mem_init_mw(rxe, pd, mw);
+ if (err)
+ goto err2;
+
+ return &mw->ibmw;
+
+err2:
+ rxe_drop_ref(pd);
+ rxe_drop_index(mw);
+ rxe_drop_ref(mw);
+err1:
+ return ERR_PTR(err);
+}
+
+static int rxe_bind_mw(struct ib_qp *ibqp,
+ struct ib_mw *ibmw, struct ib_mw_bind *mw_bind)
+{
+ return -EINVAL;
+}
+
+static int rxe_dealloc_mw(struct ib_mw *ibmw)
+{
+ struct rxe_mem *mw = to_rmw(ibmw);
+
+ mw->state = RXE_MEM_STATE_ZOMBIE;
+ rxe_drop_ref(mw->pd);
+ rxe_drop_index(mw);
+ rxe_drop_ref(mw);
+ return 0;
+}
+
+static struct ib_fmr *rxe_alloc_fmr(struct ib_pd *ibpd,
+ int access, struct ib_fmr_attr *attr)
+{
+ struct rxe_dev *rxe = to_rdev(ibpd->device);
+ struct rxe_pd *pd = to_rpd(ibpd);
+ struct rxe_mem *fmr;
+ int err;
+
+ fmr = rxe_alloc(&rxe->fmr_pool);
+ if (!fmr) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_add_index(fmr);
+
+ rxe_add_ref(pd);
+
+ err = rxe_mem_init_fmr(rxe, pd, access, attr, fmr);
+ if (err)
+ goto err2;
+
+ return &fmr->ibfmr;
+
+err2:
+ rxe_drop_ref(pd);
+ rxe_drop_index(fmr);
+ rxe_drop_ref(fmr);
+err1:
+ return ERR_PTR(err);
+}
+
+static int rxe_map_phys_fmr(struct ib_fmr *ibfmr,
+ u64 *page_list, int list_length, u64 iova)
+{
+ struct rxe_mem *fmr = to_rfmr(ibfmr);
+ struct rxe_dev *rxe = to_rdev(ibfmr->device);
+
+ return rxe_mem_map_pages(rxe, fmr, page_list, list_length, iova);
+}
+
+static int rxe_unmap_fmr(struct list_head *fmr_list)
+{
+ struct rxe_mem *fmr;
+
+ list_for_each_entry(fmr, fmr_list, ibfmr.list) {
+ if (fmr->state != RXE_MEM_STATE_VALID)
+ continue;
+
+ fmr->va = 0;
+ fmr->iova = 0;
+ fmr->length = 0;
+ fmr->num_buf = 0;
+ fmr->state = RXE_MEM_STATE_FREE;
+ }
+
+ return 0;
+}
+
+static int rxe_dealloc_fmr(struct ib_fmr *ibfmr)
+{
+ struct rxe_mem *fmr = to_rfmr(ibfmr);
+
+ fmr->state = RXE_MEM_STATE_ZOMBIE;
+ rxe_drop_ref(fmr->pd);
+ rxe_drop_index(fmr);
+ rxe_drop_ref(fmr);
+ return 0;
+}
+
+static int rxe_attach_mcast(struct ib_qp *ibqp, union ib_gid *mgid, u16 mlid)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(ibqp->device);
+ struct rxe_qp *qp = to_rqp(ibqp);
+ struct rxe_mc_grp *grp;
+
+ /* takes a ref on grp if successful */
+ err = rxe_mcast_get_grp(rxe, mgid, mlid, &grp);
+ if (err)
+ return err;
+
+ err = rxe_mcast_add_grp_elem(rxe, qp, grp);
+
+ rxe_drop_ref(grp);
+ return err;
+}
+
+static int rxe_detach_mcast(struct ib_qp *ibqp, union ib_gid *mgid, u16 mlid)
+{
+ struct rxe_dev *rxe = to_rdev(ibqp->device);
+ struct rxe_qp *qp = to_rqp(ibqp);
+
+ return rxe_mcast_drop_grp_elem(rxe, qp, mgid, mlid);
+}
+
+static ssize_t rxe_show_skb_num(struct device *device,
+ struct device_attribute *attr, char *buf)
+{
+ struct rxe_dev *rxe = container_of(device, struct rxe_dev,
+ ib_dev.dev);
+
+ return sprintf(buf, "req_in:%d resp_in:%d req_out:%d resp_out:%d\n",
+ atomic_read(&rxe->req_skb_in),
+ atomic_read(&rxe->resp_skb_in),
+ atomic_read(&rxe->req_skb_out),
+ atomic_read(&rxe->resp_skb_out));
+}
+
+static DEVICE_ATTR(skb_num, S_IRUGO, rxe_show_skb_num, NULL);
+
+static ssize_t rxe_show_parent(struct device *device,
+ struct device_attribute *attr, char *buf)
+{
+ struct rxe_dev *rxe = container_of(device, struct rxe_dev,
+ ib_dev.dev);
+ char *name;
+
+ name = rxe->ifc_ops->parent_name(rxe, 1);
+ return snprintf(buf, 16, "%s\n", name);
+}
+
+static DEVICE_ATTR(parent, S_IRUGO, rxe_show_parent, NULL);
+
+static struct device_attribute *rxe_dev_attributes[] = {
+ &dev_attr_skb_num,
+ &dev_attr_parent,
+};
+
+int rxe_register_device(struct rxe_dev *rxe)
+{
+ int err;
+ int i;
+ struct ib_device *dev = &rxe->ib_dev;
+
+ strlcpy(dev->name, "rxe%d", IB_DEVICE_NAME_MAX);
+ strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc));
+
+ dev->owner = THIS_MODULE;
+ dev->node_type = RDMA_NODE_IB_CA;
+ dev->phys_port_cnt = rxe->num_ports;
+ dev->num_comp_vectors = RXE_NUM_COMP_VECTORS;
+ dev->dma_device = rxe->ifc_ops->dma_device(rxe);
+ dev->local_dma_lkey = 0; /* TODO */
+ dev->node_guid = rxe->ifc_ops->node_guid(rxe);
+ dev->dma_ops = &rxe_dma_mapping_ops;
+
+ dev->uverbs_abi_ver = RXE_UVERBS_ABI_VERSION;
+ dev->uverbs_cmd_mask = (1ull << IB_USER_VERBS_CMD_GET_CONTEXT)
+ | (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL)
+ | (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE)
+ | (1ull << IB_USER_VERBS_CMD_QUERY_PORT)
+ | (1ull << IB_USER_VERBS_CMD_ALLOC_PD)
+ | (1ull << IB_USER_VERBS_CMD_DEALLOC_PD)
+ | (1ull << IB_USER_VERBS_CMD_CREATE_SRQ)
+ | (1ull << IB_USER_VERBS_CMD_MODIFY_SRQ)
+ | (1ull << IB_USER_VERBS_CMD_QUERY_SRQ)
+ | (1ull << IB_USER_VERBS_CMD_DESTROY_SRQ)
+ | (1ull << IB_USER_VERBS_CMD_POST_SRQ_RECV)
+ | (1ull << IB_USER_VERBS_CMD_CREATE_QP)
+ | (1ull << IB_USER_VERBS_CMD_MODIFY_QP)
+ | (1ull << IB_USER_VERBS_CMD_QUERY_QP)
+ | (1ull << IB_USER_VERBS_CMD_DESTROY_QP)
+ | (1ull << IB_USER_VERBS_CMD_POST_SEND)
+ | (1ull << IB_USER_VERBS_CMD_POST_RECV)
+ | (1ull << IB_USER_VERBS_CMD_CREATE_CQ)
+ | (1ull << IB_USER_VERBS_CMD_RESIZE_CQ)
+ | (1ull << IB_USER_VERBS_CMD_DESTROY_CQ)
+ | (1ull << IB_USER_VERBS_CMD_POLL_CQ)
+ | (1ull << IB_USER_VERBS_CMD_PEEK_CQ)
+ | (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ)
+ | (1ull << IB_USER_VERBS_CMD_REG_MR)
+ | (1ull << IB_USER_VERBS_CMD_DEREG_MR)
+ | (1ull << IB_USER_VERBS_CMD_CREATE_AH)
+ | (1ull << IB_USER_VERBS_CMD_MODIFY_AH)
+ | (1ull << IB_USER_VERBS_CMD_QUERY_AH)
+ | (1ull << IB_USER_VERBS_CMD_DESTROY_AH)
+ | (1ull << IB_USER_VERBS_CMD_ATTACH_MCAST)
+ | (1ull << IB_USER_VERBS_CMD_DETACH_MCAST)
+ ;
+
+ dev->query_device = rxe_query_device;
+ dev->modify_device = rxe_modify_device;
+ dev->query_port = rxe_query_port;
+ dev->modify_port = rxe_modify_port;
+ dev->get_link_layer = rxe_get_link_layer;
+ dev->query_gid = rxe_query_gid;
+ dev->query_pkey = rxe_query_pkey;
+ dev->alloc_ucontext = rxe_alloc_ucontext;
+ dev->dealloc_ucontext = rxe_dealloc_ucontext;
+ dev->mmap = rxe_mmap;
+ dev->alloc_pd = rxe_alloc_pd;
+ dev->dealloc_pd = rxe_dealloc_pd;
+ dev->create_ah = rxe_create_ah;
+ dev->modify_ah = rxe_modify_ah;
+ dev->query_ah = rxe_query_ah;
+ dev->destroy_ah = rxe_destroy_ah;
+ dev->create_srq = rxe_create_srq;
+ dev->modify_srq = rxe_modify_srq;
+ dev->query_srq = rxe_query_srq;
+ dev->destroy_srq = rxe_destroy_srq;
+ dev->post_srq_recv = rxe_post_srq_recv;
+ dev->create_qp = rxe_create_qp;
+ dev->modify_qp = rxe_modify_qp;
+ dev->query_qp = rxe_query_qp;
+ dev->destroy_qp = rxe_destroy_qp;
+ dev->post_send = rxe_post_send;
+ dev->post_recv = rxe_post_recv;
+ dev->create_cq = rxe_create_cq;
+ dev->modify_cq = NULL;
+ dev->destroy_cq = rxe_destroy_cq;
+ dev->resize_cq = rxe_resize_cq;
+ dev->poll_cq = rxe_poll_cq;
+ dev->peek_cq = rxe_peek_cq;
+ dev->req_notify_cq = rxe_req_notify_cq;
+ dev->req_ncomp_notif = rxe_req_ncomp_notif;
+ dev->get_dma_mr = rxe_get_dma_mr;
+ dev->reg_phys_mr = rxe_reg_phys_mr;
+ dev->reg_user_mr = rxe_reg_user_mr;
+ dev->rereg_phys_mr = rxe_rereg_phys_mr;
+ dev->dereg_mr = rxe_dereg_mr;
+ dev->alloc_fast_reg_mr = rxe_alloc_fast_reg_mr;
+ dev->alloc_fast_reg_page_list = rxe_alloc_fast_reg_page_list;
+ dev->free_fast_reg_page_list = rxe_free_fast_reg_page_list;
+ dev->alloc_mw = rxe_alloc_mw;
+ dev->bind_mw = rxe_bind_mw;
+ dev->dealloc_mw = rxe_dealloc_mw;
+ dev->alloc_fmr = rxe_alloc_fmr;
+ dev->map_phys_fmr = rxe_map_phys_fmr;
+ dev->unmap_fmr = rxe_unmap_fmr;
+ dev->dealloc_fmr = rxe_dealloc_fmr;
+ dev->attach_mcast = rxe_attach_mcast;
+ dev->detach_mcast = rxe_detach_mcast;
+ dev->process_mad = NULL;
+
+ err = ib_register_device(dev, NULL);
+ if (err) {
+ pr_warn("rxe_register_device failed, err = %d\n", err);
+ goto err1;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(rxe_dev_attributes); ++i) {
+ err = device_create_file(&dev->dev, rxe_dev_attributes[i]);
+ if (err) {
+ pr_warn("device_create_file failed, "
+ "i = %d, err = %d\n", i, err);
+ goto err2;
+ }
+ }
+
+ return 0;
+
+err2:
+ ib_unregister_device(dev);
+err1:
+ return err;
+}
+
+int rxe_unregister_device(struct rxe_dev *rxe)
+{
+ int i;
+ struct ib_device *dev = &rxe->ib_dev;
+
+ for (i = 0; i < ARRAY_SIZE(rxe_dev_attributes); ++i)
+ device_remove_file(&dev->dev, rxe_dev_attributes[i]);
+
+ ib_unregister_device(dev);
+
+ return 0;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 14/44] rxe_pool.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (12 preceding siblings ...)
2011-07-01 13:18 ` [patch 13/44] rxe_verbs.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 15/44] rxe_pool.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (30 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch14 --]
[-- Type: text/plain, Size: 5379 bytes --]
declarations for rdma objects
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_pool.h | 163 +++++++++++++++++++++++++++++++++++
1 file changed, 163 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_pool.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_pool.h
@@ -0,0 +1,163 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_POOL_H
+#define RXE_POOL_H
+
+/* declarations for pools of managed objects */
+
+#define RXE_POOL_ALIGN (16)
+#define RXE_POOL_CACHE_FLAGS (0)
+
+enum rxe_pool_flags {
+ RXE_POOL_ATOMIC = (1 << 0),
+ RXE_POOL_INDEX = (1 << 1),
+ RXE_POOL_KEY = (1 << 2),
+};
+
+enum rxe_elem_type {
+ RXE_TYPE_UC,
+ RXE_TYPE_PD,
+ RXE_TYPE_AH,
+ RXE_TYPE_SRQ,
+ RXE_TYPE_QP,
+ RXE_TYPE_CQ,
+ RXE_TYPE_MR,
+ RXE_TYPE_MW,
+ RXE_TYPE_FMR,
+ RXE_TYPE_MC_GRP,
+ RXE_TYPE_MC_ELEM,
+ RXE_NUM_TYPES, /* keep me last */
+};
+
+struct rxe_type_info {
+ char *name;
+ size_t size;
+ void (*cleanup)(void *obj);
+ enum rxe_pool_flags flags;
+ u32 max_index;
+ u32 min_index;
+ size_t key_offset;
+ size_t key_size;
+ struct kmem_cache *cache;
+};
+
+extern struct rxe_type_info rxe_type_info[];
+
+enum rxe_pool_state {
+ rxe_pool_invalid,
+ rxe_pool_valid,
+};
+
+struct rxe_pool_entry {
+ struct rxe_pool *pool;
+ struct kref ref_cnt;
+ struct list_head list;
+
+ /* only used if index'ed or key'ed */
+ struct rb_node node;
+ u32 index;
+};
+
+struct rxe_pool {
+ struct rxe_dev *rxe;
+ spinlock_t pool_lock;
+ size_t elem_size;
+ struct kref ref_cnt;
+ void (*cleanup)(void *obj);
+ enum rxe_pool_state state;
+ enum rxe_pool_flags flags;
+ enum rxe_elem_type type;
+
+ unsigned int max_elem;
+ atomic_t num_elem;
+
+ /* only used if index'ed or key'ed */
+ struct rb_root tree;
+ unsigned long *table;
+ size_t table_size;
+ u32 max_index;
+ u32 min_index;
+ u32 last;
+ size_t key_offset;
+ size_t key_size;
+};
+
+/* initialize slab caches for managed objects */
+int __init rxe_cache_init(void);
+
+/* cleanup slab caches for managed objects */
+void __exit rxe_cache_exit(void);
+
+/* initialize a pool of objects with given limit on
+ number of elements. gets parameters from rxe_type_info
+ pool elements will be allocated out of a slab cache */
+int rxe_pool_init(struct rxe_dev *rxe, struct rxe_pool *pool,
+ enum rxe_elem_type type, u32 max_elem);
+
+/* free resources from object pool */
+int rxe_pool_cleanup(struct rxe_pool *pool);
+
+/* allocate an object from pool */
+void *rxe_alloc(struct rxe_pool *pool);
+
+/* assign an index to an indexed object and insert object into
+ pool's rb tree */
+void rxe_add_index(void *elem);
+
+/* drop an index and remove object from rb tree */
+void rxe_drop_index(void *elem);
+
+/* assign a key to a keyed object and insert object into
+ pool's rb tree */
+void rxe_add_key(void *elem, void *key);
+
+/* remove elem from rb tree */
+void rxe_drop_key(void *elem);
+
+/* lookup an indexed object from index. takes a reference on object */
+void *rxe_pool_get_index(struct rxe_pool *pool, u32 index);
+
+/* lookup keyed object from key. takes a reference on the object */
+void *rxe_pool_get_key(struct rxe_pool *pool, void *key);
+
+/* cleanup an object when all references are dropped */
+void rxe_elem_release(struct kref *kref);
+
+/* take a reference on an object */
+#define rxe_add_ref(elem) kref_get(&(elem)->pelem.ref_cnt)
+
+/* drop a reference on an object */
+#define rxe_drop_ref(elem) kref_put(&(elem)->pelem.ref_cnt, rxe_elem_release)
+
+#endif /* RXE_POOL_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 15/44] rxe_pool.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (13 preceding siblings ...)
2011-07-01 13:18 ` [patch 14/44] rxe_pool.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 16/44] rxe_task.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (29 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch15 --]
[-- Type: text/plain, Size: 13364 bytes --]
rdma objects
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_pool.c | 524 +++++++++++++++++++++++++++++++++++
1 file changed, 524 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_pool.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_pool.c
@@ -0,0 +1,524 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_srq.h"
+#include "rxe_cq.h"
+#include "rxe_qp.h"
+#include "rxe_mr.h"
+
+/* info about object pools
+ note that mr, fmr and mw share a single index space
+ so that one can map an lkey to the correct type of object */
+struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = {
+ [RXE_TYPE_UC] = {
+ .name = "rxe_uc",
+ .size = sizeof(struct rxe_ucontext),
+ },
+ [RXE_TYPE_PD] = {
+ .name = "rxe_pd",
+ .size = sizeof(struct rxe_pd),
+ },
+ [RXE_TYPE_AH] = {
+ .name = "rxe_ah",
+ .size = sizeof(struct rxe_ah),
+ .flags = RXE_POOL_ATOMIC,
+ },
+ [RXE_TYPE_SRQ] = {
+ .name = "rxe_srq",
+ .size = sizeof(struct rxe_srq),
+ .flags = RXE_POOL_INDEX,
+ .min_index = RXE_MIN_SRQ_INDEX,
+ .max_index = RXE_MAX_SRQ_INDEX,
+ .cleanup = rxe_srq_cleanup,
+ },
+ [RXE_TYPE_QP] = {
+ .name = "rxe_qp",
+ .size = sizeof(struct rxe_qp),
+ .cleanup = rxe_qp_cleanup,
+ .flags = RXE_POOL_INDEX,
+ .min_index = RXE_MIN_QP_INDEX,
+ .max_index = RXE_MAX_QP_INDEX,
+ },
+ [RXE_TYPE_CQ] = {
+ .name = "rxe_cq",
+ .size = sizeof(struct rxe_cq),
+ .cleanup = rxe_cq_cleanup,
+ },
+ [RXE_TYPE_MR] = {
+ .name = "rxe_mr",
+ .size = sizeof(struct rxe_mem),
+ .cleanup = rxe_mem_cleanup,
+ .flags = RXE_POOL_INDEX,
+ .max_index = RXE_MAX_MR_INDEX,
+ .min_index = RXE_MIN_MR_INDEX,
+ },
+ [RXE_TYPE_FMR] = {
+ .name = "rxe_fmr",
+ .size = sizeof(struct rxe_mem),
+ .cleanup = rxe_mem_cleanup,
+ .flags = RXE_POOL_INDEX,
+ .max_index = RXE_MAX_FMR_INDEX,
+ .min_index = RXE_MIN_FMR_INDEX,
+ },
+ [RXE_TYPE_MW] = {
+ .name = "rxe_mw",
+ .size = sizeof(struct rxe_mem),
+ .flags = RXE_POOL_INDEX,
+ .max_index = RXE_MAX_MW_INDEX,
+ .min_index = RXE_MIN_MW_INDEX,
+ },
+ [RXE_TYPE_MC_GRP] = {
+ .name = "rxe_mc_grp",
+ .size = sizeof(struct rxe_mc_grp),
+ .cleanup = rxe_mc_cleanup,
+ .flags = RXE_POOL_KEY,
+ .key_offset = offsetof(struct rxe_mc_grp, mgid),
+ .key_size = sizeof(union ib_gid),
+ },
+ [RXE_TYPE_MC_ELEM] = {
+ .name = "rxe_mc_elem",
+ .size = sizeof(struct rxe_mc_elem),
+ .flags = RXE_POOL_ATOMIC,
+ },
+};
+
+static inline char *pool_name(struct rxe_pool *pool)
+{
+ return rxe_type_info[pool->type].name + 4;
+}
+
+static inline struct kmem_cache *pool_cache(struct rxe_pool *pool)
+{
+ return rxe_type_info[pool->type].cache;
+}
+
+static inline enum rxe_elem_type rxe_type(void *arg)
+{
+ struct rxe_pool_entry *elem = arg;
+ return elem->pool->type;
+}
+
+int __init rxe_cache_init(void)
+{
+ int err;
+ int i;
+ size_t size;
+ struct rxe_type_info *type;
+
+ for (i = 0; i < RXE_NUM_TYPES; i++) {
+ type = &rxe_type_info[i];
+ size = (type->size + RXE_POOL_ALIGN - 1) &
+ ~(RXE_POOL_ALIGN - 1);
+ type->cache = kmem_cache_create(type->name, size,
+ RXE_POOL_ALIGN,
+ RXE_POOL_CACHE_FLAGS, NULL);
+ if (!type->cache) {
+ pr_info("Unable to init kmem cache for %s\n",
+ type->name);
+ err = -ENOMEM;
+ goto err1;
+ }
+ }
+
+ return 0;
+
+err1:
+ while (--i >= 0) {
+ kmem_cache_destroy(type->cache);
+ type->cache = NULL;
+ }
+
+ return err;
+}
+
+void __exit rxe_cache_exit(void)
+{
+ int i;
+ struct rxe_type_info *type;
+
+ for (i = 0; i < RXE_NUM_TYPES; i++) {
+ type = &rxe_type_info[i];
+ kmem_cache_destroy(type->cache);
+ type->cache = NULL;
+ }
+}
+
+static int rxe_pool_init_index(struct rxe_pool *pool, u32 max, u32 min)
+{
+ int err = 0;
+ size_t size;
+
+ if ((max - min + 1) < pool->max_elem) {
+ pr_warn("not enough indices for max_elem\n");
+ err = -EINVAL;
+ goto out;
+ }
+
+ pool->max_index = max;
+ pool->min_index = min;
+
+ size = BITS_TO_LONGS(max - min + 1) * sizeof(long);
+ pool->table = kmalloc(size, GFP_KERNEL);
+ if (!pool->table) {
+ pr_warn("no memory for bit table\n");
+ err = -ENOMEM;
+ goto out;
+ }
+
+ pool->table_size = size;
+ bitmap_zero(pool->table, max - min + 1);
+
+out:
+ return err;
+}
+
+int rxe_pool_init(
+ struct rxe_dev *rxe,
+ struct rxe_pool *pool,
+ enum rxe_elem_type type,
+ unsigned max_elem)
+{
+ int err = 0;
+ size_t size = rxe_type_info[type].size;
+
+ memset(pool, 0, sizeof(*pool));
+
+ pool->rxe = rxe;
+ pool->type = type;
+ pool->max_elem = max_elem;
+ pool->elem_size = (size + RXE_POOL_ALIGN - 1) &
+ ~(RXE_POOL_ALIGN - 1);
+ pool->flags = rxe_type_info[type].flags;
+ pool->tree = RB_ROOT;
+ pool->cleanup = rxe_type_info[type].cleanup;
+
+ atomic_set(&pool->num_elem, 0);
+
+ kref_init(&pool->ref_cnt);
+
+ spin_lock_init(&pool->pool_lock);
+
+ if (rxe_type_info[type].flags & RXE_POOL_INDEX) {
+ err = rxe_pool_init_index(pool,
+ rxe_type_info[type].max_index,
+ rxe_type_info[type].min_index);
+ if (err)
+ goto out;
+ }
+
+ if (rxe_type_info[type].flags & RXE_POOL_KEY) {
+ pool->key_offset = rxe_type_info[type].key_offset;
+ pool->key_size = rxe_type_info[type].key_size;
+ }
+
+ pool->state = rxe_pool_valid;
+
+out:
+ return err;
+}
+
+static void rxe_pool_release(struct kref *kref)
+{
+ struct rxe_pool *pool = container_of(kref, struct rxe_pool, ref_cnt);
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+ pool->state = rxe_pool_invalid;
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+
+ kfree(pool->table);
+
+ return;
+}
+
+int rxe_pool_cleanup(struct rxe_pool *pool)
+{
+ int num_elem;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+ pool->state = rxe_pool_invalid;
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+
+ num_elem = atomic_read(&pool->num_elem);
+ if (num_elem > 0)
+ pr_warn("%s pool destroyed with %d unfree'd elem\n",
+ pool_name(pool), num_elem);
+
+ kref_put(&pool->ref_cnt, rxe_pool_release);
+
+ return 0;
+}
+
+static u32 alloc_index(struct rxe_pool *pool)
+{
+ u32 index;
+ u32 range = pool->max_index - pool->min_index + 1;
+
+ index = find_next_zero_bit(pool->table, range, pool->last);
+ if (index >= range)
+ index = find_first_zero_bit(pool->table, range);
+
+ set_bit(index, pool->table);
+ pool->last = index;
+ return index + pool->min_index;
+}
+
+static void insert_index(struct rxe_pool *pool, struct rxe_pool_entry *new)
+{
+ struct rb_node **link = &pool->tree.rb_node;
+ struct rb_node *parent = NULL;
+ struct rxe_pool_entry *elem;
+
+ while (*link) {
+ parent = *link;
+ elem = rb_entry(parent, struct rxe_pool_entry, node);
+
+ if (elem->index == new->index)
+ goto out;
+
+ if (elem->index > new->index)
+ link = &(*link)->rb_left;
+ else
+ link = &(*link)->rb_right;
+ }
+
+ rb_link_node(&new->node, parent, link);
+ rb_insert_color(&new->node, &pool->tree);
+out:
+ return;
+}
+
+static void insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new)
+{
+ struct rb_node **link = &pool->tree.rb_node;
+ struct rb_node *parent = NULL;
+ struct rxe_pool_entry *elem;
+ int cmp;
+
+ while (*link) {
+ parent = *link;
+ elem = rb_entry(parent, struct rxe_pool_entry, node);
+
+ cmp = memcmp((u8 *)elem + pool->key_offset,
+ (u8 *)new + pool->key_offset, pool->key_size);
+
+ if (cmp == 0)
+ goto out;
+
+ if (cmp > 0)
+ link = &(*link)->rb_left;
+ else
+ link = &(*link)->rb_right;
+ }
+
+ rb_link_node(&new->node, parent, link);
+ rb_insert_color(&new->node, &pool->tree);
+out:
+ return;
+}
+
+void rxe_add_key(void *arg, void *key)
+{
+ struct rxe_pool_entry *elem = arg;
+ struct rxe_pool *pool = elem->pool;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+ memcpy((u8 *)elem + pool->key_offset, key, pool->key_size);
+ insert_key(pool, elem);
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+ return;
+}
+
+void rxe_drop_key(void *arg)
+{
+ struct rxe_pool_entry *elem = arg;
+ struct rxe_pool *pool = elem->pool;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+ rb_erase(&elem->node, &pool->tree);
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+ return;
+}
+
+void rxe_add_index(void *arg)
+{
+ struct rxe_pool_entry *elem = arg;
+ struct rxe_pool *pool = elem->pool;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+ elem->index = alloc_index(pool);
+ insert_index(pool, elem);
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+ return;
+}
+
+void rxe_drop_index(void *arg)
+{
+ struct rxe_pool_entry *elem = arg;
+ struct rxe_pool *pool = elem->pool;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+ clear_bit(elem->index - pool->min_index, pool->table);
+ rb_erase(&elem->node, &pool->tree);
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+ return;
+}
+
+void *rxe_alloc(struct rxe_pool *pool)
+{
+ struct rxe_pool_entry *elem;
+ unsigned long flags;
+
+ if (!(pool->flags & RXE_POOL_ATOMIC)
+ && (in_irq() || irqs_disabled())) {
+ pr_warn("pool alloc %s in context %d/%d\n",
+ pool_name(pool), (int)in_irq(),
+ (int)irqs_disabled());
+ }
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+ if (pool->state != rxe_pool_valid) {
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+ return NULL;
+ }
+ kref_get(&pool->ref_cnt);
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+
+ kref_get(&pool->rxe->ref_cnt);
+
+ if (atomic_inc_return(&pool->num_elem) > pool->max_elem) {
+ atomic_dec(&pool->num_elem);
+ kref_put(&pool->rxe->ref_cnt, rxe_release);
+ kref_put(&pool->ref_cnt, rxe_pool_release);
+ return NULL;
+ }
+
+ elem = kmem_cache_zalloc(pool_cache(pool),
+ (pool->flags & RXE_POOL_ATOMIC) ?
+ GFP_ATOMIC : GFP_KERNEL);
+
+ elem->pool = pool;
+ kref_init(&elem->ref_cnt);
+
+ return elem;
+}
+
+void rxe_elem_release(struct kref *kref)
+{
+ struct rxe_pool_entry *elem =
+ container_of(kref, struct rxe_pool_entry, ref_cnt);
+ struct rxe_pool *pool = elem->pool;
+
+ if (pool->cleanup)
+ pool->cleanup(elem);
+
+ kmem_cache_free(pool_cache(pool), elem);
+ atomic_dec(&pool->num_elem);
+ kref_put(&pool->rxe->ref_cnt, rxe_release);
+ kref_put(&pool->ref_cnt, rxe_pool_release);
+}
+
+void *rxe_pool_get_index(struct rxe_pool *pool, u32 index)
+{
+ struct rb_node *node = NULL;
+ struct rxe_pool_entry *elem = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+
+ if (pool->state != rxe_pool_valid)
+ goto out;
+
+ node = pool->tree.rb_node;
+
+ while (node) {
+ elem = rb_entry(node, struct rxe_pool_entry, node);
+
+ if (elem->index > index)
+ node = node->rb_left;
+ else if (elem->index < index)
+ node = node->rb_right;
+ else
+ break;
+ }
+
+ if (node)
+ kref_get(&elem->ref_cnt);
+
+out:
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+ return node ? (void *)elem : NULL;
+}
+
+void *rxe_pool_get_key(struct rxe_pool *pool, void *key)
+{
+ struct rb_node *node = NULL;
+ struct rxe_pool_entry *elem = NULL;
+ int cmp;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->pool_lock, flags);
+
+ if (pool->state != rxe_pool_valid)
+ goto out;
+
+ node = pool->tree.rb_node;
+
+ while (node) {
+ elem = rb_entry(node, struct rxe_pool_entry, node);
+
+ cmp = memcmp((u8 *)elem + pool->key_offset,
+ key, pool->key_size);
+
+ if (cmp > 0)
+ node = node->rb_left;
+ else if (cmp < 0)
+ node = node->rb_right;
+ else
+ break;
+ }
+
+ if (node)
+ kref_get(&elem->ref_cnt);
+
+out:
+ spin_unlock_irqrestore(&pool->pool_lock, flags);
+ return node ? ((void *)elem) : NULL;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 16/44] rxe_task.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (14 preceding siblings ...)
2011-07-01 13:18 ` [patch 15/44] rxe_pool.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 17/44] rxe_task.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (28 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch16 --]
[-- Type: text/plain, Size: 3919 bytes --]
Declarations for tasklet handling.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_task.h | 107 +++++++++++++++++++++++++++++++++++
1 file changed, 107 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_task.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_task.h
@@ -0,0 +1,107 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_TASK_H
+#define RXE_TASK_H
+
+enum {
+ TASK_STATE_START = 0,
+ TASK_STATE_BUSY = 1,
+ TASK_STATE_ARMED = 2,
+};
+
+/*
+ * data structure to describe a 'task' which is a short
+ * function that returns 0 as long as it needs to be
+ * called again.
+ */
+struct rxe_task {
+ void *obj;
+ unsigned int *fast;
+ struct tasklet_struct tasklet;
+ int state;
+ spinlock_t state_lock;
+ void *arg;
+ int (*func)(void *arg);
+ int ret;
+ char name[16];
+};
+
+/*
+ * init rxe_task structure
+ * fast => address of control flag, likely a module parameter
+ * arg => parameter to pass to fcn
+ * fcn => function to call until it returns != 0
+ */
+int rxe_init_task(void *obj, struct rxe_task *task, unsigned int *fast,
+ void *arg, int (*func)(void *), char *name);
+
+/*
+ * cleanup task
+ */
+void rxe_cleanup_task(struct rxe_task *task);
+
+/*
+ * raw call to func in loop without any checking
+ * can call when tasklets are disabled
+ */
+int __rxe_do_task(struct rxe_task *task);
+
+/*
+ * common function called by any of the main tasklets
+ * if someone is calling the function on its own
+ * thread (fast path) then they must increment busy
+ * If there is any chance that there is additional
+ * work to do someone must reschedule the task before
+ * leaving
+ */
+void rxe_do_task(unsigned long data);
+
+/*
+ * run a task, use fast path if no one else
+ * is currently running it and fast path is ok
+ * else schedule it to run as a tasklet
+ */
+void rxe_run_task(struct rxe_task *task, int sched);
+
+/*
+ * keep a task from scheduling
+ */
+void rxe_disable_task(struct rxe_task *task);
+
+/*
+ * allow task to run
+ */
+void rxe_enable_task(struct rxe_task *task);
+
+#endif /* RXE_TASK_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 17/44] rxe_task.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (15 preceding siblings ...)
2011-07-01 13:18 ` [patch 16/44] rxe_task.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 18/44] rxe_av.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (27 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch17 --]
[-- Type: text/plain, Size: 5094 bytes --]
Tasklet handling details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_task.c | 169 +++++++++++++++++++++++++++++++++++
1 file changed, 169 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_task.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_task.c
@@ -0,0 +1,169 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/kernel.h>
+#include <linux/interrupt.h>
+#include <linux/hardirq.h>
+
+#include "rxe_task.h"
+
+int __rxe_do_task(struct rxe_task *task)
+
+{
+ int ret;
+
+ while ((ret = task->func(task->arg)) == 0)
+ ;
+
+ task->ret = ret;
+
+ return ret;
+}
+
+/*
+ * this locking is due to a potential race where
+ * a second caller finds the task already running
+ * but looks just after the last call to func
+ */
+void rxe_do_task(unsigned long data)
+{
+ int cont;
+ int ret;
+ unsigned long flags;
+ struct rxe_task *task = (struct rxe_task *)data;
+
+ spin_lock_irqsave(&task->state_lock, flags);
+ switch (task->state) {
+ case TASK_STATE_START:
+ task->state = TASK_STATE_BUSY;
+ spin_unlock_irqrestore(&task->state_lock, flags);
+ break;
+
+ case TASK_STATE_BUSY:
+ task->state = TASK_STATE_ARMED;
+ /* fall through to */
+ case TASK_STATE_ARMED:
+ spin_unlock_irqrestore(&task->state_lock, flags);
+ return;
+
+ default:
+ spin_unlock_irqrestore(&task->state_lock, flags);
+ pr_warn("bad state = %d in rxe_do_task\n", task->state);
+ return;
+ }
+
+ do {
+ cont = 0;
+ ret = task->func(task->arg);
+
+ spin_lock_irqsave(&task->state_lock, flags);
+ switch (task->state) {
+ case TASK_STATE_BUSY:
+ if (ret)
+ task->state = TASK_STATE_START;
+ else
+ cont = 1;
+ break;
+
+ /* soneone tried to run the task since the
+ last time we called func, so we will call
+ one more time regardless of the return value */
+ case TASK_STATE_ARMED:
+ task->state = TASK_STATE_BUSY;
+ cont = 1;
+ break;
+
+ default:
+ pr_warn("bad state = %d in rxe_do_task\n",
+ task->state);
+ }
+ spin_unlock_irqrestore(&task->state_lock, flags);
+ } while (cont);
+
+ task->ret = ret;
+}
+
+int rxe_init_task(void *obj, struct rxe_task *task, unsigned int *fast,
+ void *arg, int (*func)(void *), char *name)
+{
+ task->obj = obj;
+ task->fast = fast;
+ task->arg = arg;
+ task->func = func;
+ snprintf(task->name, sizeof(task->name), "%s", name);
+
+ tasklet_init(&task->tasklet, rxe_do_task, (unsigned long)task);
+
+ task->state = TASK_STATE_START;
+ spin_lock_init(&task->state_lock);
+
+ return 0;
+}
+
+void rxe_cleanup_task(struct rxe_task *task)
+{
+ tasklet_kill(&task->tasklet);
+}
+
+/*
+ * depending on value of fast allow bypassing
+ * tasklet call or not
+ * 0 => never
+ * 1 => only if not in interrupt level
+ * >1 => always
+ */
+static inline int rxe_fast_path_ok(unsigned int fast)
+{
+ if (fast == 0 || (fast == 1 && (in_irq() || irqs_disabled())))
+ return 0;
+ else
+ return 1;
+}
+
+void rxe_run_task(struct rxe_task *task, int sched)
+{
+ if (sched || !rxe_fast_path_ok(*task->fast))
+ tasklet_schedule(&task->tasklet);
+ else
+ rxe_do_task((unsigned long)task);
+}
+
+void rxe_disable_task(struct rxe_task *task)
+{
+ tasklet_disable(&task->tasklet);
+}
+
+void rxe_enable_task(struct rxe_task *task)
+{
+ tasklet_enable(&task->tasklet);
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 18/44] rxe_av.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (16 preceding siblings ...)
2011-07-01 13:18 ` [patch 17/44] rxe_task.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 19/44] rxe_av.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (26 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch18 --]
[-- Type: text/plain, Size: 2532 bytes --]
Declarations for address vector details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_av.h | 45 +++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_av.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_av.h
@@ -0,0 +1,45 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_AV_H
+#define RXE_AV_H
+
+int rxe_av_chk_attr(struct rxe_dev *rxe, struct ib_ah_attr *attr);
+
+int rxe_av_from_attr(struct rxe_dev *rxe, u8 port_num,
+ struct rxe_av *av, struct ib_ah_attr *attr);
+
+int rxe_av_to_attr(struct rxe_dev *rxe, struct rxe_av *av,
+ struct ib_ah_attr *attr);
+
+#endif /* RXE_AV_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 19/44] rxe_av.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (17 preceding siblings ...)
2011-07-01 13:18 ` [patch 18/44] rxe_av.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 20/44] rxe_srq.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (25 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch19 --]
[-- Type: text/plain, Size: 3177 bytes --]
Address vector implementation details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_av.c | 75 +++++++++++++++++++++++++++++++++++++
1 file changed, 75 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_av.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_av.c
@@ -0,0 +1,75 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* address handle implementation shared by ah and qp verbs */
+
+#include "rxe.h"
+#include "rxe_av.h"
+
+int rxe_av_chk_attr(struct rxe_dev *rxe, struct ib_ah_attr *attr)
+{
+ struct rxe_port *port;
+
+ if (attr->port_num < 1 || attr->port_num > rxe->num_ports) {
+ pr_info("rxe: invalid port_num = %d\n", attr->port_num);
+ return -EINVAL;
+ }
+
+ port = &rxe->port[attr->port_num - 1];
+
+ if (attr->ah_flags & IB_AH_GRH) {
+ if (attr->grh.sgid_index > port->attr.gid_tbl_len) {
+ pr_info("rxe: invalid sgid index = %d\n",
+ attr->grh.sgid_index);
+ return -EINVAL;
+ }
+ }
+
+ return 0;
+}
+
+int rxe_av_from_attr(struct rxe_dev *rxe, u8 port_num,
+ struct rxe_av *av, struct ib_ah_attr *attr)
+{
+ memset(av, 0, sizeof(*av));
+ av->attr = *attr;
+ av->attr.port_num = port_num;
+ return rxe->ifc_ops->init_av(rxe, attr, av);
+}
+
+int rxe_av_to_attr(struct rxe_dev *rxe, struct rxe_av *av,
+ struct ib_ah_attr *attr)
+{
+ *attr = av->attr;
+ return 0;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 20/44] rxe_srq.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (18 preceding siblings ...)
2011-07-01 13:18 ` [patch 19/44] rxe_av.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 21/44] rxe_srq.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (24 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch20 --]
[-- Type: text/plain, Size: 2843 bytes --]
Declarations for shared receive queue details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_srq.h | 54 ++++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_srq.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_srq.h
@@ -0,0 +1,54 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* srq implementation details */
+
+#ifndef RXE_SRQ_H
+#define RXE_SRQ_H
+
+#define IB_SRQ_INIT_MASK (~IB_SRQ_LIMIT)
+
+int rxe_srq_chk_attr(struct rxe_dev *rxe, struct rxe_srq *srq,
+ struct ib_srq_attr *attr, enum ib_srq_attr_mask mask);
+
+int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
+ struct ib_srq_init_attr *init,
+ struct ib_ucontext *context, struct ib_udata *udata);
+
+int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq,
+ struct ib_srq_attr *attr, enum ib_srq_attr_mask mask,
+ struct ib_udata *udata);
+
+void rxe_srq_cleanup(void *arg);
+
+#endif /* RXE_SRQ_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 21/44] rxe_srq.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (19 preceding siblings ...)
2011-07-01 13:18 ` [patch 20/44] rxe_srq.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 22/44] rxe_cq.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (23 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch21 --]
[-- Type: text/plain, Size: 6502 bytes --]
Shared receive queue implementation details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_srq.c | 215 ++++++++++++++++++++++++++++++++++++
1 file changed, 215 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_srq.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_srq.c
@@ -0,0 +1,215 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* srq implementation details */
+
+#include "rxe.h"
+#include "rxe_queue.h"
+#include "rxe_mmap.h"
+#include "rxe_srq.h"
+#include "rxe_qp.h"
+
+int rxe_srq_chk_attr(struct rxe_dev *rxe, struct rxe_srq *srq,
+ struct ib_srq_attr *attr, enum ib_srq_attr_mask mask)
+{
+ if (srq && srq->error) {
+ pr_warn("srq in error state\n");
+ goto err1;
+ }
+
+ if (mask & IB_SRQ_MAX_WR) {
+ if (attr->max_wr > rxe->attr.max_srq_wr) {
+ pr_warn("max_wr(%d) > max_srq_wr(%d)\n",
+ attr->max_wr, rxe->attr.max_srq_wr);
+ goto err1;
+ }
+
+ if (attr->max_wr <= 0) {
+ pr_warn("max_wr(%d) <= 0\n", attr->max_wr);
+ goto err1;
+ }
+
+ if (srq && !(rxe->attr.device_cap_flags &
+ IB_DEVICE_SRQ_RESIZE)) {
+ pr_warn("srq resize not supported\n");
+ goto err1;
+ }
+
+ if (srq && srq->limit && (attr->max_wr < srq->limit)) {
+ pr_warn("max_wr (%d) < srq->limit (%d)\n",
+ attr->max_wr, srq->limit);
+ goto err1;
+ }
+
+ if (attr->max_wr < RXE_MIN_SRQ_WR)
+ attr->max_wr = RXE_MIN_SRQ_WR;
+ }
+
+ if (mask & IB_SRQ_LIMIT) {
+ if (attr->srq_limit > rxe->attr.max_srq_wr) {
+ pr_warn("srq_limit(%d) > max_srq_wr(%d)\n",
+ attr->srq_limit, rxe->attr.max_srq_wr);
+ goto err1;
+ }
+
+ if (attr->srq_limit > srq->rq.queue->buf->index_mask) {
+ pr_warn("srq_limit (%d) > cur limit(%d)\n",
+ attr->srq_limit,
+ srq->rq.queue->buf->index_mask);
+ goto err1;
+ }
+ }
+
+ if (mask == IB_SRQ_INIT_MASK) {
+ if (attr->max_sge > rxe->attr.max_srq_sge) {
+ pr_warn("max_sge(%d) > max_srq_sge(%d)\n",
+ attr->max_sge, rxe->attr.max_srq_sge);
+ goto err1;
+ }
+
+ if (attr->max_sge < RXE_MIN_SRQ_SGE)
+ attr->max_sge = RXE_MIN_SRQ_SGE;
+ }
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
+ struct ib_srq_init_attr *init,
+ struct ib_ucontext *context, struct ib_udata *udata)
+{
+ int err;
+ int srq_wqe_size;
+ struct rxe_queue *q;
+
+ srq->event_handler = init->event_handler;
+ srq->context = init->srq_context;
+ srq->limit = init->attr.srq_limit;
+ srq->srq_num = srq->pelem.index;
+ srq->rq.max_wr = init->attr.max_wr;
+ srq->rq.max_sge = init->attr.max_sge;
+
+ srq_wqe_size = sizeof(struct rxe_recv_wqe) +
+ srq->rq.max_sge*sizeof(struct ib_sge);
+
+ spin_lock_init(&srq->rq.producer_lock);
+ spin_lock_init(&srq->rq.consumer_lock);
+
+ q = rxe_queue_init(rxe, (unsigned int *)&srq->rq.max_wr,
+ srq_wqe_size);
+ if (!q) {
+ pr_warn("unable to allocate queue for srq\n");
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ err = do_mmap_info(rxe, udata, 0, context, q->buf,
+ q->buf_size, &q->ip);
+ if (err)
+ goto err2;
+
+ srq->rq.queue = q;
+
+ if (udata && udata->outlen >= sizeof(struct mminfo) + sizeof(u32))
+ return copy_to_user(udata->outbuf + sizeof(struct mminfo),
+ &srq->srq_num, sizeof(u32));
+ else
+ return 0;
+
+err2:
+ vfree(q->buf);
+ kfree(q);
+err1:
+ return err;
+}
+
+int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq,
+ struct ib_srq_attr *attr, enum ib_srq_attr_mask mask,
+ struct ib_udata *udata)
+{
+ int err;
+ struct rxe_queue *q = srq->rq.queue;
+ struct mminfo mi = { .offset = 1, .size = 0};
+
+ if (mask & IB_SRQ_MAX_WR) {
+ /* Check that we can write the mminfo struct to user space */
+ if (udata && udata->inlen >= sizeof(__u64)) {
+ __u64 mi_addr;
+
+ /* Get address of user space mminfo struct */
+ err = ib_copy_from_udata(&mi_addr, udata,
+ sizeof(mi_addr));
+ if (err)
+ goto err1;
+
+ udata->outbuf = (void __user *)(unsigned long)mi_addr;
+ udata->outlen = sizeof(mi);
+
+ err = ib_copy_to_udata(udata, &mi, sizeof(mi));
+ if (err)
+ goto err1;
+ }
+
+ err = rxe_queue_resize(q, (unsigned int *)&attr->max_wr,
+ RCV_WQE_SIZE(srq->rq.max_sge),
+ srq->rq.queue->ip ?
+ srq->rq.queue->ip->context :
+ NULL,
+ udata, &srq->rq.producer_lock,
+ &srq->rq.consumer_lock);
+ if (err)
+ goto err2;
+ }
+
+ if (mask & IB_SRQ_LIMIT)
+ srq->limit = attr->srq_limit;
+
+ return 0;
+
+err2:
+ rxe_queue_cleanup(q);
+ srq->rq.queue = NULL;
+err1:
+ return err;
+}
+
+void rxe_srq_cleanup(void *arg)
+{
+ struct rxe_srq *srq = (struct rxe_srq *)arg;
+
+ if (srq->rq.queue)
+ rxe_queue_cleanup(srq->rq.queue);
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 22/44] rxe_cq.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (20 preceding siblings ...)
2011-07-01 13:18 ` [patch 21/44] rxe_srq.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 23/44] rxe_cq.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (22 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch22 --]
[-- Type: text/plain, Size: 2769 bytes --]
Declarations for completion queue details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_cq.h | 52 +++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_cq.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_cq.h
@@ -0,0 +1,52 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* cq implementation details */
+
+#ifndef RXE_CQ_H
+#define RXE_CQ_H
+
+int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
+ int cqe, int comp_vector, struct ib_udata *udata);
+
+int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
+ int comp_vector, struct ib_ucontext *context,
+ struct ib_udata *udata);
+
+int rxe_cq_resize_queue(struct rxe_cq *cq, int new_cqe, struct ib_udata *udata);
+
+int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited);
+
+void rxe_cq_cleanup(void *arg);
+
+#endif /* RXE_CQ_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 23/44] rxe_cq.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (21 preceding siblings ...)
2011-07-01 13:18 ` [patch 22/44] rxe_cq.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 24/44] rxe_qp.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (21 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch23 --]
[-- Type: text/plain, Size: 5379 bytes --]
Completion queue implementation details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_cq.c | 178 +++++++++++++++++++++++++++++++++++++
1 file changed, 178 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_cq.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_cq.c
@@ -0,0 +1,178 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* cq implementation details */
+
+#include "rxe.h"
+#include "rxe_queue.h"
+#include "rxe_cq.h"
+#include "rxe_mmap.h"
+
+int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
+ int cqe, int comp_vector, struct ib_udata *udata)
+{
+ int count;
+
+ if (cqe <= 0) {
+ pr_warn("cqe(%d) <= 0\n", cqe);
+ goto err1;
+ }
+
+ if (cqe > rxe->attr.max_cqe) {
+ pr_warn("cqe(%d) > max_cqe(%d)\n",
+ cqe, rxe->attr.max_cqe);
+ goto err1;
+ }
+
+ if (cq) {
+ count = queue_count(cq->queue);
+ if (cqe < count) {
+ pr_warn("cqe(%d) < current # elements in queue (%d)",
+ cqe, count);
+ goto err1;
+ }
+ }
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static void rxe_send_complete(unsigned long data)
+{
+ struct rxe_cq *cq = (struct rxe_cq *)data;
+
+ while (1) {
+ u8 notify = cq->notify;
+
+ cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+
+ /* See if anything was added to the CQ during the
+ * comp_handler call. If so, go around again because we
+ * won't be rescheduled.
+ * XXX: is there a race on enqueue right after this test
+ * but before we're out? */
+ if (notify == cq->notify)
+ return;
+ }
+}
+
+int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
+ int comp_vector, struct ib_ucontext *context,
+ struct ib_udata *udata)
+{
+ int err;
+
+ cq->queue = rxe_queue_init(rxe, (unsigned int *)&cqe,
+ sizeof(struct rxe_cqe));
+ if (!cq->queue) {
+ pr_warn("unable to create cq\n");
+ return -ENOMEM;
+ }
+
+ err = do_mmap_info(rxe, udata, 0, context, cq->queue->buf,
+ cq->queue->buf_size, &cq->queue->ip);
+ if (err) {
+ vfree(cq->queue->buf);
+ kfree(cq->queue);
+ }
+
+ if (udata)
+ cq->is_user = 1;
+
+ tasklet_init(&cq->comp_task, rxe_send_complete, (unsigned long)cq);
+
+ spin_lock_init(&cq->cq_lock);
+ cq->ibcq.cqe = cqe;
+ return 0;
+}
+
+int rxe_cq_resize_queue(struct rxe_cq *cq, int cqe, struct ib_udata *udata)
+{
+ int err;
+
+ err = rxe_queue_resize(cq->queue, (unsigned int *)&cqe,
+ sizeof(struct rxe_cqe),
+ cq->queue->ip ? cq->queue->ip->context : NULL,
+ udata, NULL, &cq->cq_lock);
+ if (!err)
+ cq->ibcq.cqe = cqe;
+
+ return err;
+}
+
+int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
+{
+ struct ib_event ev;
+ unsigned long flags;
+
+ spin_lock_irqsave(&cq->cq_lock, flags);
+
+ if (unlikely(queue_full(cq->queue))) {
+ spin_unlock_irqrestore(&cq->cq_lock, flags);
+ if (cq->ibcq.event_handler) {
+ ev.device = cq->ibcq.device;
+ ev.element.cq = &cq->ibcq;
+ ev.event = IB_EVENT_CQ_ERR;
+ cq->ibcq.event_handler(&ev, cq->ibcq.cq_context);
+ }
+
+ return -EBUSY;
+ }
+
+ memcpy(producer_addr(cq->queue), cqe, sizeof(*cqe));
+
+ /* make sure all changes to the CQ are written before
+ we update the producer pointer */
+ wmb();
+
+ advance_producer(cq->queue);
+ spin_unlock_irqrestore(&cq->cq_lock, flags);
+
+ if ((cq->notify == IB_CQ_NEXT_COMP) ||
+ (cq->notify == IB_CQ_SOLICITED && solicited)) {
+ cq->notify++;
+ tasklet_schedule(&cq->comp_task);
+ }
+
+ return 0;
+}
+
+void rxe_cq_cleanup(void *arg)
+{
+ struct rxe_cq *cq = arg;
+
+ if (cq->queue)
+ rxe_queue_cleanup(cq->queue);
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 24/44] rxe_qp.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (22 preceding siblings ...)
2011-07-01 13:18 ` [patch 23/44] rxe_cq.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 25/44] rxe_qp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (20 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch24 --]
[-- Type: text/plain, Size: 3926 bytes --]
Decarations for queue pair details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_qp.h | 99 +++++++++++++++++++++++++++++++++++++
1 file changed, 99 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_qp.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_qp.h
@@ -0,0 +1,99 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_QP_H
+#define RXE_QP_H
+
+int rxe_qp_chk_init(struct rxe_dev *rxe, struct ib_qp_init_attr *init);
+
+int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd,
+ struct ib_qp_init_attr *init, struct ib_udata *udata,
+ struct ib_pd *ibpd);
+
+int rxe_qp_to_init(struct rxe_qp *qp, struct ib_qp_init_attr *init);
+
+int rxe_qp_chk_attr(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct ib_qp_attr *attr, int mask);
+
+int rxe_qp_from_attr(struct rxe_qp *qp, struct ib_qp_attr *attr,
+ int mask, struct ib_udata *udata);
+
+int rxe_qp_to_attr(struct rxe_qp *qp, struct ib_qp_attr *attr, int mask);
+
+void rxe_qp_error(struct rxe_qp *qp);
+
+void rxe_qp_destroy(struct rxe_qp *qp);
+
+void rxe_qp_cleanup(void *arg);
+
+static inline int qp_num(struct rxe_qp *qp)
+{
+ return qp->ibqp.qp_num;
+}
+
+static inline enum ib_qp_type qp_type(struct rxe_qp *qp)
+{
+ return qp->ibqp.qp_type;
+}
+
+static inline enum ib_qp_state qp_state(struct rxe_qp *qp)
+{
+ return qp->attr.qp_state;
+}
+
+static inline int qp_mtu(struct rxe_qp *qp)
+{
+ if (qp->ibqp.qp_type == IB_QPT_RC || qp->ibqp.qp_type == IB_QPT_UC)
+ return qp->attr.path_mtu;
+ else
+ return RXE_PORT_MAX_MTU;
+}
+
+#define RCV_WQE_SIZE(max_sge) (sizeof(struct rxe_recv_wqe) + \
+ (max_sge)*sizeof(struct ib_sge))
+
+void free_rd_atomic_resource(struct rxe_qp *qp, struct resp_res *res);
+
+static inline void rxe_advance_resp_resource(struct rxe_qp *qp)
+{
+ qp->resp.res_head++;
+ if (unlikely(qp->resp.res_head == qp->attr.max_rd_atomic))
+ qp->resp.res_head = 0;
+}
+
+void retransmit_timer(unsigned long data);
+void rnr_nak_timer(unsigned long data);
+
+void dump_qp(struct rxe_qp *qp);
+
+#endif /* RXE_QP_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 25/44] rxe_qp.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (23 preceding siblings ...)
2011-07-01 13:18 ` [patch 24/44] rxe_qp.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 26/44] rxe_mr.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (19 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch25 --]
[-- Type: text/plain, Size: 21037 bytes --]
Queue pair implementation details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_qp.c | 821 +++++++++++++++++++++++++++++++++++++
1 file changed, 821 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_qp.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_qp.c
@@ -0,0 +1,821 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* qp implementation details */
+
+#include <linux/skbuff.h>
+#include <linux/delay.h>
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_mcast.h"
+#include "rxe_queue.h"
+#include "rxe_av.h"
+#include "rxe_qp.h"
+#include "rxe_mmap.h"
+#include "rxe_task.h"
+
+char *rxe_qp_state_name[] = {
+ [QP_STATE_RESET] = "RESET",
+ [QP_STATE_INIT] = "INIT",
+ [QP_STATE_READY] = "READY",
+ [QP_STATE_DRAIN] = "DRAIN",
+ [QP_STATE_DRAINED] = "DRAINED",
+ [QP_STATE_ERROR] = "ERROR",
+};
+
+static int rxe_qp_chk_cap(struct rxe_dev *rxe, struct ib_qp_cap *cap,
+ int has_srq)
+{
+ if (cap->max_send_wr > rxe->attr.max_qp_wr) {
+ pr_warn("invalid send wr = %d > %d\n",
+ cap->max_send_wr, rxe->attr.max_qp_wr);
+ goto err1;
+ }
+
+ if (cap->max_send_sge > rxe->attr.max_sge) {
+ pr_warn("invalid send sge = %d > %d\n",
+ cap->max_send_sge, rxe->attr.max_sge);
+ goto err1;
+ }
+
+ if (!has_srq) {
+ if (cap->max_recv_wr > rxe->attr.max_qp_wr) {
+ pr_warn("invalid recv wr = %d > %d\n",
+ cap->max_recv_wr, rxe->attr.max_qp_wr);
+ goto err1;
+ }
+
+ if (cap->max_recv_sge > rxe->attr.max_sge) {
+ pr_warn("invalid recv sge = %d > %d\n",
+ cap->max_recv_sge, rxe->attr.max_sge);
+ goto err1;
+ }
+ }
+
+ if (cap->max_inline_data > rxe->max_inline_data) {
+ pr_warn("invalid max inline data = %d > %d\n",
+ cap->max_inline_data, rxe->max_inline_data);
+ goto err1;
+ }
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+int rxe_qp_chk_init(struct rxe_dev *rxe, struct ib_qp_init_attr *init)
+{
+ struct ib_qp_cap *cap = &init->cap;
+ struct rxe_port *port;
+ int port_num = init->port_num;
+
+ if (!init->recv_cq || !init->send_cq) {
+ pr_warn("missing cq\n");
+ goto err1;
+ }
+
+ if (rxe_qp_chk_cap(rxe, cap, init->srq != NULL))
+ goto err1;
+
+ if (init->qp_type == IB_QPT_SMI || init->qp_type == IB_QPT_GSI) {
+ if (port_num < 1 || port_num > rxe->num_ports) {
+ pr_warn("invalid port = %d\n", port_num);
+ goto err1;
+ }
+
+ port = &rxe->port[port_num - 1];
+
+ if (init->qp_type == IB_QPT_SMI && port->qp_smi_index) {
+ pr_warn("SMI QP exists for port %d\n", port_num);
+ goto err1;
+ }
+
+ if (init->qp_type == IB_QPT_GSI && port->qp_gsi_index) {
+ pr_warn("GSI QP exists for port %d\n", port_num);
+ goto err1;
+ }
+ }
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static int alloc_rd_atomic_resources(struct rxe_qp *qp, unsigned int n)
+{
+ qp->resp.res_head = 0;
+ qp->resp.res_tail = 0;
+ qp->resp.resources = kcalloc(n, sizeof(struct resp_res), GFP_KERNEL);
+
+ if (!qp->resp.resources)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void free_rd_atomic_resources(struct rxe_qp *qp)
+{
+ if (qp->resp.resources) {
+ int i;
+
+ for (i = 0; i < qp->attr.max_rd_atomic; i++) {
+ struct resp_res *res = &qp->resp.resources[i];
+ free_rd_atomic_resource(qp, res);
+ }
+ kfree(qp->resp.resources);
+ qp->resp.resources = NULL;
+ }
+}
+
+void free_rd_atomic_resource(struct rxe_qp *qp, struct resp_res *res)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+
+ if (res->type == RXE_ATOMIC_MASK) {
+ rxe_drop_ref(qp);
+ kfree_skb(res->atomic.skb);
+ atomic_dec(&rxe->resp_skb_out);
+ } else if (res->type == RXE_READ_MASK) {
+ if (res->read.mr)
+ rxe_drop_ref(res->read.mr);
+ }
+ res->type = 0;
+}
+
+static void cleanup_rd_atomic_resources(struct rxe_qp *qp)
+{
+ int i;
+ struct resp_res *res;
+
+ if (qp->resp.resources) {
+ for (i = 0; i < qp->attr.max_rd_atomic; i++) {
+ res = &qp->resp.resources[i];
+ free_rd_atomic_resource(qp, res);
+ }
+ }
+}
+
+static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct ib_qp_init_attr *init)
+{
+ struct rxe_port *port;
+ u32 qpn;
+
+ qp->sq_sig_type = init->sq_sig_type;
+ qp->attr.path_mtu = 1;
+ qp->mtu = 256;
+
+ qpn = qp->pelem.index;
+ port = &rxe->port[init->port_num - 1];
+
+ switch (init->qp_type) {
+ case IB_QPT_SMI:
+ qp->ibqp.qp_num = 0;
+ port->qp_smi_index = qpn;
+ qp->attr.port_num = init->port_num;
+ break;
+
+ case IB_QPT_GSI:
+ qp->ibqp.qp_num = 1;
+ port->qp_gsi_index = qpn;
+ qp->attr.port_num = init->port_num;
+ break;
+
+ default:
+ qp->ibqp.qp_num = qpn;
+ break;
+ }
+
+ INIT_LIST_HEAD(&qp->arbiter_list);
+ INIT_LIST_HEAD(&qp->grp_list);
+
+ skb_queue_head_init(&qp->send_pkts);
+
+ spin_lock_init(&qp->grp_lock);
+ spin_lock_init(&qp->state_lock);
+
+ atomic_set(&qp->ssn, 0);
+ atomic_set(&qp->req_skb_in, 0);
+ atomic_set(&qp->resp_skb_in, 0);
+ atomic_set(&qp->req_skb_out, 0);
+ atomic_set(&qp->resp_skb_out, 0);
+}
+
+static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct ib_qp_init_attr *init,
+ struct ib_ucontext *context, struct ib_udata *udata)
+{
+ int err;
+ int wqe_size;
+
+ qp->sq.max_wr = init->cap.max_send_wr;
+ qp->sq.max_sge = init->cap.max_send_sge;
+ qp->sq.max_inline = init->cap.max_inline_data;
+
+ wqe_size = max_t(int, sizeof(struct rxe_send_wqe) +
+ qp->sq.max_sge*sizeof(struct ib_sge),
+ sizeof(struct rxe_send_wqe) +
+ qp->sq.max_inline);
+
+ qp->sq.queue = rxe_queue_init(rxe,
+ (unsigned int *)&qp->sq.max_wr,
+ wqe_size);
+ if (!qp->sq.queue)
+ return -ENOMEM;
+
+ err = do_mmap_info(rxe, udata, sizeof(struct mminfo),
+ context, qp->sq.queue->buf,
+ qp->sq.queue->buf_size, &qp->sq.queue->ip);
+
+ if (err) {
+ vfree(qp->sq.queue->buf);
+ kfree(qp->sq.queue);
+ }
+
+ qp->req.wqe_index = producer_index(qp->sq.queue);
+ qp->req.state = QP_STATE_RESET;
+ qp->req.opcode = -1;
+ qp->comp.opcode = -1;
+
+ spin_lock_init(&qp->sq.sq_lock);
+ skb_queue_head_init(&qp->req_pkts);
+
+ rxe_init_task(rxe, &qp->req.task, &rxe_fast_req, qp,
+ rxe_requester, "req");
+ rxe_init_task(rxe, &qp->comp.task, &rxe_fast_comp, qp,
+ rxe_completer, "comp");
+
+ init_timer(&qp->rnr_nak_timer);
+ qp->rnr_nak_timer.function = rnr_nak_timer;
+ qp->rnr_nak_timer.data = (unsigned long)qp;
+
+ init_timer(&qp->retrans_timer);
+ qp->retrans_timer.function = retransmit_timer;
+ qp->retrans_timer.data = (unsigned long)qp;
+ qp->qp_timeout_jiffies = 0; /* Can't be set for UD/UC in modify_qp */
+
+ return 0;
+}
+
+static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct ib_qp_init_attr *init,
+ struct ib_ucontext *context, struct ib_udata *udata)
+{
+ int err;
+ int wqe_size;
+
+ if (!qp->srq) {
+ qp->rq.max_wr = init->cap.max_recv_wr;
+ qp->rq.max_sge = init->cap.max_recv_sge;
+
+ wqe_size = sizeof(struct rxe_recv_wqe) +
+ qp->rq.max_sge*sizeof(struct ib_sge);
+
+ pr_debug("max_wr = %d, max_sge = %d, wqe_size = %d\n",
+ qp->rq.max_wr, qp->rq.max_sge, wqe_size);
+
+ qp->rq.queue = rxe_queue_init(rxe,
+ (unsigned int *)&qp->rq.max_wr,
+ wqe_size);
+ if (!qp->rq.queue)
+ return -ENOMEM;
+
+ err = do_mmap_info(rxe, udata, 0, context, qp->rq.queue->buf,
+ qp->rq.queue->buf_size, &qp->rq.queue->ip);
+ if (err) {
+ vfree(qp->rq.queue->buf);
+ kfree(qp->rq.queue);
+ }
+
+ }
+
+ spin_lock_init(&qp->rq.producer_lock);
+ spin_lock_init(&qp->rq.consumer_lock);
+
+ skb_queue_head_init(&qp->resp_pkts);
+
+ rxe_init_task(rxe, &qp->resp.task, &rxe_fast_resp, qp,
+ rxe_responder, "resp");
+
+ qp->resp.opcode = OPCODE_NONE;
+ qp->resp.msn = 0;
+ qp->resp.state = QP_STATE_RESET;
+
+ return 0;
+}
+
+/* called by the create qp verb */
+int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd,
+ struct ib_qp_init_attr *init, struct ib_udata *udata,
+ struct ib_pd *ibpd)
+{
+ int err;
+ struct rxe_cq *rcq = to_rcq(init->recv_cq);
+ struct rxe_cq *scq = to_rcq(init->send_cq);
+ struct rxe_srq *srq = init->srq ? to_rsrq(init->srq) : NULL;
+ struct ib_ucontext *context = udata ? ibpd->uobject->context : NULL;
+
+ rxe_add_ref(pd);
+ rxe_add_ref(rcq);
+ rxe_add_ref(scq);
+ if (srq)
+ rxe_add_ref(srq);
+
+ qp->pd = pd;
+ qp->rcq = rcq;
+ qp->scq = scq;
+ qp->srq = srq;
+ qp->udata = udata;
+
+ rxe_qp_init_misc(rxe, qp, init);
+
+ err = rxe_qp_init_req(rxe, qp, init, context, udata);
+ if (err)
+ goto err1;
+
+ err = rxe_qp_init_resp(rxe, qp, init, context, udata);
+ if (err)
+ goto err2;
+
+ qp->attr.qp_state = IB_QPS_RESET;
+ qp->valid = 1;
+
+ return 0;
+
+err2:
+ rxe_queue_cleanup(qp->sq.queue);
+err1:
+ if (srq)
+ rxe_drop_ref(srq);
+ rxe_drop_ref(scq);
+ rxe_drop_ref(rcq);
+ rxe_drop_ref(pd);
+
+ return err;
+}
+
+/* called by the query qp verb */
+int rxe_qp_to_init(struct rxe_qp *qp, struct ib_qp_init_attr *init)
+{
+ init->event_handler = qp->ibqp.event_handler;
+ init->qp_context = qp->ibqp.qp_context;
+ init->send_cq = qp->ibqp.send_cq;
+ init->recv_cq = qp->ibqp.recv_cq;
+ init->srq = qp->ibqp.srq;
+
+ init->cap.max_send_wr = qp->sq.max_wr;
+ init->cap.max_send_sge = qp->sq.max_sge;
+ init->cap.max_inline_data = qp->sq.max_inline;
+
+ if (!qp->srq) {
+ init->cap.max_recv_wr = qp->rq.max_wr;
+ init->cap.max_recv_sge = qp->rq.max_sge;
+ }
+
+ init->sq_sig_type = qp->sq_sig_type;
+
+ init->qp_type = qp->ibqp.qp_type;
+ init->port_num = 1;
+
+ return 0;
+}
+
+/* called by the modify qp verb, this routine
+ checks all the parameters before making any changes */
+int rxe_qp_chk_attr(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct ib_qp_attr *attr, int mask)
+{
+ enum ib_qp_state cur_state = (mask & IB_QP_CUR_STATE) ?
+ attr->cur_qp_state : qp->attr.qp_state;
+ enum ib_qp_state new_state = (mask & IB_QP_STATE) ?
+ attr->qp_state : cur_state;
+
+ if (!ib_modify_qp_is_ok(cur_state, new_state, qp_type(qp), mask)) {
+ pr_warn("invalid mask or state for qp\n");
+ goto err1;
+ }
+
+ if (mask & IB_QP_STATE) {
+ if (cur_state == IB_QPS_SQD) {
+ if (qp->req.state == QP_STATE_DRAIN &&
+ new_state != IB_QPS_ERR)
+ goto err1;
+ }
+ }
+
+ if (mask & IB_QP_PORT) {
+ if (attr->port_num < 1 || attr->port_num > rxe->num_ports) {
+ pr_warn("invalid port %d\n", attr->port_num);
+ goto err1;
+ }
+ }
+
+ if (mask & IB_QP_CAP && rxe_qp_chk_cap(rxe, &attr->cap,
+ qp->srq != NULL))
+ goto err1;
+
+ if (mask & IB_QP_AV && rxe_av_chk_attr(rxe, &attr->ah_attr))
+ goto err1;
+
+ if (mask & IB_QP_ALT_PATH && rxe_av_chk_attr(rxe, &attr->alt_ah_attr))
+ goto err1;
+
+ if (mask & IB_QP_PATH_MTU) {
+ struct rxe_port *port = &rxe->port[qp->attr.port_num - 1];
+ enum rxe_mtu max_mtu = (enum rxe_mtu __force)port->attr.max_mtu;
+ enum rxe_mtu mtu = (enum rxe_mtu __force)attr->path_mtu;
+
+ if (mtu > max_mtu) {
+ pr_debug("invalid mtu (%d) > (%d)\n",
+ rxe_mtu_enum_to_int(mtu),
+ rxe_mtu_enum_to_int(max_mtu));
+ goto err1;
+ }
+ }
+
+ if (mask & IB_QP_MAX_QP_RD_ATOMIC) {
+ if (attr->max_rd_atomic > rxe->attr.max_qp_rd_atom) {
+ pr_warn("invalid max_rd_atomic %d > %d\n",
+ attr->max_rd_atomic,
+ rxe->attr.max_qp_rd_atom);
+ goto err1;
+ }
+ }
+
+ if (mask & IB_QP_TIMEOUT) {
+ if (attr->timeout > 31) {
+ pr_warn("invalid QP timeout %d > 31\n",
+ attr->timeout);
+ goto err1;
+ }
+ }
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+/* move the qp to the reset state */
+static void rxe_qp_reset(struct rxe_qp *qp)
+{
+ /* stop tasks from running */
+ rxe_disable_task(&qp->resp.task);
+
+ /* stop request/comp */
+ if (qp_type(qp) == IB_QPT_RC)
+ rxe_disable_task(&qp->comp.task);
+ rxe_disable_task(&qp->req.task);
+
+ /* move qp to the reset state */
+ qp->req.state = QP_STATE_RESET;
+ qp->resp.state = QP_STATE_RESET;
+
+ /* let state machines reset themselves
+ drain work and packet queues etc. */
+ __rxe_do_task(&qp->resp.task);
+
+ if (qp->sq.queue) {
+ __rxe_do_task(&qp->comp.task);
+ __rxe_do_task(&qp->req.task);
+ }
+
+ /* cleanup attributes */
+ atomic_set(&qp->ssn, 0);
+ qp->req.opcode = -1;
+ qp->req.need_retry = 0;
+ qp->req.noack_pkts = 0;
+ qp->resp.msn = 0;
+ qp->resp.opcode = -1;
+ qp->resp.drop_msg = 0;
+ qp->resp.goto_error = 0;
+ qp->resp.sent_psn_nak = 0;
+
+ if (qp->resp.mr) {
+ rxe_drop_ref(qp->resp.mr);
+ qp->resp.mr = NULL;
+ }
+
+ cleanup_rd_atomic_resources(qp);
+
+ /* reenable tasks */
+ rxe_enable_task(&qp->resp.task);
+
+ if (qp->sq.queue) {
+ if (qp_type(qp) == IB_QPT_RC)
+ rxe_enable_task(&qp->comp.task);
+
+ rxe_enable_task(&qp->req.task);
+ }
+}
+
+/* drain the send queue */
+static void rxe_qp_drain(struct rxe_qp *qp)
+{
+ if (qp->sq.queue) {
+ if (qp->req.state != QP_STATE_DRAINED) {
+ qp->req.state = QP_STATE_DRAIN;
+ if (qp_type(qp) == IB_QPT_RC)
+ rxe_run_task(&qp->comp.task, 1);
+ else
+ __rxe_do_task(&qp->comp.task);
+ rxe_run_task(&qp->req.task, 1);
+ }
+ }
+}
+
+/* move the qp to the error state */
+void rxe_qp_error(struct rxe_qp *qp)
+{
+ qp->req.state = QP_STATE_ERROR;
+ qp->resp.state = QP_STATE_ERROR;
+
+ /* drain work and packet queues */
+ rxe_run_task(&qp->resp.task, 1);
+
+ if (qp_type(qp) == IB_QPT_RC)
+ rxe_run_task(&qp->comp.task, 1);
+ else
+ __rxe_do_task(&qp->comp.task);
+ rxe_run_task(&qp->req.task, 1);
+}
+
+/* called by the modify qp verb */
+int rxe_qp_from_attr(struct rxe_qp *qp, struct ib_qp_attr *attr, int mask,
+ struct ib_udata *udata)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+
+ /* TODO should handle error by leaving old resources intact */
+ if (mask & IB_QP_MAX_QP_RD_ATOMIC) {
+ int max_rd_atomic = __roundup_pow_of_two(attr->max_rd_atomic);
+
+ free_rd_atomic_resources(qp);
+
+ err = alloc_rd_atomic_resources(qp, max_rd_atomic);
+ if (err)
+ return err;
+
+ qp->attr.max_rd_atomic = max_rd_atomic;
+ atomic_set(&qp->req.rd_atomic, max_rd_atomic);
+ }
+
+ if (mask & IB_QP_CUR_STATE)
+ qp->attr.cur_qp_state = attr->qp_state;
+
+ if (mask & IB_QP_EN_SQD_ASYNC_NOTIFY)
+ qp->attr.en_sqd_async_notify = attr->en_sqd_async_notify;
+
+ if (mask & IB_QP_ACCESS_FLAGS)
+ qp->attr.qp_access_flags = attr->qp_access_flags;
+
+ if (mask & IB_QP_PKEY_INDEX)
+ qp->attr.pkey_index = attr->pkey_index;
+
+ if (mask & IB_QP_PORT)
+ qp->attr.port_num = attr->port_num;
+
+ if (mask & IB_QP_QKEY)
+ qp->attr.qkey = attr->qkey;
+
+ if (mask & IB_QP_AV) {
+ rxe_av_from_attr(rxe, attr->port_num, &qp->pri_av,
+ &attr->ah_attr);
+ }
+
+ if (mask & IB_QP_ALT_PATH) {
+ rxe_av_from_attr(rxe, attr->alt_port_num, &qp->alt_av,
+ &attr->alt_ah_attr);
+ qp->attr.alt_port_num = attr->alt_port_num;
+ qp->attr.alt_pkey_index = attr->alt_pkey_index;
+ qp->attr.alt_timeout = attr->alt_timeout;
+ }
+
+ if (mask & IB_QP_PATH_MTU) {
+ qp->attr.path_mtu = attr->path_mtu;
+ qp->mtu = rxe_mtu_enum_to_int((enum rxe_mtu)attr->path_mtu);
+ }
+
+ if (mask & IB_QP_TIMEOUT) {
+ qp->attr.timeout = attr->timeout;
+ if (attr->timeout == 0) {
+ qp->qp_timeout_jiffies = 0;
+ } else {
+ int j = usecs_to_jiffies(4ULL << attr->timeout);
+ qp->qp_timeout_jiffies = j ? j : 1;
+ }
+ }
+
+ if (mask & IB_QP_RETRY_CNT) {
+ qp->attr.retry_cnt = attr->retry_cnt;
+ qp->comp.retry_cnt = attr->retry_cnt;
+ pr_debug("set retry count = %d\n", attr->retry_cnt);
+ }
+
+ if (mask & IB_QP_RNR_RETRY) {
+ qp->attr.rnr_retry = attr->rnr_retry;
+ qp->comp.rnr_retry = attr->rnr_retry;
+ pr_debug("set rnr retry count = %d\n", attr->rnr_retry);
+ }
+
+ if (mask & IB_QP_RQ_PSN) {
+ qp->attr.rq_psn = (attr->rq_psn & BTH_PSN_MASK);
+ qp->resp.psn = qp->attr.rq_psn;
+ pr_debug("set resp psn = 0x%x\n", qp->resp.psn);
+ }
+
+ if (mask & IB_QP_MIN_RNR_TIMER) {
+ qp->attr.min_rnr_timer = attr->min_rnr_timer;
+ pr_debug("set min rnr timer = 0x%x\n",
+ attr->min_rnr_timer);
+ }
+
+ if (mask & IB_QP_SQ_PSN) {
+ qp->attr.sq_psn = (attr->sq_psn & BTH_PSN_MASK);
+ qp->req.psn = qp->attr.sq_psn;
+ qp->comp.psn = qp->attr.sq_psn;
+ pr_debug("set req psn = 0x%x\n", qp->req.psn);
+ }
+
+ if (mask & IB_QP_MAX_DEST_RD_ATOMIC) {
+ qp->attr.max_dest_rd_atomic =
+ __roundup_pow_of_two(attr->max_dest_rd_atomic);
+ }
+
+ if (mask & IB_QP_PATH_MIG_STATE)
+ qp->attr.path_mig_state = attr->path_mig_state;
+
+ if (mask & IB_QP_DEST_QPN)
+ qp->attr.dest_qp_num = attr->dest_qp_num;
+
+ if (mask & IB_QP_STATE) {
+ qp->attr.qp_state = attr->qp_state;
+
+ switch (attr->qp_state) {
+ case IB_QPS_RESET:
+ pr_debug("qp state -> RESET\n");
+ rxe_qp_reset(qp);
+ break;
+
+ case IB_QPS_INIT:
+ pr_debug("qp state -> INIT\n");
+ qp->req.state = QP_STATE_INIT;
+ qp->resp.state = QP_STATE_INIT;
+ break;
+
+ case IB_QPS_RTR:
+ pr_debug("qp state -> RTR\n");
+ qp->resp.state = QP_STATE_READY;
+ break;
+
+ case IB_QPS_RTS:
+ pr_debug("qp state -> RTS\n");
+ qp->req.state = QP_STATE_READY;
+ break;
+
+ case IB_QPS_SQD:
+ pr_debug("qp state -> SQD\n");
+ rxe_qp_drain(qp);
+ break;
+
+ case IB_QPS_SQE:
+ pr_warn("qp state -> SQE !!?\n");
+ /* Not possible from modify_qp. */
+ break;
+
+ case IB_QPS_ERR:
+ pr_debug("qp state -> ERR\n");
+ rxe_qp_error(qp);
+ break;
+ }
+ }
+
+ return 0;
+}
+
+/* called by the query qp verb */
+int rxe_qp_to_attr(struct rxe_qp *qp, struct ib_qp_attr *attr, int mask)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+
+ *attr = qp->attr;
+
+ attr->rq_psn = qp->resp.psn;
+ attr->sq_psn = qp->req.psn;
+
+ attr->cap.max_send_wr = qp->sq.max_wr;
+ attr->cap.max_send_sge = qp->sq.max_sge;
+ attr->cap.max_inline_data = qp->sq.max_inline;
+
+ if (!qp->srq) {
+ attr->cap.max_recv_wr = qp->rq.max_wr;
+ attr->cap.max_recv_sge = qp->rq.max_sge;
+ }
+
+ rxe_av_to_attr(rxe, &qp->pri_av, &attr->ah_attr);
+ rxe_av_to_attr(rxe, &qp->alt_av, &attr->alt_ah_attr);
+
+ if (qp->req.state == QP_STATE_DRAIN) {
+ attr->sq_draining = 1;
+ msleep(1);
+ } else {
+ attr->sq_draining = 0;
+ }
+
+ pr_debug("attr->sq_draining = %d\n", attr->sq_draining);
+
+ return 0;
+}
+
+/* called by the destroy qp verb */
+void rxe_qp_destroy(struct rxe_qp *qp)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+
+ qp->valid = 0;
+ qp->qp_timeout_jiffies = 0;
+ rxe_cleanup_task(&qp->resp.task);
+
+ del_timer_sync(&qp->retrans_timer);
+ del_timer_sync(&qp->rnr_nak_timer);
+
+ rxe_cleanup_task(&qp->req.task);
+ if (qp_type(qp) == IB_QPT_RC)
+ rxe_cleanup_task(&qp->comp.task);
+
+ /* flush out any receive wr's or pending requests */
+ __rxe_do_task(&qp->req.task);
+ if (qp->sq.queue) {
+ __rxe_do_task(&qp->comp.task);
+ __rxe_do_task(&qp->req.task);
+ }
+
+ /* drain the output queue */
+ while (!list_empty(&qp->arbiter_list))
+ __rxe_do_task(&rxe->arbiter.task);
+}
+
+/* called when the last reference to the qp is dropped */
+void rxe_qp_cleanup(void *arg)
+{
+ struct rxe_qp *qp = arg;
+
+ rxe_drop_all_mcast_groups(qp);
+
+ if (qp->sq.queue)
+ rxe_queue_cleanup(qp->sq.queue);
+
+ if (qp->srq)
+ rxe_drop_ref(qp->srq);
+
+ if (qp->rq.queue)
+ rxe_queue_cleanup(qp->rq.queue);
+
+ if (qp->scq)
+ rxe_drop_ref(qp->scq);
+ if (qp->rcq)
+ rxe_drop_ref(qp->rcq);
+ if (qp->pd)
+ rxe_drop_ref(qp->pd);
+
+ if (qp->resp.mr) {
+ rxe_drop_ref(qp->resp.mr);
+ qp->resp.mr = NULL;
+ }
+
+ free_rd_atomic_resources(qp);
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 26/44] rxe_mr.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (24 preceding siblings ...)
2011-07-01 13:18 ` [patch 25/44] rxe_qp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 27/44] rxe_mr.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (18 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch26 --]
[-- Type: text/plain, Size: 3906 bytes --]
Declarations for memory region details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_mr.h | 88 +++++++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_mr.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_mr.h
@@ -0,0 +1,88 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_MR_H
+#define RXE_MR_H
+
+enum copy_direction {
+ direction_in,
+ direction_out,
+};
+
+int rxe_mem_init_dma(struct rxe_dev *rxe, struct rxe_pd *pd,
+ int access, struct rxe_mem *mem);
+
+int rxe_mem_init_phys(struct rxe_dev *rxe, struct rxe_pd *pd,
+ int access, u64 iova, struct ib_phys_buf *buf,
+ int num_buf, struct rxe_mem *mem);
+
+int rxe_mem_init_user(struct rxe_dev *rxe, struct rxe_pd *pd, u64 start,
+ u64 length, u64 iova, int access, struct ib_udata *udata,
+ struct rxe_mem *mr);
+
+int rxe_mem_init_fast(struct rxe_dev *rxe, struct rxe_pd *pd,
+ int max_pages, struct rxe_mem *mem);
+
+int rxe_mem_init_mw(struct rxe_dev *rxe, struct rxe_pd *pd,
+ struct rxe_mem *mw);
+
+int rxe_mem_init_fmr(struct rxe_dev *rxe, struct rxe_pd *pd, int access,
+ struct ib_fmr_attr *attr, struct rxe_mem *fmr);
+
+int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr,
+ int length, enum copy_direction dir, uint32_t *crcp);
+
+int copy_data(struct rxe_dev *rxe, struct rxe_pd *pd, int access,
+ struct rxe_dma_info *dma, void *addr, int length,
+ enum copy_direction dir, __be32 *crcp);
+
+void *iova_to_vaddr(struct rxe_mem *mem, u64 iova, int length);
+
+enum lookup_type {
+ lookup_local,
+ lookup_remote,
+};
+
+struct rxe_mem *lookup_mem(struct rxe_pd *pd, int access, u32 key,
+ enum lookup_type type);
+
+int mem_check_range(struct rxe_mem *mem, u64 iova, size_t length);
+
+int rxe_mem_map_pages(struct rxe_dev *rxe, struct rxe_mem *mem,
+ u64 *page, int num_pages, u64 iova);
+
+void rxe_mem_cleanup(void *arg);
+
+int advance_dma_data(struct rxe_dma_info *dma, unsigned int length);
+
+#endif /* RXE_MR_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 27/44] rxe_mr.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (25 preceding siblings ...)
2011-07-01 13:18 ` [patch 26/44] rxe_mr.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 28/44] rxe_mcast.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (17 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch27 --]
[-- Type: text/plain, Size: 16763 bytes --]
Memory region implementation details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_mr.c | 769 +++++++++++++++++++++++++++++++++++++
1 file changed, 769 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_mr.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_mr.c
@@ -0,0 +1,769 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_mr.h"
+
+/*
+ * lfsr (linear feedback shift register) with period 255
+ */
+static u8 rxe_get_key(void)
+{
+ static unsigned key = 1;
+
+ key = key << 1;
+
+ key |= (0 != (key & 0x100)) ^ (0 != (key & 0x10))
+ ^ (0 != (key & 0x80)) ^ (0 != (key & 0x40));
+
+ key &= 0xff;
+
+ return key;
+}
+
+int mem_check_range(struct rxe_mem *mem, u64 iova, size_t length)
+{
+ switch (mem->type) {
+ case RXE_MEM_TYPE_DMA:
+ return 0;
+
+ case RXE_MEM_TYPE_MR:
+ case RXE_MEM_TYPE_FMR:
+ return ((iova < mem->iova) ||
+ ((iova + length) > (mem->iova + mem->length))) ?
+ -EFAULT : 0;
+
+ default:
+ return -EFAULT;
+ }
+}
+
+#define IB_ACCESS_REMOTE (IB_ACCESS_REMOTE_READ \
+ | IB_ACCESS_REMOTE_WRITE \
+ | IB_ACCESS_REMOTE_ATOMIC)
+
+static void rxe_mem_init(int access, struct rxe_mem *mem)
+{
+ u32 lkey = mem->pelem.index << 8 | rxe_get_key();
+ u32 rkey = (access & IB_ACCESS_REMOTE) ? lkey : 0;
+
+ if (mem->pelem.pool->type == RXE_TYPE_MR) {
+ mem->ibmr.lkey = lkey;
+ mem->ibmr.rkey = rkey;
+ } else {
+ mem->ibfmr.lkey = lkey;
+ mem->ibfmr.rkey = rkey;
+ }
+
+ mem->pd = NULL;
+ mem->umem = NULL;
+ mem->lkey = lkey;
+ mem->rkey = rkey;
+ mem->state = RXE_MEM_STATE_INVALID;
+ mem->type = RXE_MEM_TYPE_NONE;
+ mem->va = 0;
+ mem->iova = 0;
+ mem->length = 0;
+ mem->offset = 0;
+ mem->access = 0;
+ mem->page_shift = 0;
+ mem->page_mask = 0;
+ mem->map_shift = ilog2(RXE_BUF_PER_MAP);
+ mem->map_mask = 0;
+ mem->num_buf = 0;
+ mem->max_buf = 0;
+ mem->num_map = 0;
+ mem->map = NULL;
+}
+
+void rxe_mem_cleanup(void *arg)
+{
+ struct rxe_mem *mem = arg;
+ int i;
+
+ if (mem->umem)
+ ib_umem_release(mem->umem);
+
+ if (mem->map) {
+ for (i = 0; i < mem->num_map; i++)
+ kfree(mem->map[i]);
+
+ kfree(mem->map);
+ }
+}
+
+static int rxe_mem_alloc(struct rxe_dev *rxe, struct rxe_mem *mem, int num_buf)
+{
+ int i;
+ int num_map;
+ struct rxe_map **map = mem->map;
+
+ num_map = (num_buf + RXE_BUF_PER_MAP - 1) / RXE_BUF_PER_MAP;
+
+ mem->map = kmalloc(num_map*sizeof(*map), GFP_KERNEL);
+ if (!mem->map)
+ goto err1;
+
+ for (i = 0; i < num_map; i++) {
+ mem->map[i] = kmalloc(sizeof(**map), GFP_KERNEL);
+ if (!mem->map[i])
+ goto err2;
+ }
+
+ BUG_ON(!is_power_of_2(RXE_BUF_PER_MAP));
+
+ mem->map_shift = ilog2(RXE_BUF_PER_MAP);
+ mem->map_mask = RXE_BUF_PER_MAP - 1;
+
+ mem->num_buf = num_buf;
+ mem->num_map = num_map;
+ mem->max_buf = num_map*RXE_BUF_PER_MAP;
+
+ return 0;
+
+err2:
+ for (i--; i >= 0; i--)
+ kfree(mem->map[i]);
+
+ kfree(mem->map);
+err1:
+ return -ENOMEM;
+}
+
+int rxe_mem_init_dma(struct rxe_dev *rxe, struct rxe_pd *pd,
+ int access, struct rxe_mem *mem)
+{
+ rxe_mem_init(access, mem);
+
+ mem->pd = pd;
+ mem->access = access;
+ mem->state = RXE_MEM_STATE_VALID;
+ mem->type = RXE_MEM_TYPE_DMA;
+
+ return 0;
+}
+
+int rxe_mem_init_phys(struct rxe_dev *rxe, struct rxe_pd *pd, int access,
+ u64 iova, struct ib_phys_buf *phys_buf, int num_buf,
+ struct rxe_mem *mem)
+{
+ int i;
+ struct rxe_map **map;
+ struct ib_phys_buf *buf;
+ size_t length;
+ int err;
+ size_t min_size = (size_t)(-1L);
+ size_t max_size = 0;
+ int n;
+
+ rxe_mem_init(access, mem);
+
+ err = rxe_mem_alloc(rxe, mem, num_buf);
+ if (err)
+ goto err1;
+
+ length = 0;
+ map = mem->map;
+ buf = map[0]->buf;
+ n = 0;
+
+ for (i = 0; i < num_buf; i++) {
+ length += phys_buf->size;
+ max_size = max_t(int, max_size, phys_buf->size);
+ min_size = min_t(int, min_size, phys_buf->size);
+ *buf++ = *phys_buf++;
+ n++;
+
+ if (n == RXE_BUF_PER_MAP) {
+ map++;
+ buf = map[0]->buf;
+ n = 0;
+ }
+ }
+
+ if (max_size == min_size && is_power_of_2(max_size)) {
+ mem->page_shift = ilog2(max_size);
+ mem->page_mask = max_size - 1;
+ }
+
+ mem->pd = pd;
+ mem->access = access;
+ mem->iova = iova;
+ mem->va = iova;
+ mem->length = length;
+ mem->state = RXE_MEM_STATE_VALID;
+ mem->type = RXE_MEM_TYPE_MR;
+
+ return 0;
+
+err1:
+ return err;
+}
+
+int rxe_mem_init_user(struct rxe_dev *rxe, struct rxe_pd *pd, u64 start,
+ u64 length, u64 iova, int access, struct ib_udata *udata,
+ struct rxe_mem *mem)
+{
+ int i;
+ struct rxe_map **map;
+ struct ib_phys_buf *buf = NULL;
+ struct ib_umem *umem;
+ struct ib_umem_chunk *chunk;
+ int num_buf;
+ void *vaddr;
+ int err;
+
+ umem = ib_umem_get(pd->ibpd.uobject->context, start, length, access, 0);
+ if (IS_ERR(umem)) {
+ pr_warn("err %d from rxe_umem_get\n",
+ (int)PTR_ERR(umem));
+ err = -EINVAL;
+ goto err1;
+ }
+
+ mem->umem = umem;
+
+ num_buf = 0;
+ list_for_each_entry(chunk, &umem->chunk_list, list)
+ num_buf += chunk->nents;
+
+ rxe_mem_init(access, mem);
+
+ err = rxe_mem_alloc(rxe, mem, num_buf);
+ if (err) {
+ pr_warn("err %d from rxe_mem_alloc\n", err);
+ goto err1;
+ }
+
+ BUG_ON(!is_power_of_2(umem->page_size));
+
+ mem->page_shift = ilog2(umem->page_size);
+ mem->page_mask = umem->page_size - 1;
+
+ num_buf = 0;
+ map = mem->map;
+ if (length > 0) {
+ buf = map[0]->buf;
+
+ list_for_each_entry(chunk, &umem->chunk_list, list) {
+ for (i = 0; i < chunk->nents; i++) {
+ vaddr = page_address(sg_page(&chunk->
+ page_list[i]));
+ if (!vaddr) {
+ pr_warn("null vaddr\n");
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ buf->addr = (uintptr_t)vaddr;
+ buf->size = umem->page_size;
+ num_buf++;
+ buf++;
+
+ if (num_buf >= RXE_BUF_PER_MAP) {
+ map++;
+ buf = map[0]->buf;
+ num_buf = 0;
+ }
+ }
+ }
+ }
+
+ mem->pd = pd;
+ mem->umem = umem;
+ mem->access = access;
+ mem->length = length;
+ mem->iova = iova;
+ mem->va = start;
+ mem->offset = umem->offset;
+ mem->state = RXE_MEM_STATE_VALID;
+ mem->type = RXE_MEM_TYPE_MR;
+
+ return 0;
+
+err1:
+ return err;
+}
+
+int rxe_mem_init_fast(struct rxe_dev *rxe, struct rxe_pd *pd,
+ int max_pages, struct rxe_mem *mem)
+{
+ int err;
+
+ rxe_mem_init(0, mem); /* TODO what access does this have */
+
+ err = rxe_mem_alloc(rxe, mem, max_pages);
+ if (err)
+ goto err1;
+
+ /* TODO what page size do we assume */
+
+ mem->pd = pd;
+ mem->max_buf = max_pages;
+ mem->state = RXE_MEM_STATE_FREE;
+ mem->type = RXE_MEM_TYPE_MR;
+
+ return 0;
+
+err1:
+ return err;
+}
+
+int rxe_mem_init_mw(struct rxe_dev *rxe, struct rxe_pd *pd,
+ struct rxe_mem *mem)
+{
+ rxe_mem_init(0, mem);
+
+ mem->pd = pd;
+ mem->state = RXE_MEM_STATE_FREE;
+ mem->type = RXE_MEM_TYPE_MW;
+
+ return 0;
+}
+
+int rxe_mem_init_fmr(struct rxe_dev *rxe, struct rxe_pd *pd, int access,
+ struct ib_fmr_attr *attr, struct rxe_mem *mem)
+{
+ int err;
+
+ if (attr->max_maps > rxe->attr.max_map_per_fmr) {
+ pr_warn("max_mmaps = %d too big, max_map_per_fmr = %d\n",
+ attr->max_maps, rxe->attr.max_map_per_fmr);
+ err = -EINVAL;
+ goto err1;
+ }
+
+ rxe_mem_init(access, mem);
+
+ err = rxe_mem_alloc(rxe, mem, attr->max_pages);
+ if (err)
+ goto err1;
+
+ mem->pd = pd;
+ mem->access = access;
+ mem->page_shift = attr->page_shift;
+ mem->page_mask = (1 << attr->page_shift) - 1;
+ mem->max_buf = attr->max_pages;
+ mem->state = RXE_MEM_STATE_FREE;
+ mem->type = RXE_MEM_TYPE_FMR;
+
+ return 0;
+
+err1:
+ return err;
+}
+
+static void lookup_iova(
+ struct rxe_mem *mem,
+ u64 iova,
+ int *m_out,
+ int *n_out,
+ size_t *offset_out)
+{
+ size_t offset = iova - mem->iova + mem->offset;
+ int map_index;
+ int buf_index;
+ u64 length;
+
+ if (likely(mem->page_shift)) {
+ *offset_out = offset & mem->page_mask;
+ offset >>= mem->page_shift;
+ *n_out = offset & mem->map_mask;
+ *m_out = offset >> mem->map_shift;
+ } else {
+ map_index = 0;
+ buf_index = 0;
+
+ length = mem->map[map_index]->buf[buf_index].size;
+
+ while (offset >= length) {
+ offset -= length;
+ buf_index++;
+
+ if (buf_index == RXE_BUF_PER_MAP) {
+ map_index++;
+ buf_index = 0;
+ }
+ length = mem->map[map_index]->buf[buf_index].size;
+ }
+
+ *m_out = map_index;
+ *n_out = buf_index;
+ *offset_out = offset;
+ }
+}
+
+void *iova_to_vaddr(struct rxe_mem *mem, u64 iova, int length)
+{
+ size_t offset;
+ int m, n;
+ void *addr;
+
+ if (mem->state != RXE_MEM_STATE_VALID) {
+ pr_warn("mem not in valid state\n");
+ addr = NULL;
+ goto out;
+ }
+
+ if (!mem->map) {
+ addr = (void *)(uintptr_t)iova;
+ goto out;
+ }
+
+ if (mem_check_range(mem, iova, length)) {
+ pr_warn("range violation\n");
+ addr = NULL;
+ goto out;
+ }
+
+ lookup_iova(mem, iova, &m, &n, &offset);
+
+ if (offset + length > mem->map[m]->buf[n].size) {
+ pr_warn("crosses page boundary\n");
+ addr = NULL;
+ goto out;
+ }
+
+ addr = (void *)(uintptr_t)mem->map[m]->buf[n].addr + offset;
+
+out:
+ return addr;
+}
+
+/* copy data from a range (vaddr, vaddr+length-1) to or from
+ a mem object starting at iova. Compute incremental value of
+ crc32 if crcp is not zero. caller must hold a reference to mem */
+int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length,
+ enum copy_direction dir, __be32 *crcp)
+{
+ int err;
+ int bytes;
+ u8 *va;
+ struct rxe_map **map;
+ struct ib_phys_buf *buf;
+ int m;
+ int i;
+ size_t offset;
+ __be32 crc = crcp ? (*crcp) : 0;
+
+ if (mem->type == RXE_MEM_TYPE_DMA) {
+ uint8_t *src, *dest;
+
+ src = (dir == direction_in) ?
+ addr : ((void *)(uintptr_t)iova);
+
+ dest = (dir == direction_in) ?
+ ((void *)(uintptr_t)iova) : addr;
+
+ if (!crcp)
+ memcpy(dest, src, length);
+ else
+ *crcp = sb8_copy(dest, src, length, *crcp);
+
+ return 0;
+ }
+
+ BUG_ON(!mem->map);
+
+ err = mem_check_range(mem, iova, length);
+ if (err) {
+ err = -EFAULT;
+ goto err1;
+ }
+
+ lookup_iova(mem, iova, &m, &i, &offset);
+
+ map = mem->map + m;
+ buf = map[0]->buf + i;
+
+ while (length > 0) {
+ uint8_t *src, *dest;
+
+ va = (u8 *)(uintptr_t)buf->addr + offset;
+ src = (dir == direction_in) ? addr : va;
+ dest = (dir == direction_in) ? va : addr;
+
+ bytes = buf->size - offset;
+
+ if (bytes > length)
+ bytes = length;
+
+ if (!crcp)
+ memcpy(dest, src, bytes);
+ else
+ crc = sb8_copy(dest, src, bytes, crc);
+
+ length -= bytes;
+ addr += bytes;
+
+ offset = 0;
+ buf++;
+ i++;
+
+ if (i == RXE_BUF_PER_MAP) {
+ i = 0;
+ map++;
+ buf = map[0]->buf;
+ }
+ }
+
+ if (crcp)
+ *crcp = crc;
+
+ return 0;
+
+err1:
+ return err;
+}
+
+/* copy data in or out of a wqe, i.e. sg list
+ under the control of a dma descriptor */
+int copy_data(
+ struct rxe_dev *rxe,
+ struct rxe_pd *pd,
+ int access,
+ struct rxe_dma_info *dma,
+ void *addr,
+ int length,
+ enum copy_direction dir,
+ __be32 *crcp)
+{
+ int bytes;
+ struct ib_sge *sge = &dma->sge[dma->cur_sge];
+ int offset = dma->sge_offset;
+ int resid = dma->resid;
+ struct rxe_mem *mem = NULL;
+ u64 iova;
+ int err;
+
+ if (length == 0)
+ return 0;
+
+ if (length > resid) {
+ err = -EINVAL;
+ goto err2;
+ }
+
+ if (sge->length && (offset < sge->length)) {
+ mem = lookup_mem(pd, access, sge->lkey, lookup_local);
+ if (!mem) {
+ err = -EINVAL;
+ goto err1;
+ }
+ }
+
+ while (length > 0) {
+ bytes = length;
+
+ if (offset >= sge->length) {
+ if (mem) {
+ rxe_drop_ref(mem);
+ mem = NULL;
+ }
+ sge++;
+ dma->cur_sge++;
+ offset = 0;
+
+ if (dma->cur_sge >= dma->num_sge) {
+ err = -ENOSPC;
+ goto err2;
+ }
+
+ if (sge->length) {
+ mem = lookup_mem(pd, access, sge->lkey,
+ lookup_local);
+ if (!mem) {
+ err = -EINVAL;
+ goto err1;
+ }
+ } else
+ continue;
+ }
+
+ if (bytes > sge->length - offset)
+ bytes = sge->length - offset;
+
+ if (bytes > 0) {
+ iova = sge->addr + offset;
+
+ err = rxe_mem_copy(mem, iova, addr, bytes, dir, crcp);
+ if (err)
+ goto err2;
+
+ offset += bytes;
+ resid -= bytes;
+ length -= bytes;
+ addr += bytes;
+ }
+ }
+
+ dma->sge_offset = offset;
+ dma->resid = resid;
+
+ if (mem)
+ rxe_drop_ref(mem);
+
+ return 0;
+
+err2:
+ if (mem)
+ rxe_drop_ref(mem);
+err1:
+ return err;
+}
+
+int advance_dma_data(struct rxe_dma_info *dma, unsigned int length)
+{
+ struct ib_sge *sge = &dma->sge[dma->cur_sge];
+ int offset = dma->sge_offset;
+ int resid = dma->resid;
+
+ while (length) {
+ unsigned int bytes;
+
+ if (offset >= sge->length) {
+ sge++;
+ dma->cur_sge++;
+ offset = 0;
+ if (dma->cur_sge >= dma->num_sge)
+ return -ENOSPC;
+ }
+
+ bytes = length;
+
+ if (bytes > sge->length - offset)
+ bytes = sge->length - offset;
+
+ offset += bytes;
+ resid -= bytes;
+ length -= bytes;
+ }
+
+ dma->sge_offset = offset;
+ dma->resid = resid;
+
+ return 0;
+}
+
+/* (1) find the mem (mr, fmr or mw) corresponding to lkey/rkey
+ depending on lookup_type
+ (2) verify that the (qp) pd matches the mem pd
+ (3) verify that the mem can support the requested access
+ (4) verify that mem state is valid */
+struct rxe_mem *lookup_mem(struct rxe_pd *pd, int access, u32 key,
+ enum lookup_type type)
+{
+ struct rxe_mem *mem;
+ struct rxe_dev *rxe = to_rdev(pd->ibpd.device);
+ int index = key >> 8;
+
+ if (index >= RXE_MIN_MR_INDEX && index <= RXE_MAX_MR_INDEX) {
+ mem = rxe_pool_get_index(&rxe->mr_pool, index);
+ if (!mem)
+ goto err1;
+ } else if (index >= RXE_MIN_FMR_INDEX && index <= RXE_MAX_FMR_INDEX) {
+ mem = rxe_pool_get_index(&rxe->fmr_pool, index);
+ if (!mem)
+ goto err1;
+ } else if (index >= RXE_MIN_MW_INDEX && index <= RXE_MAX_MW_INDEX) {
+ mem = rxe_pool_get_index(&rxe->mw_pool, index);
+ if (!mem)
+ goto err1;
+ } else
+ goto err1;
+
+ if ((type == lookup_local && mem->lkey != key)
+ || (type == lookup_remote && mem->rkey != key))
+ goto err2;
+
+ if (mem->pd != pd)
+ goto err2;
+
+ if (access && !(access & mem->access))
+ goto err2;
+
+ if (mem->state != RXE_MEM_STATE_VALID)
+ goto err2;
+
+ return mem;
+
+err2:
+ rxe_drop_ref(mem);
+err1:
+ return NULL;
+}
+
+int rxe_mem_map_pages(struct rxe_dev *rxe, struct rxe_mem *mem,
+ u64 *page, int num_pages, u64 iova)
+{
+ int i;
+ int num_buf;
+ int err;
+ struct rxe_map **map;
+ struct ib_phys_buf *buf;
+ int page_size;
+
+ if (num_pages > mem->max_buf) {
+ err = -EINVAL;
+ goto err1;
+ }
+
+ num_buf = 0;
+ page_size = 1 << mem->page_shift;
+ map = mem->map;
+ buf = map[0]->buf;
+
+ for (i = 0; i < num_pages; i++) {
+ buf->addr = *page++;
+ buf->size = page_size;
+ buf++;
+ num_buf++;
+
+ if (num_buf == RXE_BUF_PER_MAP) {
+ map++;
+ buf = map[0]->buf;
+ num_buf = 0;
+ }
+ }
+
+ mem->iova = iova;
+ mem->va = iova;
+ mem->length = num_pages << mem->page_shift;
+ mem->state = RXE_MEM_STATE_VALID;
+
+ return 0;
+
+err1:
+ return err;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 28/44] rxe_mcast.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (26 preceding siblings ...)
2011-07-01 13:18 ` [patch 27/44] rxe_mr.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 29/44] rxe_mcast.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (16 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch28 --]
[-- Type: text/plain, Size: 2721 bytes --]
Declarations for multicast details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_mcast.h | 52 ++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_mcast.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_mcast.h
@@ -0,0 +1,52 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* multicast implemtation details */
+
+#ifndef RXE_MCAST_H
+#define RXE_MCAST_H
+
+int rxe_mcast_get_grp(struct rxe_dev *rxe, union ib_gid *mgid, u16 mlid,
+ struct rxe_mc_grp **grp_p);
+
+int rxe_mcast_add_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct rxe_mc_grp *grp);
+
+int rxe_mcast_drop_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp,
+ union ib_gid *mgid, u16 mlid);
+
+void rxe_drop_all_mcast_groups(struct rxe_qp *qp);
+
+void rxe_mc_cleanup(void *arg);
+
+#endif /* RXE_MCAST_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 29/44] rxe_mcast.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (27 preceding siblings ...)
2011-07-01 13:18 ` [patch 28/44] rxe_mcast.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 30/44] rxe_recv.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (15 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch29 --]
[-- Type: text/plain, Size: 5445 bytes --]
Multicast implemtation details.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_mcast.c | 192 ++++++++++++++++++++++++++++++++++
1 file changed, 192 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_mcast.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_mcast.c
@@ -0,0 +1,192 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/* multicast implemtation details */
+
+#include "rxe.h"
+#include "rxe_mcast.h"
+
+int rxe_mcast_get_grp(struct rxe_dev *rxe, union ib_gid *mgid, u16 mlid,
+ struct rxe_mc_grp **grp_p)
+{
+ int err;
+ struct rxe_mc_grp *grp;
+
+ if (rxe->attr.max_mcast_qp_attach == 0) {
+ err = -EINVAL;
+ goto err1;
+ }
+
+ grp = rxe_pool_get_key(&rxe->mc_grp_pool, mgid);
+ if (grp)
+ goto done;
+
+ grp = rxe_alloc(&rxe->mc_grp_pool);
+ if (!grp) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ INIT_LIST_HEAD(&grp->qp_list);
+ spin_lock_init(&grp->mcg_lock);
+ grp->mlid = mlid;
+ grp->rxe = rxe;
+
+ err = rxe->ifc_ops->mcast_add(rxe, mgid);
+ if (err)
+ goto err2;
+
+ rxe_add_key(grp, mgid);
+done:
+ *grp_p = grp;
+ return 0;
+
+err2:
+ rxe_drop_ref(grp);
+err1:
+ return err;
+}
+
+int rxe_mcast_add_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct rxe_mc_grp *grp)
+{
+ int err;
+ struct rxe_mc_elem *elem;
+
+ /* check to see of the qp is already a member of the group */
+ spin_lock_bh(&qp->grp_lock);
+ spin_lock_bh(&grp->mcg_lock);
+ list_for_each_entry(elem, &grp->qp_list, qp_list) {
+ if (elem->qp == qp) {
+ err = 0;
+ goto out;
+ }
+ }
+
+ if (grp->num_qp >= rxe->attr.max_mcast_qp_attach) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ elem = rxe_alloc(&rxe->mc_elem_pool);
+ if (!elem) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* each qp holds a ref on the grp */
+ rxe_add_ref(grp);
+
+ grp->num_qp++;
+ elem->qp = qp;
+ elem->grp = grp;
+
+ list_add(&elem->qp_list, &grp->qp_list);
+ list_add(&elem->grp_list, &qp->grp_list);
+
+ err = 0;
+out:
+ spin_unlock_bh(&grp->mcg_lock);
+ spin_unlock_bh(&qp->grp_lock);
+ return err;
+}
+
+int rxe_mcast_drop_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp,
+ union ib_gid *mgid, u16 mlid)
+{
+ struct rxe_mc_grp *grp;
+ struct rxe_mc_elem *elem, *tmp;
+
+ grp = rxe_pool_get_key(&rxe->mc_grp_pool, mgid);
+ if (!grp)
+ goto err1;
+
+ spin_lock_bh(&qp->grp_lock);
+ spin_lock_bh(&grp->mcg_lock);
+
+ list_for_each_entry_safe(elem, tmp, &grp->qp_list, qp_list) {
+ if (elem->qp == qp) {
+ list_del(&elem->qp_list);
+ list_del(&elem->grp_list);
+ grp->num_qp--;
+
+ spin_unlock_bh(&grp->mcg_lock);
+ spin_unlock_bh(&qp->grp_lock);
+ rxe_drop_ref(elem);
+ rxe_drop_ref(grp); /* ref held by QP */
+ rxe_drop_ref(grp); /* ref from get_key */
+ return 0;
+ }
+ }
+
+ spin_unlock_bh(&grp->mcg_lock);
+ spin_unlock_bh(&qp->grp_lock);
+ rxe_drop_ref(grp); /* ref from get_key */
+err1:
+ return -EINVAL;
+}
+
+void rxe_drop_all_mcast_groups(struct rxe_qp *qp)
+{
+ struct rxe_mc_grp *grp;
+ struct rxe_mc_elem *elem;
+
+ while (1) {
+ spin_lock_bh(&qp->grp_lock);
+ if (list_empty(&qp->grp_list)) {
+ spin_unlock_bh(&qp->grp_lock);
+ break;
+ }
+ elem = list_first_entry(&qp->grp_list, struct rxe_mc_elem,
+ grp_list);
+ list_del(&elem->grp_list);
+ spin_unlock_bh(&qp->grp_lock);
+
+ grp = elem->grp;
+ spin_lock_bh(&grp->mcg_lock);
+ list_del(&elem->qp_list);
+ grp->num_qp--;
+ spin_unlock_bh(&grp->mcg_lock);
+ rxe_drop_ref(grp);
+ rxe_drop_ref(elem);
+ }
+}
+
+void rxe_mc_cleanup(void *arg)
+{
+ struct rxe_mc_grp *grp = arg;
+ struct rxe_dev *rxe = grp->rxe;
+
+ rxe_drop_key(grp);
+ rxe->ifc_ops->mcast_delete(rxe, &grp->mgid);
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 30/44] rxe_recv.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (28 preceding siblings ...)
2011-07-01 13:18 ` [patch 29/44] rxe_mcast.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 31/44] rxe_comp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (14 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch30 --]
[-- Type: text/plain, Size: 11042 bytes --]
handles receiving new packets which are
sent to either request or response processing.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_recv.c | 426 +++++++++++++++++++++++++++++++++++
1 file changed, 426 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_recv.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_recv.c
@@ -0,0 +1,426 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/skbuff.h>
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_qp.h"
+
+static int check_type_state(struct rxe_dev *rxe, struct rxe_pkt_info *pkt,
+ struct rxe_qp *qp)
+{
+ if (unlikely(!qp->valid))
+ goto err1;
+
+ switch (qp_type(qp)) {
+ case IB_QPT_RC:
+ if (unlikely((pkt->opcode >> 5) != 0)) {
+ pr_warn("bad qp type\n");
+ goto err1;
+ }
+ break;
+ case IB_QPT_UC:
+ if (unlikely((pkt->opcode >> 5) != 1)) {
+ pr_warn("bad qp type\n");
+ goto err1;
+ }
+ break;
+ case IB_QPT_UD:
+ case IB_QPT_SMI:
+ case IB_QPT_GSI:
+ if (unlikely((pkt->opcode >> 5) != 3)) {
+ pr_warn("bad qp type\n");
+ goto err1;
+ }
+ break;
+ default:
+ pr_warn("unsupported qp type\n");
+ goto err1;
+ }
+
+ if (pkt->mask & RXE_REQ_MASK) {
+ if (unlikely(qp->resp.state != QP_STATE_READY))
+ goto err1;
+ } else if (unlikely(qp->req.state < QP_STATE_READY ||
+ qp->req.state > QP_STATE_DRAINED))
+ goto err1;
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static int check_keys(struct rxe_dev *rxe, struct rxe_pkt_info *pkt,
+ u32 qpn, struct rxe_qp *qp)
+{
+ int i;
+ int found_pkey = 0;
+ struct rxe_port *port = &rxe->port[pkt->port_num - 1];
+ u16 pkey = bth_pkey(pkt);
+
+ pkt->pkey_index = 0;
+
+ if (qpn == 1) {
+ for (i = 0; i < port->attr.pkey_tbl_len; i++) {
+ if (pkey_match(pkey, port->pkey_tbl[i])) {
+ pkt->pkey_index = i;
+ found_pkey = 1;
+ break;
+ }
+ }
+
+ if (!found_pkey) {
+ pr_warn("bad pkey = 0x%x\n", pkey);
+ spin_lock_bh(&port->port_lock);
+ port->attr.bad_pkey_cntr
+ = (port->attr.bad_pkey_cntr >= 0xffff) ?
+ 0xffff :
+ port->attr.bad_pkey_cntr + 1;
+ spin_unlock_bh(&port->port_lock);
+ goto err1;
+ }
+ } else if (qpn != 0) {
+ if (unlikely(!pkey_match(pkey,
+ port->pkey_tbl[qp->attr.pkey_index]))) {
+ pr_warn("bad pkey = 0x%0x\n", pkey);
+ spin_lock_bh(&port->port_lock);
+ port->attr.bad_pkey_cntr
+ = (port->attr.bad_pkey_cntr >= 0xffff) ?
+ 0xffff :
+ port->attr.bad_pkey_cntr + 1;
+ spin_unlock_bh(&port->port_lock);
+ goto err1;
+ }
+ pkt->pkey_index = qp->attr.pkey_index;
+ }
+
+ if (qp_type(qp) == IB_QPT_UD && qpn != 0 && pkt->mask) {
+ u32 qkey = (qpn == 1) ? GSI_QKEY : qp->attr.qkey;
+ if (unlikely(deth_qkey(pkt) != qkey)) {
+ pr_warn("bad qkey, got 0x%x expected 0x%x\n",
+ deth_qkey(pkt), qkey);
+ spin_lock_bh(&port->port_lock);
+ port->attr.qkey_viol_cntr
+ = (port->attr.qkey_viol_cntr >= 0xffff) ?
+ 0xffff :
+ port->attr.qkey_viol_cntr + 1;
+ spin_unlock_bh(&port->port_lock);
+ goto err1;
+ }
+ }
+
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static int check_addr(struct rxe_dev *rxe, struct rxe_pkt_info *pkt,
+ struct rxe_qp *qp)
+{
+ struct rxe_port *port = &rxe->port[pkt->port_num - 1];
+ union ib_gid *sgid;
+ union ib_gid *dgid;
+
+ if (qp_type(qp) != IB_QPT_RC && qp_type(qp) != IB_QPT_UC)
+ goto done;
+
+ if (unlikely(pkt->port_num != qp->attr.port_num)) {
+ pr_warn("port %d != qp port %d\n",
+ pkt->port_num, qp->attr.port_num);
+ goto err1;
+ }
+
+ if ((pkt->mask & RXE_GRH_MASK) == 0) {
+ if (unlikely(qp->pri_av.attr.ah_flags & IB_AH_GRH)) {
+ pr_warn("no grh for global qp\n");
+ goto err1;
+ } else
+ goto done;
+ }
+
+ sgid = grh_sgid(pkt);
+ dgid = grh_dgid(pkt);
+
+ if (unlikely((qp->pri_av.attr.ah_flags & IB_AH_GRH) == 0)) {
+ pr_warn("grh for local qp\n");
+ goto err1;
+ }
+
+ if (unlikely(dgid->global.subnet_prefix == 0 &&
+ be64_to_cpu(dgid->global.interface_id) <= 1)) {
+ pr_warn("bad dgid, subnet_prefix = 0\n");
+ goto err1;
+ }
+
+ if (unlikely(sgid->raw[0] == 0xff)) {
+ pr_warn("bad sgid, multicast gid\n");
+ goto err1;
+ }
+
+ if (unlikely(sgid->global.subnet_prefix == 0 &&
+ be64_to_cpu(sgid->global.interface_id) <= 1)) {
+ pr_warn("bad sgid, subnet prefix = 0 or 1\n");
+ goto err1;
+ }
+
+ if (unlikely(dgid->global.interface_id !=
+ port->guid_tbl[qp->pri_av.attr.grh.sgid_index])) {
+ pr_warn("bad dgid, doesn't match qp\n");
+ goto err1;
+ }
+
+ if (unlikely(sgid->global.interface_id !=
+ qp->pri_av.attr.grh.dgid.global.interface_id)) {
+ pr_warn("bad sgid, doesn't match qp\n");
+ goto err1;
+ }
+done:
+ return 0;
+
+err1:
+ return -EINVAL;
+}
+
+static int hdr_check(struct rxe_pkt_info *pkt)
+{
+ struct rxe_dev *rxe = pkt->rxe;
+ struct rxe_port *port = &rxe->port[pkt->port_num - 1];
+ struct rxe_qp *qp = NULL;
+ union ib_gid *dgid = NULL;
+ u32 qpn = bth_qpn(pkt);
+ int index;
+ int err;
+
+ if (unlikely(bth_tver(pkt) != BTH_TVER)) {
+ pr_warn("bad tver\n");
+ goto err1;
+ }
+
+ if (qpn != IB_MULTICAST_QPN) {
+ index = (qpn == 0) ? port->qp_smi_index :
+ ((qpn == 1) ? port->qp_gsi_index : qpn);
+ qp = rxe_pool_get_index(&rxe->qp_pool, index);
+ if (unlikely(!qp)) {
+ pr_warn("no qp matches qpn 0x%x\n", qpn);
+ goto err1;
+ }
+
+ err = check_type_state(rxe, pkt, qp);
+ if (unlikely(err))
+ goto err2;
+ }
+
+ if (pkt->mask & RXE_GRH_MASK) {
+ dgid = grh_dgid(pkt);
+
+ if (unlikely(grh_next_hdr(pkt) != GRH_RXE_NEXT_HDR)) {
+ pr_warn("bad next hdr\n");
+ goto err2;
+ }
+
+ if (unlikely(grh_ipver(pkt) != GRH_IPV6)) {
+ pr_warn("bad ipver\n");
+ goto err2;
+ }
+ }
+
+ if (qpn != IB_MULTICAST_QPN) {
+ err = check_addr(rxe, pkt, qp);
+ if (unlikely(err))
+ goto err2;
+
+ err = check_keys(rxe, pkt, qpn, qp);
+ if (unlikely(err))
+ goto err2;
+ } else {
+ if (unlikely((pkt->mask & RXE_GRH_MASK) == 0)) {
+ pr_warn("no grh for mcast qpn\n");
+ goto err1;
+ }
+ if (unlikely(dgid->raw[0] != 0xff)) {
+ pr_warn("bad dgid for mcast qpn\n");
+ goto err1;
+ }
+ }
+
+ pkt->qp = qp;
+ return 0;
+
+err2:
+ if (qp)
+ rxe_drop_ref(qp);
+err1:
+ return -EINVAL;
+}
+
+static inline void rxe_rcv_pkt(struct rxe_dev *rxe,
+ struct rxe_pkt_info *pkt,
+ struct sk_buff *skb)
+{
+ if (pkt->mask & RXE_REQ_MASK)
+ rxe_resp_queue_pkt(rxe, pkt->qp, skb);
+ else
+ rxe_comp_queue_pkt(rxe, pkt->qp, skb);
+}
+
+static void rxe_rcv_mcast_pkt(struct rxe_dev *rxe, struct sk_buff *skb)
+{
+ struct rxe_pkt_info *pkt = SKB_TO_PKT(skb);
+ struct rxe_mc_grp *mcg;
+ struct list_head *l;
+ struct sk_buff *skb_copy;
+ struct rxe_mc_elem *mce;
+ struct rxe_qp *qp;
+ int err;
+
+ /* lookup mcast group corresponding to mgid, takes a ref */
+ mcg = rxe_pool_get_key(&rxe->mc_grp_pool, grh_dgid(pkt));
+ if (!mcg)
+ goto err1; /* mcast group not registered */
+
+ spin_lock_bh(&mcg->mcg_lock);
+
+ list_for_each(l, &mcg->qp_list) {
+ mce = container_of(l, struct rxe_mc_elem, qp_list);
+ qp = mce->qp;
+ pkt = SKB_TO_PKT(skb);
+
+ /* validate qp for incoming packet */
+ err = check_type_state(rxe, pkt, qp);
+ if (err)
+ continue;
+
+ err = check_keys(rxe, pkt, bth_qpn(pkt), qp);
+ if (err)
+ continue;
+
+ /* if *not* the last qp in the list
+ make a copy of the skb to post to the next qp */
+ skb_copy = (l->next != &mcg->qp_list) ?
+ skb_clone(skb, GFP_KERNEL) : NULL;
+
+ pkt->qp = qp;
+ rxe_add_ref(qp);
+ rxe_rcv_pkt(rxe, pkt, skb);
+
+ skb = skb_copy;
+ if (!skb)
+ break;
+ }
+
+ spin_unlock_bh(&mcg->mcg_lock);
+
+ rxe_drop_ref(mcg); /* drop ref from rxe_pool_get_key. */
+
+err1:
+ if (skb)
+ kfree_skb(skb);
+}
+
+/* rxe_rcv is called from the interface driver
+ * on entry
+ * pkt->rxe = rdma device
+ * pkt->port_num = rdma device port
+ * For rxe_net:
+ * pkt->mask = RXE_GRH_MASK
+ * pkt->hdr = &grh with no lrh
+ * For IB transport (e.g. rxe_sample)
+ * pkt->mask = RXE_LRH_MASK
+ * pkt->hdr = &lrh with optional grh
+ */
+int rxe_rcv(struct sk_buff *skb)
+{
+ int err;
+ struct rxe_pkt_info *pkt = SKB_TO_PKT(skb);
+ struct rxe_dev *rxe = pkt->rxe;
+
+ pkt->offset = 0;
+
+ if (pkt->mask & RXE_LRH_MASK) {
+ unsigned int length = __lrh_length(pkt->hdr);
+ unsigned int lnh = __lrh_lnh(pkt->hdr);
+
+ if (skb->len < RXE_LRH_BYTES)
+ goto drop;
+
+ if (lnh < LRH_LNH_IBA_LOC)
+ goto drop;
+
+ pkt->paylen = 4*length - RXE_LRH_BYTES;
+ pkt->offset += RXE_LRH_BYTES;
+
+ if (lnh == LRH_LNH_IBA_GBL)
+ pkt->mask |= RXE_GRH_MASK;
+ }
+
+ if (pkt->mask & RXE_GRH_MASK) {
+ if (skb->len < pkt->offset + RXE_GRH_BYTES)
+ goto drop;
+
+ pkt->paylen = __grh_paylen(pkt->hdr);
+ pkt->offset += RXE_GRH_BYTES;
+ }
+
+ if (unlikely(skb->len < pkt->offset + RXE_BTH_BYTES))
+ goto drop;
+
+ pkt->opcode = bth_opcode(pkt);
+ pkt->psn = bth_psn(pkt);
+ pkt->qp = NULL;
+ pkt->mask |= rxe_opcode[pkt->opcode].mask;
+
+ if (unlikely(skb->len < header_size(pkt)))
+ goto drop;
+
+ err = hdr_check(pkt);
+ if (unlikely(err))
+ goto drop;
+
+ if (unlikely(bth_qpn(pkt) == IB_MULTICAST_QPN))
+ rxe_rcv_mcast_pkt(rxe, skb);
+ else
+ rxe_rcv_pkt(rxe, pkt, skb);
+
+ return 0;
+
+drop:
+ if (pkt->qp)
+ rxe_drop_ref(pkt->qp);
+
+ kfree_skb(skb);
+ return 0;
+}
+EXPORT_SYMBOL(rxe_rcv);
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 31/44] rxe_comp.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (29 preceding siblings ...)
2011-07-01 13:18 ` [patch 30/44] rxe_recv.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 32/44] rxe_req.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (13 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch31 --]
[-- Type: text/plain, Size: 19561 bytes --]
completion processing.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_comp.c | 734 +++++++++++++++++++++++++++++++++++
1 file changed, 734 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_comp.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_comp.c
@@ -0,0 +1,734 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/skbuff.h>
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_queue.h"
+#include "rxe_cq.h"
+#include "rxe_qp.h"
+#include "rxe_mr.h"
+#include "rxe_task.h"
+
+enum comp_state {
+ COMPST_GET_ACK,
+ COMPST_GET_WQE,
+ COMPST_COMP_WQE,
+ COMPST_COMP_ACK,
+ COMPST_CHECK_PSN,
+ COMPST_CHECK_ACK,
+ COMPST_READ,
+ COMPST_ATOMIC,
+ COMPST_WRITE_SEND,
+ COMPST_UPDATE_COMP,
+ COMPST_ERROR_RETRY,
+ COMPST_RNR_RETRY,
+ COMPST_ERROR,
+ COMPST_EXIT,
+ COMPST_DONE,
+};
+
+static char *comp_state_name[] = {
+ [COMPST_GET_ACK] = "GET ACK",
+ [COMPST_GET_WQE] = "GET WQE",
+ [COMPST_COMP_WQE] = "COMP WQE",
+ [COMPST_COMP_ACK] = "COMP ACK",
+ [COMPST_CHECK_PSN] = "CHECK PSN",
+ [COMPST_CHECK_ACK] = "CHECK ACK",
+ [COMPST_READ] = "READ",
+ [COMPST_ATOMIC] = "ATOMIC",
+ [COMPST_WRITE_SEND] = "WRITE/SEND",
+ [COMPST_UPDATE_COMP] = "UPDATE COMP",
+ [COMPST_ERROR_RETRY] = "ERROR RETRY",
+ [COMPST_RNR_RETRY] = "RNR RETRY",
+ [COMPST_ERROR] = "ERROR",
+ [COMPST_EXIT] = "EXIT",
+ [COMPST_DONE] = "DONE",
+};
+
+static unsigned long rnrnak_usec[32] = {
+ [IB_RNR_TIMER_655_36] = 655360,
+ [IB_RNR_TIMER_000_01] = 10,
+ [IB_RNR_TIMER_000_02] = 20,
+ [IB_RNR_TIMER_000_03] = 30,
+ [IB_RNR_TIMER_000_04] = 40,
+ [IB_RNR_TIMER_000_06] = 60,
+ [IB_RNR_TIMER_000_08] = 80,
+ [IB_RNR_TIMER_000_12] = 120,
+ [IB_RNR_TIMER_000_16] = 160,
+ [IB_RNR_TIMER_000_24] = 240,
+ [IB_RNR_TIMER_000_32] = 320,
+ [IB_RNR_TIMER_000_48] = 480,
+ [IB_RNR_TIMER_000_64] = 640,
+ [IB_RNR_TIMER_000_96] = 960,
+ [IB_RNR_TIMER_001_28] = 1280,
+ [IB_RNR_TIMER_001_92] = 1920,
+ [IB_RNR_TIMER_002_56] = 2560,
+ [IB_RNR_TIMER_003_84] = 3840,
+ [IB_RNR_TIMER_005_12] = 5120,
+ [IB_RNR_TIMER_007_68] = 7680,
+ [IB_RNR_TIMER_010_24] = 10240,
+ [IB_RNR_TIMER_015_36] = 15360,
+ [IB_RNR_TIMER_020_48] = 20480,
+ [IB_RNR_TIMER_030_72] = 30720,
+ [IB_RNR_TIMER_040_96] = 40960,
+ [IB_RNR_TIMER_061_44] = 61410,
+ [IB_RNR_TIMER_081_92] = 81920,
+ [IB_RNR_TIMER_122_88] = 122880,
+ [IB_RNR_TIMER_163_84] = 163840,
+ [IB_RNR_TIMER_245_76] = 245760,
+ [IB_RNR_TIMER_327_68] = 327680,
+ [IB_RNR_TIMER_491_52] = 491520,
+};
+
+static inline unsigned long rnrnak_jiffies(u8 timeout)
+{
+ return max_t(unsigned long,
+ usecs_to_jiffies(rnrnak_usec[timeout]), 1);
+}
+
+static enum ib_wc_opcode wr_to_wc_opcode(enum ib_wr_opcode opcode)
+{
+ switch (opcode) {
+ case IB_WR_RDMA_WRITE: return IB_WC_RDMA_WRITE;
+ case IB_WR_RDMA_WRITE_WITH_IMM: return IB_WC_RDMA_WRITE;
+ case IB_WR_SEND: return IB_WC_SEND;
+ case IB_WR_SEND_WITH_IMM: return IB_WC_SEND;
+ case IB_WR_RDMA_READ: return IB_WC_RDMA_READ;
+ case IB_WR_ATOMIC_CMP_AND_SWP: return IB_WC_COMP_SWAP;
+ case IB_WR_ATOMIC_FETCH_AND_ADD: return IB_WC_FETCH_ADD;
+ case IB_WR_LSO: return IB_WC_LSO;
+ case IB_WR_SEND_WITH_INV: return IB_WC_SEND;
+ case IB_WR_RDMA_READ_WITH_INV: return IB_WC_RDMA_READ;
+ case IB_WR_LOCAL_INV: return IB_WC_LOCAL_INV;
+ case IB_WR_FAST_REG_MR: return IB_WC_FAST_REG_MR;
+
+ default:
+ return 0xff;
+ }
+}
+
+void retransmit_timer(unsigned long data)
+{
+ struct rxe_qp *qp = (struct rxe_qp *)data;
+
+ if (qp->valid) {
+ qp->comp.timeout = 1;
+ rxe_run_task(&qp->comp.task, 1);
+ }
+}
+
+void rxe_comp_queue_pkt(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct sk_buff *skb)
+{
+ int must_sched;
+
+ atomic_inc(&rxe->resp_skb_in);
+ skb_queue_tail(&qp->resp_pkts, skb);
+
+ must_sched = skb_queue_len(&qp->resp_pkts) > 1;
+ rxe_run_task(&qp->comp.task, must_sched);
+}
+
+static inline enum comp_state get_wqe(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt,
+ struct rxe_send_wqe **wqe_p)
+{
+ struct rxe_send_wqe *wqe;
+
+ /* we come here whether or not we found a response packet
+ to see if there are any posted WQEs */
+ wqe = queue_head(qp->sq.queue);
+ *wqe_p = wqe;
+
+ /* no WQE or requester has not started it yet */
+ if (!wqe || wqe->state == wqe_state_posted)
+ return pkt ? COMPST_DONE : COMPST_EXIT;
+
+ /* WQE does not require an ack */
+ if (wqe->state == wqe_state_done)
+ return COMPST_COMP_WQE;
+
+ /* WQE caused an error */
+ if (wqe->state == wqe_state_error)
+ return COMPST_ERROR;
+
+ /* we have a WQE, if we also have an ack check its PSN */
+ return pkt ? COMPST_CHECK_PSN : COMPST_EXIT;
+}
+
+static inline void reset_retry_counters(struct rxe_qp *qp)
+{
+ qp->comp.retry_cnt = qp->attr.retry_cnt;
+ qp->comp.rnr_retry = qp->attr.rnr_retry;
+}
+
+static inline enum comp_state check_psn(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt,
+ struct rxe_send_wqe *wqe)
+{
+ s32 diff;
+
+ /* check to see if response is past the oldest WQE
+ if it is, complete send/write or error read/atomic */
+ diff = psn_compare(pkt->psn, wqe->last_psn);
+ if (diff > 0) {
+ if (wqe->state == wqe_state_pending) {
+ if (wqe->mask & WR_ATOMIC_OR_READ_MASK)
+ return COMPST_ERROR_RETRY;
+ else {
+ reset_retry_counters(qp);
+ return COMPST_COMP_WQE;
+ }
+ } else
+ return COMPST_DONE;
+ }
+
+ /* compare response packet to expected response */
+ diff = psn_compare(pkt->psn, qp->comp.psn);
+ if (diff < 0) {
+ /* response is most likely a retried packet
+ if it matches an uncompleted WQE go complete it
+ else ignore it */
+ if (pkt->psn == wqe->last_psn)
+ return COMPST_COMP_ACK;
+ else
+ return COMPST_DONE;
+ } else if ((diff > 0) && (wqe->mask & WR_ATOMIC_OR_READ_MASK))
+ return COMPST_ERROR_RETRY;
+ else
+ return COMPST_CHECK_ACK;
+}
+
+static inline enum comp_state check_ack(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt,
+ struct rxe_send_wqe *wqe)
+{
+ unsigned int mask = pkt->mask;
+ u8 syn;
+
+ /* Check the sequence only */
+ switch (qp->comp.opcode) {
+ case -1:
+ /* Will catch all *_ONLY cases. */
+ if (!(mask & RXE_START_MASK))
+ /* TODO check spec. retry/discard ? */
+ return COMPST_ERROR;
+
+ break;
+
+ case IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST:
+ case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE:
+ if (pkt->opcode != IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE &&
+ pkt->opcode != IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST) {
+ /* TODO check spec. retry/discard ? */
+ return COMPST_ERROR;
+ }
+ break;
+ default:
+ WARN_ON(1);
+ }
+
+ /* Check operation validity. */
+ switch (pkt->opcode) {
+ case IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST:
+ case IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST:
+ case IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY:
+ syn = aeth_syn(pkt);
+
+ if ((syn & AETH_TYPE_MASK) != AETH_ACK)
+ return COMPST_ERROR;
+
+ /* Fall through (IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE
+ * doesn't have an AETH) */
+ case IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE:
+ if (wqe->ibwr.opcode != IB_WR_RDMA_READ &&
+ wqe->ibwr.opcode != IB_WR_RDMA_READ_WITH_INV) {
+ /* TODO check spec. retry/discard ? */
+ return COMPST_ERROR;
+ }
+ reset_retry_counters(qp);
+ return COMPST_READ;
+ break;
+
+ case IB_OPCODE_RC_ATOMIC_ACKNOWLEDGE:
+ syn = aeth_syn(pkt);
+
+ if ((syn & AETH_TYPE_MASK) != AETH_ACK)
+ return COMPST_ERROR;
+
+ if (wqe->ibwr.opcode != IB_WR_ATOMIC_CMP_AND_SWP &&
+ wqe->ibwr.opcode != IB_WR_ATOMIC_FETCH_AND_ADD)
+ /* TODO check spec. retry/discard ? */
+ return COMPST_ERROR;
+ reset_retry_counters(qp);
+ return COMPST_ATOMIC;
+ break;
+
+ case IB_OPCODE_RC_ACKNOWLEDGE:
+ syn = aeth_syn(pkt);
+ switch (syn & AETH_TYPE_MASK) {
+ case AETH_ACK:
+ reset_retry_counters(qp);
+ return COMPST_WRITE_SEND;
+
+ case AETH_RNR_NAK:
+ return COMPST_RNR_RETRY;
+
+ case AETH_NAK:
+ switch (syn) {
+ case AETH_NAK_PSN_SEQ_ERROR:
+ /* a nak implicitly acks all
+ packets with psns before */
+ if (psn_compare(pkt->psn, qp->comp.psn) > 0) {
+ qp->comp.psn = pkt->psn;
+ if (qp->req.wait_psn) {
+ qp->req.wait_psn = 0;
+ rxe_run_task(&qp->req.task, 1);
+ }
+ }
+ return COMPST_ERROR_RETRY;
+
+ case AETH_NAK_INVALID_REQ:
+ wqe->status = IB_WC_REM_INV_REQ_ERR;
+ return COMPST_ERROR;
+
+ case AETH_NAK_REM_ACC_ERR:
+ wqe->status = IB_WC_REM_ACCESS_ERR;
+ return COMPST_ERROR;
+
+ case AETH_NAK_REM_OP_ERR:
+ wqe->status = IB_WC_REM_OP_ERR;
+ return COMPST_ERROR;
+
+ default:
+ pr_warn("unexpected nak %x\n", syn);
+ wqe->status = IB_WC_REM_OP_ERR;
+ return COMPST_ERROR;
+ }
+ return COMPST_ERROR;
+
+ default:
+ return COMPST_ERROR;
+
+ }
+ break;
+
+ default:
+ WARN_ON(1);
+ }
+
+ WARN_ON(1);
+ return COMPST_ERROR;
+}
+
+static inline enum comp_state do_read(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt,
+ struct rxe_send_wqe *wqe)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ int ret;
+
+ ret = copy_data(rxe, qp->pd, IB_ACCESS_LOCAL_WRITE,
+ &wqe->dma, payload_addr(pkt),
+ payload_size(pkt), direction_in, NULL);
+ if (ret)
+ return COMPST_ERROR;
+
+ if (wqe->dma.resid == 0 && (pkt->mask & RXE_END_MASK))
+ return COMPST_COMP_ACK;
+ else
+ return COMPST_UPDATE_COMP;
+}
+
+static inline enum comp_state do_atomic(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt,
+ struct rxe_send_wqe *wqe)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ int ret;
+
+ u64 atomic_orig = atmack_orig(pkt);
+
+ ret = copy_data(rxe, qp->pd, IB_ACCESS_LOCAL_WRITE,
+ &wqe->dma, &atomic_orig,
+ sizeof(u64), direction_in, NULL);
+ if (ret)
+ return COMPST_ERROR;
+ else
+ return COMPST_COMP_ACK;
+}
+
+static void make_send_cqe(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ struct rxe_cqe *cqe)
+{
+ memset(cqe, 0, sizeof(*cqe));
+
+ if (!qp->is_user) {
+ struct ib_wc *wc = &cqe->ibwc;
+
+ wc->wr_id = wqe->ibwr.wr_id;
+ wc->status = wqe->status;
+ wc->opcode = wr_to_wc_opcode(wqe->ibwr.opcode);
+ wc->byte_len = wqe->dma.length;
+ wc->qp = &qp->ibqp;
+ } else {
+ struct ib_uverbs_wc *uwc = &cqe->uibwc;
+
+ uwc->wr_id = wqe->ibwr.wr_id;
+ uwc->status = wqe->status;
+ uwc->opcode = wr_to_wc_opcode(wqe->ibwr.opcode);
+ uwc->byte_len = wqe->dma.length;
+ uwc->qp_num = qp->ibqp.qp_num;
+ }
+}
+
+static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
+{
+ struct rxe_cqe cqe;
+
+ if ((qp->sq_sig_type == IB_SIGNAL_ALL_WR) ||
+ (wqe->ibwr.send_flags & IB_SEND_SIGNALED)) {
+ make_send_cqe(qp, wqe, &cqe);
+ rxe_cq_post(qp->scq, &cqe, 0);
+ }
+
+ advance_consumer(qp->sq.queue);
+
+ /*
+ * we completed something so let req run again
+ * if it is trying to fence
+ */
+ if (qp->req.wait_fence) {
+ qp->req.wait_fence = 0;
+ rxe_run_task(&qp->req.task, 1);
+ }
+}
+
+static inline enum comp_state complete_ack(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt,
+ struct rxe_send_wqe *wqe)
+{
+ unsigned long flags;
+
+ if (wqe->has_rd_atomic) {
+ wqe->has_rd_atomic = 0;
+ atomic_inc(&qp->req.rd_atomic);
+ if (qp->req.need_rd_atomic) {
+ qp->comp.timeout_retry = 0;
+ qp->req.need_rd_atomic = 0;
+ rxe_run_task(&qp->req.task, 1);
+ }
+ }
+
+ if (unlikely(qp->req.state == QP_STATE_DRAIN)) {
+ /* state_lock used by requester & completer */
+ spin_lock_irqsave(&qp->state_lock, flags);
+ if ((qp->req.state == QP_STATE_DRAIN) &&
+ (qp->comp.psn == qp->req.psn)) {
+ qp->req.state = QP_STATE_DRAINED;
+ spin_unlock_irqrestore(&qp->state_lock, flags);
+
+ if (qp->ibqp.event_handler) {
+ struct ib_event ev;
+
+ ev.device = qp->ibqp.device;
+ ev.element.qp = &qp->ibqp;
+ ev.event = IB_EVENT_SQ_DRAINED;
+ qp->ibqp.event_handler(&ev,
+ qp->ibqp.qp_context);
+ }
+ } else
+ spin_unlock_irqrestore(&qp->state_lock, flags);
+ }
+
+ do_complete(qp, wqe);
+
+ if (psn_compare(pkt->psn, qp->comp.psn) >= 0)
+ return COMPST_UPDATE_COMP;
+ else
+ return COMPST_DONE;
+}
+
+static inline enum comp_state complete_wqe(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt,
+ struct rxe_send_wqe *wqe)
+{
+ qp->comp.opcode = -1;
+
+ if (pkt) {
+ if (psn_compare(pkt->psn, qp->comp.psn) >= 0)
+ qp->comp.psn = (pkt->psn + 1) & BTH_PSN_MASK;
+
+ if (qp->req.wait_psn) {
+ qp->req.wait_psn = 0;
+ rxe_run_task(&qp->req.task, 1);
+ }
+ }
+
+ do_complete(qp, wqe);
+
+ return COMPST_GET_WQE;
+}
+
+int rxe_completer(void *arg)
+{
+ struct rxe_qp *qp = (struct rxe_qp *)arg;
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ struct rxe_send_wqe *wqe = wqe;
+ struct sk_buff *skb = NULL;
+ struct rxe_pkt_info *pkt = NULL;
+ enum comp_state state;
+
+ if (!qp->valid) {
+ while ((skb = skb_dequeue(&qp->resp_pkts))) {
+ rxe_drop_ref(qp);
+ kfree_skb(skb);
+ atomic_dec(&rxe->resp_skb_in);
+ }
+ skb = NULL;
+ pkt = NULL;
+
+ while (queue_head(qp->sq.queue))
+ advance_consumer(qp->sq.queue);
+
+ goto exit;
+ }
+
+ if (qp->req.state == QP_STATE_ERROR) {
+ while ((skb = skb_dequeue(&qp->resp_pkts))) {
+ rxe_drop_ref(qp);
+ kfree_skb(skb);
+ atomic_dec(&rxe->resp_skb_in);
+ }
+ skb = NULL;
+ pkt = NULL;
+
+ while ((wqe = queue_head(qp->sq.queue))) {
+ wqe->status = IB_WC_WR_FLUSH_ERR;
+ do_complete(qp, wqe);
+ }
+
+ goto exit;
+ }
+
+ if (qp->req.state == QP_STATE_RESET) {
+ while ((skb = skb_dequeue(&qp->resp_pkts))) {
+ rxe_drop_ref(qp);
+ kfree_skb(skb);
+ atomic_dec(&rxe->resp_skb_in);
+ }
+ skb = NULL;
+ pkt = NULL;
+
+ while (queue_head(qp->sq.queue))
+ advance_consumer(qp->sq.queue);
+
+ goto exit;
+ }
+
+ if (qp->comp.timeout) {
+ qp->comp.timeout_retry = 1;
+ qp->comp.timeout = 0;
+ } else
+ qp->comp.timeout_retry = 0;
+
+ if (qp->req.need_retry)
+ goto exit;
+
+ state = COMPST_GET_ACK;
+
+ while (1) {
+ pr_debug("state = %s\n", comp_state_name[state]);
+ switch (state) {
+ case COMPST_GET_ACK:
+ skb = skb_dequeue(&qp->resp_pkts);
+ if (skb) {
+ pkt = SKB_TO_PKT(skb);
+ qp->comp.timeout_retry = 0;
+ }
+ state = COMPST_GET_WQE;
+ break;
+
+ case COMPST_GET_WQE:
+ state = get_wqe(qp, pkt, &wqe);
+ break;
+
+ case COMPST_CHECK_PSN:
+ state = check_psn(qp, pkt, wqe);
+ break;
+
+ case COMPST_CHECK_ACK:
+ state = check_ack(qp, pkt, wqe);
+ break;
+
+ case COMPST_READ:
+ state = do_read(qp, pkt, wqe);
+ break;
+
+ case COMPST_ATOMIC:
+ state = do_atomic(qp, pkt, wqe);
+ break;
+
+ case COMPST_WRITE_SEND:
+ if (wqe->state == wqe_state_pending &&
+ wqe->last_psn == pkt->psn)
+ state = COMPST_COMP_ACK;
+ else
+ state = COMPST_UPDATE_COMP;
+ break;
+
+ case COMPST_COMP_ACK:
+ state = complete_ack(qp, pkt, wqe);
+ break;
+
+ case COMPST_COMP_WQE:
+ state = complete_wqe(qp, pkt, wqe);
+ break;
+
+ case COMPST_UPDATE_COMP:
+ if (pkt->mask & RXE_END_MASK)
+ qp->comp.opcode = -1;
+ else
+ qp->comp.opcode = pkt->opcode;
+
+ if (psn_compare(pkt->psn, qp->comp.psn) >= 0)
+ qp->comp.psn = (pkt->psn + 1) & BTH_PSN_MASK;
+
+ if (qp->req.wait_psn) {
+ qp->req.wait_psn = 0;
+ rxe_run_task(&qp->req.task, 1);
+ }
+
+ state = COMPST_DONE;
+ break;
+
+ case COMPST_DONE:
+ if (pkt) {
+ rxe_drop_ref(pkt->qp);
+ kfree_skb(skb);
+ atomic_dec(&rxe->resp_skb_in);
+ }
+ goto done;
+
+ case COMPST_EXIT:
+ if (qp->comp.timeout_retry && wqe) {
+ state = COMPST_ERROR_RETRY;
+ break;
+ }
+
+ /* re reset the timeout counter if
+ (1) QP is type RC
+ (2) the QP is alive
+ (3) there is a packet sent by the requester that
+ might be acked (we still might get spurious
+ timeouts but try to keep them as few as possible)
+ (4) the timeout parameter is set */
+ if ((qp_type(qp) == IB_QPT_RC)
+ && (qp->req.state == QP_STATE_READY)
+ && (psn_compare(qp->req.psn, qp->comp.psn) > 0)
+ && qp->qp_timeout_jiffies)
+ mod_timer(&qp->retrans_timer,
+ jiffies + qp->qp_timeout_jiffies);
+ goto exit;
+
+ case COMPST_ERROR_RETRY:
+ /* we come here if the retry timer fired
+ and we did not receive a response packet
+ try to retry the send queue if that makes
+ sense and the limits have not been exceeded
+ remember that some timeouts are spurious
+ since we do not reset the timer but kick it
+ down the road or let it expire */
+
+ /* there is nothing to retry in this case */
+ if (!wqe || (wqe->state == wqe_state_posted))
+ goto exit;
+
+ if (qp->comp.retry_cnt > 0) {
+ if (qp->comp.retry_cnt != 7)
+ qp->comp.retry_cnt--;
+
+ /* no point in retrying if we have already
+ seen the last ack that the requester
+ could have caused */
+ if (psn_compare(qp->req.psn,
+ qp->comp.psn) > 0) {
+ /* tell the requester to retry the send
+ send queue next time around */
+ qp->req.need_retry = 1;
+ rxe_run_task(&qp->req.task, 1);
+ }
+ goto exit;
+ } else {
+ wqe->status = IB_WC_RETRY_EXC_ERR;
+ state = COMPST_ERROR;
+ }
+ break;
+
+ case COMPST_RNR_RETRY:
+ if (qp->comp.rnr_retry > 0) {
+ if (qp->comp.rnr_retry != 7)
+ qp->comp.rnr_retry--;
+
+ qp->req.need_retry = 1;
+ pr_debug("set rnr nak timer\n");
+ mod_timer(&qp->rnr_nak_timer,
+ jiffies + rnrnak_jiffies(aeth_syn(pkt)
+ & ~AETH_TYPE_MASK));
+ goto exit;
+ } else {
+ wqe->status = IB_WC_RNR_RETRY_EXC_ERR;
+ state = COMPST_ERROR;
+ }
+ break;
+
+ case COMPST_ERROR:
+ do_complete(qp, wqe);
+ rxe_qp_error(qp);
+ goto exit;
+ }
+ }
+
+exit:
+ /* we come here if we are done with processing and want the
+ task to exit from the loop calling us */
+ return -EAGAIN;
+
+done:
+ /* we come here if we have processed a packet
+ we want the task to call us again to see
+ if there is anything else to do */
+ return 0;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 32/44] rxe_req.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (30 preceding siblings ...)
2011-07-01 13:18 ` [patch 31/44] rxe_comp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 33/44] rxe_resp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (12 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch32 --]
[-- Type: text/plain, Size: 19248 bytes --]
QP request logic.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_req.c | 712 ++++++++++++++++++++++++++++++++++++
1 file changed, 712 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_req.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_req.c
@@ -0,0 +1,712 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/skbuff.h>
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_queue.h"
+#include "rxe_qp.h"
+#include "rxe_mr.h"
+
+static int next_opcode(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ unsigned opcode);
+
+static inline void retry_first_write_send(struct rxe_qp *qp,
+ struct rxe_send_wqe *wqe,
+ unsigned mask, int npsn)
+{
+ int i;
+
+ for (i = 0; i < npsn; i++) {
+ int to_send = (wqe->dma.resid > qp->mtu) ?
+ qp->mtu : wqe->dma.resid;
+
+ qp->req.opcode = next_opcode(qp, wqe,
+ wqe->ibwr.opcode);
+
+ if (wqe->ibwr.send_flags & IB_SEND_INLINE) {
+ wqe->dma.resid -= to_send;
+ wqe->dma.sge_offset += to_send;
+ } else
+ advance_dma_data(&wqe->dma, to_send);
+
+ if (mask & WR_WRITE_MASK)
+ wqe->iova += qp->mtu;
+ }
+}
+
+static void req_retry(struct rxe_qp *qp)
+{
+ struct rxe_send_wqe *wqe;
+ unsigned int wqe_index;
+ unsigned int mask;
+ int npsn;
+ int first = 1;
+
+ wqe = queue_head(qp->sq.queue);
+ npsn = (qp->comp.psn - wqe->first_psn) & BTH_PSN_MASK;
+
+ qp->req.wqe_index = consumer_index(qp->sq.queue);
+ qp->req.psn = qp->comp.psn;
+ qp->req.opcode = -1;
+
+ for (wqe_index = consumer_index(qp->sq.queue);
+ wqe_index != producer_index(qp->sq.queue);
+ wqe_index = next_index(qp->sq.queue, wqe_index)) {
+
+ wqe = addr_from_index(qp->sq.queue, wqe_index);
+ mask = wr_opcode_mask(wqe->ibwr.opcode, qp);
+
+ if (wqe->state == wqe_state_posted)
+ break;
+
+ if (wqe->state == wqe_state_done)
+ continue;
+
+ wqe->iova = (mask & WR_ATOMIC_MASK) ?
+ wqe->ibwr.wr.atomic.remote_addr :
+ wqe->ibwr.wr.rdma.remote_addr;
+
+ if (!first || (mask & WR_READ_MASK) == 0) {
+ wqe->dma.resid = wqe->dma.length;
+ wqe->dma.cur_sge = 0;
+ wqe->dma.sge_offset = 0;
+ }
+
+ if (first) {
+ first = 0;
+
+ if (mask & WR_WRITE_OR_SEND_MASK)
+ retry_first_write_send(qp, wqe, mask, npsn);
+
+ if (mask & WR_READ_MASK)
+ wqe->iova += npsn*qp->mtu;
+ }
+
+ wqe->state = wqe_state_posted;
+ }
+}
+
+void rnr_nak_timer(unsigned long data)
+{
+ struct rxe_qp *qp = (struct rxe_qp *)data;
+ pr_debug("rnr nak timer fired\n");
+ rxe_run_task(&qp->req.task, 1);
+}
+
+static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp)
+{
+ struct rxe_send_wqe *wqe = queue_head(qp->sq.queue);
+ unsigned long flags;
+
+ if (unlikely(qp->req.state == QP_STATE_DRAIN)) {
+ /* check to see if we are drained;
+ * state_lock used by requester and completer */
+ spin_lock_irqsave(&qp->state_lock, flags);
+ do {
+ if (qp->req.state != QP_STATE_DRAIN) {
+ /* comp just finished */
+ spin_unlock_irqrestore(&qp->state_lock,
+ flags);
+ break;
+ }
+
+ if (wqe && ((qp->req.wqe_index !=
+ consumer_index(qp->sq.queue)) ||
+ (wqe->state != wqe_state_posted))) {
+ /* comp not done yet */
+ spin_unlock_irqrestore(&qp->state_lock,
+ flags);
+ break;
+ }
+
+ qp->req.state = QP_STATE_DRAINED;
+ spin_unlock_irqrestore(&qp->state_lock, flags);
+
+ if (qp->ibqp.event_handler) {
+ struct ib_event ev;
+
+ ev.device = qp->ibqp.device;
+ ev.element.qp = &qp->ibqp;
+ ev.event = IB_EVENT_SQ_DRAINED;
+ qp->ibqp.event_handler(&ev,
+ qp->ibqp.qp_context);
+ }
+ } while (0);
+ }
+
+ if (qp->req.wqe_index == producer_index(qp->sq.queue))
+ return NULL;
+
+ wqe = addr_from_index(qp->sq.queue, qp->req.wqe_index);
+
+ if (qp->req.state == QP_STATE_DRAIN ||
+ qp->req.state == QP_STATE_DRAINED)
+ if (wqe->state != wqe_state_processing)
+ return NULL;
+
+ if ((wqe->ibwr.send_flags & IB_SEND_FENCE) &&
+ (qp->req.wqe_index != consumer_index(qp->sq.queue))) {
+ qp->req.wait_fence = 1;
+ return NULL;
+ }
+
+ wqe->mask = wr_opcode_mask(wqe->ibwr.opcode, qp);
+ return wqe;
+}
+
+static int next_opcode_rc(struct rxe_qp *qp, unsigned opcode, int fits)
+{
+ switch (opcode) {
+ case IB_WR_RDMA_WRITE:
+ if (qp->req.opcode == IB_OPCODE_RC_RDMA_WRITE_FIRST ||
+ qp->req.opcode == IB_OPCODE_RC_RDMA_WRITE_MIDDLE)
+ return fits ?
+ IB_OPCODE_RC_RDMA_WRITE_LAST :
+ IB_OPCODE_RC_RDMA_WRITE_MIDDLE;
+ else
+ return fits ?
+ IB_OPCODE_RC_RDMA_WRITE_ONLY :
+ IB_OPCODE_RC_RDMA_WRITE_FIRST;
+
+ case IB_WR_RDMA_WRITE_WITH_IMM:
+ if (qp->req.opcode == IB_OPCODE_RC_RDMA_WRITE_FIRST ||
+ qp->req.opcode == IB_OPCODE_RC_RDMA_WRITE_MIDDLE)
+ return fits ?
+ IB_OPCODE_RC_RDMA_WRITE_LAST_WITH_IMMEDIATE :
+ IB_OPCODE_RC_RDMA_WRITE_MIDDLE;
+ else
+ return fits ?
+ IB_OPCODE_RC_RDMA_WRITE_ONLY_WITH_IMMEDIATE :
+ IB_OPCODE_RC_RDMA_WRITE_FIRST;
+
+ case IB_WR_SEND:
+ if (qp->req.opcode == IB_OPCODE_RC_SEND_FIRST ||
+ qp->req.opcode == IB_OPCODE_RC_SEND_MIDDLE)
+ return fits ?
+ IB_OPCODE_RC_SEND_LAST :
+ IB_OPCODE_RC_SEND_MIDDLE;
+ else
+ return fits ?
+ IB_OPCODE_RC_SEND_ONLY :
+ IB_OPCODE_RC_SEND_FIRST;
+
+ case IB_WR_SEND_WITH_IMM:
+ if (qp->req.opcode == IB_OPCODE_RC_SEND_FIRST ||
+ qp->req.opcode == IB_OPCODE_RC_SEND_MIDDLE)
+ return fits ?
+ IB_OPCODE_RC_SEND_LAST_WITH_IMMEDIATE :
+ IB_OPCODE_RC_SEND_MIDDLE;
+ else
+ return fits ?
+ IB_OPCODE_RC_SEND_ONLY_WITH_IMMEDIATE :
+ IB_OPCODE_RC_SEND_FIRST;
+
+ case IB_WR_RDMA_READ:
+ return IB_OPCODE_RC_RDMA_READ_REQUEST;
+
+ case IB_WR_ATOMIC_CMP_AND_SWP:
+ return IB_OPCODE_RC_COMPARE_SWAP;
+
+ case IB_WR_ATOMIC_FETCH_AND_ADD:
+ return IB_OPCODE_RC_FETCH_ADD;
+
+ case IB_WR_SEND_WITH_INV:
+ if (qp->req.opcode == IB_OPCODE_RC_SEND_FIRST ||
+ qp->req.opcode == IB_OPCODE_RC_SEND_MIDDLE)
+ return fits ? IB_OPCODE_RC_SEND_LAST_INV :
+ IB_OPCODE_RC_SEND_MIDDLE;
+ else
+ return fits ? IB_OPCODE_RC_SEND_ONLY_INV :
+ IB_OPCODE_RC_SEND_FIRST;
+ }
+
+ return -EINVAL;
+}
+
+static int next_opcode_uc(struct rxe_qp *qp, unsigned opcode, int fits)
+{
+ switch (opcode) {
+ case IB_WR_RDMA_WRITE:
+ if (qp->req.opcode == IB_OPCODE_UC_RDMA_WRITE_FIRST ||
+ qp->req.opcode == IB_OPCODE_UC_RDMA_WRITE_MIDDLE)
+ return fits ?
+ IB_OPCODE_UC_RDMA_WRITE_LAST :
+ IB_OPCODE_UC_RDMA_WRITE_MIDDLE;
+ else
+ return fits ?
+ IB_OPCODE_UC_RDMA_WRITE_ONLY :
+ IB_OPCODE_UC_RDMA_WRITE_FIRST;
+
+ case IB_WR_RDMA_WRITE_WITH_IMM:
+ if (qp->req.opcode == IB_OPCODE_UC_RDMA_WRITE_FIRST ||
+ qp->req.opcode == IB_OPCODE_UC_RDMA_WRITE_MIDDLE)
+ return fits ?
+ IB_OPCODE_UC_RDMA_WRITE_LAST_WITH_IMMEDIATE :
+ IB_OPCODE_UC_RDMA_WRITE_MIDDLE;
+ else
+ return fits ?
+ IB_OPCODE_UC_RDMA_WRITE_ONLY_WITH_IMMEDIATE :
+ IB_OPCODE_UC_RDMA_WRITE_FIRST;
+
+ case IB_WR_SEND:
+ if (qp->req.opcode == IB_OPCODE_UC_SEND_FIRST ||
+ qp->req.opcode == IB_OPCODE_UC_SEND_MIDDLE)
+ return fits ?
+ IB_OPCODE_UC_SEND_LAST :
+ IB_OPCODE_UC_SEND_MIDDLE;
+ else
+ return fits ?
+ IB_OPCODE_UC_SEND_ONLY :
+ IB_OPCODE_UC_SEND_FIRST;
+
+ case IB_WR_SEND_WITH_IMM:
+ if (qp->req.opcode == IB_OPCODE_UC_SEND_FIRST ||
+ qp->req.opcode == IB_OPCODE_UC_SEND_MIDDLE)
+ return fits ?
+ IB_OPCODE_UC_SEND_LAST_WITH_IMMEDIATE :
+ IB_OPCODE_UC_SEND_MIDDLE;
+ else
+ return fits ?
+ IB_OPCODE_UC_SEND_ONLY_WITH_IMMEDIATE :
+ IB_OPCODE_UC_SEND_FIRST;
+ }
+
+ return -EINVAL;
+}
+
+static int next_opcode(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ unsigned opcode)
+{
+ int fits = (wqe->dma.resid <= qp->mtu);
+
+ switch (qp_type(qp)) {
+ case IB_QPT_RC:
+ return next_opcode_rc(qp, opcode, fits);
+
+ case IB_QPT_UC:
+ return next_opcode_uc(qp, opcode, fits);
+
+ case IB_QPT_SMI:
+ case IB_QPT_UD:
+ case IB_QPT_GSI:
+ switch (opcode) {
+ case IB_WR_SEND:
+ return IB_OPCODE_UD_SEND_ONLY;
+
+ case IB_WR_SEND_WITH_IMM:
+ return IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE;
+ }
+ break;
+
+ default:
+ break;
+ }
+
+ return -EINVAL;
+}
+
+static inline int check_init_depth(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
+{
+ int depth;
+
+ if (wqe->has_rd_atomic)
+ return 0;
+
+ qp->req.need_rd_atomic = 1;
+ depth = atomic_dec_return(&qp->req.rd_atomic);
+
+ if (depth >= 0) {
+ qp->req.need_rd_atomic = 0;
+ wqe->has_rd_atomic = 1;
+ return 0;
+ } else {
+ atomic_inc(&qp->req.rd_atomic);
+ return -EAGAIN;
+ }
+}
+
+static inline int get_mtu(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+
+ if ((qp_type(qp) == IB_QPT_RC) || (qp_type(qp) == IB_QPT_UC)) {
+ return qp->mtu;
+ } else {
+ struct rxe_av *av = &wqe->av;
+ struct rxe_port *port = &rxe->port[av->attr.port_num - 1];
+ return port->mtu_cap;
+ }
+}
+
+static struct rxe_pkt_info *init_req_packet(struct rxe_qp *qp,
+ struct rxe_send_wqe *wqe,
+ int opcode, int payload)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ struct rxe_port *port = &rxe->port[qp->attr.port_num - 1];
+ struct rxe_pkt_info *pkt;
+ struct sk_buff *skb;
+ struct ib_send_wr *ibwr = &wqe->ibwr;
+ struct rxe_av *av;
+ int pad = (-payload) & 0x3;
+ int paylen;
+ int solicited;
+ u16 pkey;
+ u32 qp_num;
+ int ack_req;
+ unsigned int lnh;
+ unsigned int length;
+
+ /* length from start of bth to end of icrc */
+ paylen = rxe_opcode[opcode].length + payload + pad + RXE_ICRC_SIZE;
+
+ /* TODO support APM someday */
+ if (qp_type(qp) == IB_QPT_RC || qp_type(qp) == IB_QPT_UC)
+ av = &qp->pri_av;
+ else
+ av = &wqe->av;
+
+ /* init skb
+ * ifc layer must set lrh/grh flags */
+ skb = rxe->ifc_ops->init_packet(rxe, av, paylen);
+ if (!skb)
+ return NULL;
+
+ atomic_inc(&qp->req_skb_out);
+ atomic_inc(&rxe->req_skb_out);
+
+ /* pkt->hdr, rxe, port_num, paylen and
+ mask are initialized in ifc layer */
+ pkt = SKB_TO_PKT(skb);
+ pkt->opcode = opcode;
+ pkt->qp = qp;
+ pkt->psn = qp->req.psn;
+ pkt->mask |= rxe_opcode[opcode].mask;
+ pkt->paylen = paylen;
+ pkt->offset = 0;
+
+ /* init lrh */
+ if (pkt->mask & RXE_LRH_MASK) {
+ pkt->offset += RXE_LRH_BYTES;
+ lnh = (pkt->mask & RXE_GRH_MASK) ? LRH_LNH_IBA_GBL
+ : LRH_LNH_IBA_LOC;
+ length = paylen + RXE_LRH_BYTES;
+ if (pkt->mask & RXE_GRH_MASK)
+ length += RXE_GRH_BYTES;
+
+ /* TODO currently we do not implement SL->VL
+ table so just force VL=SL */
+ lrh_init(pkt, av->attr.sl, av->attr.sl,
+ lnh, length/4,
+ av->attr.dlid,
+ port->attr.lid);
+ }
+
+ /* init grh */
+ if (pkt->mask & RXE_GRH_MASK) {
+ pkt->offset += RXE_GRH_BYTES;
+
+ /* TODO set class flow hopcount */
+ grh_init(pkt, 0, 0, paylen, 0);
+
+ grh_set_sgid(pkt, port->subnet_prefix,
+ port->guid_tbl[av->attr.grh.sgid_index]);
+ grh_set_dgid(pkt, av->attr.grh.dgid.global.subnet_prefix,
+ av->attr.grh.dgid.global.interface_id);
+ }
+
+ /* init bth */
+ solicited = (ibwr->send_flags & IB_SEND_SOLICITED) &&
+ (pkt->mask & RXE_END_MASK) &&
+ ((pkt->mask & (RXE_SEND_MASK)) ||
+ (pkt->mask & (RXE_WRITE_MASK | RXE_IMMDT_MASK)) ==
+ (RXE_WRITE_MASK | RXE_IMMDT_MASK));
+
+ pkey = (qp_type(qp) == IB_QPT_GSI) ?
+ port->pkey_tbl[ibwr->wr.ud.pkey_index] :
+ port->pkey_tbl[qp->attr.pkey_index];
+
+ qp_num = (pkt->mask & RXE_DETH_MASK) ? ibwr->wr.ud.remote_qpn :
+ qp->attr.dest_qp_num;
+
+ ack_req = ((pkt->mask & RXE_END_MASK) ||
+ (qp->req.noack_pkts++ > rxe_max_pkt_per_ack));
+ if (ack_req)
+ qp->req.noack_pkts = 0;
+
+ bth_init(pkt, pkt->opcode, solicited, 0, pad, pkey, qp_num,
+ ack_req, pkt->psn);
+
+ /* init optional headers */
+ if (pkt->mask & RXE_RETH_MASK) {
+ reth_set_rkey(pkt, ibwr->wr.rdma.rkey);
+ reth_set_va(pkt, wqe->iova);
+ reth_set_len(pkt, wqe->dma.length);
+ }
+
+ if (pkt->mask & RXE_IMMDT_MASK)
+ immdt_set_imm(pkt, ibwr->ex.imm_data);
+
+ if (pkt->mask & RXE_IETH_MASK)
+ ieth_set_rkey(pkt, ibwr->ex.invalidate_rkey);
+
+ if (pkt->mask & RXE_ATMETH_MASK) {
+ atmeth_set_va(pkt, wqe->iova);
+ if (opcode == IB_OPCODE_RC_COMPARE_SWAP ||
+ opcode == IB_OPCODE_RD_COMPARE_SWAP) {
+ atmeth_set_swap_add(pkt, ibwr->wr.atomic.swap);
+ atmeth_set_comp(pkt, ibwr->wr.atomic.compare_add);
+ } else {
+ atmeth_set_swap_add(pkt, ibwr->wr.atomic.compare_add);
+ }
+ atmeth_set_rkey(pkt, ibwr->wr.atomic.rkey);
+ }
+
+ if (pkt->mask & RXE_DETH_MASK) {
+ if (qp->ibqp.qp_num == 1)
+ deth_set_qkey(pkt, GSI_QKEY);
+ else
+ deth_set_qkey(pkt, ibwr->wr.ud.remote_qkey);
+ deth_set_sqp(pkt, qp->ibqp.qp_num);
+ }
+
+ return pkt;
+}
+
+static int fill_packet(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ struct rxe_pkt_info *pkt, int payload)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ static const uint8_t zerobuf[4] = {0,};
+ int pad;
+ u32 crc = 0;
+ u8 *p;
+
+ if (!rxe_crc_disable)
+ crc = rxe_sb8_ib_headers(pkt);
+
+ if (pkt->mask & RXE_WRITE_OR_SEND) {
+ if (wqe->ibwr.send_flags & IB_SEND_INLINE) {
+ p = &wqe->dma.inline_data[wqe->dma.sge_offset];
+ if (!rxe_crc_disable)
+ crc = sb8_copy(payload_addr(pkt),
+ p, payload, crc);
+ else
+ memcpy(payload_addr(pkt), p, payload);
+
+ wqe->dma.resid -= payload;
+ wqe->dma.sge_offset += payload;
+ } else {
+ if (copy_data(rxe, qp->pd, 0, &wqe->dma,
+ payload_addr(pkt), payload,
+ direction_out, !rxe_crc_disable ?
+ &crc : NULL)) {
+ return -1;
+ }
+ }
+ }
+
+ p = payload_addr(pkt) + payload;
+ pad = (-payload) & 0x3;
+ if (pad) {
+ if (!rxe_crc_disable)
+ crc = sb8_copy(p, zerobuf, pad, crc);
+ else
+ memcpy(p, zerobuf, pad);
+ p += pad;
+ }
+
+ *(__be32 *)p = ~crc;
+
+ return 0;
+}
+
+static void update_state(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ struct rxe_pkt_info *pkt, int payload)
+{
+ /* number of packets left to send including current one */
+ int num_pkt = (wqe->dma.resid + payload + qp->mtu - 1) / qp->mtu;
+
+ /* handle zero length packet case */
+ if (num_pkt == 0)
+ num_pkt = 1;
+
+ if (pkt->mask & RXE_START_MASK) {
+ wqe->first_psn = qp->req.psn;
+ wqe->last_psn = (qp->req.psn + num_pkt - 1) & BTH_PSN_MASK;
+ }
+
+ if (pkt->mask & RXE_READ_MASK)
+ qp->req.psn = (wqe->first_psn + num_pkt) & BTH_PSN_MASK;
+ else
+ qp->req.psn = (qp->req.psn + 1) & BTH_PSN_MASK;
+
+ qp->req.opcode = pkt->opcode;
+
+ if (pkt->mask & RXE_END_MASK) {
+ if (qp_type(qp) == IB_QPT_RC)
+ wqe->state = wqe_state_pending;
+ else
+ wqe->state = wqe_state_done;
+
+ qp->req.wqe_index = next_index(qp->sq.queue,
+ qp->req.wqe_index);
+ } else
+ wqe->state = wqe_state_processing;
+
+ if (qp->qp_timeout_jiffies && !timer_pending(&qp->retrans_timer))
+ mod_timer(&qp->retrans_timer,
+ jiffies + qp->qp_timeout_jiffies);
+}
+
+int rxe_requester(void *arg)
+{
+ struct rxe_qp *qp = (struct rxe_qp *)arg;
+ struct rxe_pkt_info *pkt;
+ struct rxe_send_wqe *wqe;
+ unsigned mask;
+ int payload;
+ int mtu;
+ int opcode;
+
+ if (!qp->valid || qp->req.state == QP_STATE_ERROR)
+ goto exit;
+
+ if (qp->req.state == QP_STATE_RESET) {
+ qp->req.wqe_index = consumer_index(qp->sq.queue);
+ qp->req.opcode = -1;
+ qp->req.need_rd_atomic = 0;
+ qp->req.wait_psn = 0;
+ qp->req.need_retry = 0;
+ goto exit;
+ }
+
+ if (qp->req.need_retry) {
+ req_retry(qp);
+ qp->req.need_retry = 0;
+ }
+
+ wqe = req_next_wqe(qp);
+ if (!wqe)
+ goto exit;
+
+ /* heuristic sliding window algorithm to keep
+ sender from overrunning receiver queues */
+ if (qp_type(qp) == IB_QPT_RC) {
+ if (psn_compare(qp->req.psn, qp->comp.psn +
+ rxe_max_req_comp_gap) > 0) {
+ qp->req.wait_psn = 1;
+ goto exit;
+ }
+ }
+
+ /* heuristic algorithm to prevent sender from
+ overrunning the output queue. to not overrun
+ the receiver's input queue (UC/UD) requires
+ rate control or application level flow control */
+ if (atomic_read(&qp->req_skb_out) > rxe_max_skb_per_qp) {
+ qp->need_req_skb = 1;
+ goto exit;
+ }
+
+ opcode = next_opcode(qp, wqe, wqe->ibwr.opcode);
+ if (opcode < 0) {
+ wqe->status = IB_WC_LOC_QP_OP_ERR;
+ /* TODO most be more to do here ?? */
+ goto exit;
+ }
+
+ mask = rxe_opcode[opcode].mask;
+ if (unlikely(mask & RXE_READ_OR_ATOMIC)) {
+ if (check_init_depth(qp, wqe))
+ goto exit;
+ }
+
+ mtu = get_mtu(qp, wqe);
+ payload = (mask & RXE_WRITE_OR_SEND) ? wqe->dma.resid : 0;
+ if (payload > mtu) {
+ if (qp_type(qp) == IB_QPT_UD) {
+ /* believe it or not this is
+ what the spec says to do */
+ /* TODO handle > 1 ports */
+
+ /*
+ * fake a successful UD send
+ */
+ wqe->first_psn = qp->req.psn;
+ wqe->last_psn = qp->req.psn;
+ qp->req.psn = (qp->req.psn + 1) & BTH_PSN_MASK;
+ qp->req.opcode = IB_OPCODE_UD_SEND_ONLY;
+ qp->req.wqe_index = next_index(qp->sq.queue,
+ qp->req.wqe_index);
+ wqe->state = wqe_state_done;
+ wqe->status = IB_WC_SUCCESS;
+ goto complete;
+ }
+ payload = mtu;
+ }
+
+ pkt = init_req_packet(qp, wqe, opcode, payload);
+ if (!pkt) {
+ qp->need_req_skb = 1;
+ goto exit;
+ }
+
+ if (fill_packet(qp, wqe, pkt, payload)) {
+ wqe->status = IB_WC_LOC_PROT_ERR;
+ wqe->state = wqe_state_error;
+ goto complete;
+ }
+
+ update_state(qp, wqe, pkt, payload);
+
+ arbiter_skb_queue(to_rdev(qp->ibqp.device), qp, PKT_TO_SKB(pkt));
+
+ if (mask & RXE_END_MASK)
+ goto complete;
+ else
+ goto done;
+
+complete:
+ if (qp_type(qp) != IB_QPT_RC) {
+ while (rxe_completer(qp) == 0)
+ ;
+ }
+done:
+ return 0;
+
+exit:
+ return -EAGAIN;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 33/44] rxe_resp.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (31 preceding siblings ...)
2011-07-01 13:18 ` [patch 32/44] rxe_req.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 34/44] rxe_arbiter.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (11 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch33 --]
[-- Type: text/plain, Size: 34451 bytes --]
QP response logic.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_resp.c | 1369 +++++++++++++++++++++++++++++++++++
1 file changed, 1369 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_resp.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_resp.c
@@ -0,0 +1,1369 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/*
+ * implements the qp responder
+ */
+
+#include <linux/skbuff.h>
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_queue.h"
+#include "rxe_cq.h"
+#include "rxe_qp.h"
+#include "rxe_mr.h"
+
+enum resp_states {
+ RESPST_NONE,
+ RESPST_GET_REQ,
+ RESPST_CHK_PSN,
+ RESPST_CHK_OP_SEQ,
+ RESPST_CHK_OP_VALID,
+ RESPST_CHK_RESOURCE,
+ RESPST_CHK_LENGTH,
+ RESPST_CHK_RKEY,
+ RESPST_EXECUTE,
+ RESPST_READ_REPLY,
+ RESPST_COMPLETE,
+ RESPST_ACKNOWLEDGE,
+ RESPST_CLEANUP,
+ RESPST_DUPLICATE_REQUEST,
+ RESPST_ERR_MALFORMED_WQE,
+ RESPST_ERR_UNSUPPORTED_OPCODE,
+ RESPST_ERR_MISALIGNED_ATOMIC,
+ RESPST_ERR_PSN_OUT_OF_SEQ,
+ RESPST_ERR_MISSING_OPCODE_FIRST,
+ RESPST_ERR_MISSING_OPCODE_LAST_C,
+ RESPST_ERR_MISSING_OPCODE_LAST_D1E,
+ RESPST_ERR_TOO_MANY_RDMA_ATM_REQ,
+ RESPST_ERR_RNR,
+ RESPST_ERR_RKEY_VIOLATION,
+ RESPST_ERR_LENGTH,
+ RESPST_ERR_CQ_OVERFLOW,
+ RESPST_ERROR,
+ RESPST_RESET,
+ RESPST_DONE,
+ RESPST_EXIT,
+};
+
+static char *resp_state_name[] = {
+ [RESPST_NONE] = "NONE",
+ [RESPST_GET_REQ] = "GET_REQ",
+ [RESPST_CHK_PSN] = "CHK_PSN",
+ [RESPST_CHK_OP_SEQ] = "CHK_OP_SEQ",
+ [RESPST_CHK_OP_VALID] = "CHK_OP_VALID",
+ [RESPST_CHK_RESOURCE] = "CHK_RESOURCE",
+ [RESPST_CHK_LENGTH] = "CHK_LENGTH",
+ [RESPST_CHK_RKEY] = "CHK_RKEY",
+ [RESPST_EXECUTE] = "EXECUTE",
+ [RESPST_READ_REPLY] = "READ_REPLY",
+ [RESPST_COMPLETE] = "COMPLETE",
+ [RESPST_ACKNOWLEDGE] = "ACKNOWLEDGE",
+ [RESPST_CLEANUP] = "CLEANUP",
+ [RESPST_DUPLICATE_REQUEST] = "DUPLICATE_REQUEST",
+ [RESPST_ERR_MALFORMED_WQE] = "ERR_MALFORMED_WQE",
+ [RESPST_ERR_UNSUPPORTED_OPCODE] = "ERR_UNSUPPORTED_OPCODE",
+ [RESPST_ERR_MISALIGNED_ATOMIC] = "ERR_MISALIGNED_ATOMIC",
+ [RESPST_ERR_PSN_OUT_OF_SEQ] = "ERR_PSN_OUT_OF_SEQ",
+ [RESPST_ERR_MISSING_OPCODE_FIRST] = "ERR_MISSING_OPCODE_FIRST",
+ [RESPST_ERR_MISSING_OPCODE_LAST_C] = "ERR_MISSING_OPCODE_LAST_C",
+ [RESPST_ERR_MISSING_OPCODE_LAST_D1E] = "ERR_MISSING_OPCODE_LAST_D1E",
+ [RESPST_ERR_TOO_MANY_RDMA_ATM_REQ] = "ERR_TOO_MANY_RDMA_ATM_REQ",
+ [RESPST_ERR_RNR] = "ERR_RNR",
+ [RESPST_ERR_RKEY_VIOLATION] = "ERR_RKEY_VIOLATION",
+ [RESPST_ERR_LENGTH] = "ERR_LENGTH",
+ [RESPST_ERR_CQ_OVERFLOW] = "ERR_CQ_OVERFLOW",
+ [RESPST_ERROR] = "ERROR",
+ [RESPST_RESET] = "RESET",
+ [RESPST_DONE] = "DONE",
+ [RESPST_EXIT] = "EXIT",
+};
+
+/* rxe_recv calls here to add a request packet to the input queue */
+void rxe_resp_queue_pkt(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct sk_buff *skb)
+{
+ int must_sched;
+ struct rxe_pkt_info *pkt = SKB_TO_PKT(skb);
+
+ atomic_inc(&qp->req_skb_in);
+ atomic_inc(&rxe->req_skb_in);
+
+ skb_queue_tail(&qp->req_pkts, skb);
+
+ must_sched = (pkt->opcode == IB_OPCODE_RC_RDMA_READ_REQUEST)
+ || (skb_queue_len(&qp->req_pkts) > 1);
+
+ rxe_run_task(&qp->resp.task, must_sched);
+}
+
+static inline enum resp_states get_req(struct rxe_qp *qp,
+ struct rxe_pkt_info **pkt_p)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ struct sk_buff *skb;
+
+ if (qp->resp.state == QP_STATE_ERROR) {
+ skb = skb_dequeue(&qp->req_pkts);
+ if (skb) {
+ /* drain request packet queue */
+ rxe_drop_ref(qp);
+ kfree_skb(skb);
+ atomic_dec(&qp->req_skb_in);
+ atomic_dec(&rxe->req_skb_in);
+ return RESPST_GET_REQ;
+ } else {
+ /* go drain recv wr queue */
+ return RESPST_CHK_RESOURCE;
+ }
+ }
+
+ skb = skb_peek(&qp->req_pkts);
+ if (!skb)
+ return RESPST_EXIT;
+
+ *pkt_p = SKB_TO_PKT(skb);
+
+ return (qp->resp.res) ? RESPST_READ_REPLY : RESPST_CHK_PSN;
+}
+
+static enum resp_states check_psn(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ int diff = psn_compare(pkt->psn, qp->resp.psn);
+
+ switch (qp_type(qp)) {
+ case IB_QPT_RC:
+ if (diff > 0) {
+ if (qp->resp.sent_psn_nak) {
+ return RESPST_CLEANUP;
+ } else {
+ qp->resp.sent_psn_nak = 1;
+ return RESPST_ERR_PSN_OUT_OF_SEQ;
+ }
+ } else if (diff < 0) {
+ return RESPST_DUPLICATE_REQUEST;
+ } else {
+ if (qp->resp.sent_psn_nak)
+ qp->resp.sent_psn_nak = 0;
+ }
+ break;
+
+ case IB_QPT_UC:
+ if (qp->resp.drop_msg || diff != 0) {
+ if (pkt->mask & RXE_START_MASK) {
+ qp->resp.drop_msg = 0;
+ return RESPST_CHK_OP_SEQ;
+ } else {
+ qp->resp.drop_msg = 1;
+ return RESPST_CLEANUP;
+ }
+ }
+ break;
+ default:
+ break;
+ }
+
+ return RESPST_CHK_OP_SEQ;
+}
+
+static enum resp_states check_op_seq(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ switch (qp_type(qp)) {
+ case IB_QPT_RC:
+ switch (qp->resp.opcode) {
+ case IB_OPCODE_RC_SEND_FIRST:
+ case IB_OPCODE_RC_SEND_MIDDLE:
+ switch (pkt->opcode) {
+ case IB_OPCODE_RC_SEND_MIDDLE:
+ case IB_OPCODE_RC_SEND_LAST:
+ case IB_OPCODE_RC_SEND_LAST_WITH_IMMEDIATE:
+ case IB_OPCODE_RC_SEND_LAST_INV:
+ return RESPST_CHK_OP_VALID;
+ default:
+ return RESPST_ERR_MISSING_OPCODE_LAST_C;
+ }
+
+ case IB_OPCODE_RC_RDMA_WRITE_FIRST:
+ case IB_OPCODE_RC_RDMA_WRITE_MIDDLE:
+ switch (pkt->opcode) {
+ case IB_OPCODE_RC_RDMA_WRITE_MIDDLE:
+ case IB_OPCODE_RC_RDMA_WRITE_LAST:
+ case IB_OPCODE_RC_RDMA_WRITE_LAST_WITH_IMMEDIATE:
+ return RESPST_CHK_OP_VALID;
+ default:
+ return RESPST_ERR_MISSING_OPCODE_LAST_C;
+ }
+
+ default:
+ switch (pkt->opcode) {
+ case IB_OPCODE_RC_SEND_MIDDLE:
+ case IB_OPCODE_RC_SEND_LAST:
+ case IB_OPCODE_RC_SEND_LAST_WITH_IMMEDIATE:
+ case IB_OPCODE_RC_SEND_LAST_INV:
+ case IB_OPCODE_RC_RDMA_WRITE_MIDDLE:
+ case IB_OPCODE_RC_RDMA_WRITE_LAST:
+ case IB_OPCODE_RC_RDMA_WRITE_LAST_WITH_IMMEDIATE:
+ return RESPST_ERR_MISSING_OPCODE_FIRST;
+ default:
+ return RESPST_CHK_OP_VALID;
+ }
+ }
+ break;
+
+ case IB_QPT_UC:
+ switch (qp->resp.opcode) {
+ case IB_OPCODE_UC_SEND_FIRST:
+ case IB_OPCODE_UC_SEND_MIDDLE:
+ switch (pkt->opcode) {
+ case IB_OPCODE_UC_SEND_MIDDLE:
+ case IB_OPCODE_UC_SEND_LAST:
+ case IB_OPCODE_UC_SEND_LAST_WITH_IMMEDIATE:
+ return RESPST_CHK_OP_VALID;
+ default:
+ return RESPST_ERR_MISSING_OPCODE_LAST_D1E;
+ }
+
+ case IB_OPCODE_UC_RDMA_WRITE_FIRST:
+ case IB_OPCODE_UC_RDMA_WRITE_MIDDLE:
+ switch (pkt->opcode) {
+ case IB_OPCODE_UC_RDMA_WRITE_MIDDLE:
+ case IB_OPCODE_UC_RDMA_WRITE_LAST:
+ case IB_OPCODE_UC_RDMA_WRITE_LAST_WITH_IMMEDIATE:
+ return RESPST_CHK_OP_VALID;
+ default:
+ return RESPST_ERR_MISSING_OPCODE_LAST_D1E;
+ }
+
+ default:
+ switch (pkt->opcode) {
+ case IB_OPCODE_UC_SEND_MIDDLE:
+ case IB_OPCODE_UC_SEND_LAST:
+ case IB_OPCODE_UC_SEND_LAST_WITH_IMMEDIATE:
+ case IB_OPCODE_UC_RDMA_WRITE_MIDDLE:
+ case IB_OPCODE_UC_RDMA_WRITE_LAST:
+ case IB_OPCODE_UC_RDMA_WRITE_LAST_WITH_IMMEDIATE:
+ qp->resp.drop_msg = 1;
+ return RESPST_CLEANUP;
+ default:
+ return RESPST_CHK_OP_VALID;
+ }
+ }
+ break;
+
+ default:
+ return RESPST_CHK_OP_VALID;
+ }
+}
+
+static enum resp_states check_op_valid(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ switch (qp_type(qp)) {
+ case IB_QPT_RC:
+ if (((pkt->mask & RXE_READ_MASK) &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_READ)) ||
+ ((pkt->mask & RXE_WRITE_MASK) &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_WRITE)) ||
+ ((pkt->mask & RXE_ATOMIC_MASK) &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_ATOMIC))) {
+ return RESPST_ERR_UNSUPPORTED_OPCODE;
+ }
+
+ if (!pkt->mask)
+ return RESPST_ERR_UNSUPPORTED_OPCODE;
+ break;
+
+ case IB_QPT_UC:
+ if ((pkt->mask & RXE_WRITE_MASK) &&
+ !(qp->attr.qp_access_flags & IB_ACCESS_REMOTE_WRITE)) {
+ qp->resp.drop_msg = 1;
+ return RESPST_CLEANUP;
+ }
+
+ if (!pkt->mask) {
+ qp->resp.drop_msg = 1;
+ return RESPST_CLEANUP;
+ }
+ break;
+
+ case IB_QPT_UD:
+ case IB_QPT_SMI:
+ case IB_QPT_GSI:
+ if (!pkt->mask)
+ return RESPST_CLEANUP;
+ break;
+
+ default:
+ WARN_ON(1);
+ break;
+ }
+
+ return RESPST_CHK_RESOURCE;
+}
+
+static enum resp_states get_srq_wqe(struct rxe_qp *qp)
+{
+ struct rxe_srq *srq = qp->srq;
+ struct rxe_queue *q = srq->rq.queue;
+ struct rxe_recv_wqe *wqe;
+ struct ib_event ev;
+
+ if (srq->error)
+ return RESPST_ERR_RNR;
+
+ spin_lock_bh(&srq->rq.consumer_lock);
+
+ wqe = queue_head(q);
+ if (!wqe) {
+ spin_unlock_bh(&srq->rq.consumer_lock);
+ return RESPST_ERR_RNR;
+ }
+
+ /* note kernel and user space recv wqes have same size */
+ memcpy(&qp->resp.srq_wqe, wqe, sizeof(qp->resp.srq_wqe));
+
+ qp->resp.wqe = &qp->resp.srq_wqe.wqe;
+ advance_consumer(q);
+
+ if (srq->limit && srq->ibsrq.event_handler &&
+ (queue_count(q) < srq->limit)) {
+ srq->limit = 0;
+ goto event;
+ }
+
+ spin_unlock_bh(&srq->rq.consumer_lock);
+ return RESPST_CHK_LENGTH;
+
+event:
+ spin_unlock_bh(&srq->rq.consumer_lock);
+ ev.device = qp->ibqp.device;
+ ev.element.srq = qp->ibqp.srq;
+ ev.event = IB_EVENT_SRQ_LIMIT_REACHED;
+ srq->ibsrq.event_handler(&ev, srq->ibsrq.srq_context);
+ return RESPST_CHK_LENGTH;
+}
+
+static enum resp_states check_resource(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ struct rxe_srq *srq = qp->srq;
+
+ if (qp->resp.state == QP_STATE_ERROR) {
+ if (qp->resp.wqe) {
+ qp->resp.status = IB_WC_WR_FLUSH_ERR;
+ return RESPST_COMPLETE;
+ } else if (!srq) {
+ qp->resp.wqe = queue_head(qp->rq.queue);
+ if (qp->resp.wqe) {
+ qp->resp.status = IB_WC_WR_FLUSH_ERR;
+ return RESPST_COMPLETE;
+ } else
+ return RESPST_EXIT;
+ } else
+ return RESPST_EXIT;
+ }
+
+ if (pkt->mask & RXE_READ_OR_ATOMIC) {
+ /* it is the requesters job to not send
+ too many read/atomic ops, we just
+ recycle the responder resource queue */
+ if (likely(qp->attr.max_rd_atomic > 0))
+ return RESPST_CHK_LENGTH;
+ else
+ return RESPST_ERR_TOO_MANY_RDMA_ATM_REQ;
+ }
+
+ if (pkt->mask & RXE_RWR_MASK) {
+ if (srq) {
+ return get_srq_wqe(qp);
+ } else {
+ qp->resp.wqe = queue_head(qp->rq.queue);
+ return (qp->resp.wqe) ? RESPST_CHK_LENGTH
+ : RESPST_ERR_RNR;
+ }
+ }
+
+ return RESPST_CHK_LENGTH;
+}
+
+static enum resp_states check_length(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ switch (qp_type(qp)) {
+ case IB_QPT_RC:
+ return RESPST_CHK_RKEY;
+
+ case IB_QPT_UC:
+ return RESPST_CHK_RKEY;
+
+ default:
+ return RESPST_CHK_RKEY;
+ }
+}
+
+static enum resp_states check_rkey(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ struct rxe_mem *mem;
+ u64 va;
+ u32 rkey;
+ u32 resid;
+ u32 pktlen;
+ int mtu = qp->mtu;
+ enum resp_states state;
+ int access;
+
+ if (pkt->mask & (RXE_READ_MASK | RXE_WRITE_MASK)) {
+ if (pkt->mask & RXE_RETH_MASK) {
+ qp->resp.va = reth_va(pkt);
+ qp->resp.rkey = reth_rkey(pkt);
+ qp->resp.resid = reth_len(pkt);
+ }
+ access = (pkt->mask & RXE_READ_MASK) ? IB_ACCESS_REMOTE_READ
+ : IB_ACCESS_REMOTE_WRITE;
+ } else if (pkt->mask & RXE_ATOMIC_MASK) {
+ qp->resp.va = atmeth_va(pkt);
+ qp->resp.rkey = atmeth_rkey(pkt);
+ qp->resp.resid = sizeof(u64);
+ access = IB_ACCESS_REMOTE_ATOMIC;
+ } else {
+ return RESPST_EXECUTE;
+ }
+
+ va = qp->resp.va;
+ rkey = qp->resp.rkey;
+ resid = qp->resp.resid;
+ pktlen = payload_size(pkt);
+
+ mem = lookup_mem(qp->pd, access, rkey, lookup_remote);
+ if (!mem) {
+ state = RESPST_ERR_RKEY_VIOLATION;
+ goto err1;
+ }
+
+ if (mem_check_range(mem, va, resid)) {
+ state = RESPST_ERR_RKEY_VIOLATION;
+ goto err2;
+ }
+
+ if (pkt->mask & RXE_WRITE_MASK) {
+ if (resid > mtu) {
+ if (pktlen != mtu || bth_pad(pkt)) {
+ state = RESPST_ERR_LENGTH;
+ goto err2;
+ }
+
+ resid = mtu;
+ } else {
+ if ((pktlen != resid)) {
+ state = RESPST_ERR_LENGTH;
+ goto err2;
+ }
+ if ((bth_pad(pkt) != (0x3 & (-resid)))) {
+ /* This case may not be exactly that
+ * but nothing else fits. */
+ state = RESPST_ERR_LENGTH;
+ goto err2;
+ }
+ }
+ }
+
+ WARN_ON(qp->resp.mr != NULL);
+
+ qp->resp.mr = mem;
+ return RESPST_EXECUTE;
+
+err2:
+ rxe_drop_ref(mem);
+err1:
+ return state;
+}
+
+static enum resp_states send_data_in(struct rxe_qp *qp, void *data_addr,
+ int data_len)
+{
+ int err;
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+
+ err = copy_data(rxe, qp->pd, IB_ACCESS_LOCAL_WRITE, &qp->resp.wqe->dma,
+ data_addr, data_len, direction_in, NULL);
+ if (unlikely(err))
+ return (err == -ENOSPC) ? RESPST_ERR_LENGTH
+ : RESPST_ERR_MALFORMED_WQE;
+
+ return RESPST_NONE;
+}
+
+static enum resp_states write_data_in(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ enum resp_states rc = RESPST_NONE;
+ int err;
+ int data_len = payload_size(pkt);
+
+ err = rxe_mem_copy(qp->resp.mr, qp->resp.va, payload_addr(pkt),
+ data_len, direction_in, NULL);
+ if (err) {
+ rc = RESPST_ERR_RKEY_VIOLATION;
+ goto out;
+ }
+
+ qp->resp.va += data_len;
+ qp->resp.resid -= data_len;
+
+out:
+ return rc;
+}
+
+/* Guarantee atomicity of atomic operations at the machine level. */
+static DEFINE_SPINLOCK(atomic_ops_lock);
+
+static enum resp_states process_atomic(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ u64 iova = atmeth_va(pkt);
+ u64 *vaddr;
+ enum resp_states ret;
+ struct rxe_mem *mr = qp->resp.mr;
+
+ if (mr->state != RXE_MEM_STATE_VALID) {
+ ret = RESPST_ERR_RKEY_VIOLATION;
+ goto out;
+ }
+
+ vaddr = iova_to_vaddr(mr, iova, sizeof(u64));
+
+ /* check vaddr is 8 bytes aligned. */
+ if (!vaddr || (uintptr_t)vaddr & 7) {
+ ret = RESPST_ERR_MISALIGNED_ATOMIC;
+ goto out;
+ }
+
+ spin_lock_bh(&atomic_ops_lock);
+
+ qp->resp.atomic_orig = *vaddr;
+
+ if (pkt->opcode == IB_OPCODE_RC_COMPARE_SWAP ||
+ pkt->opcode == IB_OPCODE_RD_COMPARE_SWAP) {
+ if (*vaddr == atmeth_comp(pkt))
+ *vaddr = atmeth_swap_add(pkt);
+ } else {
+ *vaddr += atmeth_swap_add(pkt);
+ }
+
+ spin_unlock_bh(&atomic_ops_lock);
+
+ ret = RESPST_NONE;
+out:
+ return ret;
+}
+
+static struct sk_buff *prepare_ack_packet(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt,
+ int opcode,
+ int payload,
+ u32 psn,
+ u8 syndrome)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ struct sk_buff *skb;
+ struct rxe_pkt_info *ack;
+ int paylen;
+ int pad;
+
+ /*
+ * allocate packet
+ */
+ pad = (-payload) & 0x3;
+ paylen = rxe_opcode[opcode].length + payload + pad + RXE_ICRC_SIZE;
+
+ skb = rxe->ifc_ops->init_packet(rxe, &qp->pri_av, paylen);
+ if (!skb)
+ return NULL;
+
+ atomic_inc(&qp->resp_skb_out);
+ atomic_inc(&rxe->resp_skb_out);
+
+ ack = SKB_TO_PKT(skb);
+ ack->qp = qp;
+ ack->opcode = opcode;
+ ack->mask |= rxe_opcode[opcode].mask;
+ ack->offset = pkt->offset;
+ ack->paylen = paylen;
+
+ /* fill in lrh/grh/bth using the request packet headers */
+ memcpy(ack->hdr, pkt->hdr, pkt->offset + RXE_BTH_BYTES);
+
+ /* swap addresses in lrh */
+ if (ack->mask & RXE_LRH_MASK) {
+ __lrh_set_dlid(ack->hdr, __lrh_slid(pkt->hdr));
+ __lrh_set_slid(ack->hdr, __lrh_dlid(pkt->hdr));
+ }
+
+ /* swap addresses in grh */
+ if (ack->mask & RXE_GRH_MASK) {
+ grh_set_paylen(ack, paylen);
+ grh_set_dgid(ack, grh_sgid(pkt)->global.subnet_prefix,
+ grh_sgid(pkt)->global.interface_id);
+ grh_set_sgid(ack, grh_dgid(pkt)->global.subnet_prefix,
+ grh_dgid(pkt)->global.interface_id);
+ }
+
+ bth_set_opcode(ack, opcode);
+ bth_set_qpn(ack, qp->attr.dest_qp_num);
+ bth_set_pad(ack, pad);
+ bth_set_se(ack, 0);
+ bth_set_psn(ack, psn);
+ ack->psn = psn;
+
+ if (ack->mask & RXE_AETH_MASK) {
+ aeth_set_syn(ack, syndrome);
+ aeth_set_msn(ack, qp->resp.msn);
+ }
+
+ if (ack->mask & RXE_ATMACK_MASK)
+ atmack_set_orig(ack, qp->resp.atomic_orig);
+
+ return skb;
+}
+
+/* RDMA read response. If res is not NULL, then we have a current RDMA request
+ * being processed or replayed. */
+static enum resp_states read_reply(struct rxe_qp *qp,
+ struct rxe_pkt_info *req_pkt)
+{
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ struct rxe_pkt_info *pkt;
+ int mtu = qp->mtu;
+ enum resp_states state;
+ int payload;
+ struct sk_buff *skb;
+ int opcode;
+ int err;
+ struct resp_res *res = qp->resp.res;
+ unsigned char *buf;
+
+ if (!res) {
+ /* This is the first time we process that request. Get a
+ * resource */
+ res = &qp->resp.resources[qp->resp.res_head];
+
+ free_rd_atomic_resource(qp, res);
+ rxe_advance_resp_resource(qp);
+
+ res->type = RXE_READ_MASK;
+
+ res->read.va = qp->resp.va;
+ res->read.va_org = qp->resp.va;
+
+ res->first_psn = req_pkt->psn;
+ res->last_psn = req_pkt->psn +
+ (reth_len(req_pkt) + mtu - 1)/mtu - 1;
+ res->cur_psn = req_pkt->psn;
+
+ res->read.resid = qp->resp.resid;
+ res->read.length = qp->resp.resid;
+ res->read.rkey = qp->resp.rkey;
+
+ /* note res inherits the reference to mr from qp */
+ res->read.mr = qp->resp.mr;
+ qp->resp.mr = NULL;
+
+ qp->resp.res = res;
+ res->state = rdatm_res_state_new;
+ }
+
+ if (res->state == rdatm_res_state_new) {
+ if (res->read.resid <= mtu)
+ opcode = IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY;
+ else
+ opcode = IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST;
+ } else {
+ if (res->read.resid > mtu)
+ opcode = IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE;
+ else
+ opcode = IB_OPCODE_RC_RDMA_READ_RESPONSE_LAST;
+ }
+
+ res->state = rdatm_res_state_next;
+
+ payload = min_t(int, res->read.resid, mtu);
+
+ skb = prepare_ack_packet(qp, req_pkt, opcode, payload,
+ res->cur_psn, AETH_ACK_UNLIMITED);
+ if (!skb)
+ return RESPST_ERR_RNR;
+
+ pkt = SKB_TO_PKT(skb);
+
+ err = rxe_mem_copy(res->read.mr, res->read.va, payload_addr(pkt),
+ payload, direction_out, NULL);
+ if (err)
+ pr_warn("rxe_mem_copy failed TODO ???\n");
+
+ res->read.va += payload;
+ res->read.resid -= payload;
+ res->cur_psn = (res->cur_psn + 1) & BTH_PSN_MASK;
+
+ buf = payload_addr(pkt) + payload;
+ while (payload & 0x3) {
+ *buf++ = 0;
+ payload++;
+ }
+
+ if (!rxe_crc_disable)
+ *(__be32 *)buf = rxe_sb8(pkt);
+
+ arbiter_skb_queue(rxe, qp, skb);
+
+ if (res->read.resid > 0) {
+ state = RESPST_DONE;
+ } else {
+ qp->resp.res = NULL;
+ qp->resp.opcode = -1;
+ qp->resp.psn = res->cur_psn;
+ state = RESPST_CLEANUP;
+ }
+
+ return state;
+}
+
+/* Executes a new request. A retried request never reach that function (send
+ * and writes are discarded, and reads and atomics are retried elsewhere. */
+static enum resp_states execute(struct rxe_qp *qp, struct rxe_pkt_info *pkt)
+{
+ enum resp_states err;
+
+ if (pkt->mask & RXE_SEND_MASK) {
+ if (qp_type(qp) == IB_QPT_UD ||
+ qp_type(qp) == IB_QPT_SMI ||
+ qp_type(qp) == IB_QPT_GSI) {
+ /* Copy the GRH */
+ err = send_data_in(qp, pkt->hdr, RXE_GRH_BYTES);
+ if (err)
+ return err;
+ }
+
+ err = send_data_in(qp, payload_addr(pkt), payload_size(pkt));
+ if (err)
+ return err;
+ } else if (pkt->mask & RXE_WRITE_MASK) {
+ err = write_data_in(qp, pkt);
+ if (err)
+ return err;
+ } else if (pkt->mask & RXE_READ_MASK) {
+ /* For RDMA Read we can increment the msn now. See C9-148. */
+ qp->resp.msn++;
+ return RESPST_READ_REPLY;
+ } else if (pkt->mask & RXE_ATOMIC_MASK) {
+ err = process_atomic(qp, pkt);
+ if (err)
+ return err;
+ } else
+ /* Unreachable */
+ WARN_ON(1);
+
+ /* We successfully processed this new request. */
+ qp->resp.msn++;
+
+ /* next expected psn, read handles this separately */
+ qp->resp.psn = (pkt->psn + 1) & BTH_PSN_MASK;
+
+ qp->resp.opcode = pkt->opcode;
+ qp->resp.status = IB_WC_SUCCESS;
+
+ if (pkt->mask & RXE_COMP_MASK)
+ return RESPST_COMPLETE;
+ else if (qp_type(qp) == IB_QPT_RC)
+ return RESPST_ACKNOWLEDGE;
+ else
+ return RESPST_CLEANUP;
+}
+
+static enum resp_states do_complete(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ struct rxe_cqe cqe;
+ struct ib_wc *wc = &cqe.ibwc;
+ struct ib_uverbs_wc *uwc = &cqe.uibwc;
+ struct rxe_recv_wqe *wqe = qp->resp.wqe;
+
+ if (unlikely(!wqe))
+ return RESPST_CLEANUP;
+
+ memset(&cqe, 0, sizeof(cqe));
+
+ wc->wr_id = wqe->wr_id;
+ wc->status = qp->resp.status;
+
+ /* fields after status are not required for errors */
+ if (wc->status == IB_WC_SUCCESS) {
+ wc->opcode = (pkt->mask & RXE_IMMDT_MASK &&
+ pkt->mask & RXE_READ_MASK) ?
+ IB_WC_RECV_RDMA_WITH_IMM : IB_WC_RECV;
+ wc->vendor_err = 0;
+ wc->byte_len = wqe->dma.length - wqe->dma.resid;
+
+ /* fields after byte_len are offset
+ between kernel and user space */
+ if (qp->rcq->is_user) {
+ uwc->wc_flags = IB_WC_GRH;
+
+ if (pkt->mask & RXE_IMMDT_MASK) {
+ uwc->wc_flags |= IB_WC_WITH_IMM;
+ uwc->ex.imm_data =
+ (__u32 __force)immdt_imm(pkt);
+ }
+
+ if (pkt->mask & RXE_IETH_MASK) {
+ uwc->wc_flags |= IB_WC_WITH_INVALIDATE;
+ uwc->ex.invalidate_rkey = ieth_rkey(pkt);
+ }
+
+ uwc->qp_num = qp->ibqp.qp_num;
+
+ if (pkt->mask & RXE_DETH_MASK)
+ uwc->src_qp = deth_sqp(pkt);
+
+ uwc->port_num = qp->attr.port_num;
+ } else {
+ wc->wc_flags = IB_WC_GRH;
+
+ if (pkt->mask & RXE_IMMDT_MASK) {
+ wc->wc_flags |= IB_WC_WITH_IMM;
+ wc->ex.imm_data = immdt_imm(pkt);
+ }
+
+ if (pkt->mask & RXE_IETH_MASK) {
+ wc->wc_flags |= IB_WC_WITH_INVALIDATE;
+ wc->ex.invalidate_rkey = ieth_rkey(pkt);
+ }
+
+ wc->qp = &qp->ibqp;
+
+ if (pkt->mask & RXE_DETH_MASK)
+ wc->src_qp = deth_sqp(pkt);
+
+ wc->port_num = qp->attr.port_num;
+ }
+ }
+
+ /* have copy for srq and reference for !srq */
+ if (!qp->srq)
+ advance_consumer(qp->rq.queue);
+
+ qp->resp.wqe = NULL;
+
+ if (rxe_cq_post(qp->rcq, &cqe, pkt ? bth_se(pkt) : 1))
+ return RESPST_ERR_CQ_OVERFLOW;
+
+ if (qp->resp.state == QP_STATE_ERROR)
+ return RESPST_CHK_RESOURCE;
+
+ if (!pkt)
+ return RESPST_DONE;
+ else if (qp_type(qp) == IB_QPT_RC)
+ return RESPST_ACKNOWLEDGE;
+ else
+ return RESPST_CLEANUP;
+}
+
+static int send_ack(struct rxe_qp *qp, struct rxe_pkt_info *pkt,
+ u8 syndrome, u32 psn)
+{
+ int err = 0;
+ struct sk_buff *skb;
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ struct rxe_pkt_info *ack;
+ __be32 *buf;
+
+ skb = prepare_ack_packet(qp, pkt, IB_OPCODE_RC_ACKNOWLEDGE,
+ 0, psn, syndrome);
+ if (skb == NULL) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ /* CRC */
+ ack = SKB_TO_PKT(skb);
+ buf = payload_addr(ack);
+
+ if (!rxe_crc_disable)
+ *buf = rxe_sb8(ack);
+
+ arbiter_skb_queue(rxe, qp, skb);
+
+err1:
+ return err;
+}
+
+static int send_atomic_ack(struct rxe_qp *qp, struct rxe_pkt_info *pkt,
+ u8 syndrome)
+{
+ int rc = 0;
+ struct sk_buff *skb;
+ struct sk_buff *skb_copy;
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ struct rxe_pkt_info *ack;
+ struct resp_res *res;
+ __be32 *buf;
+
+ skb = prepare_ack_packet(qp, pkt, IB_OPCODE_RC_ATOMIC_ACKNOWLEDGE,
+ 0, pkt->psn, syndrome);
+ if (skb == NULL) {
+ rc = -ENOMEM;
+ goto out;
+ }
+
+ ack = SKB_TO_PKT(skb);
+ buf = payload_addr(ack);
+
+ if (!rxe_crc_disable)
+ *buf = rxe_sb8(ack);
+
+ res = &qp->resp.resources[qp->resp.res_head];
+ free_rd_atomic_resource(qp, res);
+ rxe_advance_resp_resource(qp);
+
+ res->type = RXE_ATOMIC_MASK;
+ res->atomic.skb = skb;
+ res->first_psn = qp->resp.psn;
+ res->last_psn = qp->resp.psn;
+ res->cur_psn = qp->resp.psn;
+
+ skb_copy = skb_clone(skb, GFP_ATOMIC);
+ if (skb_copy) {
+ rxe_add_ref(qp); /* for the new SKB */
+ atomic_inc(&qp->resp_skb_out);
+ atomic_inc(&rxe->resp_skb_out);
+ } else {
+ pr_warn("Could not clone atomic response\n");
+ }
+
+ arbiter_skb_queue(rxe, qp, skb_copy);
+
+out:
+ return rc;
+}
+
+static enum resp_states acknowledge(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ if (qp_type(qp) != IB_QPT_RC)
+ return RESPST_CLEANUP;
+
+ if (qp->resp.aeth_syndrome != AETH_ACK_UNLIMITED)
+ send_ack(qp, pkt, qp->resp.aeth_syndrome, pkt->psn);
+ else if (pkt->mask & RXE_ATOMIC_MASK)
+ send_atomic_ack(qp, pkt, AETH_ACK_UNLIMITED);
+ else if (bth_ack(pkt))
+ send_ack(qp, pkt, AETH_ACK_UNLIMITED, pkt->psn);
+
+ return RESPST_CLEANUP;
+}
+
+static enum resp_states cleanup(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ struct sk_buff *skb;
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+
+ if (pkt) {
+ skb = skb_dequeue(&qp->req_pkts);
+ rxe_drop_ref(qp);
+ kfree_skb(skb);
+ atomic_dec(&qp->req_skb_in);
+ atomic_dec(&rxe->req_skb_in);
+ }
+
+ if (qp->resp.mr) {
+ rxe_drop_ref(qp->resp.mr);
+ qp->resp.mr = NULL;
+ }
+
+ return RESPST_DONE;
+}
+
+static struct resp_res *find_resource(struct rxe_qp *qp, u32 psn)
+{
+ int i;
+
+ for (i = 0; i < qp->attr.max_rd_atomic; i++) {
+ struct resp_res *res = &qp->resp.resources[i];
+ if (res->type == 0)
+ continue;
+
+ if (psn_compare(psn, res->first_psn) >= 0 &&
+ psn_compare(psn, res->last_psn) <= 0) {
+ return res;
+ }
+ }
+
+ return NULL;
+}
+
+static enum resp_states duplicate_request(struct rxe_qp *qp,
+ struct rxe_pkt_info *pkt)
+{
+ enum resp_states rc;
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+
+ if (pkt->mask & RXE_SEND_MASK ||
+ pkt->mask & RXE_WRITE_MASK) {
+ /* SEND. Ack again and cleanup. C9-105. */
+ if (bth_ack(pkt))
+ send_ack(qp, pkt, AETH_ACK_UNLIMITED, qp->resp.psn - 1);
+ rc = RESPST_CLEANUP;
+ goto out;
+ } else if (pkt->mask & RXE_READ_MASK) {
+ struct resp_res *res;
+
+ res = find_resource(qp, pkt->psn);
+ if (!res) {
+ /* Ressource not found. Class D error.
+ Drop the request. */
+ rc = RESPST_CLEANUP;
+ goto out;
+ } else {
+ /* Ensure this new request is the same as the previous
+ * one or a subset of it. */
+ u64 iova = reth_va(pkt);
+ u32 resid = reth_len(pkt);
+
+ if (iova < res->read.va_org ||
+ resid > res->read.length ||
+ (iova + resid) > (res->read.va_org +
+ res->read.length)) {
+ rc = RESPST_CLEANUP;
+ goto out;
+ }
+
+ if (reth_rkey(pkt) != res->read.rkey) {
+ rc = RESPST_CLEANUP;
+ goto out;
+ }
+
+ res->cur_psn = pkt->psn;
+ res->state = (pkt->psn == res->first_psn) ?
+ rdatm_res_state_new :
+ rdatm_res_state_replay;
+
+ /* Reset the resource, except length. */
+ res->read.va_org = iova;
+ res->read.va = iova;
+ res->read.resid = resid;
+
+ /* Replay the RDMA read reply. */
+ qp->resp.res = res;
+ rc = RESPST_READ_REPLY;
+ goto out;
+ }
+ } else {
+ struct resp_res *res;
+
+ WARN_ON((pkt->mask & RXE_ATOMIC_MASK) == 0);
+
+ /* Find the operation in our list of responder resources. */
+ res = find_resource(qp, pkt->psn);
+ if (res) {
+ struct sk_buff *skb_copy;
+
+ skb_copy = skb_clone(res->atomic.skb, GFP_ATOMIC);
+ if (skb_copy) {
+ rxe_add_ref(qp); /* for the new SKB */
+ atomic_inc(&qp->resp_skb_out);
+ atomic_inc(&rxe->resp_skb_out);
+ } else {
+ pr_warn("Couldn't clone atomic resp\n");
+ rc = RESPST_CLEANUP;
+ goto out;
+ }
+ bth_set_psn(SKB_TO_PKT(skb_copy),
+ qp->resp.psn - 1);
+ /* Resend the result. */
+ arbiter_skb_queue(to_rdev(qp->ibqp.device), qp,
+ skb_copy);
+ }
+
+ /* Ressource not found. Class D error. Drop the request. */
+ rc = RESPST_CLEANUP;
+ goto out;
+ }
+out:
+ return rc;
+}
+
+/* Process a class A or C. Both are treated the same in this implementation. */
+static void do_class_ac_error(struct rxe_qp *qp, u8 syndrome,
+ enum ib_wc_status status)
+{
+ qp->resp.aeth_syndrome = syndrome;
+ qp->resp.status = status;
+
+ /* indicate that we should go through the ERROR state */
+ qp->resp.goto_error = 1;
+}
+
+static enum resp_states do_class_d1e_error(struct rxe_qp *qp)
+{
+ /* UC */
+ if (qp->srq) {
+ /* Class E */
+ qp->resp.drop_msg = 1;
+ if (qp->resp.wqe) {
+ qp->resp.status = IB_WC_REM_INV_REQ_ERR;
+ return RESPST_COMPLETE;
+ } else {
+ return RESPST_CLEANUP;
+ }
+ } else {
+ /* Class D1. This packet may be the start of a
+ new message and could be valid. The previous
+ message is invalid and ignored. reset the
+ recv wr to its original state */
+ if (qp->resp.wqe) {
+ qp->resp.wqe->dma.resid = qp->resp.wqe->dma.length;
+ qp->resp.wqe->dma.cur_sge = 0;
+ qp->resp.wqe->dma.sge_offset = 0;
+ qp->resp.opcode = -1;
+ }
+
+ if (qp->resp.mr) {
+ rxe_drop_ref(qp->resp.mr);
+ qp->resp.mr = NULL;
+ }
+
+ return RESPST_CHK_OP_SEQ;
+ }
+}
+
+int rxe_responder(void *arg)
+{
+ struct rxe_qp *qp = (struct rxe_qp *)arg;
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ enum resp_states state;
+ struct rxe_pkt_info *pkt = NULL;
+ int ret = 0;
+
+ qp->resp.aeth_syndrome = AETH_ACK_UNLIMITED;
+
+ if (!qp->valid) {
+ ret = -EINVAL;
+ goto done;
+ }
+
+ switch (qp->resp.state) {
+ case QP_STATE_RESET:
+ state = RESPST_RESET;
+ break;
+
+ default:
+ state = RESPST_GET_REQ;
+ break;
+ }
+
+ while (1) {
+ pr_debug("state = %s\n", resp_state_name[state]);
+ switch (state) {
+ case RESPST_GET_REQ:
+ state = get_req(qp, &pkt);
+ break;
+ case RESPST_CHK_PSN:
+ state = check_psn(qp, pkt);
+ break;
+ case RESPST_CHK_OP_SEQ:
+ state = check_op_seq(qp, pkt);
+ break;
+ case RESPST_CHK_OP_VALID:
+ state = check_op_valid(qp, pkt);
+ break;
+ case RESPST_CHK_RESOURCE:
+ state = check_resource(qp, pkt);
+ break;
+ case RESPST_CHK_LENGTH:
+ state = check_length(qp, pkt);
+ break;
+ case RESPST_CHK_RKEY:
+ state = check_rkey(qp, pkt);
+ break;
+ case RESPST_EXECUTE:
+ state = execute(qp, pkt);
+ break;
+ case RESPST_COMPLETE:
+ state = do_complete(qp, pkt);
+ break;
+ case RESPST_READ_REPLY:
+ state = read_reply(qp, pkt);
+ break;
+ case RESPST_ACKNOWLEDGE:
+ state = acknowledge(qp, pkt);
+ break;
+ case RESPST_CLEANUP:
+ state = cleanup(qp, pkt);
+ break;
+ case RESPST_DUPLICATE_REQUEST:
+ state = duplicate_request(qp, pkt);
+ break;
+ case RESPST_ERR_PSN_OUT_OF_SEQ:
+ /* RC only - Class B. Drop packet. */
+ send_ack(qp, pkt, AETH_NAK_PSN_SEQ_ERROR, qp->resp.psn);
+ state = RESPST_CLEANUP;
+ break;
+
+ case RESPST_ERR_TOO_MANY_RDMA_ATM_REQ:
+ case RESPST_ERR_MISSING_OPCODE_FIRST:
+ case RESPST_ERR_MISSING_OPCODE_LAST_C:
+ case RESPST_ERR_UNSUPPORTED_OPCODE:
+ case RESPST_ERR_MISALIGNED_ATOMIC:
+ /* RC Only - Class C. */
+ do_class_ac_error(qp, AETH_NAK_INVALID_REQ,
+ IB_WC_REM_INV_REQ_ERR);
+ state = RESPST_COMPLETE;
+ break;
+
+ case RESPST_ERR_MISSING_OPCODE_LAST_D1E:
+ state = do_class_d1e_error(qp);
+ break;
+ case RESPST_ERR_RNR:
+ if (qp_type(qp) == IB_QPT_RC) {
+ /* RC - class B */
+ send_ack(qp, pkt, AETH_RNR_NAK |
+ (~AETH_TYPE_MASK &
+ qp->attr.min_rnr_timer),
+ pkt->psn);
+ } else {
+ /* UD/UC - class D */
+ qp->resp.drop_msg = 1;
+ }
+ state = RESPST_CLEANUP;
+ break;
+
+ case RESPST_ERR_RKEY_VIOLATION:
+ if (qp_type(qp) == IB_QPT_RC) {
+ /* Class C */
+ do_class_ac_error(qp, AETH_NAK_REM_ACC_ERR,
+ IB_WC_REM_ACCESS_ERR);
+ state = RESPST_COMPLETE;
+ } else {
+ qp->resp.drop_msg = 1;
+ if (qp->srq) {
+ /* UC/SRQ Class D */
+ qp->resp.status = IB_WC_REM_ACCESS_ERR;
+ state = RESPST_COMPLETE;
+ } else {
+ /* UC/non-SRQ Class E. */
+ state = RESPST_CLEANUP;
+ }
+ }
+ break;
+
+ case RESPST_ERR_LENGTH:
+ if (qp_type(qp) == IB_QPT_RC) {
+ /* Class C */
+ do_class_ac_error(qp, AETH_NAK_INVALID_REQ,
+ IB_WC_REM_INV_REQ_ERR);
+ state = RESPST_COMPLETE;
+ } else if (qp->srq) {
+ /* UC/UD - class E */
+ qp->resp.status = IB_WC_REM_INV_REQ_ERR;
+ state = RESPST_COMPLETE;
+ } else {
+ /* UC/UD - class D */
+ qp->resp.drop_msg = 1;
+ state = RESPST_CLEANUP;
+ }
+ break;
+
+ case RESPST_ERR_MALFORMED_WQE:
+ /* All, Class A. */
+ do_class_ac_error(qp, AETH_NAK_REM_OP_ERR,
+ IB_WC_LOC_QP_OP_ERR);
+ state = RESPST_COMPLETE;
+ break;
+
+ case RESPST_ERR_CQ_OVERFLOW:
+ /* All - Class G */
+ state = RESPST_ERROR;
+ break;
+
+ case RESPST_DONE:
+ if (qp->resp.goto_error) {
+ state = RESPST_ERROR;
+ break;
+ } else
+ goto done;
+
+ case RESPST_EXIT:
+ if (qp->resp.goto_error) {
+ state = RESPST_ERROR;
+ break;
+ } else
+ goto exit;
+
+ case RESPST_RESET: {
+ struct sk_buff *skb;
+
+ while ((skb = skb_dequeue(&qp->req_pkts))) {
+ rxe_drop_ref(qp);
+ kfree_skb(skb);
+ atomic_dec(&qp->req_skb_in);
+ atomic_dec(&rxe->req_skb_in);
+ }
+
+ while (!qp->srq && qp->rq.queue &&
+ queue_head(qp->rq.queue))
+ advance_consumer(qp->rq.queue);
+
+ qp->resp.wqe = NULL;
+ goto exit;
+ }
+
+ case RESPST_ERROR:
+ qp->resp.goto_error = 0;
+ pr_warn("qp#%d moved to error state\n", qp_num(qp));
+ rxe_qp_error(qp);
+ goto exit;
+
+ default:
+ WARN_ON(1);
+ }
+ }
+
+exit:
+ ret = -EAGAIN;
+done:
+ return ret;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 34/44] rxe_arbiter.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (32 preceding siblings ...)
2011-07-01 13:18 ` [patch 33/44] rxe_resp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
[not found] ` <20110701132202.342196794-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
2011-07-01 13:18 ` [patch 35/44] rxe_dma.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (10 subsequent siblings)
44 siblings, 1 reply; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch34 --]
[-- Type: text/plain, Size: 6301 bytes --]
packet output arbitration.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_arbiter.c | 192 ++++++++++++++++++++++++++++++++
1 file changed, 192 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_arbiter.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_arbiter.c
@@ -0,0 +1,192 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/skbuff.h>
+
+#include "rxe.h"
+#include "rxe_loc.h"
+#include "rxe_qp.h"
+
+static inline void account_skb(struct rxe_dev *rxe, struct rxe_qp *qp,
+ int is_request)
+{
+ if (is_request & RXE_REQ_MASK) {
+ atomic_dec(&rxe->req_skb_out);
+ atomic_dec(&qp->req_skb_out);
+ if (qp->need_req_skb) {
+ if (atomic_read(&qp->req_skb_out) < rxe_max_skb_per_qp)
+ rxe_run_task(&qp->req.task, 1);
+ }
+ } else {
+ atomic_dec(&rxe->resp_skb_out);
+ atomic_dec(&qp->resp_skb_out);
+ }
+}
+
+static int xmit_one_packet(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct sk_buff *skb)
+{
+ int err;
+ struct timespec time;
+ long new_delay;
+ struct rxe_pkt_info *pkt = SKB_TO_PKT(skb);
+ int is_request = pkt->mask & RXE_REQ_MASK;
+
+ /* drop pkt if qp is in wrong state to send */
+ if (!qp->valid)
+ goto drop;
+
+ if (is_request) {
+ if (qp->req.state != QP_STATE_READY)
+ goto drop;
+ } else {
+ if (qp->resp.state != QP_STATE_READY)
+ goto drop;
+ }
+
+ /* busy wait for static rate control
+ we could refine this by yielding the tasklet
+ for larger delays and waiting out the small ones */
+ if (rxe->arbiter.delay)
+ do {
+ getnstimeofday(&time);
+ } while (timespec_compare(&time, &rxe->arbiter.time) < 0);
+
+ new_delay = (skb->len*rxe_nsec_per_kbyte) >> 10;
+ if (new_delay < rxe_nsec_per_packet)
+ new_delay = rxe_nsec_per_packet;
+
+ if (pkt->mask & RXE_LOOPBACK_MASK)
+ err = rxe->ifc_ops->loopback(rxe, skb);
+ else
+ err = rxe->ifc_ops->send(rxe, skb);
+
+ /* we can recover from RXE_QUEUE_STOPPED errors
+ by retrying the packet. In other cases
+ the packet is consumed so move on */
+ if (err == RXE_QUEUE_STOPPED)
+ return err;
+ else if (err)
+ rxe->xmit_errors++;
+
+ rxe->arbiter.delay = new_delay > 0;
+ if (rxe->arbiter.delay) {
+ getnstimeofday(&time);
+ time.tv_nsec += new_delay;
+ while (time.tv_nsec > NSEC_PER_SEC) {
+ time.tv_sec += 1;
+ time.tv_nsec -= NSEC_PER_SEC;
+ }
+ rxe->arbiter.time = time;
+ }
+
+ goto done;
+
+drop:
+ kfree_skb(skb);
+ err = 0;
+done:
+ account_skb(rxe, qp, is_request);
+ return err;
+}
+
+/*
+ * choose one packet for sending
+ */
+int rxe_arbiter(void *arg)
+{
+ int err;
+ unsigned long flags;
+ struct rxe_dev *rxe = (struct rxe_dev *)arg;
+ struct sk_buff *skb;
+ struct list_head *qpl;
+ struct rxe_qp *qp;
+
+ /* get the next qp's send queue */
+ spin_lock_irqsave(&rxe->arbiter.list_lock, flags);
+ if (list_empty(&rxe->arbiter.qp_list)) {
+ spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
+ return 1;
+ }
+
+ qpl = rxe->arbiter.qp_list.next;
+ list_del_init(qpl);
+ qp = list_entry(qpl, struct rxe_qp, arbiter_list);
+ spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
+
+ /* get next packet from queue and try to send it
+ note skb could have already been removed */
+ skb = skb_dequeue(&qp->send_pkts);
+ if (skb) {
+ err = xmit_one_packet(rxe, qp, skb);
+ if (err) {
+ if (err == RXE_QUEUE_STOPPED)
+ skb_queue_head(&qp->send_pkts, skb);
+ rxe_run_task(&rxe->arbiter.task, 1);
+ spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
+ return 1;
+ }
+ }
+
+ /* if more work in queue put qp back on the list */
+ spin_lock_irqsave(&rxe->arbiter.list_lock, flags);
+
+ if (list_empty(qpl) && !skb_queue_empty(&qp->send_pkts))
+ list_add_tail(qpl, &rxe->arbiter.qp_list);
+
+ spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
+ return 0;
+}
+
+/*
+ * queue a packet for sending from a qp
+ */
+void arbiter_skb_queue(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct sk_buff *skb)
+{
+ int must_sched;
+ unsigned long flags;
+
+ /* add packet to send queue */
+ skb_queue_tail(&qp->send_pkts, skb);
+
+ /* if not already there add qp to arbiter list */
+ spin_lock_irqsave(&rxe->arbiter.list_lock, flags);
+ if (list_empty(&qp->arbiter_list))
+ list_add_tail(&qp->arbiter_list, &rxe->arbiter.qp_list);
+ spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
+
+ /* run the arbiter, use tasklet unless only one packet */
+ must_sched = skb_queue_len(&qp->resp_pkts) > 1;
+ rxe_run_task(&rxe->arbiter.task, must_sched);
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 35/44] rxe_dma.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (33 preceding siblings ...)
2011-07-01 13:18 ` [patch 34/44] rxe_arbiter.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 36/44] gen_sb8tables.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (9 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch35 --]
[-- Type: text/plain, Size: 5401 bytes --]
Dummy dma processing for rxe device.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_dma.c | 178 ++++++++++++++++++++++++++++++++++++
1 file changed, 178 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_dma.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_dma.c
@@ -0,0 +1,178 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ * Copyright (c) 2006 QLogic, Corporation. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "rxe.h"
+#include "rxe_loc.h"
+
+static int rxe_mapping_error(struct ib_device *dev, u64 dma_addr)
+{
+ return dma_addr == 0;
+}
+
+static u64 rxe_dma_map_single(struct ib_device *dev,
+ void *cpu_addr, size_t size,
+ enum dma_data_direction direction)
+{
+ BUG_ON(!valid_dma_direction(direction));
+ return (u64) (uintptr_t) cpu_addr;
+}
+
+static void rxe_dma_unmap_single(struct ib_device *dev,
+ u64 addr, size_t size,
+ enum dma_data_direction direction)
+{
+ BUG_ON(!valid_dma_direction(direction));
+}
+
+static u64 rxe_dma_map_page(struct ib_device *dev,
+ struct page *page,
+ unsigned long offset,
+ size_t size, enum dma_data_direction direction)
+{
+ u64 addr = 0;
+
+ BUG_ON(!valid_dma_direction(direction));
+
+ if (offset + size > PAGE_SIZE)
+ goto out;
+
+ addr = (uintptr_t) page_address(page);
+ if (addr)
+ addr += offset;
+
+out:
+ return addr;
+}
+
+static void rxe_dma_unmap_page(struct ib_device *dev,
+ u64 addr, size_t size,
+ enum dma_data_direction direction)
+{
+ BUG_ON(!valid_dma_direction(direction));
+}
+
+static int rxe_map_sg(struct ib_device *dev, struct scatterlist *sgl,
+ int nents, enum dma_data_direction direction)
+{
+ struct scatterlist *sg;
+ u64 addr;
+ int i;
+ int ret = nents;
+
+ BUG_ON(!valid_dma_direction(direction));
+
+ for_each_sg(sgl, sg, nents, i) {
+ addr = (uintptr_t) page_address(sg_page(sg));
+ /* TODO: handle highmem pages */
+ if (!addr) {
+ ret = 0;
+ break;
+ }
+ }
+
+ return ret;
+}
+
+static void rxe_unmap_sg(struct ib_device *dev,
+ struct scatterlist *sg, int nents,
+ enum dma_data_direction direction)
+{
+ BUG_ON(!valid_dma_direction(direction));
+}
+
+static u64 rxe_sg_dma_address(struct ib_device *dev, struct scatterlist *sg)
+{
+ u64 addr = (uintptr_t) page_address(sg_page(sg));
+
+ if (addr)
+ addr += sg->offset;
+
+ return addr;
+}
+
+static unsigned int rxe_sg_dma_len(struct ib_device *dev,
+ struct scatterlist *sg)
+{
+ return sg->length;
+}
+
+static void rxe_sync_single_for_cpu(struct ib_device *dev,
+ u64 addr,
+ size_t size, enum dma_data_direction dir)
+{
+}
+
+static void rxe_sync_single_for_device(struct ib_device *dev,
+ u64 addr,
+ size_t size, enum dma_data_direction dir)
+{
+}
+
+static void *rxe_dma_alloc_coherent(struct ib_device *dev, size_t size,
+ u64 *dma_handle, gfp_t flag)
+{
+ struct page *p;
+ void *addr = NULL;
+
+ p = alloc_pages(flag, get_order(size));
+ if (p)
+ addr = page_address(p);
+
+ if (dma_handle)
+ *dma_handle = (uintptr_t) addr;
+
+ return addr;
+}
+
+static void rxe_dma_free_coherent(struct ib_device *dev, size_t size,
+ void *cpu_addr, u64 dma_handle)
+{
+ free_pages((unsigned long)cpu_addr, get_order(size));
+}
+
+struct ib_dma_mapping_ops rxe_dma_mapping_ops = {
+ rxe_mapping_error,
+ rxe_dma_map_single,
+ rxe_dma_unmap_single,
+ rxe_dma_map_page,
+ rxe_dma_unmap_page,
+ rxe_map_sg,
+ rxe_unmap_sg,
+ rxe_sg_dma_address,
+ rxe_sg_dma_len,
+ rxe_sync_single_for_cpu,
+ rxe_sync_single_for_device,
+ rxe_dma_alloc_coherent,
+ rxe_dma_free_coherent
+};
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 36/44] gen_sb8tables.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (34 preceding siblings ...)
2011-07-01 13:18 ` [patch 35/44] rxe_dma.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 37/44] rxe_sb8.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (8 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch36 --]
[-- Type: text/plain, Size: 5268 bytes --]
Program to create slice by 8 tables for the CRC32 calculation.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/gen_sb8tables.c | 242 ++++++++++++++++++++++++++++++
1 file changed, 242 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/gen_sb8tables.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/gen_sb8tables.c
@@ -0,0 +1,242 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <byteswap.h>
+
+#define _GNU_SOURCE
+#include <getopt.h>
+
+static int flip;
+static int swap;
+static int swip;
+static int flop;
+
+static uint32_t poly = 0xedb88320; /* crc32 */
+
+static uint32_t t[256];
+
+static int reverse[] = { 7, 6, 5, 4, 3, 2, 1, 0};
+
+static uint8_t flip8(uint8_t x)
+{
+ uint8_t y = 0;
+ int i;
+
+ for (i = 0; i < 8; i++) {
+ if (x & (1 << i))
+ y |= 1 << reverse[i];
+ }
+
+ return y;
+}
+
+static uint32_t flip32(uint32_t x)
+{
+ uint32_t y;
+ uint8_t *p = (uint8_t *)&x;
+ uint8_t *q = (uint8_t *)&y;
+ int i;
+
+ for (i = 0; i < 4; i++)
+ q[i] = flip8(p[i]);
+
+ return y;
+}
+
+static void compute(int n)
+{
+ uint64_t rem;
+ uint64_t p, m;
+ int i;
+ int j;
+ int k;
+ uint32_t ply = poly;
+
+ if (flop)
+ ply = flip32(ply);
+ if (swip)
+ ply = bswap_32(ply);
+
+ for (i = 0; i < 256; i++) {
+ if (flip)
+ rem = flip8(i);
+ else
+ rem = i;
+ rem <<= 32;
+
+ for (k = 0; k <= n; k++) {
+ for (j = 7; j >= 0; j--) {
+ m = 1ULL << (32 + j);
+
+ if (rem & m) {
+ p = ((1ULL << 32) + ply) << j;
+ rem ^= p;
+ }
+ }
+
+ rem <<= 8;
+ }
+
+ rem >>= 8;
+
+ if (flip) {
+ if (swap)
+ t[i] = bswap_32(flip32(rem));
+ else
+ t[i] = flip32(rem);
+ } else {
+ if (swap)
+ t[i] = bswap_32(rem);
+ else
+ t[i] = rem;
+ }
+ }
+}
+
+static void print_table(char *name)
+{
+ int i;
+
+ printf("\nstatic u32 %s[] = {\n", name);
+ for (i = 0; i < 256; i++) {
+ printf("0x%08x,", t[i]);
+ if ((i % 4) == 3)
+ printf("\n");
+ else
+ printf(" ");
+ }
+ printf("};\n");
+}
+
+static void usage(void)
+{
+ printf("usage:\n");
+}
+
+static int arg_process(int argc, char *argv[])
+{
+ int c;
+ char *opt_string = "sfFShp:";
+ struct option opt_long[] = {
+ {"help", 0, 0, 'h'},
+ {"poly", 1, 0, 'p'},
+ {"flip", 0, 0, 'f'},
+ {"swap", 0, 0, 's'},
+ {"swip", 0, 0, 'S'},
+ {"flop", 0, 0, 'F'},
+ };
+
+ while (1) {
+ c = getopt_long(argc, argv, opt_string, opt_long, NULL);
+
+ if (c == -1)
+ break;
+
+ switch (c) {
+ case 'h':
+ usage();
+ return 1;
+
+ case 'p':
+ poly = strtoul(optarg, NULL, 0);
+ break;
+
+ case 'f':
+ flip = 1;
+ break;
+
+ case 'F':
+ flop = 1;
+ break;
+
+ case 's':
+ swap = 1;
+ break;
+
+ case 'S':
+ swip = 1;
+ break;
+
+ default:
+ return 1;
+ }
+ }
+
+ if (optind < argc)
+ return 1;
+
+ return 0;
+}
+
+int main(int argc, char *argv[])
+{
+ if (arg_process(argc, argv)) {
+ usage();
+ return 0;
+ }
+
+ printf("/*\n");
+ printf(" * Slice by 8 tables for CRC polynomial = 0x%08x\n", poly);
+ printf(" * This file is automatically generated\n");
+ printf(" */\n");
+
+ compute(0);
+ print_table("t32");
+
+ compute(1);
+ print_table("t40");
+
+ compute(2);
+ print_table("t48");
+
+ compute(3);
+ print_table("t56");
+
+ compute(4);
+ print_table("t64");
+
+ compute(5);
+ print_table("t72");
+
+ compute(6);
+ print_table("t80");
+
+ compute(7);
+ print_table("t88");
+
+ return 0;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 37/44] rxe_sb8.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (35 preceding siblings ...)
2011-07-01 13:18 ` [patch 36/44] gen_sb8tables.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
[not found] ` <20110701132202.486466900-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
2011-07-01 13:18 ` [patch 38/44] rxe.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (7 subsequent siblings)
44 siblings, 1 reply; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch37 --]
[-- Type: text/plain, Size: 6472 bytes --]
Slice by 8 implementation of CRC32.
The code code is similar to the kernel provided crc32
calculation except runs about 3X faster which allows us to
get to ~1GB/sec.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_sb8.c | 242 ++++++++++++++++++++++++++++++++++++
1 file changed, 242 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_sb8.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_sb8.c
@@ -0,0 +1,242 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "rxe.h"
+
+#include "rxe_loc.h"
+#include "sb8_tables.h"
+
+/* TODO we need big endian versions of sb8 */
+
+/* slice by 8 algorithm for crc32 */
+static __be32 sb8_le(__be32 last_crc, const void *data, const size_t length)
+{
+ const u8 *p8;
+ int init_bytes;
+ size_t qwords;
+ int end_bytes;
+ u32 *p32;
+ size_t i;
+ __be32 crc;
+ u32 q;
+ u8 i1, i2, i3, i4;
+
+ crc = last_crc;
+
+ if (length == 0 || rxe_crc_disable)
+ return crc;
+
+ p8 = data;
+ init_bytes = (-(long)p8) & 7;
+ if (init_bytes > length)
+ init_bytes = length;
+ qwords = (length - init_bytes) >> 3;
+ end_bytes = (length - init_bytes) & 7;
+ p32 = (u32 *)(p8 + init_bytes);
+
+ for (i = 0; i < init_bytes; i++) {
+ i1 = crc ^ *p8++;
+ crc = t32[i1] ^ (crc >> 8);
+ }
+
+ for (i = 0; i < qwords; i++) {
+
+ q = *p32++ ^ crc;
+
+ i1 = q;
+ i2 = q >> 8;
+ i3 = q >> 16;
+ i4 = q >> 24;
+
+ crc = t88[i1] ^ t80[i2] ^ t72[i3] ^ t64[i4];
+
+ q = *p32++;
+
+ i1 = q;
+ i2 = q >> 8;
+ i3 = q >> 16;
+ i4 = q >> 24;
+
+ crc ^= t56[i1] ^ t48[i2] ^ t40[i3] ^ t32[i4];
+ }
+
+ p8 = (u8 *)p32;
+
+ for (i = 0; i < end_bytes; i++) {
+ i1 = crc ^ *p8++;
+ crc = t32[i1] ^ (crc >> 8);
+ }
+
+ return crc;
+}
+
+/* slice by 8 with a copy */
+__be32 sb8_le_copy(void *dest, const void *data,
+ const size_t length, __be32 last_crc)
+{
+ const u8 *p8;
+ u8 *d8;
+ int init_bytes;
+ size_t qwords;
+ int end_bytes;
+ u64 *p64;
+ u64 *d64;
+ size_t i;
+ __be32 crc;
+ u32 q;
+ u64 t;
+ u8 i1, i2, i3, i4;
+
+ crc = last_crc;
+
+ if (rxe_crc_disable) {
+ memcpy(dest, data, length);
+ return crc;
+ }
+
+ p8 = data;
+ d8 = dest;
+ init_bytes = (-(long)p8) & 7;
+ if (init_bytes > length)
+ init_bytes = length;
+ qwords = (length - init_bytes) >> 3;
+ end_bytes = (length - init_bytes) & 7;
+ p64 = (u64 *)(p8 + init_bytes);
+ d64 = (u64 *)(d8 + init_bytes);
+
+ for (i = 0; i < init_bytes; i++) {
+ t = *p8++;
+ *d8++ = t;
+ i1 = crc ^ t;
+ crc = t32[i1] ^ (crc >> 8);
+ }
+
+ for (i = 0; i < qwords; i++) {
+ t = *p64++;
+ *d64++ = t;
+ q = (u32)t ^ crc;
+
+ i1 = q;
+ i2 = q >> 8;
+ i3 = q >> 16;
+ i4 = q >> 24;
+
+ crc = t88[i1] ^ t80[i2] ^ t72[i3] ^ t64[i4];
+
+ t >>= 32;
+ q = (u32)t;
+
+ i1 = q;
+ i2 = q >> 8;
+ i3 = q >> 16;
+ i4 = q >> 24;
+
+ crc ^= t56[i1] ^ t48[i2] ^ t40[i3] ^ t32[i4];
+ }
+
+ p8 = (u8 *)p64;
+ d8 = (u8 *)d64;
+
+ for (i = 0; i < end_bytes; i++) {
+ t = *p8++;
+ *d8++ = t;
+ i1 = crc ^ t;
+ crc = t32[i1] ^ (crc >> 8);
+ }
+
+ return crc;
+}
+
+/* Compute a partial ICRC for all the IB transport headers. */
+__be32 rxe_sb8_ib_headers(struct rxe_pkt_info *pkt)
+{
+ u32 crc;
+ unsigned int length;
+ unsigned int grh_offset;
+ unsigned int bth_offset;
+ u8 tmp[RXE_LRH_BYTES + RXE_GRH_BYTES + RXE_BTH_BYTES];
+
+ /* This seed is the result of computing a CRC with a seed of
+ * 0xfffffff and 8 bytes of 0xff representing a masked LRH. */
+ crc = 0xdebb20e3;
+
+ length = RXE_BTH_BYTES;
+ grh_offset = 0;
+ bth_offset = 0;
+
+ if (pkt->mask & RXE_LRH_MASK) {
+ length += RXE_LRH_BYTES;
+ grh_offset += RXE_LRH_BYTES;
+ bth_offset += RXE_LRH_BYTES;
+ }
+ if (pkt->mask & RXE_GRH_MASK) {
+ length += RXE_GRH_BYTES;
+ bth_offset += RXE_GRH_BYTES;
+ }
+
+ memcpy(tmp, pkt->hdr, length);
+
+ if (pkt->mask & RXE_GRH_MASK) {
+ tmp[grh_offset + 0] |= 0x0f; /* GRH: tclass */
+ tmp[grh_offset + 1] = 0xff;
+ tmp[grh_offset + 2] = 0xff;
+ tmp[grh_offset + 3] = 0xff;
+ tmp[grh_offset + 7] = 0xff;
+ }
+
+ tmp[bth_offset + 4] = 0xff; /* BTH: resv8a */
+
+ crc = sb8_le(crc, tmp + grh_offset, length - grh_offset);
+
+ /* And finish to compute the CRC on the remainder of the headers. */
+ crc = sb8_le(crc, pkt->hdr + length,
+ rxe_opcode[pkt->opcode].length - RXE_BTH_BYTES);
+
+ return crc;
+}
+
+/* Compute the ICRC for a packet (incoming or outgoing). */
+__be32 rxe_sb8(struct rxe_pkt_info *pkt)
+{
+ u32 crc;
+ int size;
+
+ crc = rxe_sb8_ib_headers(pkt);
+
+ /* And finish to compute the CRC on the remainder. */
+ size = pkt->paylen - rxe_opcode[pkt->opcode].length - RXE_ICRC_SIZE;
+ crc = sb8_le(crc, payload_addr(pkt), size);
+ crc = ~crc;
+
+ return crc;
+}
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 38/44] rxe.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (36 preceding siblings ...)
2011-07-01 13:18 ` [patch 37/44] rxe_sb8.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:18 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 39/44] rxe_net.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (6 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:18 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch38 --]
[-- Type: text/plain, Size: 16265 bytes --]
module main for ib_rxe
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe.c | 549 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 549 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe.c
@@ -0,0 +1,549 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "rxe.h"
+#include "rxe_loc.h"
+
+MODULE_AUTHOR("Bob Pearson, Frank Zago, John Groves");
+MODULE_DESCRIPTION("Soft RDMA transport");
+MODULE_LICENSE("Dual BSD/GPL");
+
+static int rxe_max_ucontext = RXE_MAX_UCONTEXT;
+module_param_named(max_ucontext, rxe_max_ucontext, int, 0644);
+MODULE_PARM_DESC(max_ucontext, "max user contexts per device");
+
+static int rxe_max_qp = RXE_MAX_QP;
+module_param_named(max_qp, rxe_max_qp, int, 0444);
+MODULE_PARM_DESC(max_qp, "max QPs per device");
+
+static int rxe_max_qp_wr = RXE_MAX_QP_WR;
+module_param_named(max_qp_wr, rxe_max_qp_wr, int, 0644);
+MODULE_PARM_DESC(max_qp_wr, "max send or recv WR's per QP");
+
+static int rxe_max_inline_data = RXE_MAX_INLINE_DATA;
+module_param_named(max_inline_data, rxe_max_inline_data, int, 0644);
+MODULE_PARM_DESC(max_inline_data, "max inline data per WR");
+
+static int rxe_max_cq = RXE_MAX_CQ;
+module_param_named(max_cq, rxe_max_cq, int, 0644);
+MODULE_PARM_DESC(max_cq, "max CQs per device");
+
+static int rxe_max_mr = RXE_MAX_MR;
+module_param_named(max_mr, rxe_max_mr, int, 0644);
+MODULE_PARM_DESC(max_mr, "max MRs per device");
+
+static int rxe_max_fmr = RXE_MAX_FMR;
+module_param_named(max_fmr, rxe_max_fmr, int, 0644);
+MODULE_PARM_DESC(max_fmr, "max MRs per device");
+
+static int rxe_max_mw = RXE_MAX_MW;
+module_param_named(max_mw, rxe_max_mw, int, 0644);
+MODULE_PARM_DESC(max_mw, "max MWs per device");
+
+static int rxe_max_log_cqe = RXE_MAX_LOG_CQE;
+module_param_named(max_log_cqe, rxe_max_log_cqe, int, 0644);
+MODULE_PARM_DESC(max_log_cqe, "Log2 of max CQ entries per CQ");
+
+int rxe_fast_comp = 2;
+module_param_named(fast_comp, rxe_fast_comp, int, 0644);
+MODULE_PARM_DESC(fast_comp, "fast path call to completer "
+ "(0=no, 1=no int context, 2=any context)");
+
+int rxe_fast_resp = 2;
+module_param_named(fast_resp, rxe_fast_resp, int, 0644);
+MODULE_PARM_DESC(fast_resp, "enable fast path call to responder "
+ "(0=no, 1=no int context, 2=any context)");
+
+int rxe_fast_req = 2;
+module_param_named(fast_req, rxe_fast_req, int, 0644);
+MODULE_PARM_DESC(fast_req, "enable fast path call to requester "
+ "(0=no, 1=no int context, 2=any context)");
+
+int rxe_fast_arb = 2;
+module_param_named(fast_arb, rxe_fast_arb, int, 0644);
+MODULE_PARM_DESC(fast_arb, "enable fast path call to arbiter "
+ "(0=no, 1=no int context, 2=any context)");
+
+int rxe_crc_disable;
+EXPORT_SYMBOL(rxe_crc_disable);
+module_param_named(crc_disable, rxe_crc_disable, int, 0644);
+MODULE_PARM_DESC(rxe_crc_disable,
+ "Disable crc32 computation on outbound packets");
+
+int rxe_nsec_per_packet = 200;
+module_param_named(nsec_per_packet, rxe_nsec_per_packet, int, 0644);
+MODULE_PARM_DESC(nsec_per_packet,
+ "minimum output packet delay nsec");
+
+int rxe_nsec_per_kbyte = 700;
+module_param_named(nsec_per_kbyte, rxe_nsec_per_kbyte, int, 0644);
+MODULE_PARM_DESC(nsec_per_kbyte,
+ "minimum output packet delay per kbyte nsec");
+
+int rxe_max_skb_per_qp = 800;
+module_param_named(max_skb_per_qp, rxe_max_skb_per_qp, int, 0644);
+MODULE_PARM_DESC(max_skb_per_qp,
+ "maximum skb's posted for output");
+
+int rxe_max_req_comp_gap = 128;
+module_param_named(max_req_comp_gap, rxe_max_req_comp_gap, int, 0644);
+MODULE_PARM_DESC(max_req_comp_gap,
+ "max difference between req.psn and comp.psn");
+
+int rxe_max_pkt_per_ack = 64;
+module_param_named(max_pkt_per_ack, rxe_max_pkt_per_ack, int, 0644);
+MODULE_PARM_DESC(max_pkt_per_ack,
+ "max packets before an ack will be generated");
+
+int rxe_default_mtu = 1024;
+module_param_named(default_mtu, rxe_default_mtu, int, 0644);
+MODULE_PARM_DESC(default_mtu,
+ "default rxe port mtu");
+
+/* free resources for all ports on a device */
+static void rxe_cleanup_ports(struct rxe_dev *rxe)
+{
+ unsigned int port_num;
+ struct rxe_port *port;
+
+ for (port_num = 1; port_num <= rxe->num_ports; port_num++) {
+ port = &rxe->port[port_num - 1];
+
+ kfree(port->guid_tbl);
+ port->guid_tbl = NULL;
+
+ kfree(port->pkey_tbl);
+ port->pkey_tbl = NULL;
+ }
+
+ kfree(rxe->port);
+ rxe->port = NULL;
+}
+
+/* free resources for a rxe device
+ all objects created for this device
+ must have been destroyed */
+static void rxe_cleanup(struct rxe_dev *rxe)
+{
+ rxe_cleanup_task(&rxe->arbiter.task);
+
+ rxe_pool_cleanup(&rxe->uc_pool);
+ rxe_pool_cleanup(&rxe->pd_pool);
+ rxe_pool_cleanup(&rxe->ah_pool);
+ rxe_pool_cleanup(&rxe->srq_pool);
+ rxe_pool_cleanup(&rxe->qp_pool);
+ rxe_pool_cleanup(&rxe->cq_pool);
+ rxe_pool_cleanup(&rxe->mr_pool);
+ rxe_pool_cleanup(&rxe->fmr_pool);
+ rxe_pool_cleanup(&rxe->mw_pool);
+ rxe_pool_cleanup(&rxe->mc_grp_pool);
+ rxe_pool_cleanup(&rxe->mc_elem_pool);
+
+ rxe_cleanup_ports(rxe);
+}
+
+/* called when all references have been dropped */
+void rxe_release(struct kref *kref)
+{
+ struct rxe_dev *rxe = container_of(kref, struct rxe_dev, ref_cnt);
+
+ rxe_cleanup(rxe);
+ ib_dealloc_device(&rxe->ib_dev);
+ module_put(THIS_MODULE);
+ rxe->ifc_ops->release(rxe);
+}
+
+/* initialize rxe device parameters */
+static int rxe_init_device_param(struct rxe_dev *rxe)
+{
+ rxe->num_ports = RXE_NUM_PORT;
+ rxe->max_inline_data = rxe_max_inline_data;
+
+ rxe->attr.fw_ver = RXE_FW_VER;
+ rxe->attr.max_mr_size = RXE_MAX_MR_SIZE;
+ rxe->attr.page_size_cap = RXE_PAGE_SIZE_CAP;
+ rxe->attr.vendor_id = RXE_VENDOR_ID;
+ rxe->attr.vendor_part_id = RXE_VENDOR_PART_ID;
+ rxe->attr.hw_ver = RXE_HW_VER;
+ rxe->attr.max_qp = rxe_max_qp;
+ rxe->attr.max_qp_wr = rxe_max_qp_wr;
+ rxe->attr.device_cap_flags = RXE_DEVICE_CAP_FLAGS;
+ rxe->attr.max_sge = RXE_MAX_SGE;
+ rxe->attr.max_sge_rd = RXE_MAX_SGE_RD;
+ rxe->attr.max_cq = rxe_max_cq;
+ if ((rxe_max_log_cqe < 0) || (rxe_max_log_cqe > 20))
+ rxe_max_log_cqe = RXE_MAX_LOG_CQE;
+ rxe->attr.max_cqe = (1 << rxe_max_log_cqe) - 1;
+ rxe->attr.max_mr = rxe_max_mr;
+ rxe->attr.max_pd = RXE_MAX_PD;
+ rxe->attr.max_qp_rd_atom = RXE_MAX_QP_RD_ATOM;
+ rxe->attr.max_ee_rd_atom = RXE_MAX_EE_RD_ATOM;
+ rxe->attr.max_res_rd_atom = RXE_MAX_RES_RD_ATOM;
+ rxe->attr.max_qp_init_rd_atom = RXE_MAX_QP_INIT_RD_ATOM;
+ rxe->attr.max_ee_init_rd_atom = RXE_MAX_EE_INIT_RD_ATOM;
+ rxe->attr.atomic_cap = RXE_ATOMIC_CAP;
+ rxe->attr.max_ee = RXE_MAX_EE;
+ rxe->attr.max_rdd = RXE_MAX_RDD;
+ rxe->attr.max_mw = rxe_max_mw;
+ rxe->attr.max_raw_ipv6_qp = RXE_MAX_RAW_IPV6_QP;
+ rxe->attr.max_raw_ethy_qp = RXE_MAX_RAW_ETHY_QP;
+ rxe->attr.max_mcast_grp = RXE_MAX_MCAST_GRP;
+ rxe->attr.max_mcast_qp_attach = RXE_MAX_MCAST_QP_ATTACH;
+ rxe->attr.max_total_mcast_qp_attach = RXE_MAX_TOT_MCAST_QP_ATTACH;
+ rxe->attr.max_ah = RXE_MAX_AH;
+ rxe->attr.max_fmr = rxe_max_fmr;
+ rxe->attr.max_map_per_fmr = RXE_MAX_MAP_PER_FMR;
+ rxe->attr.max_srq = RXE_MAX_SRQ;
+ rxe->attr.max_srq_wr = RXE_MAX_SRQ_WR;
+ rxe->attr.max_srq_sge = RXE_MAX_SRQ_SGE;
+ rxe->attr.max_fast_reg_page_list_len = RXE_MAX_FMR_PAGE_LIST_LEN;
+ rxe->attr.max_pkeys = RXE_MAX_PKEYS;
+ rxe->attr.local_ca_ack_delay = RXE_LOCAL_CA_ACK_DELAY;
+
+ rxe->max_ucontext = rxe_max_ucontext;
+ rxe->pref_mtu = rxe_mtu_int_to_enum(rxe_default_mtu);
+
+ return 0;
+}
+
+/* initialize port attributes */
+static int rxe_init_port_param(struct rxe_dev *rxe, unsigned int port_num)
+{
+ struct rxe_port *port = &rxe->port[port_num - 1];
+
+ port->attr.state = RXE_PORT_STATE;
+ port->attr.max_mtu = RXE_PORT_MAX_MTU;
+ port->attr.active_mtu = RXE_PORT_ACTIVE_MTU;
+ port->attr.gid_tbl_len = RXE_PORT_GID_TBL_LEN;
+ port->attr.port_cap_flags = RXE_PORT_PORT_CAP_FLAGS;
+ port->attr.max_msg_sz = RXE_PORT_MAX_MSG_SZ;
+ port->attr.bad_pkey_cntr = RXE_PORT_BAD_PKEY_CNTR;
+ port->attr.qkey_viol_cntr = RXE_PORT_QKEY_VIOL_CNTR;
+ port->attr.pkey_tbl_len = RXE_PORT_PKEY_TBL_LEN;
+ port->attr.lid = RXE_PORT_LID;
+ port->attr.sm_lid = RXE_PORT_SM_LID;
+ port->attr.lmc = RXE_PORT_LMC;
+ port->attr.max_vl_num = RXE_PORT_MAX_VL_NUM;
+ port->attr.sm_sl = RXE_PORT_SM_SL;
+ port->attr.subnet_timeout = RXE_PORT_SUBNET_TIMEOUT;
+ port->attr.init_type_reply = RXE_PORT_INIT_TYPE_REPLY;
+ port->attr.active_width = RXE_PORT_ACTIVE_WIDTH;
+ port->attr.active_speed = RXE_PORT_ACTIVE_SPEED;
+ port->attr.phys_state = RXE_PORT_PHYS_STATE;
+ port->mtu_cap =
+ rxe_mtu_enum_to_int(RXE_PORT_ACTIVE_MTU);
+ port->subnet_prefix = cpu_to_be64(RXE_PORT_SUBNET_PREFIX);
+
+ return 0;
+}
+
+/* initialize port state, note IB convention
+ that HCA ports are always numbered from 1 */
+static int rxe_init_ports(struct rxe_dev *rxe)
+{
+ int err;
+ unsigned int port_num;
+ struct rxe_port *port;
+
+ rxe->port = kcalloc(rxe->num_ports, sizeof(struct rxe_port),
+ GFP_KERNEL);
+ if (!rxe->port)
+ return -ENOMEM;
+
+ for (port_num = 1; port_num <= rxe->num_ports; port_num++) {
+ port = &rxe->port[port_num - 1];
+
+ rxe_init_port_param(rxe, port_num);
+
+ if (!port->attr.pkey_tbl_len) {
+ err = -EINVAL;
+ goto err1;
+ }
+
+ port->pkey_tbl = kcalloc(port->attr.pkey_tbl_len,
+ sizeof(*port->pkey_tbl), GFP_KERNEL);
+ if (!port->pkey_tbl) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ port->pkey_tbl[0] = 0xffff;
+
+ if (!port->attr.gid_tbl_len) {
+ kfree(port->pkey_tbl);
+ err = -EINVAL;
+ goto err1;
+ }
+
+ port->guid_tbl = kcalloc(port->attr.gid_tbl_len,
+ sizeof(*port->guid_tbl), GFP_KERNEL);
+ if (!port->guid_tbl) {
+ kfree(port->pkey_tbl);
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ port->guid_tbl[0] = rxe->ifc_ops->port_guid(rxe, port_num);
+
+ spin_lock_init(&port->port_lock);
+ }
+
+ return 0;
+
+err1:
+ while (--port_num >= 1) {
+ port = &rxe->port[port_num - 1];
+ kfree(port->pkey_tbl);
+ kfree(port->guid_tbl);
+ }
+
+ kfree(rxe->port);
+ return err;
+}
+
+/* init pools of managed objects */
+static int rxe_init_pools(struct rxe_dev *rxe)
+{
+ int err;
+
+ err = rxe_pool_init(rxe, &rxe->uc_pool, RXE_TYPE_UC,
+ rxe->max_ucontext);
+ if (err)
+ goto err1;
+
+ err = rxe_pool_init(rxe, &rxe->pd_pool, RXE_TYPE_PD,
+ rxe->attr.max_pd);
+ if (err)
+ goto err2;
+
+ err = rxe_pool_init(rxe, &rxe->ah_pool, RXE_TYPE_AH,
+ rxe->attr.max_ah);
+ if (err)
+ goto err3;
+
+ err = rxe_pool_init(rxe, &rxe->srq_pool, RXE_TYPE_SRQ,
+ rxe->attr.max_srq);
+ if (err)
+ goto err4;
+
+ err = rxe_pool_init(rxe, &rxe->qp_pool, RXE_TYPE_QP,
+ rxe->attr.max_qp);
+ if (err)
+ goto err5;
+
+ err = rxe_pool_init(rxe, &rxe->cq_pool, RXE_TYPE_CQ,
+ rxe->attr.max_cq);
+ if (err)
+ goto err6;
+
+ err = rxe_pool_init(rxe, &rxe->mr_pool, RXE_TYPE_MR,
+ rxe->attr.max_mr);
+ if (err)
+ goto err7;
+
+ err = rxe_pool_init(rxe, &rxe->fmr_pool, RXE_TYPE_FMR,
+ rxe->attr.max_fmr);
+ if (err)
+ goto err8;
+
+ err = rxe_pool_init(rxe, &rxe->mw_pool, RXE_TYPE_MW,
+ rxe->attr.max_mw);
+ if (err)
+ goto err9;
+
+ err = rxe_pool_init(rxe, &rxe->mc_grp_pool, RXE_TYPE_MC_GRP,
+ rxe->attr.max_mcast_grp);
+ if (err)
+ goto err10;
+
+ err = rxe_pool_init(rxe, &rxe->mc_elem_pool, RXE_TYPE_MC_ELEM,
+ rxe->attr.max_total_mcast_qp_attach);
+ if (err)
+ goto err11;
+
+ return 0;
+
+err11:
+ rxe_pool_cleanup(&rxe->mc_grp_pool);
+err10:
+ rxe_pool_cleanup(&rxe->mw_pool);
+err9:
+ rxe_pool_cleanup(&rxe->fmr_pool);
+err8:
+ rxe_pool_cleanup(&rxe->mr_pool);
+err7:
+ rxe_pool_cleanup(&rxe->cq_pool);
+err6:
+ rxe_pool_cleanup(&rxe->qp_pool);
+err5:
+ rxe_pool_cleanup(&rxe->srq_pool);
+err4:
+ rxe_pool_cleanup(&rxe->ah_pool);
+err3:
+ rxe_pool_cleanup(&rxe->pd_pool);
+err2:
+ rxe_pool_cleanup(&rxe->uc_pool);
+err1:
+ return err;
+}
+
+/* initialize rxe device state */
+static int rxe_init(struct rxe_dev *rxe)
+{
+ int err;
+
+ /* init default device parameters */
+ rxe_init_device_param(rxe);
+
+ err = rxe_init_ports(rxe);
+ if (err)
+ goto err1;
+
+ err = rxe_init_pools(rxe);
+ if (err)
+ goto err2;
+
+ /* init packet counters */
+ atomic_set(&rxe->req_skb_in, 0);
+ atomic_set(&rxe->resp_skb_in, 0);
+ atomic_set(&rxe->req_skb_out, 0);
+ atomic_set(&rxe->resp_skb_out, 0);
+
+ /* init pending mmap list */
+ spin_lock_init(&rxe->mmap_offset_lock);
+ spin_lock_init(&rxe->pending_lock);
+ INIT_LIST_HEAD(&rxe->pending_mmaps);
+
+ /* init arbiter */
+ spin_lock_init(&rxe->arbiter.list_lock);
+ INIT_LIST_HEAD(&rxe->arbiter.qp_list);
+ rxe_init_task(rxe, &rxe->arbiter.task, &rxe_fast_arb,
+ rxe, rxe_arbiter, "arb");
+
+ return 0;
+
+err2:
+ rxe_cleanup_ports(rxe);
+err1:
+ return err;
+}
+
+int rxe_set_mtu(struct rxe_dev *rxe, unsigned int ndev_mtu,
+ unsigned int port_num)
+{
+ struct rxe_port *port = &rxe->port[port_num - 1];
+ enum rxe_mtu mtu;
+
+ mtu = eth_mtu_int_to_enum(ndev_mtu);
+ if (!mtu)
+ return -EINVAL;
+
+ /* Set the port mtu to min(feasible, preferred) */
+ mtu = min_t(enum rxe_mtu, mtu, rxe->pref_mtu);
+
+ port->attr.active_mtu = (enum ib_mtu __force)mtu;
+ port->mtu_cap = rxe_mtu_enum_to_int(mtu);
+
+ return 0;
+}
+EXPORT_SYMBOL(rxe_set_mtu);
+
+/* called by ifc layer to create new rxe device
+ caller should allocate memory for rxe by calling
+ ib_alloc_device */
+int rxe_add(struct rxe_dev *rxe, unsigned int mtu)
+{
+ int err;
+ unsigned port_num = 1;
+
+ __module_get(THIS_MODULE);
+
+ kref_init(&rxe->ref_cnt);
+
+ err = rxe_init(rxe);
+ if (err)
+ goto err1;
+
+ err = rxe_set_mtu(rxe, mtu, port_num);
+ if (err)
+ goto err2;
+
+ err = rxe_register_device(rxe);
+ if (err)
+ goto err2;
+
+ return 0;
+
+err2:
+ rxe_cleanup(rxe);
+err1:
+ kref_put(&rxe->ref_cnt, rxe_release);
+ module_put(THIS_MODULE);
+ return err;
+}
+EXPORT_SYMBOL(rxe_add);
+
+/* called by the ifc layer to remove a device */
+void rxe_remove(struct rxe_dev *rxe)
+{
+ rxe_unregister_device(rxe);
+
+ kref_put(&rxe->ref_cnt, rxe_release);
+}
+EXPORT_SYMBOL(rxe_remove);
+
+static int __init rxe_module_init(void)
+{
+ int err;
+
+ /* initialize slab caches for managed objects */
+ err = rxe_cache_init();
+ if (err) {
+ pr_err("rxe: unable to init object pools\n");
+ return err;
+ }
+
+ pr_info("rxe: loaded\n");
+
+ return 0;
+}
+
+static void __exit rxe_module_exit(void)
+{
+ rxe_cache_exit();
+
+ pr_info("rxe: unloaded\n");
+}
+
+module_init(rxe_module_init);
+module_exit(rxe_module_exit);
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 39/44] rxe_net.h
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (37 preceding siblings ...)
2011-07-01 13:18 ` [patch 38/44] rxe.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:19 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 40/44] rxe_net.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (5 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:19 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch39 --]
[-- Type: text/plain, Size: 3447 bytes --]
Common declarations for ib_rxe_net module.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_net.h | 87 ++++++++++++++++++++++++++++++++++++
1 file changed, 87 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_net.h
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_net.h
@@ -0,0 +1,87 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef RXE_NET_H
+#define RXE_NET_H
+
+#include <net/sock.h>
+#include <net/if_inet6.h>
+
+/*
+ * this should be defined in .../include/linux/if_ether.h
+ */
+#define ETH_P_RXE (0x8915)
+
+/*
+ * this should be defined in .../include/linux/netfilter.h
+ * to a specific value
+ */
+#define NFPROTO_RXE (0)
+
+/*
+ * these should be defined in .../include/linux/netfilter_rxe.h
+ */
+#define NF_RXE_IN (0)
+#define NF_RXE_OUT (1)
+
+/* Should probably move to something other than an array...these can be big */
+#define RXE_MAX_IF_INDEX (384)
+
+struct rxe_net_info {
+ struct rxe_dev *rxe;
+ u8 port;
+ struct net_device *ndev;
+ int status;
+};
+
+extern struct rxe_net_info net_info[RXE_MAX_IF_INDEX];
+extern spinlock_t net_info_lock;
+
+/* caller must hold net_dev_lock */
+static inline struct rxe_dev *net_to_rxe(struct net_device *ndev)
+{
+ return (ndev->ifindex >= RXE_MAX_IF_INDEX) ?
+ NULL : net_info[ndev->ifindex].rxe;
+}
+
+static inline u8 net_to_port(struct net_device *ndev)
+{
+ return net_info[ndev->ifindex].port;
+}
+
+void rxe_net_add(struct net_device *ndev);
+void rxe_net_remove(int net_index);
+void rxe_net_up(struct net_device *ndev);
+void rxe_net_down(struct net_device *ndev);
+
+#endif /* RXE_NET_H */
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 40/44] rxe_net.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (38 preceding siblings ...)
2011-07-01 13:19 ` [patch 39/44] rxe_net.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:19 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 41/44] rxe_net_sysfs.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (4 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:19 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch40 --]
[-- Type: text/plain, Size: 15194 bytes --]
implements kernel module that implements an interface between
ib_rxe and the netdev stack.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_net.c | 598 ++++++++++++++++++++++++++++++++++++
1 file changed, 598 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_net.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_net.c
@@ -0,0 +1,598 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/skbuff.h>
+#include <linux/netdevice.h>
+#include <linux/if.h>
+#include <linux/if_vlan.h>
+#include <net/sch_generic.h>
+#include <linux/netfilter.h>
+#include <rdma/ib_addr.h>
+
+#include "rxe.h"
+#include "rxe_net.h"
+
+MODULE_AUTHOR("Bob Pearson, Frank Zago, John Groves");
+MODULE_DESCRIPTION("RDMA transport over Converged Enhanced Ethernet");
+MODULE_LICENSE("Dual BSD/GPL");
+
+static int rxe_eth_proto_id = ETH_P_RXE;
+module_param_named(eth_proto_id, rxe_eth_proto_id, int, 0644);
+MODULE_PARM_DESC(eth_proto_id, "Ethernet protocol ID (default/correct=0x8915)");
+
+static int rxe_xmit_shortcut;
+module_param_named(xmit_shortcut, rxe_xmit_shortcut, int, 0644);
+MODULE_PARM_DESC(xmit_shortcut,
+ "Shortcut transmit (EXPERIMENTAL)");
+
+static int rxe_loopback_mad_grh_fix = 1;
+module_param_named(loopback_mad_grh_fix, rxe_loopback_mad_grh_fix, int, 0644);
+MODULE_PARM_DESC(loopback_mad_grh_fix, "Allow MADs to self without GRH");
+
+/*
+ * note: this table is a replacement for a protocol specific pointer
+ * in struct net_device which exists for other ethertypes
+ * this allows us to not have to patch that data structure
+ * eventually we want to get our own when we're famous
+ */
+struct rxe_net_info net_info[RXE_MAX_IF_INDEX];
+spinlock_t net_info_lock;
+
+static int rxe_net_rcv(struct sk_buff *skb,
+ struct net_device *ndev,
+ struct packet_type *ptype,
+ struct net_device *orig_dev);
+
+static __be64 rxe_mac_to_eui64(struct net_device *ndev)
+{
+ unsigned char *mac_addr = ndev->dev_addr;
+ __be64 eui64;
+ unsigned char *dst = (unsigned char *)&eui64;
+
+ dst[0] = mac_addr[0] ^ 2;
+ dst[1] = mac_addr[1];
+ dst[2] = mac_addr[2];
+ dst[3] = 0xff;
+ dst[4] = 0xfe;
+ dst[5] = mac_addr[3];
+ dst[6] = mac_addr[4];
+ dst[7] = mac_addr[5];
+
+ return eui64;
+}
+
+/* callback when rxe gets released */
+static void release(struct rxe_dev *rxe)
+{
+ module_put(THIS_MODULE);
+}
+
+static __be64 node_guid(struct rxe_dev *rxe)
+{
+ return rxe_mac_to_eui64(rxe->ndev);
+}
+
+static __be64 port_guid(struct rxe_dev *rxe, unsigned int port_num)
+{
+ return rxe_mac_to_eui64(rxe->ndev);
+}
+
+static struct device *dma_device(struct rxe_dev *rxe)
+{
+ struct net_device *ndev;
+
+ ndev = rxe->ndev;
+
+ if (ndev->priv_flags & IFF_802_1Q_VLAN)
+ ndev = vlan_dev_real_dev(ndev);
+
+ return ndev->dev.parent;
+}
+
+static int mcast_add(struct rxe_dev *rxe, union ib_gid *mgid)
+{
+ int err;
+ unsigned char ll_addr[ETH_ALEN];
+
+ ipv6_eth_mc_map((struct in6_addr *)mgid->raw, ll_addr);
+ err = dev_mc_add(rxe->ndev, ll_addr);
+
+ return err;
+}
+
+static int mcast_delete(struct rxe_dev *rxe, union ib_gid *mgid)
+{
+ int err;
+ unsigned char ll_addr[ETH_ALEN];
+
+ ipv6_eth_mc_map((struct in6_addr *)mgid->raw, ll_addr);
+ err = dev_mc_del(rxe->ndev, ll_addr);
+
+ return err;
+}
+
+static inline int queue_deactivated(struct sk_buff *skb)
+{
+ const struct net_device_ops *ops = skb->dev->netdev_ops;
+ u16 queue_index = 0;
+ struct netdev_queue *txq;
+
+ if (ops->ndo_select_queue)
+ queue_index = ops->ndo_select_queue(skb->dev, skb);
+ else if (skb->dev->real_num_tx_queues > 1)
+ queue_index = skb_tx_hash(skb->dev, skb);
+
+ txq = netdev_get_tx_queue(skb->dev, queue_index);
+ return txq->qdisc->state & 2;
+}
+
+static int send_finish(struct sk_buff *skb)
+{
+ if (!rxe_xmit_shortcut)
+ return dev_queue_xmit(skb);
+ else
+ return skb->dev->netdev_ops->ndo_start_xmit(skb, skb->dev);
+}
+
+static int send(struct rxe_dev *rxe, struct sk_buff *skb)
+{
+ if (queue_deactivated(skb))
+ return RXE_QUEUE_STOPPED;
+
+ if (netif_queue_stopped(skb->dev))
+ return RXE_QUEUE_STOPPED;
+
+ return NF_HOOK(NFPROTO_RXE, NF_RXE_OUT, skb, rxe->ndev, NULL,
+ send_finish);
+}
+
+static int loopback_finish(struct sk_buff *skb)
+{
+ struct rxe_pkt_info *pkt = SKB_TO_PKT(skb);
+ struct packet_type *ptype = NULL;
+ struct net_device *orig_dev = pkt->rxe->ndev;
+
+ return rxe_net_rcv(skb, pkt->rxe->ndev, ptype, orig_dev);
+}
+
+static int loopback(struct rxe_dev *rxe, struct sk_buff *skb)
+{
+ return NF_HOOK(NFPROTO_RXE, NF_RXE_OUT, skb, rxe->ndev, NULL,
+ loopback_finish);
+}
+
+static inline int addr_same(struct rxe_dev *rxe, struct rxe_av *av)
+{
+ int port_num = 1;
+ return rxe->port[port_num - 1].guid_tbl[0]
+ == av->attr.grh.dgid.global.interface_id;
+}
+
+static struct sk_buff *init_packet(struct rxe_dev *rxe, struct rxe_av *av,
+ int paylen)
+{
+ struct sk_buff *skb;
+ struct rxe_pkt_info *pkt;
+
+
+ skb = alloc_skb(paylen + RXE_GRH_BYTES + LL_RESERVED_SPACE(rxe->ndev),
+ GFP_ATOMIC);
+ if (!skb)
+ return NULL;
+
+ skb_reserve(skb, LL_RESERVED_SPACE(rxe->ndev));
+ skb_reset_network_header(skb);
+
+ skb->dev = rxe->ndev;
+ skb->protocol = htons(rxe_eth_proto_id);
+
+ pkt = SKB_TO_PKT(skb);
+ pkt->rxe = rxe;
+ pkt->port_num = 1;
+ pkt->hdr = skb_put(skb, RXE_GRH_BYTES + paylen);
+ pkt->mask = RXE_GRH_MASK;
+
+ dev_hard_header(skb, rxe->ndev, rxe_eth_proto_id,
+ av->ll_addr, rxe->ndev->dev_addr, skb->len);
+
+ if (addr_same(rxe, av))
+ pkt->mask |= RXE_LOOPBACK_MASK;
+
+ return skb;
+}
+
+static int init_av(struct rxe_dev *rxe, struct ib_ah_attr *attr,
+ struct rxe_av *av)
+{
+ struct in6_addr *in6 = (struct in6_addr *)attr->grh.dgid.raw;
+
+ /* grh required for rxe_net */
+ if ((attr->ah_flags & IB_AH_GRH) == 0) {
+ if (rxe_loopback_mad_grh_fix) {
+ /* temporary fix so that we can handle mad's to self
+ without grh's included add grh pointing to self */
+ attr->ah_flags |= IB_AH_GRH;
+ attr->grh.dgid.global.subnet_prefix
+ = rxe->port[0].subnet_prefix;
+ attr->grh.dgid.global.interface_id
+ = rxe->port[0].guid_tbl[0];
+ av->attr = *attr;
+ } else {
+ pr_info("rxe_net: attempting to init av without grh\n");
+ return -EINVAL;
+ }
+ }
+
+ if (rdma_link_local_addr(in6))
+ rdma_get_ll_mac(in6, av->ll_addr);
+ else if (rdma_is_multicast_addr(in6))
+ rdma_get_mcast_mac(in6, av->ll_addr);
+ else {
+ int i;
+ char addr[64];
+
+ for (i = 0; i < 16; i++)
+ sprintf(addr+2*i, "%02x", attr->grh.dgid.raw[i]);
+
+ pr_info("rxe_net: non local subnet address not supported %s\n",
+ addr);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+/*
+ * this is required by rxe_cfg to match rxe devices in
+ * /sys/class/infiniband up with their underlying ethernet devices
+ */
+static char *parent_name(struct rxe_dev *rxe, unsigned int port_num)
+{
+ return rxe->ndev->name;
+}
+
+static enum rdma_link_layer link_layer(struct rxe_dev *rxe,
+ unsigned int port_num)
+{
+ return IB_LINK_LAYER_ETHERNET;
+}
+
+static struct rxe_ifc_ops ifc_ops = {
+ .release = release,
+ .node_guid = node_guid,
+ .port_guid = port_guid,
+ .dma_device = dma_device,
+ .mcast_add = mcast_add,
+ .mcast_delete = mcast_delete,
+ .send = send,
+ .loopback = loopback,
+ .init_packet = init_packet,
+ .init_av = init_av,
+ .parent_name = parent_name,
+ .link_layer = link_layer,
+};
+
+/* Caller must hold net_info_lock */
+void rxe_net_add(struct net_device *ndev)
+{
+ int err;
+ struct rxe_dev *rxe;
+ unsigned port_num;
+
+ __module_get(THIS_MODULE);
+
+ rxe = (struct rxe_dev *)ib_alloc_device(sizeof(*rxe));
+ if (!rxe) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ /* for now we always assign port = 1 */
+ port_num = 1;
+
+ rxe->ifc_ops = &ifc_ops;
+
+ rxe->ndev = ndev;
+
+ err = rxe_add(rxe, ndev->mtu);
+ if (err)
+ goto err2;
+
+ pr_info("rxe_net: added %s to %s\n",
+ rxe->ib_dev.name, ndev->name);
+
+ net_info[ndev->ifindex].rxe = rxe;
+ net_info[ndev->ifindex].port = port_num;
+ net_info[ndev->ifindex].ndev = ndev;
+ return;
+
+err2:
+ ib_dealloc_device(&rxe->ib_dev);
+err1:
+ module_put(THIS_MODULE);
+
+ return;
+}
+
+/* Caller must hold net_info_lock */
+void rxe_net_remove(int net_index)
+{
+ struct rxe_dev *rxe = net_info[net_index].rxe;
+ struct net_device *ndev = net_info[net_index].ndev;
+
+ net_info[net_index].rxe = NULL;
+ spin_unlock_bh(&net_info_lock);
+
+ pr_info("rxe_net: remove %s from %s\n",
+ rxe->ib_dev.name, ndev->name);
+
+ rxe_remove(rxe);
+
+ spin_lock_bh(&net_info_lock);
+ return;
+}
+
+/* Caller must hold net_info_lock */
+void rxe_net_up(struct net_device *ndev)
+{
+ struct rxe_dev *rxe;
+ struct rxe_port *port;
+ u8 port_num;
+
+ if (ndev->ifindex >= RXE_MAX_IF_INDEX)
+ goto out;
+
+ net_info[ndev->ifindex].status = IB_PORT_ACTIVE;
+
+ rxe = net_to_rxe(ndev);
+ if (!rxe)
+ goto out;
+
+ port_num = net_to_port(ndev);
+ port = &rxe->port[port_num-1];
+ port->attr.state = IB_PORT_ACTIVE;
+ port->attr.phys_state = IB_PHYS_STATE_LINK_UP;
+
+ pr_info("rxe_net: set %s active for %s\n",
+ rxe->ib_dev.name, ndev->name);
+out:
+ return;
+}
+
+/* Caller must hold net_info_lock */
+void rxe_net_down(struct net_device *ndev)
+{
+ struct rxe_dev *rxe;
+ struct rxe_port *port;
+ u8 port_num;
+
+ if (ndev->ifindex >= RXE_MAX_IF_INDEX)
+ goto out;
+
+ net_info[ndev->ifindex].status = IB_PORT_DOWN;
+
+ rxe = net_to_rxe(ndev);
+ if (!rxe)
+ goto out;
+
+ port_num = net_to_port(ndev);
+ port = &rxe->port[port_num-1];
+ port->attr.state = IB_PORT_DOWN;
+ port->attr.phys_state = 3;
+
+ pr_info("rxe_net: set %s down for %s\n",
+ rxe->ib_dev.name, ndev->name);
+out:
+ return;
+}
+
+static int can_support_rxe(struct net_device *ndev)
+{
+ int rc = 0;
+
+ if (ndev->ifindex >= RXE_MAX_IF_INDEX) {
+ pr_debug("%s index %d: too large for rxe ndev table\n",
+ ndev->name, ndev->ifindex);
+ goto out;
+ }
+
+ /* Let's says we support all ethX devices */
+ if (strncmp(ndev->name, "eth", 3) == 0)
+ rc = 1;
+
+out:
+ return rc;
+}
+
+static int rxe_notify(struct notifier_block *not_blk,
+ unsigned long event,
+ void *arg)
+{
+ struct net_device *ndev = arg;
+ struct rxe_dev *rxe;
+
+ if (!can_support_rxe(ndev))
+ goto out;
+
+ spin_lock_bh(&net_info_lock);
+ switch (event) {
+ case NETDEV_REGISTER:
+ /* Keep a record of this NIC. */
+ net_info[ndev->ifindex].status = IB_PORT_DOWN;
+ net_info[ndev->ifindex].rxe = NULL;
+ net_info[ndev->ifindex].port = 1;
+ net_info[ndev->ifindex].ndev = ndev;
+ break;
+
+ case NETDEV_UNREGISTER:
+ if (net_info[ndev->ifindex].rxe) {
+ rxe = net_info[ndev->ifindex].rxe;
+ net_info[ndev->ifindex].rxe = NULL;
+ spin_unlock_bh(&net_info_lock);
+ rxe_remove(rxe);
+ spin_lock_bh(&net_info_lock);
+ }
+ net_info[ndev->ifindex].status = 0;
+ net_info[ndev->ifindex].port = 0;
+ net_info[ndev->ifindex].ndev = NULL;
+ break;
+
+ case NETDEV_UP:
+ rxe_net_up(ndev);
+ break;
+
+ case NETDEV_DOWN:
+ rxe_net_down(ndev);
+ break;
+
+ case NETDEV_CHANGEMTU:
+ rxe = net_to_rxe(ndev);
+ if (rxe) {
+ pr_info("rxe_net: %s changed mtu to %d\n",
+ ndev->name, ndev->mtu);
+ rxe_set_mtu(rxe, ndev->mtu, net_to_port(ndev));
+ }
+ break;
+
+ case NETDEV_REBOOT:
+ case NETDEV_CHANGE:
+ case NETDEV_GOING_DOWN:
+ case NETDEV_CHANGEADDR:
+ case NETDEV_CHANGENAME:
+ case NETDEV_FEAT_CHANGE:
+ default:
+ pr_info("rxe_net: ignoring netdev event = %ld for %s\n",
+ event, ndev->name);
+ break;
+ }
+ spin_unlock_bh(&net_info_lock);
+
+out:
+ return NOTIFY_OK;
+}
+
+static int rxe_net_rcv(struct sk_buff *skb,
+ struct net_device *ndev,
+ struct packet_type *ptype,
+ struct net_device *orig_dev)
+{
+ struct rxe_dev *rxe = net_to_rxe(ndev);
+ struct rxe_pkt_info *pkt = SKB_TO_PKT(skb);
+ int rc = 0;
+
+ if (!rxe)
+ goto drop;
+
+ /* TODO: We can receive packets in fragments. For now we
+ * linearize and it's costly because we may copy a lot of
+ * data. We should handle that case better. */
+ if (skb_linearize(skb))
+ goto drop;
+
+#if 0
+ /* Error injector */
+ {
+ static int x = 8;
+ static int counter;
+ counter++;
+
+ if (counter == x) {
+ x = 13-x; /* 8 or 5 */
+ counter = 0;
+ pr_debug("dropping one packet\n");
+ goto drop;
+ }
+ }
+#endif
+
+ skb = skb_share_check(skb, GFP_ATOMIC);
+ if (!skb) {
+ /* still return null */
+ goto out;
+ }
+
+ /* set required fields in pkt */
+ pkt->rxe = rxe;
+ pkt->port_num = net_to_port(ndev);
+ pkt->hdr = skb_network_header(skb);
+ pkt->mask = RXE_GRH_MASK;
+
+ rc = NF_HOOK(NFPROTO_RXE, NF_RXE_IN, skb, ndev, NULL, rxe_rcv);
+out:
+ return rc;
+
+drop:
+ kfree_skb(skb);
+ return 0;
+}
+
+static struct packet_type rxe_packet_type = {
+ .func = rxe_net_rcv,
+};
+
+static struct notifier_block rxe_net_notifier = {
+ .notifier_call = rxe_notify,
+};
+
+static int __init rxe_net_init(void)
+{
+ int err;
+
+ spin_lock_init(&net_info_lock);
+
+ if (rxe_eth_proto_id != ETH_P_RXE)
+ pr_info("rxe_net: protoid set to 0x%x\n",
+ rxe_eth_proto_id);
+
+ rxe_packet_type.type = cpu_to_be16(rxe_eth_proto_id);
+ dev_add_pack(&rxe_packet_type);
+
+ err = register_netdevice_notifier(&rxe_net_notifier);
+
+ pr_info("rxe_net: loaded\n");
+
+ return err;
+}
+
+static void __exit rxe_net_exit(void)
+{
+ unregister_netdevice_notifier(&rxe_net_notifier);
+ dev_remove_pack(&rxe_packet_type);
+
+ pr_info("rxe_net: unloaded\n");
+
+ return;
+}
+
+module_init(rxe_net_init);
+module_exit(rxe_net_exit);
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 41/44] rxe_net_sysfs.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (39 preceding siblings ...)
2011-07-01 13:19 ` [patch 40/44] rxe_net.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:19 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 42/44] rxe_sample.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (3 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:19 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch41 --]
[-- Type: text/plain, Size: 6292 bytes --]
sysfs interface for ib_rxe_net.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_net_sysfs.c | 220 ++++++++++++++++++++++++++++++
1 file changed, 220 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_net_sysfs.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_net_sysfs.c
@@ -0,0 +1,220 @@
+/*
+ * Copyright (c) 2009-2011 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "rxe.h"
+#include "rxe_net.h"
+
+/* Copy argument and remove trailing CR. Return the new length. */
+static int sanitize_arg(const char *val, char *intf, int intf_len)
+{
+ int len;
+
+ if (!val)
+ return 0;
+
+ /* Remove newline. */
+ for (len = 0; len < intf_len - 1 && val[len] && val[len] != '\n'; len++)
+ intf[len] = val[len];
+ intf[len] = 0;
+
+ if (len == 0 || (val[len] != 0 && val[len] != '\n'))
+ return 0;
+
+ return len;
+}
+
+/* Caller must hold net_info_lock */
+static void rxe_set_port_state(struct net_device *ndev)
+{
+ struct rxe_dev *rxe;
+
+ if (ndev->ifindex >= RXE_MAX_IF_INDEX)
+ goto out;
+
+ rxe = net_to_rxe(ndev);
+ if (!rxe)
+ goto out;
+
+ if (net_info[ndev->ifindex].status == IB_PORT_ACTIVE)
+ rxe_net_up(ndev);
+ else
+ rxe_net_down(ndev); /* down for unknown state */
+out:
+ return;
+}
+
+static int rxe_param_set_add(const char *val, struct kernel_param *kp)
+{
+ int i, len;
+ char intf[32];
+
+ len = sanitize_arg(val, intf, sizeof(intf));
+ if (!len) {
+ pr_err("rxe_net: add: invalid interface name\n");
+ return -EINVAL;
+ }
+
+ spin_lock_bh(&net_info_lock);
+ for (i = 0; i < RXE_MAX_IF_INDEX; i++) {
+ struct net_device *ndev = net_info[i].ndev;
+
+ if (ndev && (0 == strncmp(intf, ndev->name, len))) {
+ spin_unlock_bh(&net_info_lock);
+ if (net_info[i].rxe)
+ pr_info("rxe_net: already configured on %s\n",
+ intf);
+ else {
+ rxe_net_add(ndev);
+ if (net_info[i].rxe) {
+ rxe_set_port_state(ndev);
+ } else
+ pr_err("rxe_net: add appears to have "
+ "failed for %s (index %d)\n",
+ intf, i);
+ }
+ return 0;
+ }
+ }
+ spin_unlock_bh(&net_info_lock);
+
+ pr_warning("interface %s not found\n", intf);
+
+ return 0;
+}
+
+static void rxe_remove_all(void)
+{
+ int i;
+
+ spin_lock_bh(&net_info_lock);
+ for (i = 0; i < RXE_MAX_IF_INDEX; i++) {
+ if (net_info[i].rxe)
+ rxe_net_remove(i);
+ }
+ spin_unlock_bh(&net_info_lock);
+}
+
+static int rxe_param_set_remove(const char *val, struct kernel_param *kp)
+{
+ int i, len;
+ char intf[32];
+
+ len = sanitize_arg(val, intf, sizeof(intf));
+ if (!len) {
+ pr_err("rxe_net: remove: invalid interface name\n");
+ return -EINVAL;
+ }
+
+ if (strncmp("all", intf, len) == 0) {
+ pr_info("rxe_sys: remove all");
+ rxe_remove_all();
+ return 0;
+ }
+
+ spin_lock_bh(&net_info_lock);
+ for (i = 0; i < RXE_MAX_IF_INDEX; i++) {
+ if (!net_info[i].rxe || !net_info[i].ndev)
+ continue;
+
+ if (0 == strncmp(intf, net_info[i].rxe->ib_dev.name, len)) {
+ rxe_net_remove(i);
+ spin_unlock_bh(&net_info_lock);
+ return 0;
+ }
+ }
+ spin_unlock_bh(&net_info_lock);
+ pr_warning("rxe_sys: instance %s not found\n", intf);
+
+ return 0;
+}
+
+module_param_call(add, rxe_param_set_add, NULL, NULL, 0200);
+module_param_call(remove, rxe_param_set_remove, NULL, NULL, 0200);
+
+static int rxe_param_set_mtu(const char *val, struct kernel_param *kp)
+{
+ int i, rc, len;
+ int do_all = 0;
+ char intf[32];
+ char cmd[32];
+ int tmp_mtu;
+
+ len = sanitize_arg(val, intf, sizeof(intf));
+ if (!len)
+ return -EINVAL;
+
+ rc = sscanf(intf, "%s %d", cmd, &tmp_mtu);
+ if (rc != 2) {
+ pr_warning("rxe_net: mtu bogus input (%s)\n", intf);
+ goto out;
+ }
+
+ pr_info("set_mtu: %s %d\n", cmd, tmp_mtu);
+
+ if (!(is_power_of_2(tmp_mtu)
+ && (tmp_mtu >= 256)
+ && (tmp_mtu <= 4096))) {
+ pr_warning("rxe_net: bogus mtu (%s - %d pow2 %d)\n",
+ intf, tmp_mtu, is_power_of_2(tmp_mtu));
+ goto out;
+ }
+
+ tmp_mtu = rxe_mtu_int_to_enum(tmp_mtu);
+
+ if (strcmp("all", cmd) == 0)
+ do_all = 1;
+
+ spin_lock_bh(&net_info_lock);
+ for (i = 0; i < RXE_MAX_IF_INDEX; i++) {
+ if (net_info[i].rxe && net_info[i].ndev) {
+ if (do_all
+ || (0 == strncmp(cmd,
+ net_info[i].rxe->ib_dev.name,
+ len))) {
+ net_info[i].rxe->pref_mtu = tmp_mtu;
+
+ rxe_set_mtu(net_info[i].rxe,
+ net_info[i].ndev->mtu, 1);
+
+ if (!do_all)
+ break;
+ }
+ }
+ }
+ spin_unlock_bh(&net_info_lock);
+
+out:
+ return 0;
+}
+
+module_param_call(mtu, rxe_param_set_mtu, NULL, NULL, 0200);
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 42/44] rxe_sample.c
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (40 preceding siblings ...)
2011-07-01 13:19 ` [patch 41/44] rxe_net_sysfs.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:19 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 43/44] Makefile rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (2 subsequent siblings)
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:19 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch42 --]
[-- Type: text/plain, Size: 5957 bytes --]
module that implements a soft IB device using ib_rxe in loopback.
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/rxe_sample.c | 226 +++++++++++++++++++++++++++++++++
1 file changed, 226 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/rxe_sample.c
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/rxe_sample.c
@@ -0,0 +1,226 @@
+/*
+ * Copyright (c) 2009-2011 System Fabric Works, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/*
+ * sample driver for IB transport over rxe
+ * implements a simple loopback device on module load
+ */
+
+#include <linux/skbuff.h>
+
+#include <linux/device.h>
+
+#include "rxe.h"
+
+MODULE_AUTHOR("Bob Pearson");
+MODULE_DESCRIPTION("RDMA transport over Converged Enhanced Ethernet");
+MODULE_LICENSE("Dual BSD/GPL");
+
+static __be64 node_guid(struct rxe_dev *rxe)
+{
+ return 0x3333333333333333ULL;
+}
+
+static __be64 port_guid(struct rxe_dev *rxe, unsigned int port_num)
+{
+ return 0x4444444444444444ULL;
+}
+
+/*
+ * the ofed core requires that we provide a valid device
+ * object for registration
+ */
+static struct class *my_class;
+static struct device *my_dev;
+
+static struct device *dma_device(struct rxe_dev *rxe)
+{
+ return my_dev;
+}
+
+static int mcast_add(struct rxe_dev *rxe, union ib_gid *mgid)
+{
+ return 0;
+}
+
+static int mcast_delete(struct rxe_dev *rxe, union ib_gid *mgid)
+{
+ return 0;
+}
+
+/* just loopback packet */
+static int send(struct rxe_dev *rxe, struct sk_buff *skb)
+{
+ struct rxe_pkt_info *pkt = SKB_TO_PKT(skb);
+
+ pkt->rxe = rxe;
+ pkt->mask = RXE_LRH_MASK;
+
+ return rxe_rcv(skb);
+}
+
+static struct sk_buff *init_packet(struct rxe_dev *rxe, struct rxe_av *av,
+ int paylen)
+{
+ struct sk_buff *skb;
+ struct rxe_pkt_info *pkt;
+
+ paylen += RXE_LRH_BYTES;
+
+ if (av->attr.ah_flags & IB_AH_GRH)
+ paylen += RXE_GRH_BYTES;
+
+ skb = alloc_skb(paylen, GFP_ATOMIC);
+ if (!skb)
+ return NULL;
+
+ skb->dev = NULL;
+ skb->protocol = 0;
+
+ pkt = SKB_TO_PKT(skb);
+ pkt->rxe = rxe;
+ pkt->port_num = 1;
+ pkt->hdr = skb_put(skb, paylen);
+ pkt->mask = RXE_LRH_MASK;
+ if (av->attr.ah_flags & IB_AH_GRH)
+ pkt->mask |= RXE_GRH_MASK;
+
+ return skb;
+}
+
+static int init_av(struct rxe_dev *rxe, struct ib_ah_attr *attr,
+ struct rxe_av *av)
+{
+ if (!av->attr.dlid)
+ av->attr.dlid = 1;
+ return 0;
+}
+
+static char *parent_name(struct rxe_dev *rxe, unsigned int port_num)
+{
+ return "sample";
+}
+
+static enum rdma_link_layer link_layer(struct rxe_dev *rxe,
+ unsigned int port_num)
+{
+ return IB_LINK_LAYER_INFINIBAND;
+}
+
+static struct rxe_ifc_ops ifc_ops = {
+ .node_guid = node_guid,
+ .port_guid = port_guid,
+ .dma_device = dma_device,
+ .mcast_add = mcast_add,
+ .mcast_delete = mcast_delete,
+ .send = send,
+ .init_packet = init_packet,
+ .init_av = init_av,
+ .parent_name = parent_name,
+ .link_layer = link_layer,
+};
+
+static struct rxe_dev *rxe_sample;
+
+static int rxe_sample_add(void)
+{
+ int err;
+ struct rxe_port *port;
+
+ rxe_sample = (struct rxe_dev *)ib_alloc_device(sizeof(*rxe_sample));
+ if (!rxe_sample) {
+ err = -ENOMEM;
+ goto err1;
+ }
+
+ rxe_sample->ifc_ops = &ifc_ops;
+
+ err = rxe_add(rxe_sample, 4500);
+ if (err)
+ goto err2;
+
+ /* bit of a hack */
+ port = &rxe_sample->port[0];
+ port->attr.state = IB_PORT_ACTIVE;
+ port->attr.phys_state = 5;
+ port->attr.max_mtu = IB_MTU_4096;
+ port->attr.active_mtu = IB_MTU_4096;
+ port->mtu_cap = IB_MTU_4096;
+ port->attr.lid = 1;
+
+ pr_info("rxe_sample: added %s\n",
+ rxe_sample->ib_dev.name);
+ return 0;
+
+err2:
+ ib_dealloc_device(&rxe_sample->ib_dev);
+err1:
+ return err;
+}
+
+static void rxe_sample_remove(void)
+{
+ if (!rxe_sample)
+ goto done;
+
+ rxe_remove(rxe_sample);
+
+ pr_info("rxe_sample: removed %s\n",
+ rxe_sample->ib_dev.name);
+done:
+ return;
+}
+
+static int __init rxe_sample_init(void)
+{
+ int err;
+
+ rxe_crc_disable = 1;
+
+ my_class = class_create(THIS_MODULE, "foo");
+ my_dev = device_create(my_class, NULL, 0, NULL, "bar");
+
+ err = rxe_sample_add();
+ return err;
+}
+
+static void __exit rxe_sample_exit(void)
+{
+ rxe_sample_remove();
+
+ device_destroy(my_class, my_dev->devt);
+ class_destroy(my_class);
+ return;
+}
+
+module_init(rxe_sample_init);
+module_exit(rxe_sample_exit);
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 43/44] Makefile
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (41 preceding siblings ...)
2011-07-01 13:19 ` [patch 42/44] rxe_sample.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:19 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 44/44] Kconfig rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
[not found] ` <20110701131821.928693424-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
44 siblings, 0 replies; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:19 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch43 --]
[-- Type: text/plain, Size: 1489 bytes --]
Makefile
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/hw/rxe/Makefile | 46 +++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/Makefile
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/Makefile
@@ -0,0 +1,46 @@
+
+EXTRA_PRE_CFLAGS := -D__KERNEL__ $(BACKPORTS) $(DEBUG) \
+ -I/usr/src/ofa_kernel/include
+
+obj-m += ib_rxe.o ib_rxe_net.o ib_rxe_sample.o
+
+ib_rxe-y := \
+ rxe.o \
+ rxe_comp.o \
+ rxe_req.o \
+ rxe_resp.o \
+ rxe_recv.o \
+ rxe_pool.o \
+ rxe_queue.o \
+ rxe_verbs.o \
+ rxe_av.o \
+ rxe_srq.o \
+ rxe_qp.o \
+ rxe_cq.o \
+ rxe_mr.o \
+ rxe_dma.o \
+ rxe_opcode.o \
+ rxe_mmap.o \
+ rxe_arbiter.o \
+ rxe_sb8.o \
+ rxe_mcast.o \
+ rxe_task.o
+
+ib_rxe_net-y := \
+ rxe_net.o \
+ rxe_net_sysfs.o
+
+ib_rxe_sample-y := \
+ rxe_sample.o
+
+# Generation of SB8 tables for Ethernet CRC32
+hostprogs-y := gen_sb8tables
+clean-files := sb8_tables.h
+
+$(obj)/rxe_sb8.o: $(obj)/sb8_tables.h
+
+quiet_cmd_sb8 = GEN $@
+ cmd_sb8 = $< --flop --swip --flip --swap --poly=0xedb88320 > $@
+
+$(obj)/sb8_tables.h: $(obj)/gen_sb8tables
+ $(call cmd,sb8)
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* [patch 44/44] Kconfig
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
` (42 preceding siblings ...)
2011-07-01 13:19 ` [patch 43/44] Makefile rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
@ 2011-07-01 13:19 ` rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
[not found] ` <20110701132202.824114065-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
[not found] ` <20110701131821.928693424-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
44 siblings, 1 reply; 53+ messages in thread
From: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5 @ 2011-07-01 13:19 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Bob Pearson
[-- Attachment #1: patch44 --]
[-- Type: text/plain, Size: 2771 bytes --]
Kconfig file
Signed-off-by: Bob Pearson <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
---
drivers/infiniband/Kconfig | 1 +
drivers/infiniband/Makefile | 1 +
drivers/infiniband/hw/rxe/Kconfig | 26 ++++++++++++++++++++++++++
3 files changed, 28 insertions(+)
Index: infiniband/drivers/infiniband/hw/rxe/Kconfig
===================================================================
--- /dev/null
+++ infiniband/drivers/infiniband/hw/rxe/Kconfig
@@ -0,0 +1,26 @@
+config INFINIBAND_RXE
+ tristate "Software RDMA driver"
+ depends on 64BIT && NET
+ ---help---
+ This is a driver for a software implementation of IBTA
+ RDMA transport.
+
+ There are three kernel modules:
+
+ ib_rxe - device independant transport driver.
+ ib_rxe_net - connects transport to the netdev stack
+ and runs on Ethernet devices. Follows the
+ RoCE protocol i.e. GRH and no LRH.
+ ib_rxe_sample - sample module that connects the transport
+ to itself as a loopback and follows the
+ InfiniBand protocol i.e. uses LRH. This
+ could be used as a start for other experimantal
+ software implementations of InfiniBand.
+
+ Normal use is to load ib_rxe and ib_rxe_net after loading ib_core.
+ There is a script rxe_cfg that automates the configuration of rxe.
+
+ This driver supports kernel and user space ULPs. For user space
+ verbs applications you must install librxe with libibverbs.
+
+ If you don't know what to use this for, you don't need it.
Index: infiniband/drivers/infiniband/Kconfig
===================================================================
--- infiniband.orig/drivers/infiniband/Kconfig
+++ infiniband/drivers/infiniband/Kconfig
@@ -51,6 +51,7 @@ source "drivers/infiniband/hw/cxgb3/Kcon
source "drivers/infiniband/hw/cxgb4/Kconfig"
source "drivers/infiniband/hw/mlx4/Kconfig"
source "drivers/infiniband/hw/nes/Kconfig"
+source "drivers/infiniband/hw/rxe/Kconfig"
source "drivers/infiniband/ulp/ipoib/Kconfig"
Index: infiniband/drivers/infiniband/Makefile
===================================================================
--- infiniband.orig/drivers/infiniband/Makefile
+++ infiniband/drivers/infiniband/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_INFINIBAND_CXGB3) += hw/cx
obj-$(CONFIG_INFINIBAND_CXGB4) += hw/cxgb4/
obj-$(CONFIG_MLX4_INFINIBAND) += hw/mlx4/
obj-$(CONFIG_INFINIBAND_NES) += hw/nes/
+obj-$(CONFIG_INFINIBAND_RXE) += hw/rxe/
obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/
obj-$(CONFIG_INFINIBAND_SRP) += ulp/srp/
obj-$(CONFIG_INFINIBAND_ISER) += ulp/iser/
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [patch 37/44] rxe_sb8.c
[not found] ` <20110701132202.486466900-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
@ 2011-07-01 22:03 ` Roland Dreier
[not found] ` <CAL1RGDWLRw26RUr=WjUgzrn=6aGBukghoyOm5ordJv72mSsTBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 53+ messages in thread
From: Roland Dreier @ 2011-07-01 22:03 UTC (permalink / raw)
To: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Fri, Jul 1, 2011 at 6:18 AM, <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org> wrote:
> Slice by 8 implementation of CRC32.
> The code code is similar to the kernel provided crc32
> calculation except runs about 3X faster which allows us to
> get to ~1GB/sec.
Wouldn't the sane thing to do be to fix lib/crc32.c instead?
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [patch 37/44] rxe_sb8.c
[not found] ` <CAL1RGDWLRw26RUr=WjUgzrn=6aGBukghoyOm5ordJv72mSsTBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-07-01 23:09 ` Bob Pearson
2011-07-02 16:55 ` David Dillow
0 siblings, 1 reply; 53+ messages in thread
From: Bob Pearson @ 2011-07-01 23:09 UTC (permalink / raw)
To: 'Roland Dreier'; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hi Roland,
Perhaps. We have some ICRC specific APIs as well as plain Jane CRC
calculation that includes a copy at the same time.
If I knew who maintained crc32.c I would be happy to talk to them and see
what they think. The reason I wrote this
Is that crc32.c takes about 6 clocks per byte and this one takes < 2 clocks
per byte. We couldn't get over 200-300 MB/sec
With crc32.c and this algorithm was hitting 900MB/sec. Sandy Bridge has a
generic CRC instruction that should be able to
Reduce the time to nothing on top of a copy I have heard.
Bob
-----Original Message-----
From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Roland Dreier
Sent: Friday, July 01, 2011 5:04 PM
To: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [patch 37/44] rxe_sb8.c
On Fri, Jul 1, 2011 at 6:18 AM, <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org> wrote:
> Slice by 8 implementation of CRC32.
> The code code is similar to the kernel provided crc32 calculation
> except runs about 3X faster which allows us to get to ~1GB/sec.
Wouldn't the sane thing to do be to fix lib/crc32.c instead?
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at
http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [patch 00/44] RDMA over Ethernet
[not found] ` <20110701131821.928693424-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
@ 2011-07-02 9:42 ` Bart Van Assche
0 siblings, 0 replies; 53+ messages in thread
From: Bart Van Assche @ 2011-07-02 9:42 UTC (permalink / raw)
To: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Fri, Jul 1, 2011 at 3:18 PM, <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org> wrote:
> Copies of the user space library and tools for 'upstream' and a tar file of
> these patches are available at support.systemfabricworks.com/downloads/rxe.
I have found a user space library at the mentioned location, but no
tar file with patches ?
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [patch 37/44] rxe_sb8.c
2011-07-01 23:09 ` Bob Pearson
@ 2011-07-02 16:55 ` David Dillow
[not found] ` <1309625729.20982.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
0 siblings, 1 reply; 53+ messages in thread
From: David Dillow @ 2011-07-02 16:55 UTC (permalink / raw)
To: Bob Pearson; +Cc: 'Roland Dreier', linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Fri, 2011-07-01 at 18:09 -0500, Bob Pearson wrote:
> Hi Roland,
>
> Perhaps. We have some ICRC specific APIs as well as plain Jane CRC
> calculation that includes a copy at the same time.
> If I knew who maintained crc32.c I would be happy to talk to them and see
> what they think.
No one is really the maintainer there. 'git log' says Joakim Tjernlund
has touched it quite a bit, but your best bet is to post patches to
LKML, and perhaps Andrew Morton.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [patch 34/44] rxe_arbiter.c
[not found] ` <20110701132202.342196794-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
@ 2011-07-03 7:52 ` Bart Van Assche
[not found] ` <CAO+b5-oh8Qavo1q5n1-KKa9XVmLcJ0s7ZZswC1Efp+gOgOAqQw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 53+ messages in thread
From: Bart Van Assche @ 2011-07-03 7:52 UTC (permalink / raw)
To: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Fri, Jul 1, 2011 at 3:18 PM, <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org> wrote:
>
> +int rxe_arbiter(void *arg)
> +{
> + int err;
> + unsigned long flags;
> + struct rxe_dev *rxe = (struct rxe_dev *)arg;
> + struct sk_buff *skb;
> + struct list_head *qpl;
> + struct rxe_qp *qp;
> +
> + /* get the next qp's send queue */
> + spin_lock_irqsave(&rxe->arbiter.list_lock, flags);
> + if (list_empty(&rxe->arbiter.qp_list)) {
> + spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
> + return 1;
> + }
> +
> + qpl = rxe->arbiter.qp_list.next;
> + list_del_init(qpl);
> + qp = list_entry(qpl, struct rxe_qp, arbiter_list);
> + spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
> +
> + /* get next packet from queue and try to send it
> + note skb could have already been removed */
> + skb = skb_dequeue(&qp->send_pkts);
> + if (skb) {
> + err = xmit_one_packet(rxe, qp, skb);
> + if (err) {
> + if (err == RXE_QUEUE_STOPPED)
> + skb_queue_head(&qp->send_pkts, skb);
> + rxe_run_task(&rxe->arbiter.task, 1);
> + spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
> + return 1;
> + }
> + }
Can you please run these patches through sparse ? Sparse complains
about the above code:
$ make C=2 M=drivers/infiniband/hw/rxe
[ ... ]
drivers/infiniband/hw/rxe/rxe_arbiter.c:155:25: warning: context
imbalance in 'rxe_arbiter' - unexpected unlock
[ ... ]
Also, several complaints are reported with endianness checking enabled
(make CF=-D__CHECK_ENDIAN__ C=2 M=drivers/infiniband/hw/rxe).
Checkpatch is complaining too:
$ git show HEAD | scripts/checkpatch.pl - -nosignoff
WARNING: suspect code indent for conditional statements (8, 10)
#1061: FILE: drivers/infiniband/hw/rxe/rxe_arbiter.c:69:
+ if (is_request) {
+ if (qp->req.state != QP_STATE_READY)
WARNING: suspect code indent for conditional statements (10, 12)
#1062: FILE: drivers/infiniband/hw/rxe/rxe_arbiter.c:70:
+ if (qp->req.state != QP_STATE_READY)
+ goto drop;
WARNING: suspect code indent for conditional statements (10, 12)
#1065: FILE: drivers/infiniband/hw/rxe/rxe_arbiter.c:73:
+ if (qp->resp.state != QP_STATE_READY)
+ goto drop;
WARNING: msleep < 20ms can sleep for up to 20ms; see
Documentation/timers/timers-howto.txt
+ msleep(1);
total: 0 errors, 4 warnings, 14803 lines checked
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [patch 34/44] rxe_arbiter.c
[not found] ` <CAO+b5-oh8Qavo1q5n1-KKa9XVmLcJ0s7ZZswC1Efp+gOgOAqQw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-07-03 15:21 ` Bob Pearson
0 siblings, 0 replies; 53+ messages in thread
From: Bob Pearson @ 2011-07-03 15:21 UTC (permalink / raw)
To: 'Bart Van Assche'; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
A regression slipped in. We had been clean on sparse and checkpatch except
for the msleep which is not ciritical to performance.
CHECK_ENDIAN is new to me. That will help.
-----Original Message-----
From: bart.vanassche-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org [mailto:bart.vanassche-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org] On Behalf
Of Bart Van Assche
Sent: Sunday, July 03, 2011 2:53 AM
To: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [patch 34/44] rxe_arbiter.c
On Fri, Jul 1, 2011 at 3:18 PM, <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org> wrote:
>
> +int rxe_arbiter(void *arg)
> +{
> + int err;
> + unsigned long flags;
> + struct rxe_dev *rxe = (struct rxe_dev *)arg;
> + struct sk_buff *skb;
> + struct list_head *qpl;
> + struct rxe_qp *qp;
> +
> + /* get the next qp's send queue */
> + spin_lock_irqsave(&rxe->arbiter.list_lock, flags);
> + if (list_empty(&rxe->arbiter.qp_list)) {
> + spin_unlock_irqrestore(&rxe->arbiter.list_lock,
> + flags);
> + return 1;
> + }
> +
> + qpl = rxe->arbiter.qp_list.next;
> + list_del_init(qpl);
> + qp = list_entry(qpl, struct rxe_qp, arbiter_list);
> + spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
> +
> + /* get next packet from queue and try to send it
> + note skb could have already been removed */
> + skb = skb_dequeue(&qp->send_pkts);
> + if (skb) {
> + err = xmit_one_packet(rxe, qp, skb);
> + if (err) {
> + if (err == RXE_QUEUE_STOPPED)
> + skb_queue_head(&qp->send_pkts, skb);
> + rxe_run_task(&rxe->arbiter.task, 1);
> +
> + spin_unlock_irqrestore(&rxe->arbiter.list_lock, flags);
> + return 1;
> + }
> + }
Can you please run these patches through sparse ? Sparse complains about the
above code:
$ make C=2 M=drivers/infiniband/hw/rxe
[ ... ]
drivers/infiniband/hw/rxe/rxe_arbiter.c:155:25: warning: context imbalance
in 'rxe_arbiter' - unexpected unlock [ ... ]
Also, several complaints are reported with endianness checking enabled (make
CF=-D__CHECK_ENDIAN__ C=2 M=drivers/infiniband/hw/rxe).
Checkpatch is complaining too:
$ git show HEAD | scripts/checkpatch.pl - -nosignoff
WARNING: suspect code indent for conditional statements (8, 10)
#1061: FILE: drivers/infiniband/hw/rxe/rxe_arbiter.c:69:
+ if (is_request) {
+ if (qp->req.state != QP_STATE_READY)
WARNING: suspect code indent for conditional statements (10, 12)
#1062: FILE: drivers/infiniband/hw/rxe/rxe_arbiter.c:70:
+ if (qp->req.state != QP_STATE_READY)
+ goto drop;
WARNING: suspect code indent for conditional statements (10, 12)
#1065: FILE: drivers/infiniband/hw/rxe/rxe_arbiter.c:73:
+ if (qp->resp.state != QP_STATE_READY)
+ goto drop;
WARNING: msleep < 20ms can sleep for up to 20ms; see
Documentation/timers/timers-howto.txt
+ msleep(1);
total: 0 errors, 4 warnings, 14803 lines checked
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* RE: [patch 37/44] rxe_sb8.c
[not found] ` <1309625729.20982.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2011-07-14 5:03 ` Bob Pearson
0 siblings, 0 replies; 53+ messages in thread
From: Bob Pearson @ 2011-07-14 5:03 UTC (permalink / raw)
To: 'David Dillow', 'Roland Dreier',
linux-rdma-u79uwXL29TY76Z2rM5mHXA
David, Roland,
After some travel I got back to looking at the CRC32 calculation. I have three test systems: x86_64, x86_32 and Sparc/64. If I am going to replace lib/crc32.c I need to implement both the LE and BE (bit *not* byte order) APIs. Ethernet and InfiniBand use the LE bit order, a minority of device drivers (e.g. ATM and Firewire) use the BE bit order. It turns out that the algorithm in lib/crc32.c has improved since the last time I looked and is now a, sort of, slice by 4 equivalent to the slice by 8 algorithm. On the x86_64 system with current gcc -O3 I measure 2.8 cycles per byte with lib/crc32.c and 1.5 cycles/byte using rxe_sb8.c. I should have replacements for the standard routine in a day or two. Once it's working and tested I'll try to touch base with Andrew.
Bob
-----Original Message-----
From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of David Dillow
Sent: Saturday, July 02, 2011 11:55 AM
To: Bob Pearson
Cc: 'Roland Dreier'; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: RE: [patch 37/44] rxe_sb8.c
On Fri, 2011-07-01 at 18:09 -0500, Bob Pearson wrote:
> Hi Roland,
>
> Perhaps. We have some ICRC specific APIs as well as plain Jane CRC
> calculation that includes a copy at the same time.
> If I knew who maintained crc32.c I would be happy to talk to them and
> see what they think.
No one is really the maintainer there. 'git log' says Joakim Tjernlund has touched it quite a bit, but your best bet is to post patches to LKML, and perhaps Andrew Morton.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [patch 44/44] Kconfig
[not found] ` <20110701132202.824114065-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
@ 2011-07-22 13:32 ` Bart Van Assche
0 siblings, 0 replies; 53+ messages in thread
From: Bart Van Assche @ 2011-07-22 13:32 UTC (permalink / raw)
To: rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Fri, Jul 1, 2011 at 3:19 PM, <rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org> wrote:
> +++ infiniband/drivers/infiniband/hw/rxe/Kconfig
> @@ -0,0 +1,26 @@
> +config INFINIBAND_RXE
> + tristate "Software RDMA driver"
> + depends on 64BIT && NET
> + ---help---
> + This is a driver for a software implementation of IBTA
> + RDMA transport.
> +
> + There are three kernel modules:
> +
> + ib_rxe - device independant transport driver.
> + ib_rxe_net - connects transport to the netdev stack
> + and runs on Ethernet devices. Follows the
> + RoCE protocol i.e. GRH and no LRH.
> + ib_rxe_sample - sample module that connects the transport
> + to itself as a loopback and follows the
> + InfiniBand protocol i.e. uses LRH. This
> + could be used as a start for other experimantal
> + software implementations of InfiniBand.
> +
> + Normal use is to load ib_rxe and ib_rxe_net after loading ib_core.
> + There is a script rxe_cfg that automates the configuration of rxe.
> +
> + This driver supports kernel and user space ULPs. For user space
> + verbs applications you must install librxe with libibverbs.
> +
> + If you don't know what to use this for, you don't need it.
Maybe it's a good idea to add more information in the above text about
RoCE and to change the above text such that people are invited to
learn more about RDMA ? As an example, this is how the PPPOL2TP
Kconfig entry looks like:
config PPPOL2TP
tristate "PPP over L2TP (EXPERIMENTAL)"
depends on EXPERIMENTAL && L2TP && PPP
help
Support for PPP-over-L2TP socket family. L2TP is a protocol
used by ISPs and enterprises to tunnel PPP traffic over UDP
tunnels. L2TP is replacing PPTP for VPN uses.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2011-07-22 13:32 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-01 13:18 [patch 00/44] RDMA over Ethernet rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 01/44] ib_pack.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 02/44] rxe_hdr.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 03/44] rxe_opcode.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 04/44] rxe_opcode.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 05/44] rxe_param.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 06/44] rxe.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 07/44] rxe_loc.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 08/44] rxe_mmap.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 09/44] rxe_mmap.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 10/44] rxe_queue.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 11/44] rxe_queue.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 12/44] rxe_verbs.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 13/44] rxe_verbs.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 14/44] rxe_pool.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 15/44] rxe_pool.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 16/44] rxe_task.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 17/44] rxe_task.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 18/44] rxe_av.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 19/44] rxe_av.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 20/44] rxe_srq.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 21/44] rxe_srq.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 22/44] rxe_cq.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 23/44] rxe_cq.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 24/44] rxe_qp.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 25/44] rxe_qp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 26/44] rxe_mr.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 27/44] rxe_mr.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 28/44] rxe_mcast.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 29/44] rxe_mcast.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 30/44] rxe_recv.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 31/44] rxe_comp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 32/44] rxe_req.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 33/44] rxe_resp.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 34/44] rxe_arbiter.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
[not found] ` <20110701132202.342196794-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
2011-07-03 7:52 ` Bart Van Assche
[not found] ` <CAO+b5-oh8Qavo1q5n1-KKa9XVmLcJ0s7ZZswC1Efp+gOgOAqQw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-07-03 15:21 ` Bob Pearson
2011-07-01 13:18 ` [patch 35/44] rxe_dma.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 36/44] gen_sb8tables.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:18 ` [patch 37/44] rxe_sb8.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
[not found] ` <20110701132202.486466900-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
2011-07-01 22:03 ` Roland Dreier
[not found] ` <CAL1RGDWLRw26RUr=WjUgzrn=6aGBukghoyOm5ordJv72mSsTBA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-07-01 23:09 ` Bob Pearson
2011-07-02 16:55 ` David Dillow
[not found] ` <1309625729.20982.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-07-14 5:03 ` Bob Pearson
2011-07-01 13:18 ` [patch 38/44] rxe.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 39/44] rxe_net.h rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 40/44] rxe_net.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 41/44] rxe_net_sysfs.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 42/44] rxe_sample.c rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 43/44] Makefile rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
2011-07-01 13:19 ` [patch 44/44] Kconfig rpearson-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5
[not found] ` <20110701132202.824114065-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
2011-07-22 13:32 ` Bart Van Assche
[not found] ` <20110701131821.928693424-klaOcWyJdxkshyMvu7JE4pqQE7yCjDx5@public.gmane.org>
2011-07-02 9:42 ` [patch 00/44] RDMA over Ethernet Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).