* [PATCH v2 10/11] IB/IPoIB: Modify ipoib_get_net_dev_by_params to lookup gid table
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
ipoib_get_net_dev_by_params compares incoming gid with local_gid
which is gid at index 0 of the gid table. OPA devices using larger
LIDs may have a different GID format than whats setup in the local_gid
field. Do a search of the gid table in those cases.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/ulp/ipoib/ipoib_main.c | 37 +++++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 474c3bf..3e135df 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -328,6 +328,40 @@ static struct net_device *ipoib_get_net_dev_match_addr(
return result;
}
+/* retuns true if the incoming gid is assigned to the IPoIB
+ * netdev interface
+ *
+ * OPA devices may have the incoming GID in the OPA GID
+ * format which might not necessarily be assigned to the
+ * netdev interface. This necessitates searching the GID
+ * table to match this OPA GID.
+ */
+static bool ipoib_check_gid(struct ipoib_dev_priv *priv,
+ const union ib_gid *gid)
+{
+ bool is_local_gid;
+ struct ib_port_attr attr;
+ union ib_gid port_gid;
+ int i;
+
+ if (!gid)
+ return true;
+
+ is_local_gid = !memcmp(gid, &priv->local_gid, sizeof(*gid));
+
+ if (!rdma_cap_opa_ah(priv->ca, priv->port) || is_local_gid)
+ return is_local_gid;
+
+ if (ib_query_port(priv->ca, priv->port, &attr))
+ return false;
+ for (i = 1; i < attr.gid_tbl_len; i++) {
+ if (ib_query_gid(priv->ca, priv->port, i, &port_gid, NULL))
+ return false;
+ if (!memcmp(gid, &port_gid, sizeof(*gid)))
+ return true;
+ }
+ return false;
+}
/* returns the number of IPoIB netdevs on top a given ipoib device matching a
* pkey_index and address, if one exists.
*
@@ -344,8 +378,7 @@ static int ipoib_match_gid_pkey_addr(struct ipoib_dev_priv *priv,
struct net_device *net_dev = NULL;
int matches = 0;
- if (priv->pkey_index == pkey_index &&
- (!gid || !memcmp(gid, &priv->local_gid, sizeof(*gid)))) {
+ if (priv->pkey_index == pkey_index && ipoib_check_gid(priv, gid)) {
if (!addr) {
net_dev = ipoib_get_master_net_dev(priv->dev);
} else {
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 09/11] IB/IPoIB: Retrieve 32 bit LIDs from path records when running on OPA devices
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Path record responses will contain the 32 bit LID information in the
SGID and DGID field of the responses. Modify IPoIB to use these extended
LIDs in datagram and connected mode communication.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/ulp/ipoib/ipoib.h | 4 +++-
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 11 +++++++++++
drivers/infiniband/ulp/ipoib/ipoib_main.c | 26 ++++++++++++++++++++++++++
drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 2 +-
include/rdma/opa_addr.h | 12 ++++++++++++
5 files changed, 53 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index 7b8d2d9..fad4560 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -352,7 +352,7 @@ struct ipoib_dev_priv {
u32 qkey;
union ib_gid local_gid;
- u16 local_lid;
+ u32 local_lid;
unsigned int admin_mtu;
unsigned int mcast_mtu;
@@ -421,6 +421,8 @@ struct ipoib_path {
struct rb_node rb_node;
struct list_head list;
int valid;
+ u32 dlid;
+ u32 slid;
};
struct ipoib_neigh {
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 4ad297d..c27df76 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -38,6 +38,7 @@
#include <linux/slab.h>
#include <linux/vmalloc.h>
#include <linux/moduleparam.h>
+#include <rdma/opa_addr.h>
#include "ipoib.h"
@@ -1356,6 +1357,16 @@ static void ipoib_cm_tx_start(struct work_struct *work)
}
memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);
+ if (rdma_cap_opa_ah(priv->ca, priv->port)) {
+ if (p->path->dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE))
+ pathrec.dgid.global.interface_id =
+ OPA_MAKE_ID(p->path->dlid);
+
+ if (p->path->slid >= be16_to_cpu(IB_MULTICAST_LID_BASE))
+ pathrec.sgid.global.interface_id =
+ OPA_MAKE_ID(p->path->slid);
+ }
+
spin_unlock_irqrestore(&priv->lock, flags);
netif_tx_unlock_bh(dev);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 5636fc3..474c3bf 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -52,6 +52,7 @@
#include <linux/inetdevice.h>
#include <rdma/ib_cache.h>
#include <linux/pci.h>
+#include <rdma/opa_addr.h>
#define DRV_VERSION "1.0.0"
@@ -766,6 +767,31 @@ static void path_rec_completion(int status,
spin_lock_irqsave(&priv->lock, flags);
if (!IS_ERR_OR_NULL(ah)) {
+ /*
+ * Extended LIDs might get programmed into GIDs in the
+ * case of OPA devices. Since we have created the ah
+ * above which would have made use of the lids, now is
+ * a good time to change them back to regular GIDs after
+ * saving the extended LIDs.
+ */
+ if (rdma_cap_opa_ah(priv->ca, priv->port) &&
+ ib_is_opa_gid(&pathrec->sgid)) {
+ path->slid = opa_get_lid_from_gid(&pathrec->sgid);
+ pathrec->sgid = path->pathrec.sgid;
+ } else {
+ path->slid = be16_to_cpu(pathrec->slid);
+ }
+
+ if (rdma_cap_opa_ah(priv->ca, priv->port) &&
+ ib_is_opa_gid(&pathrec->dgid)) {
+ path->dlid = opa_get_lid_from_gid(&pathrec->dgid);
+ pathrec->dgid = path->pathrec.dgid;
+ } else {
+ path->dlid = be16_to_cpu(pathrec->dlid);
+ }
+ ipoib_dbg(priv, "PathRec SGID %pI6 DGID %pI6\n",
+ pathrec->sgid.raw, pathrec->dgid.raw);
+
path->pathrec = *pathrec;
old_ah = path->ah;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index bff73b5..d3394b6 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -581,7 +581,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
port_attr.state);
return;
}
- priv->local_lid = (u16)port_attr.lid;
+ priv->local_lid = port_attr.lid;
netif_addr_lock_bh(dev);
if (!test_bit(IPOIB_FLAG_DEV_ADDR_SET, &priv->flags)) {
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index 5c713bc..d0a37d0 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -53,4 +53,16 @@ static inline bool ib_is_opa_gid(union ib_gid *gid)
return ((be64_to_cpu(gid->global.interface_id) >> 40) ==
OPA_SPECIAL_OUI);
}
+
+/**
+ * opa_get_lid_from_gid: Returns the last 32 bits of the gid.
+ * OPA devices use one of the gids in the gid table to also
+ * store the lid.
+ *
+ * @gid: The Global identifier
+ */
+static inline u32 opa_get_lid_from_gid(union ib_gid *gid)
+{
+ return be64_to_cpu(gid->global.interface_id) & 0xFFFFFFFF;
+}
#endif /* OPA_ADDR_H */
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 08/11] IB/SA: Program extended LID in SM Address handle
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reuse port_attr.grh_required field to let IB SA know that extended LID
information needs to be set in the SM Address handle for OPA devices
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/sa_query.c | 8 +++++++-
include/rdma/opa_addr.h | 1 +
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 81b742c..d59ee67 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -50,6 +50,7 @@
#include <uapi/rdma/ib_user_sa.h>
#include <rdma/ib_marshall.h>
#include <rdma/ib_addr.h>
+#include <rdma/opa_addr.h>
#include "sa.h"
#include "core_priv.h"
@@ -964,7 +965,12 @@ static void update_sm_ah(struct work_struct *work)
if (port_attr.grh_required) {
ah_attr.ah_flags = IB_AH_GRH;
ah_attr.grh.dgid.global.subnet_prefix = cpu_to_be64(port_attr.subnet_prefix);
- ah_attr.grh.dgid.global.interface_id = cpu_to_be64(IB_SA_WELL_KNOWN_GUID);
+ if (rdma_cap_opa_ah(port->agent->device, port->port_num))
+ ah_attr.grh.dgid.global.interface_id =
+ OPA_MAKE_ID(ah_attr.dlid);
+ else
+ ah_attr.grh.dgid.global.interface_id =
+ cpu_to_be64(IB_SA_WELL_KNOWN_GUID);
}
new_ah->ah = ib_create_ah(port->agent->qp->pd, &ah_attr);
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index 3e22937..5c713bc 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -38,6 +38,7 @@
#define OPA_TO_IB_UCAST_LID(x) (((x) >= be16_to_cpu(IB_MULTICAST_LID_BASE)) \
? 0 : x)
#define OPA_SPECIAL_OUI (0x00066AULL)
+#define OPA_MAKE_ID(x) (cpu_to_be64(OPA_SPECIAL_OUI << 40 | (x)))
/**
* ib_is_opa_gid: Returns true if the top 24 bits of the gid
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 07/11] IB/mad: Change slid in RMPP recv from 16 to 32 bits
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
OPA devices will contain larger lids in the wc.slid
which is now 32 bits. This change ensures RMPP handler
is able to retrieve the correct lid.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/mad_rmpp.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 8a076e1..057c1ec 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -64,7 +64,7 @@ struct mad_rmpp_recv {
__be64 tid;
u32 src_qp;
- u16 slid;
+ u32 slid;
u8 mgmt_class;
u8 class_version;
u8 method;
@@ -316,7 +316,7 @@ static void recv_cleanup_handler(struct work_struct *work)
mad_hdr = &mad_recv_wc->recv_buf.mad->mad_hdr;
rmpp_recv->tid = mad_hdr->tid;
rmpp_recv->src_qp = mad_recv_wc->wc->src_qp;
- rmpp_recv->slid = (u16)mad_recv_wc->wc->slid;
+ rmpp_recv->slid = mad_recv_wc->wc->slid;
rmpp_recv->mgmt_class = mad_hdr->mgmt_class;
rmpp_recv->class_version = mad_hdr->class_version;
rmpp_recv->method = mad_hdr->method;
@@ -337,7 +337,7 @@ static void recv_cleanup_handler(struct work_struct *work)
list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) {
if (rmpp_recv->tid == mad_hdr->tid &&
rmpp_recv->src_qp == mad_recv_wc->wc->src_qp &&
- rmpp_recv->slid == (u16)mad_recv_wc->wc->slid &&
+ rmpp_recv->slid == mad_recv_wc->wc->slid &&
rmpp_recv->mgmt_class == mad_hdr->mgmt_class &&
rmpp_recv->class_version == mad_hdr->class_version &&
rmpp_recv->method == mad_hdr->method)
@@ -870,7 +870,7 @@ static int init_newwin(struct ib_mad_send_wr_private *mad_send_wr)
if (ib_query_ah(mad_send_wr->send_buf.ah, &ah_attr))
continue;
- if (rmpp_recv->slid == (u16)ah_attr.dlid) {
+ if (rmpp_recv->slid == ah_attr.dlid) {
newwin = rmpp_recv->repwin;
break;
}
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 06/11] IB/mad: Ensure DR MADs are correctly specified when using OPA devices
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Pure DR MADs do not need OPA GIDs to be specified in the GRH since
they do not rely on LID information.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/mad.c | 104 +++++++++++++++++++++++++++++++++++++-----
include/rdma/opa_addr.h | 17 +++++++
2 files changed, 109 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 40cbd6b..c0ee997 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -41,6 +41,7 @@
#include <linux/slab.h>
#include <linux/module.h>
#include <rdma/ib_cache.h>
+#include <rdma/opa_addr.h>
#include "mad_priv.h"
#include "mad_rmpp.h"
@@ -731,6 +732,80 @@ static size_t mad_priv_dma_size(const struct ib_mad_private *mp)
return sizeof(struct ib_grh) + mp->mad_size;
}
+static int verify_mad_ah(struct ib_mad_agent_private *mad_agent_priv,
+ struct ib_mad_send_wr_private *mad_send_wr)
+{
+ struct ib_device *ib_dev = mad_agent_priv->qp_info->port_priv->device;
+ u8 port = mad_agent_priv->qp_info->port_priv->port_num;
+ struct ib_smp *smp = mad_send_wr->send_buf.mad;
+ struct opa_smp *opa_smp = (struct opa_smp *)smp;
+ u32 opa_drslid = be32_to_cpu(opa_smp->route.dr.dr_slid);
+ u32 opa_drdlid = be32_to_cpu(opa_smp->route.dr.dr_dlid);
+
+ bool dr_slid_is_permissive = (OPA_LID_PERMISSIVE ==
+ opa_smp->route.dr.dr_slid) ? true : false;
+ bool dr_dlid_is_permissive = (OPA_LID_PERMISSIVE ==
+ opa_smp->route.dr.dr_dlid) ? true : false;
+ bool drslid_is_ib_ucast = (opa_drslid <
+ be16_to_cpu(IB_MULTICAST_LID_BASE)) ?
+ true : false;
+ bool drdlid_is_ib_ucast = (opa_drdlid <
+ be16_to_cpu(IB_MULTICAST_LID_BASE)) ?
+ true : false;
+ bool drslid_is_ext = !drslid_is_ib_ucast && !dr_slid_is_permissive;
+ bool drdlid_is_ext = !drdlid_is_ib_ucast && !dr_dlid_is_permissive;
+ bool grh_present = false;
+ struct ib_ah_attr attr;
+ union ib_gid sgid;
+ int ret = 0;
+
+ ret = ib_query_ah(mad_send_wr->send_buf.ah, &attr);
+ if (ret)
+ return ret;
+ grh_present = (attr.ah_flags & IB_AH_GRH);
+ if (grh_present) {
+ ret = ib_query_gid(ib_dev, port, attr.grh.sgid_index,
+ &sgid, NULL);
+ if (ret)
+ return ret;
+ }
+
+ if (smp->class_version == OPA_SMP_CLASS_VERSION) {
+ /*
+ * Conditions when GRH info should not be specified
+ * 1. both dr_slid and dr_dlid are permissve (Pure DR)
+ * 2. both dr_slid and dr_dlid are less than 0xc000.
+ *
+ * Conditions when GRH info should be specified
+ * 1. dr_dlid is not permissive and above 0xbfff
+ * OR
+ * 2. dr_slid is not permissive and above 0xbfff
+ */
+ if (grh_present) {
+ if ((dr_slid_is_permissive &&
+ dr_dlid_is_permissive) ||
+ (drslid_is_ib_ucast && drdlid_is_ib_ucast))
+ if (ib_is_opa_gid(&attr.grh.dgid) &&
+ ib_is_opa_gid(&sgid))
+ return -EINVAL;
+ if (drslid_is_ext && !ib_is_opa_gid(&sgid))
+ return -EINVAL;
+ if (drdlid_is_ext &&
+ !ib_is_opa_gid(&attr.grh.dgid))
+ return -EINVAL;
+ } else { /* There is no GRH */
+ if (drslid_is_ext || drdlid_is_ext)
+ return -EINVAL;
+ }
+ } else {
+ if (grh_present)
+ if (ib_is_opa_gid(&attr.grh.dgid) &&
+ ib_is_opa_gid(&sgid))
+ return -EINVAL;
+ }
+ return ret;
+}
+
/*
* Return 0 if SMP is to be sent
* Return 1 if SMP was consumed locally (whether or not solicited)
@@ -754,8 +829,12 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
size_t mad_size = port_mad_size(mad_agent_priv->qp_info->port_priv);
u16 out_mad_pkey_index = 0;
u16 drslid;
- bool opa = rdma_cap_opa_mad(mad_agent_priv->qp_info->port_priv->device,
- mad_agent_priv->qp_info->port_priv->port_num);
+ bool opa_mad =
+ rdma_cap_opa_mad(mad_agent_priv->qp_info->port_priv->device,
+ mad_agent_priv->qp_info->port_priv->port_num);
+ bool opa_ah =
+ rdma_cap_opa_ah(mad_agent_priv->qp_info->port_priv->device,
+ mad_agent_priv->qp_info->port_priv->port_num);
if (rdma_cap_ib_switch(device) &&
smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
@@ -763,13 +842,21 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
else
port_num = mad_agent_priv->agent.port_num;
+ if (opa_mad && opa_ah) {
+ ret = verify_mad_ah(mad_agent_priv, mad_send_wr);
+ if (ret) {
+ dev_err(&device->dev,
+ "Error verifying MAD format\n");
+ goto out;
+ }
+ }
/*
* Directed route handling starts if the initial LID routed part of
* a request or the ending LID routed part of a response is empty.
* If we are at the start of the LID routed part, don't update the
* hop_ptr or hop_cnt. See section 14.2.2, Vol 1 IB spec.
*/
- if (opa && smp->class_version == OPA_SMP_CLASS_VERSION) {
+ if (opa_mad && smp->class_version == OPA_SMP_CLASS_VERSION) {
u32 opa_drslid;
if ((opa_get_smp_direction(opa_smp)
@@ -783,13 +870,6 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
goto out;
}
opa_drslid = be32_to_cpu(opa_smp->route.dr.dr_slid);
- if (opa_drslid != be32_to_cpu(OPA_LID_PERMISSIVE) &&
- opa_drslid & 0xffff0000) {
- ret = -EINVAL;
- dev_err(&device->dev, "OPA Invalid dr_slid 0x%x\n",
- opa_drslid);
- goto out;
- }
drslid = (u16)(opa_drslid & 0x0000ffff);
/* Check to post send on QP or process locally */
@@ -834,7 +914,7 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
send_wr->pkey_index,
send_wr->port_num, &mad_wc);
- if (opa && smp->base_version == OPA_MGMT_BASE_VERSION) {
+ if (opa_mad && smp->base_version == OPA_MGMT_BASE_VERSION) {
mad_wc.byte_len = mad_send_wr->send_buf.hdr_len
+ mad_send_wr->send_buf.data_len
+ sizeof(struct ib_grh);
@@ -891,7 +971,7 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
}
local->mad_send_wr = mad_send_wr;
- if (opa) {
+ if (opa_mad) {
local->mad_send_wr->send_wr.pkey_index = out_mad_pkey_index;
local->return_wc_byte_len = mad_size;
}
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index 142b327..3e22937 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -33,6 +33,23 @@
#if !defined(OPA_ADDR_H)
#define OPA_ADDR_H
+#include <rdma/ib_verbs.h>
+
#define OPA_TO_IB_UCAST_LID(x) (((x) >= be16_to_cpu(IB_MULTICAST_LID_BASE)) \
? 0 : x)
+#define OPA_SPECIAL_OUI (0x00066AULL)
+
+/**
+ * ib_is_opa_gid: Returns true if the top 24 bits of the gid
+ * contains the OPA_STL_OUI identifier. This identifies that
+ * the provided gid is a special purpose GID meant to carry
+ * extended LID information.
+ *
+ * @gid: The Global identifier
+ */
+static inline bool ib_is_opa_gid(union ib_gid *gid)
+{
+ return ((be64_to_cpu(gid->global.interface_id) >> 40) ==
+ OPA_SPECIAL_OUI);
+}
#endif /* OPA_ADDR_H */
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 05/11] IB/core: Change wc.slid from 16 to 32 bits
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
lid field in ib_wc is increased to 32 bits. This enables core
components to use the larger addresses if needed.
The user ABI is unchanged and return 16 bit values when queried.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/cm.c | 4 ++--
drivers/infiniband/core/mad_rmpp.c | 4 ++--
drivers/infiniband/core/user_mad.c | 2 +-
drivers/infiniband/core/uverbs_cmd.c | 2 +-
drivers/infiniband/hw/hfi1/mad.c | 2 +-
drivers/infiniband/hw/hfi1/rc.c | 2 +-
drivers/infiniband/hw/hfi1/ruc.c | 2 +-
drivers/infiniband/hw/hfi1/uc.c | 2 +-
drivers/infiniband/hw/mlx4/mad.c | 6 +++---
drivers/infiniband/hw/mlx4/mcg.c | 2 +-
drivers/infiniband/hw/mlx5/mad.c | 2 +-
drivers/infiniband/hw/mthca/mthca_cmd.c | 4 ++--
drivers/infiniband/hw/mthca/mthca_mad.c | 2 +-
drivers/infiniband/hw/qib/qib_rc.c | 2 +-
drivers/infiniband/hw/qib/qib_ruc.c | 2 +-
drivers/infiniband/hw/qib/qib_uc.c | 2 +-
drivers/infiniband/sw/rdmavt/cq.c | 2 +-
include/rdma/ib_verbs.h | 2 +-
18 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index c995255..137c4c2 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1576,7 +1576,7 @@ static void cm_process_routed_req(struct cm_req_msg *req_msg, struct ib_wc *wc)
{
if (!cm_req_get_primary_subnet_local(req_msg)) {
if (req_msg->primary_local_lid == IB_LID_PERMISSIVE) {
- req_msg->primary_local_lid = cpu_to_be16(wc->slid);
+ req_msg->primary_local_lid = cpu_to_be16((u16)wc->slid);
cm_req_set_primary_sl(req_msg, wc->sl);
}
@@ -1586,7 +1586,7 @@ static void cm_process_routed_req(struct cm_req_msg *req_msg, struct ib_wc *wc)
if (!cm_req_get_alt_subnet_local(req_msg)) {
if (req_msg->alt_local_lid == IB_LID_PERMISSIVE) {
- req_msg->alt_local_lid = cpu_to_be16(wc->slid);
+ req_msg->alt_local_lid = cpu_to_be16((u16)wc->slid);
cm_req_set_alt_sl(req_msg, wc->sl);
}
diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index c34dca3..8a076e1 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -316,7 +316,7 @@ static void recv_cleanup_handler(struct work_struct *work)
mad_hdr = &mad_recv_wc->recv_buf.mad->mad_hdr;
rmpp_recv->tid = mad_hdr->tid;
rmpp_recv->src_qp = mad_recv_wc->wc->src_qp;
- rmpp_recv->slid = mad_recv_wc->wc->slid;
+ rmpp_recv->slid = (u16)mad_recv_wc->wc->slid;
rmpp_recv->mgmt_class = mad_hdr->mgmt_class;
rmpp_recv->class_version = mad_hdr->class_version;
rmpp_recv->method = mad_hdr->method;
@@ -337,7 +337,7 @@ static void recv_cleanup_handler(struct work_struct *work)
list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) {
if (rmpp_recv->tid == mad_hdr->tid &&
rmpp_recv->src_qp == mad_recv_wc->wc->src_qp &&
- rmpp_recv->slid == mad_recv_wc->wc->slid &&
+ rmpp_recv->slid == (u16)mad_recv_wc->wc->slid &&
rmpp_recv->mgmt_class == mad_hdr->mgmt_class &&
rmpp_recv->class_version == mad_hdr->class_version &&
rmpp_recv->method == mad_hdr->method)
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 415a318..2a0b928 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -229,7 +229,7 @@ static void recv_handler(struct ib_mad_agent *agent,
packet->mad.hdr.status = 0;
packet->mad.hdr.length = hdr_size(file) + mad_recv_wc->mad_len;
packet->mad.hdr.qpn = cpu_to_be32(mad_recv_wc->wc->src_qp);
- packet->mad.hdr.lid = cpu_to_be16(mad_recv_wc->wc->slid);
+ packet->mad.hdr.lid = cpu_to_be16((u16)mad_recv_wc->wc->slid);
packet->mad.hdr.sl = mad_recv_wc->wc->sl;
packet->mad.hdr.path_bits = mad_recv_wc->wc->dlid_path_bits;
packet->mad.hdr.pkey_index = mad_recv_wc->wc->pkey_index;
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index e135c08..a203cf2 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1619,7 +1619,7 @@ static int copy_wc_to_user(void __user *dest, struct ib_wc *wc)
tmp.src_qp = wc->src_qp;
tmp.wc_flags = wc->wc_flags;
tmp.pkey_index = wc->pkey_index;
- tmp.slid = wc->slid;
+ tmp.slid = (u16)wc->slid;
tmp.sl = wc->sl;
tmp.dlid_path_bits = wc->dlid_path_bits;
tmp.port_num = wc->port_num;
diff --git a/drivers/infiniband/hw/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
index 9487c9b..c76e546 100644
--- a/drivers/infiniband/hw/hfi1/mad.c
+++ b/drivers/infiniband/hw/hfi1/mad.c
@@ -3974,7 +3974,7 @@ static int opa_local_smp_check(struct hfi1_ibport *ibp,
const struct ib_wc *in_wc)
{
struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
- u16 slid = in_wc->slid;
+ u16 slid = (u16)in_wc->slid;
u16 pkey;
if (in_wc->pkey_index >= ARRAY_SIZE(ppd->pkeys))
diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index caca6f5..40ae502 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -2306,7 +2306,7 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
wc.opcode = IB_WC_RECV;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = (u16)qp->remote_ah_attr.dlid;
+ wc.slid = qp->remote_ah_attr.dlid;
/*
* It seems that IB mandates the presence of an SL in a
* work completion only for the UD transport (see section
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index 2fe2b2f..777cfc8 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -591,7 +591,7 @@ static void ruc_loopback(struct rvt_qp *sqp)
wc.byte_len = wqe->length;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = (u16)qp->remote_ah_attr.dlid;
+ wc.slid = qp->remote_ah_attr.dlid;
wc.sl = qp->remote_ah_attr.sl;
wc.port_num = 1;
/* Signal completion event if the solicited bit is set. */
diff --git a/drivers/infiniband/hw/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
index 0572dc7..5e6d1ba 100644
--- a/drivers/infiniband/hw/hfi1/uc.c
+++ b/drivers/infiniband/hw/hfi1/uc.c
@@ -451,7 +451,7 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
wc.status = IB_WC_SUCCESS;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = (u16)qp->remote_ah_attr.dlid;
+ wc.slid = qp->remote_ah_attr.dlid;
/*
* It seems that IB mandates the presence of an SL in a
* work completion only for the UD transport (see section
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 404ec4e..8e50de0 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -167,7 +167,7 @@ int mlx4_MAD_IFC(struct mlx4_ib_dev *dev, int mad_ifc_flags,
op_modifier |= 0x4;
- in_modifier |= in_wc->slid << 16;
+ in_modifier |= (u16)in_wc->slid << 16;
}
err = mlx4_cmd_box(dev->dev, inmailbox->dma, outmailbox->dma, in_modifier,
@@ -599,7 +599,7 @@ int mlx4_ib_send_to_slave(struct mlx4_ib_dev *dev, int slave, u8 port,
memcpy((char *)&tun_mad->hdr.slid_mac_47_32, &(wc->smac[4]), 2);
} else {
tun_mad->hdr.sl_vid = cpu_to_be16(((u16)(wc->sl)) << 12);
- tun_mad->hdr.slid_mac_47_32 = cpu_to_be16(wc->slid);
+ tun_mad->hdr.slid_mac_47_32 = cpu_to_be16((u16)wc->slid);
}
ib_dma_sync_single_for_device(&dev->ib_dev,
@@ -787,7 +787,7 @@ static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
}
}
- slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+ slid = in_wc ? (u16)in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP && slid == 0) {
forward_trap(to_mdev(ibdev), port_num, in_mad);
diff --git a/drivers/infiniband/hw/mlx4/mcg.c b/drivers/infiniband/hw/mlx4/mcg.c
index d46a847..a21d37f 100644
--- a/drivers/infiniband/hw/mlx4/mcg.c
+++ b/drivers/infiniband/hw/mlx4/mcg.c
@@ -244,7 +244,7 @@ static int send_mad_to_slave(int slave, struct mlx4_ib_demux_ctx *ctx,
wc.sl = 0;
wc.dlid_path_bits = 0;
wc.port_num = ctx->port;
- wc.slid = (u16)ah_attr.dlid; /* opensm lid */
+ wc.slid = ah_attr.dlid; /* opensm lid */
wc.src_qp = 1;
return mlx4_ib_send_to_slave(dev, slave, ctx->port, IB_QPT_GSI, &wc, NULL, mad);
}
diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c
index 39e5848..f0323b7 100644
--- a/drivers/infiniband/hw/mlx5/mad.c
+++ b/drivers/infiniband/hw/mlx5/mad.c
@@ -66,7 +66,7 @@ static int process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
u16 slid;
int err;
- slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+ slid = in_wc ? (u16)in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP && slid == 0)
return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED;
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index c7f49bb..d07f389 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -1913,7 +1913,7 @@ int mthca_MAD_IFC(struct mthca_dev *dev, int ignore_mkey, int ignore_bkey,
(in_wc->wc_flags & IB_WC_GRH ? 0x80 : 0);
MTHCA_PUT(inbox, val, MAD_IFC_G_PATH_OFFSET);
- MTHCA_PUT(inbox, in_wc->slid, MAD_IFC_RLID_OFFSET);
+ MTHCA_PUT(inbox, (u16)in_wc->slid, MAD_IFC_RLID_OFFSET);
MTHCA_PUT(inbox, in_wc->pkey_index, MAD_IFC_PKEY_OFFSET);
if (in_grh)
@@ -1921,7 +1921,7 @@ int mthca_MAD_IFC(struct mthca_dev *dev, int ignore_mkey, int ignore_bkey,
op_modifier |= 0x4;
- in_modifier |= in_wc->slid << 16;
+ in_modifier |= (u16)in_wc->slid << 16;
}
err = mthca_cmd_box(dev, inmailbox->dma, outmailbox->dma,
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c
index b503d160..e9a7dd0 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -204,7 +204,7 @@ int mthca_process_mad(struct ib_device *ibdev,
u16 *out_mad_pkey_index)
{
int err;
- u16 slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+ u16 slid = in_wc ? (u16)in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
u16 prev_lid = 0;
struct ib_port_attr pattr;
const struct ib_mad *in_mad = (const struct ib_mad *)in;
diff --git a/drivers/infiniband/hw/qib/qib_rc.c b/drivers/infiniband/hw/qib/qib_rc.c
index 91f0d08..9d0b2bc 100644
--- a/drivers/infiniband/hw/qib/qib_rc.c
+++ b/drivers/infiniband/hw/qib/qib_rc.c
@@ -2018,7 +2018,7 @@ void qib_rc_rcv(struct qib_ctxtdata *rcd, struct ib_header *hdr,
wc.opcode = IB_WC_RECV;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = (u16)qp->remote_ah_attr.dlid;
+ wc.slid = qp->remote_ah_attr.dlid;
wc.sl = qp->remote_ah_attr.sl;
/* zero fields that are N/A */
wc.vendor_err = 0;
diff --git a/drivers/infiniband/hw/qib/qib_ruc.c b/drivers/infiniband/hw/qib/qib_ruc.c
index 0b0620f..588b4ae 100644
--- a/drivers/infiniband/hw/qib/qib_ruc.c
+++ b/drivers/infiniband/hw/qib/qib_ruc.c
@@ -566,7 +566,7 @@ static void qib_ruc_loopback(struct rvt_qp *sqp)
wc.byte_len = wqe->length;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = (u16)qp->remote_ah_attr.dlid;
+ wc.slid = qp->remote_ah_attr.dlid;
wc.sl = qp->remote_ah_attr.sl;
wc.port_num = 1;
/* Signal completion event if the solicited bit is set. */
diff --git a/drivers/infiniband/hw/qib/qib_uc.c b/drivers/infiniband/hw/qib/qib_uc.c
index 7a10748..5b2d483 100644
--- a/drivers/infiniband/hw/qib/qib_uc.c
+++ b/drivers/infiniband/hw/qib/qib_uc.c
@@ -403,7 +403,7 @@ void qib_uc_rcv(struct qib_ibport *ibp, struct ib_header *hdr,
wc.status = IB_WC_SUCCESS;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = (u16)qp->remote_ah_attr.dlid;
+ wc.slid = qp->remote_ah_attr.dlid;
wc.sl = qp->remote_ah_attr.sl;
/* zero fields that are N/A */
wc.vendor_err = 0;
diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 6d9904a..a03b240 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -105,7 +105,7 @@ void rvt_cq_enter(struct rvt_cq *cq, struct ib_wc *entry, bool solicited)
wc->uqueue[head].src_qp = entry->src_qp;
wc->uqueue[head].wc_flags = entry->wc_flags;
wc->uqueue[head].pkey_index = entry->pkey_index;
- wc->uqueue[head].slid = entry->slid;
+ wc->uqueue[head].slid = (u16)entry->slid;
wc->uqueue[head].sl = entry->sl;
wc->uqueue[head].dlid_path_bits = entry->dlid_path_bits;
wc->uqueue[head].port_num = entry->port_num;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 3d80720..b3f9130 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -896,7 +896,7 @@ struct ib_wc {
u32 src_qp;
int wc_flags;
u16 pkey_index;
- u16 slid;
+ u32 slid;
u8 sl;
u8 dlid_path_bits;
u8 port_num; /* valid only for DR SMPs on switches */
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 04/11] IB/core: Change port_attr.lid size from 16 to 32 bits
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
lid field in port_attr is increased to 32 bits. This enables core
components to use the larger addresses if needed.
The user ABI is unchanged and return 16 bit values when queried.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
drivers/infiniband/hw/mlx4/alias_GUID.c | 2 +-
drivers/infiniband/hw/mlx4/mad.c | 2 +-
drivers/infiniband/hw/mthca/mthca_mad.c | 2 +-
drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 2 +-
drivers/infiniband/ulp/srpt/ib_srpt.c | 2 +-
include/rdma/ib_verbs.h | 2 +-
7 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 36f5bc1..e135c08 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -515,11 +515,13 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
resp.bad_pkey_cntr = attr.bad_pkey_cntr;
resp.qkey_viol_cntr = attr.qkey_viol_cntr;
resp.pkey_tbl_len = attr.pkey_tbl_len;
- resp.lid = attr.lid;
- if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
+ if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
- else
+ resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
+ } else {
resp.sm_lid = (u16)attr.sm_lid;
+ resp.lid = (u16)attr.lid;
+ }
resp.lmc = attr.lmc;
resp.max_vl_num = attr.max_vl_num;
resp.sm_sl = attr.sm_sl;
diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c b/drivers/infiniband/hw/mlx4/alias_GUID.c
index 5e99390..7fa64e6 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -527,7 +527,7 @@ static int set_guid_rec(struct ib_device *ibdev,
memset(&guid_info_rec, 0, sizeof (struct ib_sa_guidinfo_rec));
- guid_info_rec.lid = cpu_to_be16(attr.lid);
+ guid_info_rec.lid = cpu_to_be16((u16)attr.lid);
guid_info_rec.block_num = index;
memcpy(guid_info_rec.guid_info_list, rec_det->all_recs,
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 1672907..404ec4e 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -821,7 +821,7 @@ static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
in_mad->mad_hdr.method == IB_MGMT_METHOD_SET &&
in_mad->mad_hdr.attr_id == IB_SMP_ATTR_PORT_INFO &&
!ib_query_port(ibdev, port_num, &pattr))
- prev_lid = pattr.lid;
+ prev_lid = (u16)pattr.lid;
err = mlx4_MAD_IFC(to_mdev(ibdev),
(mad_flags & IB_MAD_IGNORE_MKEY ? MLX4_MAD_IFC_IGNORE_MKEY : 0) |
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c
index 9139405..b503d160 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -255,7 +255,7 @@ int mthca_process_mad(struct ib_device *ibdev,
in_mad->mad_hdr.method == IB_MGMT_METHOD_SET &&
in_mad->mad_hdr.attr_id == IB_SMP_ATTR_PORT_INFO &&
!ib_query_port(ibdev, port_num, &pattr))
- prev_lid = pattr.lid;
+ prev_lid = (u16)pattr.lid;
err = mthca_MAD_IFC(to_mdev(ibdev),
mad_flags & IB_MAD_IGNORE_MKEY,
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index d3394b6..bff73b5 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -581,7 +581,7 @@ void ipoib_mcast_join_task(struct work_struct *work)
port_attr.state);
return;
}
- priv->local_lid = port_attr.lid;
+ priv->local_lid = (u16)port_attr.lid;
netif_addr_lock_bh(dev);
if (!test_bit(IPOIB_FLAG_DEV_ADDR_SET, &priv->flags)) {
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index c6d0c47..4dc66ba 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -515,7 +515,7 @@ static int srpt_refresh_port(struct srpt_port *sport)
goto err_query_port;
sport->sm_lid = (u16)port_attr.sm_lid;
- sport->lid = port_attr.lid;
+ sport->lid = (u16)port_attr.lid;
ret = ib_query_gid(sport->sdev->device, sport->port, 0, &sport->gid,
NULL);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 294d3ed..3d80720 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -520,7 +520,7 @@ struct ib_port_attr {
u32 bad_pkey_cntr;
u32 qkey_viol_cntr;
u16 pkey_tbl_len;
- u16 lid;
+ u32 lid;
u32 sm_lid;
u8 lmc;
u8 max_vl_num;
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 03/11] IB/core: Change ah_attr.dlid from 16 to 32 bits
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
dlid field in ah_attr is increased to 32 bits. This
enables core components to use the larger addresses if needed.
The user ABI is unchanged and userspace applications can use
16 bit lids when creating and modifying address handles.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/mad_rmpp.c | 2 +-
drivers/infiniband/core/sa_query.c | 2 +-
drivers/infiniband/core/uverbs_cmd.c | 11 +++++++++--
drivers/infiniband/core/uverbs_marshall.c | 2 +-
drivers/infiniband/hw/hfi1/driver.c | 4 ++--
drivers/infiniband/hw/hfi1/rc.c | 4 ++--
drivers/infiniband/hw/hfi1/ruc.c | 21 +++++++++++----------
drivers/infiniband/hw/hfi1/uc.c | 2 +-
drivers/infiniband/hw/hfi1/ud.c | 10 +++++-----
drivers/infiniband/hw/hfi1/verbs.c | 4 ++--
drivers/infiniband/hw/mlx4/ah.c | 2 +-
drivers/infiniband/hw/mlx4/mcg.c | 2 +-
drivers/infiniband/hw/mlx4/qp.c | 2 +-
drivers/infiniband/hw/mlx5/ah.c | 2 +-
drivers/infiniband/hw/mthca/mthca_av.c | 2 +-
drivers/infiniband/hw/mthca/mthca_qp.c | 2 +-
drivers/infiniband/hw/ocrdma/ocrdma_ah.c | 2 +-
drivers/infiniband/hw/qib/qib_rc.c | 4 ++--
drivers/infiniband/hw/qib/qib_ruc.c | 11 ++++++-----
drivers/infiniband/hw/qib/qib_uc.c | 2 +-
drivers/infiniband/hw/qib/qib_ud.c | 8 ++++----
include/rdma/ib_verbs.h | 2 +-
22 files changed, 56 insertions(+), 47 deletions(-)
diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 382941b..c34dca3 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -870,7 +870,7 @@ static int init_newwin(struct ib_mad_send_wr_private *mad_send_wr)
if (ib_query_ah(mad_send_wr->send_buf.ah, &ah_attr))
continue;
- if (rmpp_recv->slid == ah_attr.dlid) {
+ if (rmpp_recv->slid == (u16)ah_attr.dlid) {
newwin = rmpp_recv->repwin;
break;
}
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 0b0dc43..81b742c 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -958,7 +958,7 @@ static void update_sm_ah(struct work_struct *work)
pr_err("Couldn't find index for default PKey\n");
memset(&ah_attr, 0, sizeof ah_attr);
- ah_attr.dlid = (u16)port_attr.sm_lid;
+ ah_attr.dlid = port_attr.sm_lid;
ah_attr.sl = port_attr.sm_sl;
ah_attr.port_num = port->port_num;
if (port_attr.grh_required) {
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 7630c92..36f5bc1 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -2281,7 +2281,10 @@ ssize_t ib_uverbs_query_qp(struct ib_uverbs_file *file,
resp.dest.sgid_index = attr->ah_attr.grh.sgid_index;
resp.dest.hop_limit = attr->ah_attr.grh.hop_limit;
resp.dest.traffic_class = attr->ah_attr.grh.traffic_class;
- resp.dest.dlid = attr->ah_attr.dlid;
+ if (rdma_cap_opa_ah(ib_dev, attr->ah_attr.port_num))
+ resp.dest.dlid = OPA_TO_IB_UCAST_LID(attr->ah_attr.dlid);
+ else
+ resp.dest.dlid = (u16)attr->ah_attr.dlid;
resp.dest.sl = attr->ah_attr.sl;
resp.dest.src_path_bits = attr->ah_attr.src_path_bits;
resp.dest.static_rate = attr->ah_attr.static_rate;
@@ -2293,7 +2296,11 @@ ssize_t ib_uverbs_query_qp(struct ib_uverbs_file *file,
resp.alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index;
resp.alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit;
resp.alt_dest.traffic_class = attr->alt_ah_attr.grh.traffic_class;
- resp.alt_dest.dlid = attr->alt_ah_attr.dlid;
+ if (rdma_cap_opa_ah(ib_dev, attr->alt_ah_attr.port_num))
+ resp.alt_dest.dlid =
+ OPA_TO_IB_UCAST_LID(attr->alt_ah_attr.dlid);
+ else
+ resp.alt_dest.dlid = (u16)attr->alt_ah_attr.dlid;
resp.alt_dest.sl = attr->alt_ah_attr.sl;
resp.alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits;
resp.alt_dest.static_rate = attr->alt_ah_attr.static_rate;
diff --git a/drivers/infiniband/core/uverbs_marshall.c b/drivers/infiniband/core/uverbs_marshall.c
index af020f8..f8c9008 100644
--- a/drivers/infiniband/core/uverbs_marshall.c
+++ b/drivers/infiniband/core/uverbs_marshall.c
@@ -42,7 +42,7 @@ void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
dst->grh.hop_limit = src->grh.hop_limit;
dst->grh.traffic_class = src->grh.traffic_class;
memset(&dst->grh.reserved, 0, sizeof(dst->grh.reserved));
- dst->dlid = src->dlid;
+ dst->dlid = (u16)src->dlid;
dst->sl = src->sl;
dst->src_path_bits = src->src_path_bits;
dst->static_rate = src->static_rate;
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 6563e4d..4195ba8 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -473,12 +473,12 @@ void hfi1_process_ecn_slowpath(struct rvt_qp *qp, struct hfi1_packet *pkt,
(dlid != be16_to_cpu(IB_LID_PERMISSIVE));
break;
case IB_QPT_UC:
- rlid = qp->remote_ah_attr.dlid;
+ rlid = (u16)qp->remote_ah_attr.dlid;
rqpn = qp->remote_qpn;
svc_type = IB_CC_SVCTYPE_UC;
break;
case IB_QPT_RC:
- rlid = qp->remote_ah_attr.dlid;
+ rlid = (u16)qp->remote_ah_attr.dlid;
rqpn = qp->remote_qpn;
svc_type = IB_CC_SVCTYPE_RC;
break;
diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index 8bc5013..caca6f5 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -890,7 +890,7 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
pbc_flags |= ((!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT);
lrh0 |= (sc5 & 0xf) << 12 | (qp->remote_ah_attr.sl & 0xf) << 4;
hdr.lrh[0] = cpu_to_be16(lrh0);
- hdr.lrh[1] = cpu_to_be16(qp->remote_ah_attr.dlid);
+ hdr.lrh[1] = cpu_to_be16((u16)qp->remote_ah_attr.dlid);
hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
hdr.lrh[3] = cpu_to_be16(ppd->lid | qp->remote_ah_attr.src_path_bits);
ohdr->bth[0] = cpu_to_be32(bth0);
@@ -2306,7 +2306,7 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
wc.opcode = IB_WC_RECV;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = qp->remote_ah_attr.dlid;
+ wc.slid = (u16)qp->remote_ah_attr.dlid;
/*
* It seems that IB mandates the presence of an SL in a
* work completion only for the UD transport (see section
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index a1576ae..2fe2b2f 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -297,7 +297,7 @@ int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct ib_header *hdr,
goto err;
}
/* Validate the SLID. See Ch. 9.6.1.5 and 17.2.8 */
- if (be16_to_cpu(hdr->lrh[3]) != qp->alt_ah_attr.dlid ||
+ if (be16_to_cpu(hdr->lrh[3]) != (u16)qp->alt_ah_attr.dlid ||
ppd_from_ibp(ibp)->port != qp->alt_ah_attr.port_num)
goto err;
spin_lock_irqsave(&qp->s_lock, flags);
@@ -332,7 +332,7 @@ int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct ib_header *hdr,
goto err;
}
/* Validate the SLID. See Ch. 9.6.1.5 */
- if (be16_to_cpu(hdr->lrh[3]) != qp->remote_ah_attr.dlid ||
+ if (be16_to_cpu(hdr->lrh[3]) != (u16)qp->remote_ah_attr.dlid ||
ppd_from_ibp(ibp)->port != qp->port_num)
goto err;
if (qp->s_mig_state == IB_MIG_REARM &&
@@ -591,7 +591,7 @@ static void ruc_loopback(struct rvt_qp *sqp)
wc.byte_len = wqe->length;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = qp->remote_ah_attr.dlid;
+ wc.slid = (u16)qp->remote_ah_attr.dlid;
wc.sl = qp->remote_ah_attr.sl;
wc.port_num = 1;
/* Signal completion event if the solicited bit is set. */
@@ -812,7 +812,8 @@ void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
else
qp->s_flags &= ~RVT_S_AHG_VALID;
ps->s_txreq->phdr.hdr.lrh[0] = cpu_to_be16(lrh0);
- ps->s_txreq->phdr.hdr.lrh[1] = cpu_to_be16(qp->remote_ah_attr.dlid);
+ ps->s_txreq->phdr.hdr.lrh[1] =
+ cpu_to_be16((u16)qp->remote_ah_attr.dlid);
ps->s_txreq->phdr.hdr.lrh[2] =
cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
ps->s_txreq->phdr.hdr.lrh[3] = cpu_to_be16(ppd_from_ibp(ibp)->lid |
@@ -864,9 +865,9 @@ void hfi1_do_send(struct rvt_qp *qp)
switch (qp->ibqp.qp_type) {
case IB_QPT_RC:
- if (!loopback && ((qp->remote_ah_attr.dlid & ~((1 << ps.ppd->lmc
- ) - 1)) ==
- ps.ppd->lid)) {
+ if (!loopback && (((u16)qp->remote_ah_attr.dlid &
+ ~((1 << ps.ppd->lmc) - 1)) ==
+ ps.ppd->lid)) {
ruc_loopback(qp);
return;
}
@@ -874,9 +875,9 @@ void hfi1_do_send(struct rvt_qp *qp)
timeout_int = (qp->timeout_jiffies);
break;
case IB_QPT_UC:
- if (!loopback && ((qp->remote_ah_attr.dlid & ~((1 << ps.ppd->lmc
- ) - 1)) ==
- ps.ppd->lid)) {
+ if (!loopback && (((u16)qp->remote_ah_attr.dlid &
+ ~((1 << ps.ppd->lmc) - 1)) ==
+ ps.ppd->lid)) {
ruc_loopback(qp);
return;
}
diff --git a/drivers/infiniband/hw/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
index 5e6d1ba..0572dc7 100644
--- a/drivers/infiniband/hw/hfi1/uc.c
+++ b/drivers/infiniband/hw/hfi1/uc.c
@@ -451,7 +451,7 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
wc.status = IB_WC_SUCCESS;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = qp->remote_ah_attr.dlid;
+ wc.slid = (u16)qp->remote_ah_attr.dlid;
/*
* It seems that IB mandates the presence of an SL in a
* work completion only for the UD transport (see section
diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
index 97ae24b..8a27ab1 100644
--- a/drivers/infiniband/hw/hfi1/ud.c
+++ b/drivers/infiniband/hw/hfi1/ud.c
@@ -113,7 +113,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
hfi1_bad_pqkey(ibp, OPA_TRAP_BAD_P_KEY, pkey,
ah_attr->sl,
sqp->ibqp.qp_num, qp->ibqp.qp_num,
- slid, ah_attr->dlid);
+ slid, (u16)ah_attr->dlid);
goto drop;
}
}
@@ -137,7 +137,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
ah_attr->sl,
sqp->ibqp.qp_num, qp->ibqp.qp_num,
lid,
- ah_attr->dlid);
+ (u16)ah_attr->dlid);
goto drop;
}
}
@@ -248,7 +248,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
if (wc.slid == 0 && sqp->ibqp.qp_type == IB_QPT_GSI)
wc.slid = be16_to_cpu(IB_LID_PERMISSIVE);
wc.sl = ah_attr->sl;
- wc.dlid_path_bits = ah_attr->dlid & ((1 << ppd->lmc) - 1);
+ wc.dlid_path_bits = (u16)ah_attr->dlid & ((1 << ppd->lmc) - 1);
wc.port_num = qp->port_num;
/* Signal completion event if the solicited bit is set. */
rvt_cq_enter(ibcq_to_rvtcq(qp->ibqp.recv_cq), &wc,
@@ -321,7 +321,7 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
if (ah_attr->dlid < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
ah_attr->dlid == be16_to_cpu(IB_LID_PERMISSIVE)) {
- lid = ah_attr->dlid & ~((1 << ppd->lmc) - 1);
+ lid = (u16)ah_attr->dlid & ~((1 << ppd->lmc) - 1);
if (unlikely(!loopback &&
(lid == ppd->lid ||
(lid == be16_to_cpu(IB_LID_PERMISSIVE) &&
@@ -402,7 +402,7 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
ps->s_txreq->psc = priv->s_sendcontext;
ps->s_txreq->phdr.hdr.lrh[0] = cpu_to_be16(lrh0);
- ps->s_txreq->phdr.hdr.lrh[1] = cpu_to_be16(ah_attr->dlid);
+ ps->s_txreq->phdr.hdr.lrh[1] = cpu_to_be16((u16)ah_attr->dlid);
ps->s_txreq->phdr.hdr.lrh[2] =
cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
if (ah_attr->dlid == be16_to_cpu(IB_LID_PERMISSIVE)) {
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 4b7a16c..a13bfdf 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1781,12 +1781,12 @@ void hfi1_cnp_rcv(struct hfi1_packet *packet)
switch (packet->qp->ibqp.qp_type) {
case IB_QPT_UC:
- rlid = qp->remote_ah_attr.dlid;
+ rlid = (u16)qp->remote_ah_attr.dlid;
rqpn = qp->remote_qpn;
svc_type = IB_CC_SVCTYPE_UC;
break;
case IB_QPT_RC:
- rlid = qp->remote_ah_attr.dlid;
+ rlid = (u16)qp->remote_ah_attr.dlid;
rqpn = qp->remote_qpn;
svc_type = IB_CC_SVCTYPE_RC;
break;
diff --git a/drivers/infiniband/hw/mlx4/ah.c b/drivers/infiniband/hw/mlx4/ah.c
index 5fc6233..772789f 100644
--- a/drivers/infiniband/hw/mlx4/ah.c
+++ b/drivers/infiniband/hw/mlx4/ah.c
@@ -58,7 +58,7 @@ static struct ib_ah *create_ib_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr,
memcpy(ah->av.ib.dgid, ah_attr->grh.dgid.raw, 16);
}
- ah->av.ib.dlid = cpu_to_be16(ah_attr->dlid);
+ ah->av.ib.dlid = cpu_to_be16((u16)ah_attr->dlid);
if (ah_attr->static_rate) {
ah->av.ib.stat_rate = ah_attr->static_rate + MLX4_STAT_RATE_OFFSET;
while (ah->av.ib.stat_rate > IB_RATE_2_5_GBPS + MLX4_STAT_RATE_OFFSET &&
diff --git a/drivers/infiniband/hw/mlx4/mcg.c b/drivers/infiniband/hw/mlx4/mcg.c
index a21d37f..d46a847 100644
--- a/drivers/infiniband/hw/mlx4/mcg.c
+++ b/drivers/infiniband/hw/mlx4/mcg.c
@@ -244,7 +244,7 @@ static int send_mad_to_slave(int slave, struct mlx4_ib_demux_ctx *ctx,
wc.sl = 0;
wc.dlid_path_bits = 0;
wc.port_num = ctx->port;
- wc.slid = ah_attr.dlid; /* opensm lid */
+ wc.slid = (u16)ah_attr.dlid; /* opensm lid */
wc.src_qp = 1;
return mlx4_ib_send_to_slave(dev, slave, ctx->port, IB_QPT_GSI, &wc, NULL, mad);
}
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 570bc86..ada1ecc 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1396,7 +1396,7 @@ static int _mlx4_set_path(struct mlx4_ib_dev *dev, const struct ib_ah_attr *ah,
path->grh_mylmc = ah->src_path_bits & 0x7f;
- path->rlid = cpu_to_be16(ah->dlid);
+ path->rlid = cpu_to_be16((u16)ah->dlid);
if (ah->static_rate) {
path->static_rate = ah->static_rate + MLX4_STAT_RATE_OFFSET;
while (path->static_rate > IB_RATE_2_5_GBPS + MLX4_STAT_RATE_OFFSET &&
diff --git a/drivers/infiniband/hw/mlx5/ah.c b/drivers/infiniband/hw/mlx5/ah.c
index 745efa4..6248542 100644
--- a/drivers/infiniband/hw/mlx5/ah.c
+++ b/drivers/infiniband/hw/mlx5/ah.c
@@ -56,7 +56,7 @@ static struct ib_ah *create_ib_ah(struct mlx5_ib_dev *dev,
ah_attr->grh.sgid_index);
ah->av.stat_rate_sl |= (ah_attr->sl & 0x7) << 1;
} else {
- ah->av.rlid = cpu_to_be16(ah_attr->dlid);
+ ah->av.rlid = cpu_to_be16((u16)ah_attr->dlid);
ah->av.fl_mlid = ah_attr->src_path_bits & 0x7f;
ah->av.stat_rate_sl |= (ah_attr->sl & 0xf);
}
diff --git a/drivers/infiniband/hw/mthca/mthca_av.c b/drivers/infiniband/hw/mthca/mthca_av.c
index bcac294..9e0c5c8 100644
--- a/drivers/infiniband/hw/mthca/mthca_av.c
+++ b/drivers/infiniband/hw/mthca/mthca_av.c
@@ -200,7 +200,7 @@ int mthca_create_ah(struct mthca_dev *dev,
av->port_pd = cpu_to_be32(pd->pd_num | (ah_attr->port_num << 24));
av->g_slid = ah_attr->src_path_bits;
- av->dlid = cpu_to_be16(ah_attr->dlid);
+ av->dlid = cpu_to_be16((u16)ah_attr->dlid);
av->msg_sr = (3 << 4) | /* 2K message */
mthca_get_rate(dev, ah_attr->static_rate, ah_attr->port_num);
av->sl_tclass_flowlabel = cpu_to_be32(ah_attr->sl << 28);
diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c
index 96e5fb9..32d000f 100644
--- a/drivers/infiniband/hw/mthca/mthca_qp.c
+++ b/drivers/infiniband/hw/mthca/mthca_qp.c
@@ -516,7 +516,7 @@ static int mthca_path_set(struct mthca_dev *dev, const struct ib_ah_attr *ah,
struct mthca_qp_path *path, u8 port)
{
path->g_mylmc = ah->src_path_bits & 0x7f;
- path->rlid = cpu_to_be16(ah->dlid);
+ path->rlid = cpu_to_be16((u16)ah->dlid);
path->static_rate = mthca_get_rate(dev, ah->static_rate, port);
if (ah->ah_flags & IB_AH_GRH) {
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
index 797362a..ea90304 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
@@ -215,7 +215,7 @@ struct ib_ah *ocrdma_create_ah(struct ib_pd *ibpd, struct ib_ah_attr *attr)
/* if pd is for the user process, pass the ah_id to user space */
if ((pd->uctx) && (pd->uctx->ah_tbl.va)) {
- ahid_addr = pd->uctx->ah_tbl.va + attr->dlid;
+ ahid_addr = pd->uctx->ah_tbl.va + (u16)attr->dlid;
*ahid_addr = 0;
*ahid_addr |= ah->id & OCRDMA_AH_ID_MASK;
if (ocrdma_is_udp_encap_supported(dev)) {
diff --git a/drivers/infiniband/hw/qib/qib_rc.c b/drivers/infiniband/hw/qib/qib_rc.c
index 2097512..91f0d08 100644
--- a/drivers/infiniband/hw/qib/qib_rc.c
+++ b/drivers/infiniband/hw/qib/qib_rc.c
@@ -665,7 +665,7 @@ void qib_send_rc_ack(struct rvt_qp *qp)
lrh0 |= ibp->sl_to_vl[qp->remote_ah_attr.sl] << 12 |
qp->remote_ah_attr.sl << 4;
hdr.lrh[0] = cpu_to_be16(lrh0);
- hdr.lrh[1] = cpu_to_be16(qp->remote_ah_attr.dlid);
+ hdr.lrh[1] = cpu_to_be16((u16)qp->remote_ah_attr.dlid);
hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
hdr.lrh[3] = cpu_to_be16(ppd->lid | qp->remote_ah_attr.src_path_bits);
ohdr->bth[0] = cpu_to_be32(bth0);
@@ -2018,7 +2018,7 @@ void qib_rc_rcv(struct qib_ctxtdata *rcd, struct ib_header *hdr,
wc.opcode = IB_WC_RECV;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = qp->remote_ah_attr.dlid;
+ wc.slid = (u16)qp->remote_ah_attr.dlid;
wc.sl = qp->remote_ah_attr.sl;
/* zero fields that are N/A */
wc.vendor_err = 0;
diff --git a/drivers/infiniband/hw/qib/qib_ruc.c b/drivers/infiniband/hw/qib/qib_ruc.c
index de1bde5..0b0620f 100644
--- a/drivers/infiniband/hw/qib/qib_ruc.c
+++ b/drivers/infiniband/hw/qib/qib_ruc.c
@@ -297,7 +297,7 @@ int qib_ruc_check_hdr(struct qib_ibport *ibp, struct ib_header *hdr,
goto err;
}
/* Validate the SLID. See Ch. 9.6.1.5 and 17.2.8 */
- if (be16_to_cpu(hdr->lrh[3]) != qp->alt_ah_attr.dlid ||
+ if (be16_to_cpu(hdr->lrh[3]) != (u16)qp->alt_ah_attr.dlid ||
ppd_from_ibp(ibp)->port != qp->alt_ah_attr.port_num)
goto err;
spin_lock_irqsave(&qp->s_lock, flags);
@@ -330,7 +330,7 @@ int qib_ruc_check_hdr(struct qib_ibport *ibp, struct ib_header *hdr,
goto err;
}
/* Validate the SLID. See Ch. 9.6.1.5 */
- if (be16_to_cpu(hdr->lrh[3]) != qp->remote_ah_attr.dlid ||
+ if (be16_to_cpu(hdr->lrh[3]) != (u16)qp->remote_ah_attr.dlid ||
ppd_from_ibp(ibp)->port != qp->port_num)
goto err;
if (qp->s_mig_state == IB_MIG_REARM &&
@@ -566,7 +566,7 @@ static void qib_ruc_loopback(struct rvt_qp *sqp)
wc.byte_len = wqe->length;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = qp->remote_ah_attr.dlid;
+ wc.slid = (u16)qp->remote_ah_attr.dlid;
wc.sl = qp->remote_ah_attr.sl;
wc.port_num = 1;
/* Signal completion event if the solicited bit is set. */
@@ -702,7 +702,7 @@ void qib_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
lrh0 |= ibp->sl_to_vl[qp->remote_ah_attr.sl] << 12 |
qp->remote_ah_attr.sl << 4;
priv->s_hdr->lrh[0] = cpu_to_be16(lrh0);
- priv->s_hdr->lrh[1] = cpu_to_be16(qp->remote_ah_attr.dlid);
+ priv->s_hdr->lrh[1] = cpu_to_be16((u16)qp->remote_ah_attr.dlid);
priv->s_hdr->lrh[2] =
cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
priv->s_hdr->lrh[3] = cpu_to_be16(ppd_from_ibp(ibp)->lid |
@@ -744,7 +744,8 @@ void qib_do_send(struct rvt_qp *qp)
if ((qp->ibqp.qp_type == IB_QPT_RC ||
qp->ibqp.qp_type == IB_QPT_UC) &&
- (qp->remote_ah_attr.dlid & ~((1 << ppd->lmc) - 1)) == ppd->lid) {
+ (((u16)qp->remote_ah_attr.dlid) & ~((1 << ppd->lmc) - 1))
+ == ppd->lid) {
qib_ruc_loopback(qp);
return;
}
diff --git a/drivers/infiniband/hw/qib/qib_uc.c b/drivers/infiniband/hw/qib/qib_uc.c
index 5b2d483..7a10748 100644
--- a/drivers/infiniband/hw/qib/qib_uc.c
+++ b/drivers/infiniband/hw/qib/qib_uc.c
@@ -403,7 +403,7 @@ void qib_uc_rcv(struct qib_ibport *ibp, struct ib_header *hdr,
wc.status = IB_WC_SUCCESS;
wc.qp = &qp->ibqp;
wc.src_qp = qp->remote_qpn;
- wc.slid = qp->remote_ah_attr.dlid;
+ wc.slid = (u16)qp->remote_ah_attr.dlid;
wc.sl = qp->remote_ah_attr.sl;
/* zero fields that are N/A */
wc.vendor_err = 0;
diff --git a/drivers/infiniband/hw/qib/qib_ud.c b/drivers/infiniband/hw/qib/qib_ud.c
index f45cad1..37cd128 100644
--- a/drivers/infiniband/hw/qib/qib_ud.c
+++ b/drivers/infiniband/hw/qib/qib_ud.c
@@ -98,7 +98,7 @@ static void qib_ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
ah_attr->sl,
sqp->ibqp.qp_num, qp->ibqp.qp_num,
cpu_to_be16(lid),
- cpu_to_be16(ah_attr->dlid));
+ cpu_to_be16((u16)ah_attr->dlid));
goto drop;
}
}
@@ -122,7 +122,7 @@ static void qib_ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
ah_attr->sl,
sqp->ibqp.qp_num, qp->ibqp.qp_num,
cpu_to_be16(lid),
- cpu_to_be16(ah_attr->dlid));
+ cpu_to_be16((u16)ah_attr->dlid));
goto drop;
}
}
@@ -296,7 +296,7 @@ int qib_make_ud_req(struct rvt_qp *qp, unsigned long *flags)
this_cpu_inc(ibp->pmastats->n_unicast_xmit);
} else {
this_cpu_inc(ibp->pmastats->n_unicast_xmit);
- lid = ah_attr->dlid & ~((1 << ppd->lmc) - 1);
+ lid = ((u16)ah_attr->dlid) & ~((1 << ppd->lmc) - 1);
if (unlikely(lid == ppd->lid)) {
unsigned long tflags = *flags;
/*
@@ -363,7 +363,7 @@ int qib_make_ud_req(struct rvt_qp *qp, unsigned long *flags)
else
lrh0 |= ibp->sl_to_vl[ah_attr->sl] << 12;
priv->s_hdr->lrh[0] = cpu_to_be16(lrh0);
- priv->s_hdr->lrh[1] = cpu_to_be16(ah_attr->dlid); /* DEST LID */
+ priv->s_hdr->lrh[1] = cpu_to_be16((u16)ah_attr->dlid); /* DEST LID */
priv->s_hdr->lrh[2] =
cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
lid = ppd->lid;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 52216f6..294d3ed 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -814,7 +814,7 @@ struct ib_mr_status {
struct ib_ah_attr {
struct ib_global_route grh;
- u16 dlid;
+ u32 dlid;
u8 sl;
u8 src_path_bits;
u8 static_rate;
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 02/11] IB/core: Change port_attr.sm_lid from 16 to 32 bits
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
sm_lid field in port_attr is increased to 32 bits. This
enables core components to use the larger addresses if needed.
The user ABI is unchanged and return 16 bit values when queried.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/sa_query.c | 2 +-
drivers/infiniband/core/uverbs_cmd.c | 6 +++++-
drivers/infiniband/ulp/srpt/ib_srpt.c | 2 +-
include/rdma/ib_verbs.h | 2 +-
include/rdma/opa_addr.h | 38 +++++++++++++++++++++++++++++++++++
5 files changed, 46 insertions(+), 4 deletions(-)
create mode 100644 include/rdma/opa_addr.h
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 81b742c..0b0dc43 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -958,7 +958,7 @@ static void update_sm_ah(struct work_struct *work)
pr_err("Couldn't find index for default PKey\n");
memset(&ah_attr, 0, sizeof ah_attr);
- ah_attr.dlid = port_attr.sm_lid;
+ ah_attr.dlid = (u16)port_attr.sm_lid;
ah_attr.sl = port_attr.sm_sl;
ah_attr.port_num = port->port_num;
if (port_attr.grh_required) {
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index cb3f515a..7630c92 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -39,6 +39,7 @@
#include <linux/sched.h>
#include <asm/uaccess.h>
+#include <rdma/opa_addr.h>
#include "uverbs.h"
#include "core_priv.h"
@@ -515,7 +516,10 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
resp.qkey_viol_cntr = attr.qkey_viol_cntr;
resp.pkey_tbl_len = attr.pkey_tbl_len;
resp.lid = attr.lid;
- resp.sm_lid = attr.sm_lid;
+ if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
+ resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
+ else
+ resp.sm_lid = (u16)attr.sm_lid;
resp.lmc = attr.lmc;
resp.max_vl_num = attr.max_vl_num;
resp.sm_sl = attr.sm_sl;
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 0b1f69e..c6d0c47 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -514,7 +514,7 @@ static int srpt_refresh_port(struct srpt_port *sport)
if (ret)
goto err_query_port;
- sport->sm_lid = port_attr.sm_lid;
+ sport->sm_lid = (u16)port_attr.sm_lid;
sport->lid = port_attr.lid;
ret = ib_query_gid(sport->sdev->device, sport->port, 0, &sport->gid,
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 30fa96e..52216f6 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -521,7 +521,7 @@ struct ib_port_attr {
u32 qkey_viol_cntr;
u16 pkey_tbl_len;
u16 lid;
- u16 sm_lid;
+ u32 sm_lid;
u8 lmc;
u8 max_vl_num;
u8 sm_sl;
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
new file mode 100644
index 0000000..142b327
--- /dev/null
+++ b/include/rdma/opa_addr.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (c) 2016 Intel Corporation. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#if !defined(OPA_ADDR_H)
+#define OPA_ADDR_H
+
+#define OPA_TO_IB_UCAST_LID(x) (((x) >= be16_to_cpu(IB_MULTICAST_LID_BASE)) \
+ ? 0 : x)
+#endif /* OPA_ADDR_H */
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 01/11] IB/core: Add rdma_cap_opa_ah to expose opa address handles
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
In-Reply-To: <1479843532-47496-1-git-send-email-dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
rdma_cap_opa_ah(..) enables core components to check if the
corresponding port supports extended addresses
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
include/rdma/ib_verbs.h | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5ad43a4..30fa96e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -479,6 +479,7 @@ static inline struct rdma_hw_stats *rdma_alloc_hw_stats_struct(
/* Address format 0x000FF000 */
#define RDMA_CORE_CAP_AF_IB 0x00001000
#define RDMA_CORE_CAP_ETH_AH 0x00002000
+#define RDMA_CORE_CAP_OPA_AH 0x00004000
/* Protocol 0xFFF00000 */
#define RDMA_CORE_CAP_PROT_IB 0x00100000
@@ -2472,6 +2473,26 @@ static inline bool rdma_cap_eth_ah(const struct ib_device *device, u8 port_num)
}
/**
+ * rdma_cap_opa_ah - Check if the port of device has the capability
+ * OPA Address handle
+ * @device: Device to check
+ * @port_num: Port number to check
+ *
+ * OPA Address handles enable use of 32 bit LIDs by using a specially
+ * formatted GID field to carry the LID. This check enables kernel
+ * components to identify such a scheme so that they can then try
+ * to make use of the LID in the GID field.
+ *
+ * Return: true if we are running as a OPA device which enables
+ * 32 bit LIDs to be used in the fabric.
+ */
+static inline bool rdma_cap_opa_ah(struct ib_device *device, u8 port_num)
+{
+ return (device->port_immutable[port_num].core_cap_flags &
+ RDMA_CORE_CAP_OPA_AH) == RDMA_CORE_CAP_OPA_AH;
+}
+
+/**
* rdma_max_mad_size - Return the max MAD size required by this RDMA Port.
*
* @device: Device
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v2 00/11] IB/core: Add 32 bit LID support
From: Dasaratharaman Chandramouli @ 2016-11-22 19:38 UTC (permalink / raw)
To: Dasaratharaman Chandramouli, Ira Weiny, Don Hiatt, linux-rdma,
Doug Ledford
OPA devices can support more than 48K LIDs in the fabric. A node with a LID
greater than 0xbfff is called an 'extended lid'. In order to support verbs with
extended LIDs it is necessary to modify some of the RDMA data structures where
LIDs are currently only 16 bits in length.
This patch series follows on what was presented at the OFA Workshop. Rather
than breaking the current UABI we propose to extend the LID address space by
sending a 'special' GID value down the verbs stack that has the 32-bit LID
programmed in it. By having a means to differentiate a regular GID from our
'special' GID, the underlying OPA device driver is able to retrieve the 32-bit
LIDs from the GID fields instead of picking them up from the 16 bit lid fields.
Internal to the kernel data structures such as struct ib_wc, struct
ib_port_attr and related ones have been modified to use 32 bit LID fields.
These changes are specific to the kernel and do not break the current UABI.
Node <-> SM interaction in getting extended LID information
----------------------------------------------------------------------------
1. Source application determines the GID of the destination through standard
means and send a pathrecord query to the SM.
2. SM (which is OPA specific) recognizes that one or more nodes in the
pathrecord request uses extended LIDs.
3. SM issues a pathrecord response. The SGID and DGID fields in the pathrecord
response is the specially formulated GID.
4. Additionally, SM sets the hoplimit field of the pathrecord to 1.
5. Source receives the response and can determine the actual LID of the
destination, if needed, from the response.
Source Node <-> Destination Node interaction in using extended LID information
-------------------------------------------------------------------------------
1. Source uses the pathrecord response from the SM to create an address handle
to the destination (either at user or kernel space).
2. Since hoplimit field in the pathrecord is > 0, GRH fields are enabled in the
address handle.
3. Address handle information is now passed down through the RDMA stack and
reaches the driver.
4. Driver looks at the GRH fields in the address handle and determines that the
GID in the GRH is actually a special GID.
5. Driver retrieves LID from GID field and uses 16B packets to send data
on the wire.
6. Driver at the receiving side determines that a GRH needs to be added to the
address handle before passing it on to the destination application.
7. Destination now receives the packet and can send back the response using the
same address handle information.
There are some obvious limitations with this scheme:
----------------------------------------------------
1. Multicast packets which always need a GRH cannot use this scheme.
Essentially multicast LIDs cannot be extended.
2. Subnet routed packets which also need a GRH cannot fully use this scheme.
Specifically the LID of the router itself cannot be extended.
The actual destination can still be extended.
3. Applications will need to use pathrecords to get destination address
information. Any other out-of-band mechanisms are not guaranteed to work.
4. As an extension to 3, applications that 'validate' pathrecord responses need
to be careful not to treat 0 LID field as an error condition.
Changes from V1:
1. Increase ah_attr.dlid from 16 to 32 bits
Dasaratharaman Chandramouli (9):
IB/core: Add rdma_cap_opa_ah to expose opa address handles
IB/core: Change port_attr.sm_lid from 16 to 32 bits
IB/core: Change ah_attr.dlid from 16 to 32 bits
IB/core: Change port_attr.lid size from 16 to 32 bits
IB/mad: Change slid in RMPP recv from 16 to 32 bits
IB/SA: Program extended LID in SM Address handle
IB/IPoIB: Retrieve 32 bit LIDs from path records when running on OPA
devices
IB/IPoIB: Modify ipoib_get_net_dev_by_params to lookup gid table
IB/srpt: Increase lid and sm_lid to 32 bits
Don Hiatt (2):
IB/core: Change wc.slid from 16 to 32 bits
IB/mad: Ensure DR MADs are correctly specified when using OPA devices
drivers/infiniband/core/cm.c | 4 +-
drivers/infiniband/core/mad.c | 104 ++++++++++++++++++++++++++----
drivers/infiniband/core/mad_rmpp.c | 2 +-
drivers/infiniband/core/sa_query.c | 8 ++-
drivers/infiniband/core/user_mad.c | 2 +-
drivers/infiniband/core/uverbs_cmd.c | 23 +++++--
drivers/infiniband/core/uverbs_marshall.c | 2 +-
drivers/infiniband/hw/hfi1/driver.c | 4 +-
drivers/infiniband/hw/hfi1/mad.c | 2 +-
drivers/infiniband/hw/hfi1/rc.c | 2 +-
drivers/infiniband/hw/hfi1/ruc.c | 19 +++---
drivers/infiniband/hw/hfi1/ud.c | 10 +--
drivers/infiniband/hw/hfi1/verbs.c | 4 +-
drivers/infiniband/hw/mlx4/ah.c | 2 +-
drivers/infiniband/hw/mlx4/alias_GUID.c | 2 +-
drivers/infiniband/hw/mlx4/mad.c | 8 +--
drivers/infiniband/hw/mlx4/qp.c | 2 +-
drivers/infiniband/hw/mlx5/ah.c | 2 +-
drivers/infiniband/hw/mlx5/mad.c | 2 +-
drivers/infiniband/hw/mthca/mthca_av.c | 2 +-
drivers/infiniband/hw/mthca/mthca_cmd.c | 4 +-
drivers/infiniband/hw/mthca/mthca_mad.c | 4 +-
drivers/infiniband/hw/mthca/mthca_qp.c | 2 +-
drivers/infiniband/hw/ocrdma/ocrdma_ah.c | 2 +-
drivers/infiniband/hw/qib/qib_rc.c | 2 +-
drivers/infiniband/hw/qib/qib_ruc.c | 9 +--
drivers/infiniband/hw/qib/qib_ud.c | 8 +--
drivers/infiniband/sw/rdmavt/cq.c | 2 +-
drivers/infiniband/ulp/ipoib/ipoib.h | 4 +-
drivers/infiniband/ulp/ipoib/ipoib_cm.c | 11 ++++
drivers/infiniband/ulp/ipoib/ipoib_main.c | 63 +++++++++++++++++-
drivers/infiniband/ulp/srpt/ib_srpt.h | 4 +-
include/rdma/ib_verbs.h | 29 +++++++--
include/rdma/opa_addr.h | 68 +++++++++++++++++++
34 files changed, 340 insertions(+), 78 deletions(-)
create mode 100644 include/rdma/opa_addr.h
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Enabling peer to peer device transactions for PCIe devices
From: Serguei Sagalovitch @ 2016-11-22 18:59 UTC (permalink / raw)
To: Dan Williams, Deucher, Alexander
Cc: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Kuehling, Felix, Bridgman, John,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
Koenig, Christian, Sander, Ben, Suthikulpanit, Suravee,
Blinzer, Paul,
Linux-media-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <CAPcyv4i_5r2RVuV4F6V3ETbpKsf8jnMyQviZ7Legz3N4-v+9Og-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Dan,
I personally like "device-DAX" idea but my concerns are:
- How well it will co-exists with the DRM infrastructure /
implementations
in part dealing with CPU pointers?
- How well we will be able to handle case when we need to "move"/"evict"
memory/data to the new location so CPU pointer should point to the
new physical location/address
(and may be not in PCI device memory at all)?
Sincerely yours,
Serguei Sagalovitch
On 2016-11-22 01:11 PM, Dan Williams wrote:
> On Mon, Nov 21, 2016 at 12:36 PM, Deucher, Alexander
> <Alexander.Deucher-5C7GfCeVMHo@public.gmane.org> wrote:
>> This is certainly not the first time this has been brought up, but I'd like to try and get some consensus on the best way to move this forward. Allowing devices to talk directly improves performance and reduces latency by avoiding the use of staging buffers in system memory. Also in cases where both devices are behind a switch, it avoids the CPU entirely. Most current APIs (DirectGMA, PeerDirect, CUDA, HSA) that deal with this are pointer based. Ideally we'd be able to take a CPU virtual address and be able to get to a physical address taking into account IOMMUs, etc. Having struct pages for the memory would allow it to work more generally and wouldn't require as much explicit support in drivers that wanted to use it.
>>
>> Some use cases:
>> 1. Storage devices streaming directly to GPU device memory
>> 2. GPU device memory to GPU device memory streaming
>> 3. DVB/V4L/SDI devices streaming directly to GPU device memory
>> 4. DVB/V4L/SDI devices streaming directly to storage devices
>>
>> Here is a relatively simple example of how this could work for testing. This is obviously not a complete solution.
>> - Device memory will be registered with Linux memory sub-system by created corresponding struct page structures for device memory
>> - get_user_pages_fast() will return corresponding struct pages when CPU address points to the device memory
>> - put_page() will deal with struct pages for device memory
>>
> [..]
>> 4. iopmem
>> iopmem : A block device for PCIe memory (https://lwn.net/Articles/703895/)
> The change I suggest for this particular approach is to switch to
> "device-DAX" [1]. I.e. a character device for establishing DAX
> mappings rather than a block device plus a DAX filesystem. The pro of
> this approach is standard user pointers and struct pages rather than a
> new construct. The con is that this is done via an interface separate
> from the existing gpu and storage device. For example it would require
> a /dev/dax instance alongside a /dev/nvme interface, but I don't see
> that as a significant blocking concern.
>
> [1]: https://lists.01.org/pipermail/linux-nvdimm/2016-October/007496.html
^ permalink raw reply
* Re: Enabling peer to peer device transactions for PCIe devices
From: Dan Williams @ 2016-11-22 18:11 UTC (permalink / raw)
To: Deucher, Alexander
Cc: Sagalovitch, Serguei,
linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Kuehling, Felix, Bridgman, John,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
Koenig, Christian, Sander, Ben, Suthikulpanit, Suravee,
Blinzer, Paul,
Linux-media-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <MWHPR12MB169484839282E2D56124FA02F7B50-Gy0DoCVfaSW4WA4dJ5YXGAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
On Mon, Nov 21, 2016 at 12:36 PM, Deucher, Alexander
<Alexander.Deucher-5C7GfCeVMHo@public.gmane.org> wrote:
> This is certainly not the first time this has been brought up, but I'd like to try and get some consensus on the best way to move this forward. Allowing devices to talk directly improves performance and reduces latency by avoiding the use of staging buffers in system memory. Also in cases where both devices are behind a switch, it avoids the CPU entirely. Most current APIs (DirectGMA, PeerDirect, CUDA, HSA) that deal with this are pointer based. Ideally we'd be able to take a CPU virtual address and be able to get to a physical address taking into account IOMMUs, etc. Having struct pages for the memory would allow it to work more generally and wouldn't require as much explicit support in drivers that wanted to use it.
>
> Some use cases:
> 1. Storage devices streaming directly to GPU device memory
> 2. GPU device memory to GPU device memory streaming
> 3. DVB/V4L/SDI devices streaming directly to GPU device memory
> 4. DVB/V4L/SDI devices streaming directly to storage devices
>
> Here is a relatively simple example of how this could work for testing. This is obviously not a complete solution.
> - Device memory will be registered with Linux memory sub-system by created corresponding struct page structures for device memory
> - get_user_pages_fast() will return corresponding struct pages when CPU address points to the device memory
> - put_page() will deal with struct pages for device memory
>
[..]
> 4. iopmem
> iopmem : A block device for PCIe memory (https://lwn.net/Articles/703895/)
The change I suggest for this particular approach is to switch to
"device-DAX" [1]. I.e. a character device for establishing DAX
mappings rather than a block device plus a DAX filesystem. The pro of
this approach is standard user pointers and struct pages rather than a
new construct. The con is that this is done via an interface separate
from the existing gpu and storage device. For example it would require
a /dev/dax instance alongside a /dev/nvme interface, but I don't see
that as a significant blocking concern.
[1]: https://lists.01.org/pipermail/linux-nvdimm/2016-October/007496.html
^ permalink raw reply
* RE: [RFC] Avoid running out of local port in RDMA_CM
From: Hefty, Sean @ 2016-11-22 18:04 UTC (permalink / raw)
To: Moni Shoua; +Cc: linux-rdma, Doug Ledford
In-Reply-To: <CAG9sBKMfGWv7-wgoSNxVqMs7TktcJ3a4J0QnL+TmPSm1kaQ28g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
> > I believe that any solution here should mimic the TCP/IP stack as
> closely as possible. So I would rule out the re-use of a single port
> for all active connections.
> >
> > I think TCP matches on the full tuple <src port, src ip, dst port,
> dst ip>. We should be safe to re-use port numbers as long as some
> other portion of the tuple changes. Maybe that can be added as part of
> the port reservation/checking?
> >
>
> At first the thought was to reuse ports as long as the dest IP between
> rdma_id is different but is this complication really necessary?
> RDMA_CM mimics socket API but wire protocol is different and source
> port has no role in transporting a packet from QP to QP. Do you see a
> real risk in reusing a port unconditionally?
The port number is carried in the packet and provided as the source address to the other side. By re-using a single port number, a server will see multiple connections all reporting the same source address (port + src IP).
The proposal will break iWarp.
^ permalink raw reply
* Re: [RFC] Avoid running out of local port in RDMA_CM
From: Moni Shoua @ 2016-11-22 17:52 UTC (permalink / raw)
To: Hefty, Sean; +Cc: linux-rdma, Doug Ledford
In-Reply-To: <1828884A29C6694DAF28B7E6B8A82373AB0B611D-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
On Tue, Nov 22, 2016 at 7:38 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> I believe that any solution here should mimic the TCP/IP stack as closely as possible. So I would rule out the re-use of a single port for all active connections.
>
> I think TCP matches on the full tuple <src port, src ip, dst port, dst ip>. We should be safe to re-use port numbers as long as some other portion of the tuple changes. Maybe that can be added as part of the port reservation/checking?
>
At first the thought was to reuse ports as long as the dest IP between
rdma_id is different but is this complication really necessary?
RDMA_CM mimics socket API but wire protocol is different and source
port has no role in transporting a packet from QP to QP. Do you see a
real risk in reusing a port unconditionally?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: [RFC] Avoid running out of local port in RDMA_CM
From: Hefty, Sean @ 2016-11-22 17:38 UTC (permalink / raw)
To: Moni Shoua; +Cc: linux-rdma, Doug Ledford
In-Reply-To: <CAG9sBKMMuj5Qf7_jgL_EAwOAVAFZCp5Wf=NytqsHAezrys97pA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
I believe that any solution here should mimic the TCP/IP stack as closely as possible. So I would rule out the re-use of a single port for all active connections.
I think TCP matches on the full tuple <src port, src ip, dst port, dst ip>. We should be safe to re-use port numbers as long as some other portion of the tuple changes. Maybe that can be added as part of the port reservation/checking?
> Introduction
> -----------------------------------------------------------------------
> ---------
> Like TCP/IP sockets, RDMA_CM connection identifier (rdma_id) is
> associated with
> local and remote addresses and local and remote ports. Values for
> these attributes
> are assigned during the life cycle of the rdma_id. In this RFC we focus
> on the
> value that is given to the local port of the rdma_id.
> While in TCP/IP protocol port numbers are part of the transport header,
> in
> InfiniBand they don't have a place. The way for an application to
> connect to
> a specific service is to use the communication manager and use a known
> Service
> ID (see CHAPTER 12:COMMUNICATION MANAGEMENT in the InfiniBandTM
> Architecture
> Specification Volume 1). The RDMA IP CM Service, which provides support
> for
> a socket-like connection model for RDMA-aware ULPs, replaces the
> Service ID with a
> 16 bit port number which is used as an identifier for a service.
>
> The problem
> -----------------------------------------------------------------------
> ---------
> RDMA_CM requires binding of a connection identifier (rdma_id) to a
> local port.
> The passive side, the one calling rdma_listen(), usually binds
> explicitly to a
> well-known port. The active side, the one calling rmda_connect(),
> binds implicitly
> to a random port that rdma_cm chooses for it. This makes sense if we
> keep in mind
> that the port number is a way to identify a service. Binding to a port
> removes it
> from the pool of available ports until the rdma_id is destroyed at
> which time the
> port is returned to the pool. The problem starts when number of
> rdma_ids is larger
> than the number of available ports. The most likely scenario for this
> to happen is a
> node with many clients trying to connect to remote services. When the
> available port
> pool is empty the call to rdma_resolve_addr() fails when a free port
> number is requested
> from the pool.
>
> Suggested Solution
> -----------------------------------------------------------------------
> ---------
> Extending the size of the pool is out of the question since we must to
> keep the
> 16 bit width of the port number to avoid backward compatibility issues.
> 1. Port number is a parameter to the function that generates Service
> ID.
> 2. Port number is part of the private data of the request MAD (see
> Annex
> A11: RDMA IP CM Service in the InfiniBandTM Architecture
> Specification Volume 1)
> The other alternative is to reuse port numbers. Since port numbers are
> not part of the
> InfiniBand transport header we don't need to worry about wire protocol
> issues. Also,
> since binding to a port on the active side doesn't create a conflict
> in service
> identification (since no one listens to active side rdma_id), it is
> safe to reuse a
> port number there when the port pool is empty.
> The suggested solution is to reserve one port as a global port for
> reuse and assign
> it under the following conditions
> 1. The request for binding is implicit and for any port (via
> cma_alloc_any_port())
> 2. The pool is empty
> 3. The ULP allows it
>
> Risks
> -----------------------------------------------------------------------
> ---------
> RDMA_CM puts the local port number in the private data section of the
> CM request
> MAD. If this field is observed by an application or a traffic analyzer
> there
> might be a confusion. A way to minimize the risk is to reuse a port
> only if
> application allows it (say by setting an option to the rdma_id)
^ permalink raw reply
* Re: [PATCH 6/7] IB/rxe: Avoid missed completions in the CM/MAD
From: Moni Shoua @ 2016-11-22 17:14 UTC (permalink / raw)
To: Andrew Boyer; +Cc: linux-rdma
In-Reply-To: <1479479809-10798-6-git-send-email-andrew.boyer-8PEkshWhKlo@public.gmane.org>
On Fri, Nov 18, 2016 at 4:36 PM, Andrew Boyer <andrew.boyer-8PEkshWhKlo@public.gmane.org> wrote:
> The MAD code uses the IB_CQ_REPORT_MISSED_EVENTS flag to avoid a
> race between posting CQEs and arming the CQ. Without this fix, the
> last completion might be left on the CQ, hanging the kthread
> waiting on MAD to complete.
> See ib_cq_poll_work().
>
Looks OK but I would edit the commit message a bit. This fix is
relevant not only for MAD and not only for workqueue polling context.
For example, iSER allocates CQ with SOFTIRQ polling context and is
also exposed to this bug (see ib_poll_handler)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [RFC 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) Bus driver
From: Jason Gunthorpe @ 2016-11-22 17:04 UTC (permalink / raw)
To: Vishwanathapura, Niranjana
Cc: Doug Ledford, linux-rdma, netdev, Dennis Dalessandro
In-Reply-To: <20161122015304.GB67988@knc-06.sc.intel.com>
On Mon, Nov 21, 2016 at 05:53:04PM -0800, Vishwanathapura, Niranjana wrote:
> There are many example drivers in kernel which are using bus_register() in
> an initcall.
There really are not, certainly not in major subsystems.
> We could add a custom Interface between HFI1 driver and hfi_vnic drivers
> without involving a bus.
hfi is already registering on the infiniband class, just use that.
> But using the existing bus model gave a lot of in-built flexibility in
> decoupling devices from the drivers.
If you want to have your own bus then you need your own hfi
subsystem. drivers/infiniband is not a dumping ground..
Jason
^ permalink raw reply
* RE: [PATCH 1/3] IB/mad: Fix an array index check
From: Weiny, Ira @ 2016-11-22 16:48 UTC (permalink / raw)
To: Bart Van Assche, Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Hefty, Sean
In-Reply-To: <396b3fe2-6a84-efed-07de-3e6381009ad1-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1726 bytes --]
>
> The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS
> (80) elements. Hence compare the array index with that value instead of with
> IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity reports the
> following:
>
> Overrunning array class->method_table of 80 8-byte elements at element index
> 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr-
> >mgmt_class) (which evaluates to 127).
>
> Fixes: commit b7ab0b19a85f ("IB/mad: Verify mgmt class in received MADs")
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Sean Hefty <sean.hefty@intel.com>
> Cc: <stable@vger.kernel.org>
Thanks!
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> ---
> drivers/infiniband/core/mad.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index
> 40cbd6b..2395fe2 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -1746,7 +1746,7 @@ find_mad_agent(struct ib_mad_port_private
> *port_priv,
> if (!class)
> goto out;
> if (convert_mgmt_class(mad_hdr->mgmt_class) >=
> - IB_MGMT_MAX_METHODS)
> + ARRAY_SIZE(class->method_table))
> goto out;
> method = class->method_table[convert_mgmt_class(
> mad_hdr-
> >mgmt_class)];
> --
> 2.10.2
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¥{±Ù{ayº\x1dÊÚë,j\a¢f£¢·h»öì\x17/oSc¾Ú³9uÀ¦æåÈ&jw¨®\x03(éÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þàþf£¢·h§~m
^ permalink raw reply
* Re: [PATCH 4/7] IB/rxe: Unblock loopback by moving skb_out increment
From: Moni Shoua @ 2016-11-22 16:13 UTC (permalink / raw)
To: Andrew Boyer, Doug Ledford; +Cc: linux-rdma
In-Reply-To: <1479479809-10798-4-git-send-email-andrew.boyer-8PEkshWhKlo@public.gmane.org>
Acked-by: Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
On Fri, Nov 18, 2016 at 4:36 PM, Andrew Boyer <andrew.boyer-8PEkshWhKlo@public.gmane.org> wrote:
> skb_out is decremented in rxe_skb_tx_dtor(), which is not called in the
> loopback() path. Move the increment to the send() path rather than
> rxe_xmit_packet().
>
> Signed-off-by: Andrew Boyer <andrew.boyer-8PEkshWhKlo@public.gmane.org>
> ---
> drivers/infiniband/sw/rxe/rxe_loc.h | 2 --
> drivers/infiniband/sw/rxe/rxe_net.c | 2 ++
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
> index 73849a5a..efe4c6a 100644
> --- a/drivers/infiniband/sw/rxe/rxe_loc.h
> +++ b/drivers/infiniband/sw/rxe/rxe_loc.h
> @@ -266,8 +266,6 @@ static inline int rxe_xmit_packet(struct rxe_dev *rxe, struct rxe_qp *qp,
> return err;
> }
>
> - atomic_inc(&qp->skb_out);
> -
> if ((qp_type(qp) != IB_QPT_RC) &&
> (pkt->mask & RXE_END_MASK)) {
> pkt->wqe->state = wqe_state_done;
> diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
> index ffff5a5..332ce52 100644
> --- a/drivers/infiniband/sw/rxe/rxe_net.c
> +++ b/drivers/infiniband/sw/rxe/rxe_net.c
> @@ -455,6 +455,8 @@ static int send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt,
> return -EAGAIN;
> }
>
> + if (pkt->qp)
> + atomic_inc(&pkt->qp->skb_out);
> kfree_skb(skb);
>
> return 0;
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] i40iw: Fix incorrect assignment of SQ head
From: Henry Orosco @ 2016-11-22 15:44 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
e1000-rdma-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Henry Orosco
The SQ head is incorrectly incremented when the number
of WQEs required is greater than the number available.
The fix is to use the I40IW_RING_MOV_HEAD_BY_COUNT
macro. This checks for the SQ full condition first and
only if SQ has room for the request, then we move the
head appropriately.
Signed-off-by: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Henry Orosco <henry.orosco-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/i40iw/i40iw_uk.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_uk.c b/drivers/infiniband/hw/i40iw/i40iw_uk.c
index 4d28c3c..3987cd8 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_uk.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_uk.c
@@ -175,12 +175,10 @@ u64 *i40iw_qp_get_next_send_wqe(struct i40iw_qp_uk *qp,
if (!*wqe_idx)
qp->swqe_polarity = !qp->swqe_polarity;
}
-
- for (i = 0; i < wqe_size / I40IW_QP_WQE_MIN_SIZE; i++) {
- I40IW_RING_MOVE_HEAD(qp->sq_ring, ret_code);
- if (ret_code)
- return NULL;
- }
+ I40IW_RING_MOVE_HEAD_BY_COUNT(qp->sq_ring,
+ wqe_size / I40IW_QP_WQE_MIN_SIZE, ret_code);
+ if (ret_code)
+ return NULL;
wqe = qp->sq_base[*wqe_idx].elem;
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH 2/7] IB/rxe: Advance the consumer pointer before posting the CQE
From: Yonatan Cohen @ 2016-11-22 15:24 UTC (permalink / raw)
To: Andrew Boyer, monis-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1479479809-10798-2-git-send-email-andrew.boyer-8PEkshWhKlo@public.gmane.org>
On 11/18/2016 4:36 PM, Andrew Boyer wrote:
> A simple userspace application might poll the CQ, find a completion,
> and then attempt to post a new WQE to the SQ. A spurious error can
> occur if the userspace application detects a full SQ in the instant
> before the kernel is able to advance the SQ consumer pointer.
>
> This is noticeable when using single-entry SQs with ibv_rc_pingpong()
> if lots of kernel and userspace library debugging is enabled.
>
> Signed-off-by: Andrew Boyer <andrew.boyer-8PEkshWhKlo@public.gmane.org>
> ---
> drivers/infiniband/sw/rxe/rxe_comp.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
> index 6c5e29d..d46c49b 100644
> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
> @@ -420,11 +420,12 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
> (wqe->wr.send_flags & IB_SEND_SIGNALED) ||
> (qp->req.state == QP_STATE_ERROR)) {
> make_send_cqe(qp, wqe, &cqe);
> + advance_consumer(qp->sq.queue);
> rxe_cq_post(qp->scq, &cqe, 0);
> + } else {
> + advance_consumer(qp->sq.queue);
> }
>
> - advance_consumer(qp->sq.queue);
> -
> /*
> * we completed something so let req run again
> * if it is trying to fence
>
Reviewed-by: Yonatan Cohen <yonatanc-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 1/7] IB/rxe: Allocate enough space for an IPv6 addr
From: Yonatan Cohen @ 2016-11-22 15:21 UTC (permalink / raw)
To: Andrew Boyer, monis-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1479479809-10798-1-git-send-email-andrew.boyer-8PEkshWhKlo@public.gmane.org>
On 11/18/2016 4:36 PM, Andrew Boyer wrote:
> Avoid smashing the stack when an ICRC error occurs on an IPv6 network.
>
> Signed-off-by: Andrew Boyer <andrew.boyer-8PEkshWhKlo@public.gmane.org>
> ---
> drivers/infiniband/sw/rxe/rxe_recv.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
> index 46f0628..b40ab8d 100644
> --- a/drivers/infiniband/sw/rxe/rxe_recv.c
> +++ b/drivers/infiniband/sw/rxe/rxe_recv.c
> @@ -391,7 +391,7 @@ int rxe_rcv(struct sk_buff *skb)
> payload_size(pkt));
> calc_icrc = cpu_to_be32(~calc_icrc);
> if (unlikely(calc_icrc != pack_icrc)) {
> - char saddr[sizeof(struct in6_addr)];
> + char saddr[64];
>
> if (skb->protocol == htons(ETH_P_IPV6))
> sprintf(saddr, "%pI6", &ipv6_hdr(skb)->saddr);
>
you fixed a bug here but i think the following would be better
than hard coding 64 bytes on the stack
--- a/drivers/infiniband/sw/rxe/rxe_recv.c
+++ b/drivers/infiniband/sw/rxe/rxe_recv.c
@@ -391,16 +391,14 @@ int rxe_rcv(struct sk_buff *skb)
payload_size(pkt));
calc_icrc = cpu_to_be32(~calc_icrc);
if (unlikely(calc_icrc != pack_icrc)) {
- char saddr[sizeof(struct in6_addr)];
if (skb->protocol == htons(ETH_P_IPV6))
- sprintf(saddr, "%pI6", &ipv6_hdr(skb)->saddr);
+ pr_warn_ratelimited("bad ICRC from %pI6\n",
&ipv6_hdr(skb)->saddr);
else if (skb->protocol == htons(ETH_P_IP))
- sprintf(saddr, "%pI4", &ip_hdr(skb)->saddr);
+ pr_warn_ratelimited("bad ICRC from %pI4\n",
&ip_hdr(skb)->saddr);
else
- sprintf(saddr, "unknown");
+ pr_warn_ratelimited("bad ICRC from unknown\n");
- pr_warn_ratelimited("bad ICRC from %s\n", saddr);
goto drop;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [RFC] Avoid running out of local port in RDMA_CM
From: Moni Shoua @ 2016-11-22 10:12 UTC (permalink / raw)
To: Sean Hefty; +Cc: linux-rdma, Doug Ledford
Introduction
--------------------------------------------------------------------------------
Like TCP/IP sockets, RDMA_CM connection identifier (rdma_id) is associated with
local and remote addresses and local and remote ports. Values for
these attributes
are assigned during the life cycle of the rdma_id. In this RFC we focus on the
value that is given to the local port of the rdma_id.
While in TCP/IP protocol port numbers are part of the transport header, in
InfiniBand they don't have a place. The way for an application to connect to
a specific service is to use the communication manager and use a known Service
ID (see CHAPTER 12:COMMUNICATION MANAGEMENT in the InfiniBandTM Architecture
Specification Volume 1). The RDMA IP CM Service, which provides support for
a socket-like connection model for RDMA-aware ULPs, replaces the
Service ID with a
16 bit port number which is used as an identifier for a service.
The problem
--------------------------------------------------------------------------------
RDMA_CM requires binding of a connection identifier (rdma_id) to a local port.
The passive side, the one calling rdma_listen(), usually binds explicitly to a
well-known port. The active side, the one calling rmda_connect(),
binds implicitly
to a random port that rdma_cm chooses for it. This makes sense if we
keep in mind
that the port number is a way to identify a service. Binding to a port
removes it
from the pool of available ports until the rdma_id is destroyed at
which time the
port is returned to the pool. The problem starts when number of
rdma_ids is larger
than the number of available ports. The most likely scenario for this
to happen is a
node with many clients trying to connect to remote services. When the
available port
pool is empty the call to rdma_resolve_addr() fails when a free port
number is requested
from the pool.
Suggested Solution
--------------------------------------------------------------------------------
Extending the size of the pool is out of the question since we must to keep the
16 bit width of the port number to avoid backward compatibility issues.
1. Port number is a parameter to the function that generates Service ID.
2. Port number is part of the private data of the request MAD (see Annex
A11: RDMA IP CM Service in the InfiniBandTM Architecture
Specification Volume 1)
The other alternative is to reuse port numbers. Since port numbers are
not part of the
InfiniBand transport header we don't need to worry about wire protocol
issues. Also,
since binding to a port on the active side doesn't create a conflict in service
identification (since no one listens to active side rdma_id), it is
safe to reuse a
port number there when the port pool is empty.
The suggested solution is to reserve one port as a global port for
reuse and assign
it under the following conditions
1. The request for binding is implicit and for any port (via
cma_alloc_any_port())
2. The pool is empty
3. The ULP allows it
Risks
--------------------------------------------------------------------------------
RDMA_CM puts the local port number in the private data section of the CM request
MAD. If this field is observed by an application or a traffic analyzer there
might be a confusion. A way to minimize the risk is to reuse a port only if
application allows it (say by setting an option to the rdma_id)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 2/3] IPoIB: Avoid reading an uninitialized member variable
From: Leon Romanovsky @ 2016-11-22 7:15 UTC (permalink / raw)
To: Bart Van Assche
Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Erez Shitrit
In-Reply-To: <fd4a2913-545e-bf2f-e352-a47fd50a954f-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 1826 bytes --]
On Mon, Nov 21, 2016 at 10:21:41AM -0800, Bart Van Assche wrote:
> This patch avoids that Coverity reports the following:
>
> Using uninitialized value port_attr.state when calling printk
>
> Fixes: commit 94232d9ce817 ("IPoIB: Start multicast join process only on active ports")
> Signed-off-by: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> Cc: Erez Shitrit <erezsh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Except that it doesn't print the reason why ib_query_port failed,
it look good.
Reviewed-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Thanks
> ---
> drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> index 1909dd2..fddff40 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> @@ -575,8 +575,11 @@ void ipoib_mcast_join_task(struct work_struct *work)
> if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags))
> return;
>
> - if (ib_query_port(priv->ca, priv->port, &port_attr) ||
> - port_attr.state != IB_PORT_ACTIVE) {
> + if (ib_query_port(priv->ca, priv->port, &port_attr)) {
> + ipoib_dbg(priv, "ib_query_port() failed\n");
> + return;
> + }
> + if (port_attr.state != IB_PORT_ACTIVE) {
> ipoib_dbg(priv, "port state is not ACTIVE (state = %d) suspending join task\n",
> port_attr.state);
> return;
> --
> 2.10.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox