* [PATCH for-next 0/4] IP based RoCE GID Addressing
@ 2013-06-13 15:01 Or Gerlitz
[not found] ` <1371135704-5712-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Or Gerlitz @ 2013-06-13 15:01 UTC (permalink / raw)
To: roland-DgEjT+Ai2ygdnm+yROfE0A
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
matanb-VPRAkNaXOzVWk0Htik3J/w, Or Gerlitz
Currently, the IB stack (core + drivers) handle RoCE (IBoE) gids as
they encode related Ethernet net-device interface MAC address and
possibly VLAN id.
This series changes RoCE GIDs to encode IP addresses (IPv4 + IPv6)
of the that Ethernet interface, under the following reasoning:
1. There are environments where the compute entity that runs the RoCE
stack is not aware that its traffic is vlan-tagged. This results with that
node to create/assume wrong GIDs from the view point of a peer node which
is aware to vlans.
Note that "node" here can be physical node connected to Ethernet switch acting in
access mode talking to another node which does vlan insertion/stripping by itself.
Or another example is SRIOV Virtual Function which is configured to work in "VST"
mode (Virtual-Switch-Tagging) such that the hypervisor configures the HW eSWitch
to do vlan insertion for the vPORT representing that function.
2. When RoCE traffic is inspected (mirrored/trapped) in Ethernet switches for
monitoring and security purposes. It is much more natural for both humans and
automated utilities (...) to observe IP addresses in a certain offset into RoCE
frames L3 header vs. MAC/VLANs (which are there anyway in the L2 header of that
frame, so they are not gone by this change).
3. Some Bonding/Teaming advanced mode such as balance-alb and balance-tlb
are using multiple underlying devices in parallel, and hence packets always
carry the bond IP address but different streams have different source MACs.
The approach brought by this series is part from what would allow to
support that for RoCE traffic too.
The 1st patch modified the IB core to cope with the new scheme, and the 2nd does that
for the mlx4_ib driver. The 3rd patch sets the foundation for extending uverbs to
the new scheme which was introduced lately, and the fourth patch adds two extended
uCMA commands and two extended uVERBS commands which are now exported to user space.
These extended verbs will allow to enhance user space libraries such that they work
OK over the modified scheme. All RC applications using librdmacm will not need to be
modified at all, since the change will be encapsulated into that library.
The ocrdma driver needs to go through a similar patch as the mlx4_ib one, we can
surely do that patch, just need to dig there a little further.
Or.
Igor Ivanov (1):
IB/core: Infra-structure to support verbs extensions through uverbs
Matan Barak (1):
IB/core: Add RoCE IP based addressing extensions towards user space
Moni Shoua (2):
IB/core: RoCE IP based GID addressing
IB/mlx4: RoCE IP based GID addressing
drivers/infiniband/core/cm.c | 3 +
drivers/infiniband/core/cma.c | 39 ++-
drivers/infiniband/core/sa_query.c | 5 +
drivers/infiniband/core/ucma.c | 190 +++++++++++--
drivers/infiniband/core/uverbs.h | 2 +
drivers/infiniband/core/uverbs_cmd.c | 330 ++++++++++++++++-----
drivers/infiniband/core/uverbs_main.c | 33 ++-
drivers/infiniband/core/uverbs_marshall.c | 94 ++++++-
drivers/infiniband/core/verbs.c | 7 +
drivers/infiniband/hw/mlx4/ah.c | 21 +-
drivers/infiniband/hw/mlx4/cq.c | 5 +
drivers/infiniband/hw/mlx4/main.c | 461 ++++++++++++++++++++---------
drivers/infiniband/hw/mlx4/mlx4_ib.h | 3 +
drivers/infiniband/hw/mlx4/qp.c | 19 +-
include/linux/mlx4/cq.h | 14 +-
include/rdma/ib_addr.h | 45 ++--
include/rdma/ib_marshall.h | 12 +
include/rdma/ib_sa.h | 3 +
include/rdma/ib_verbs.h | 4 +
include/uapi/rdma/ib_user_sa.h | 34 ++-
include/uapi/rdma/ib_user_verbs.h | 130 ++++++++-
include/uapi/rdma/rdma_user_cm.h | 21 ++-
22 files changed, 1157 insertions(+), 318 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH for-next 1/4] IB/core: RoCE IP based GID addressing
[not found] ` <1371135704-5712-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-13 15:01 ` Or Gerlitz
2013-06-13 15:01 ` [PATCH for-next 2/4] IB/mlx4: " Or Gerlitz
` (3 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Or Gerlitz @ 2013-06-13 15:01 UTC (permalink / raw)
To: roland-DgEjT+Ai2ygdnm+yROfE0A
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
matanb-VPRAkNaXOzVWk0Htik3J/w, Moni Shoua, Or Gerlitz
From: Moni Shoua <monis-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
Currently, the IB core assume RoCE (IBoE) gids encode related Ethernet
netdevice interface MAC address and possibly VLAN id.
Change gids to be treated as they encode interface IP address.
Since Ethernet layer 2 address parameters are not longer encoded within gids,
had to extend the Infiniband address structures (e.g. ib_ah_attr) with layer 2
address parameters, namely mac and vlan.
Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/core/cm.c | 3 ++
drivers/infiniband/core/cma.c | 39 ++++++++++++++++++++++--------
drivers/infiniband/core/sa_query.c | 5 ++++
drivers/infiniband/core/ucma.c | 18 +++-----------
drivers/infiniband/core/verbs.c | 7 +++++
include/rdma/ib_addr.h | 45 ++++++++++++++++++++----------------
include/rdma/ib_sa.h | 3 ++
include/rdma/ib_verbs.h | 4 +++
8 files changed, 79 insertions(+), 45 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 784b97c..7af618f 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1557,6 +1557,9 @@ static int cm_req_handler(struct cm_work *work)
cm_process_routed_req(req_msg, work->mad_recv_wc->wc);
cm_format_paths_from_req(req_msg, &work->path[0], &work->path[1]);
+
+ memcpy(work->path[0].dmac, cm_id_priv->av.ah_attr.dmac, 6);
+ work->path[0].vlan = cm_id_priv->av.ah_attr.vlan;
ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av);
if (ret) {
ib_get_cached_gid(work->port->cm_dev->ib_device,
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 71c2c71..ba217c9 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -373,7 +373,9 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
return -EINVAL;
mutex_lock(&lock);
- iboe_addr_get_sgid(dev_addr, &iboe_gid);
+ rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.src_addr,
+ &iboe_gid);
+
memcpy(&gid, dev_addr->src_dev_addr +
rdma_addr_gid_offset(dev_addr), sizeof gid);
list_for_each_entry(cma_dev, &dev_list, list) {
@@ -1803,7 +1805,7 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
struct sockaddr_in *src_addr = (struct sockaddr_in *)&route->addr.src_addr;
struct sockaddr_in *dst_addr = (struct sockaddr_in *)&route->addr.dst_addr;
struct net_device *ndev = NULL;
- u16 vid;
+
if (src_addr->sin_family != dst_addr->sin_family)
return -EINVAL;
@@ -1830,10 +1832,13 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
goto err2;
}
- vid = rdma_vlan_dev_vlan_id(ndev);
+ route->path_rec->vlan = rdma_vlan_dev_vlan_id(ndev);
+ memcpy(route->path_rec->dmac, addr->dev_addr.dst_dev_addr, 6);
- iboe_mac_vlan_to_ll(&route->path_rec->sgid, addr->dev_addr.src_dev_addr, vid);
- iboe_mac_vlan_to_ll(&route->path_rec->dgid, addr->dev_addr.dst_dev_addr, vid);
+ rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.src_addr,
+ &route->path_rec->sgid);
+ rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.dst_addr,
+ &route->path_rec->dgid);
route->path_rec->hop_limit = 1;
route->path_rec->reversible = 1;
@@ -1970,6 +1975,8 @@ static void addr_handler(int status, struct sockaddr *src_addr,
RDMA_CM_ADDR_RESOLVED))
goto out;
+ memcpy(&id_priv->id.route.addr.src_addr, src_addr,
+ ip_addr_size(src_addr));
if (!status && !id_priv->cma_dev)
status = cma_acquire_dev(id_priv);
@@ -1979,11 +1986,8 @@ static void addr_handler(int status, struct sockaddr *src_addr,
goto out;
event.event = RDMA_CM_EVENT_ADDR_ERROR;
event.status = status;
- } else {
- memcpy(&id_priv->id.route.addr.src_addr, src_addr,
- ip_addr_size(src_addr));
+ } else
event.event = RDMA_CM_EVENT_ADDR_RESOLVED;
- }
if (id_priv->id.event_handler(&id_priv->id, &event)) {
cma_exch(id_priv, RDMA_CM_DESTROYING);
@@ -2381,6 +2385,7 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr)
if (ret)
goto err1;
+ memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr));
if (!cma_any_addr(addr)) {
ret = rdma_translate_ip(addr, &id->route.addr.dev_addr);
if (ret)
@@ -2391,7 +2396,6 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr)
goto err1;
}
- memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr));
if (!(id_priv->options & (1 << CMA_OPTION_AFONLY))) {
if (addr->sa_family == AF_INET)
id_priv->afonly = 1;
@@ -2951,9 +2955,13 @@ static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast)
struct rdma_id_private *id_priv;
struct cma_multicast *mc = multicast->context;
struct rdma_cm_event event;
+ struct rdma_dev_addr *dev_addr;
int ret;
+ struct net_device *ndev = NULL;
+ u16 vlan;
id_priv = mc->id_priv;
+ dev_addr = &id_priv->id.route.addr.dev_addr;
if (cma_disable_callback(id_priv, RDMA_CM_ADDR_BOUND) &&
cma_disable_callback(id_priv, RDMA_CM_ADDR_RESOLVED))
return 0;
@@ -2967,11 +2975,19 @@ static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast)
memset(&event, 0, sizeof event);
event.status = status;
event.param.ud.private_data = mc->context;
+ ndev = dev_get_by_index(&init_net, dev_addr->bound_dev_if);
+ if (!ndev) {
+ status = -ENODEV;
+ } else {
+ vlan = rdma_vlan_dev_vlan_id(ndev);
+ dev_put(ndev);
+ }
if (!status) {
event.event = RDMA_CM_EVENT_MULTICAST_JOIN;
ib_init_ah_from_mcmember(id_priv->id.device,
id_priv->id.port_num, &multicast->rec,
&event.param.ud.ah_attr);
+ event.param.ud.ah_attr.vlan = vlan;
event.param.ud.qp_num = 0xFFFFFF;
event.param.ud.qkey = be32_to_cpu(multicast->rec.qkey);
} else
@@ -3138,7 +3154,8 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
err = -EINVAL;
goto out2;
}
- iboe_addr_get_sgid(dev_addr, &mc->multicast.ib->rec.port_gid);
+ rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.src_addr,
+ &mc->multicast.ib->rec.port_gid);
work->id = id_priv;
work->mc = mc;
INIT_WORK(&work->work, iboe_mcast_work_handler);
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 934f45e..d813075 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -556,6 +556,11 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
ah_attr->grh.hop_limit = rec->hop_limit;
ah_attr->grh.traffic_class = rec->traffic_class;
}
+ if (force_grh) {
+ memcpy(ah_attr->dmac, rec->dmac, 6);
+ ah_attr->vlan = rec->vlan;
+ }
+
return 0;
}
EXPORT_SYMBOL(ib_init_ah_from_path);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 5ca44cd..bc2cb5d 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -602,24 +602,14 @@ static void ucma_copy_ib_route(struct rdma_ucm_query_route_resp *resp,
static void ucma_copy_iboe_route(struct rdma_ucm_query_route_resp *resp,
struct rdma_route *route)
{
- struct rdma_dev_addr *dev_addr;
- struct net_device *dev;
- u16 vid = 0;
resp->num_paths = route->num_paths;
switch (route->num_paths) {
case 0:
- dev_addr = &route->addr.dev_addr;
- dev = dev_get_by_index(&init_net, dev_addr->bound_dev_if);
- if (dev) {
- vid = rdma_vlan_dev_vlan_id(dev);
- dev_put(dev);
- }
-
- iboe_mac_vlan_to_ll((union ib_gid *) &resp->ib_route[0].dgid,
- dev_addr->dst_dev_addr, vid);
- iboe_addr_get_sgid(dev_addr,
- (union ib_gid *) &resp->ib_route[0].sgid);
+ rdma_ip2gid((struct sockaddr *)&route->addr.dst_addr,
+ (union ib_gid *)&resp->ib_route[0].dgid);
+ rdma_ip2gid((struct sockaddr *)&route->addr.src_addr,
+ (union ib_gid *)&resp->ib_route[0].sgid);
resp->ib_route[0].pkey = cpu_to_be16(0xffff);
break;
case 2:
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 22192de..936ec87 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -189,8 +189,15 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
u32 flow_class;
u16 gid_index;
int ret;
+ int is_eth = (rdma_port_get_link_layer(device, port_num) ==
+ IB_LINK_LAYER_ETHERNET);
memset(ah_attr, 0, sizeof *ah_attr);
+ if (is_eth) {
+ memcpy(ah_attr->dmac, wc->smac, 6);
+ ah_attr->vlan = wc->vlan;
+ }
+
ah_attr->dlid = wc->slid;
ah_attr->sl = wc->sl;
ah_attr->src_path_bits = wc->dlid_path_bits;
diff --git a/include/rdma/ib_addr.h b/include/rdma/ib_addr.h
index 9996539..b38f837 100644
--- a/include/rdma/ib_addr.h
+++ b/include/rdma/ib_addr.h
@@ -38,8 +38,12 @@
#include <linux/in6.h>
#include <linux/if_arp.h>
#include <linux/netdevice.h>
+#include <linux/inetdevice.h>
#include <linux/socket.h>
#include <linux/if_vlan.h>
+#include <net/ipv6.h>
+#include <net/if_inet6.h>
+#include <net/ip.h>
#include <rdma/ib_verbs.h>
#include <rdma/ib_pack.h>
@@ -130,41 +134,42 @@ static inline int rdma_addr_gid_offset(struct rdma_dev_addr *dev_addr)
return dev_addr->dev_type == ARPHRD_INFINIBAND ? 4 : 0;
}
-static inline void iboe_mac_vlan_to_ll(union ib_gid *gid, u8 *mac, u16 vid)
-{
- memset(gid->raw, 0, 16);
- *((__be32 *) gid->raw) = cpu_to_be32(0xfe800000);
- if (vid < 0x1000) {
- gid->raw[12] = vid & 0xff;
- gid->raw[11] = vid >> 8;
- } else {
- gid->raw[12] = 0xfe;
- gid->raw[11] = 0xff;
- }
- memcpy(gid->raw + 13, mac + 3, 3);
- memcpy(gid->raw + 8, mac, 3);
- gid->raw[8] ^= 2;
-}
-
static inline u16 rdma_vlan_dev_vlan_id(const struct net_device *dev)
{
return dev->priv_flags & IFF_802_1Q_VLAN ?
vlan_dev_vlan_id(dev) : 0xffff;
}
+static inline int rdma_ip2gid(struct sockaddr *addr, union ib_gid *gid)
+{
+ switch (addr->sa_family) {
+ case AF_INET:
+ ipv6_addr_set_v4mapped(((struct sockaddr_in *)addr)->sin_addr.s_addr,
+ (struct in6_addr *)gid);
+ break;
+ case AF_INET6:
+ memcpy(gid->raw, &((struct sockaddr_in6 *)addr)->sin6_addr, 16);
+ break;
+ default:
+ return -EINVAL;
+ }
+ return 0;
+}
+
static inline void iboe_addr_get_sgid(struct rdma_dev_addr *dev_addr,
union ib_gid *gid)
{
struct net_device *dev;
- u16 vid = 0xffff;
+ struct in_device *ip4;
dev = dev_get_by_index(&init_net, dev_addr->bound_dev_if);
if (dev) {
- vid = rdma_vlan_dev_vlan_id(dev);
+ ip4 = (struct in_device *)dev->ip_ptr;
+ if (ip4 && ip4->ifa_list && ip4->ifa_list->ifa_address)
+ ipv6_addr_set_v4mapped(ip4->ifa_list->ifa_address,
+ (struct in6_addr *)gid);
dev_put(dev);
}
-
- iboe_mac_vlan_to_ll(gid, dev_addr->src_dev_addr, vid);
}
static inline void rdma_addr_get_sgid(struct rdma_dev_addr *dev_addr, union ib_gid *gid)
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index 8275e53..0a9207e 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -154,6 +154,9 @@ struct ib_sa_path_rec {
u8 packet_life_time_selector;
u8 packet_life_time;
u8 preference;
+ u8 smac[6];
+ u8 dmac[6];
+ __be16 vlan;
};
#define IB_SA_MCMEMBER_REC_MGID IB_SA_COMP_MASK( 0)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 98cc4b2..ef1f332 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -469,6 +469,8 @@ struct ib_ah_attr {
u8 static_rate;
u8 ah_flags;
u8 port_num;
+ u8 dmac[6];
+ u16 vlan;
};
enum ib_wc_status {
@@ -541,6 +543,8 @@ struct ib_wc {
u8 sl;
u8 dlid_path_bits;
u8 port_num; /* valid only for DR SMPs on switches */
+ u8 smac[6];
+ u16 vlan;
};
enum ib_cq_notify_flags {
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH for-next 2/4] IB/mlx4: RoCE IP based GID addressing
[not found] ` <1371135704-5712-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-13 15:01 ` [PATCH for-next 1/4] IB/core: RoCE IP based GID addressing Or Gerlitz
@ 2013-06-13 15:01 ` Or Gerlitz
2013-06-13 15:01 ` [PATCH for-next 3/4] IB/core: Infra-structure to support verbs extensions through uverbs Or Gerlitz
` (2 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Or Gerlitz @ 2013-06-13 15:01 UTC (permalink / raw)
To: roland-DgEjT+Ai2ygdnm+yROfE0A
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
matanb-VPRAkNaXOzVWk0Htik3J/w, Moni Shoua, Or Gerlitz
From: Moni Shoua <monis-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
Currently, the mlx4 driver set RoCE (IBoE) gids to encode related
Ethernet netdevice interface MAC address and possibly VLAN id.
Change this scheme such that gids encode interface IP addresses
(both IP4 and IPv6).
Signed-off-by: Moni Shoua <monis-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx4/ah.c | 21 +-
drivers/infiniband/hw/mlx4/cq.c | 5 +
drivers/infiniband/hw/mlx4/main.c | 461 +++++++++++++++++++++++-----------
drivers/infiniband/hw/mlx4/mlx4_ib.h | 3 +
drivers/infiniband/hw/mlx4/qp.c | 19 +-
include/linux/mlx4/cq.h | 14 +-
6 files changed, 354 insertions(+), 169 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/ah.c b/drivers/infiniband/hw/mlx4/ah.c
index a251bec..3941700 100644
--- a/drivers/infiniband/hw/mlx4/ah.c
+++ b/drivers/infiniband/hw/mlx4/ah.c
@@ -92,21 +92,18 @@ static struct ib_ah *create_iboe_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr
{
struct mlx4_ib_dev *ibdev = to_mdev(pd->device);
struct mlx4_dev *dev = ibdev->dev;
- union ib_gid sgid;
- u8 mac[6];
- int err;
int is_mcast;
+ struct in6_addr in6;
u16 vlan_tag;
- err = mlx4_ib_resolve_grh(ibdev, ah_attr, mac, &is_mcast, ah_attr->port_num);
- if (err)
- return ERR_PTR(err);
-
- memcpy(ah->av.eth.mac, mac, 6);
- err = ib_get_cached_gid(pd->device, ah_attr->port_num, ah_attr->grh.sgid_index, &sgid);
- if (err)
- return ERR_PTR(err);
- vlan_tag = rdma_get_vlan_id(&sgid);
+ memcpy(&in6, ah_attr->grh.dgid.raw, sizeof(in6));
+ if (rdma_is_multicast_addr(&in6)) {
+ is_mcast = 1;
+ rdma_get_mcast_mac(&in6, ah->av.eth.mac);
+ } else {
+ memcpy(ah->av.eth.mac, ah_attr->dmac, 6);
+ }
+ vlan_tag = ah_attr->vlan;
if (vlan_tag < 0x1000)
vlan_tag |= (ah_attr->sl & 7) << 13;
ah->av.eth.port_pd = cpu_to_be32(to_mpd(pd)->pdn | (ah_attr->port_num << 24));
diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index d5e60f4..ba3f85b 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -793,6 +793,11 @@ repoll:
wc->sl = be16_to_cpu(cqe->sl_vid) >> 13;
else
wc->sl = be16_to_cpu(cqe->sl_vid) >> 12;
+ if (be32_to_cpu(cqe->vlan_my_qpn) & MLX4_CQE_VLAN_PRESENT_MASK)
+ wc->vlan = be16_to_cpu(cqe->sl_vid) & MLX4_CQE_VID_MASK;
+ else
+ wc->vlan = 0xffff;
+ memcpy(wc->smac, cqe->smac, 6);
}
return 0;
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 23d7343..8879b41 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -39,6 +39,8 @@
#include <linux/inetdevice.h>
#include <linux/rtnetlink.h>
#include <linux/if_vlan.h>
+#include <net/ipv6.h>
+#include <net/addrconf.h>
#include <rdma/ib_smi.h>
#include <rdma/ib_user_verbs.h>
@@ -767,7 +769,6 @@ static int add_gid_entry(struct ib_qp *ibqp, union ib_gid *gid)
int mlx4_ib_add_mc(struct mlx4_ib_dev *mdev, struct mlx4_ib_qp *mqp,
union ib_gid *gid)
{
- u8 mac[6];
struct net_device *ndev;
int ret = 0;
@@ -781,11 +782,7 @@ int mlx4_ib_add_mc(struct mlx4_ib_dev *mdev, struct mlx4_ib_qp *mqp,
spin_unlock(&mdev->iboe.lock);
if (ndev) {
- rdma_get_mcast_mac((struct in6_addr *)gid, mac);
- rtnl_lock();
- dev_mc_add(mdev->iboe.netdevs[mqp->port - 1], mac);
ret = 1;
- rtnl_unlock();
dev_put(ndev);
}
@@ -805,6 +802,8 @@ static int mlx4_ib_mcg_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
struct mlx4_ib_qp *mqp = to_mqp(ibqp);
u64 reg_id;
struct mlx4_ib_steering *ib_steering = NULL;
+ enum mlx4_protocol prot = (gid->raw[1] == 0x0e) ?
+ MLX4_PROT_IB_IPV4 : MLX4_PROT_IB_IPV6;
if (mdev->dev->caps.steering_mode ==
MLX4_STEERING_MODE_DEVICE_MANAGED) {
@@ -816,7 +815,7 @@ static int mlx4_ib_mcg_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
err = mlx4_multicast_attach(mdev->dev, &mqp->mqp, gid->raw, mqp->port,
!!(mqp->flags &
MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK),
- MLX4_PROT_IB_IPV6, ®_id);
+ prot, ®_id);
if (err)
goto err_malloc;
@@ -835,7 +834,7 @@ static int mlx4_ib_mcg_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
err_add:
mlx4_multicast_detach(mdev->dev, &mqp->mqp, gid->raw,
- MLX4_PROT_IB_IPV6, reg_id);
+ prot, reg_id);
err_malloc:
kfree(ib_steering);
@@ -863,10 +862,11 @@ static int mlx4_ib_mcg_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
int err;
struct mlx4_ib_dev *mdev = to_mdev(ibqp->device);
struct mlx4_ib_qp *mqp = to_mqp(ibqp);
- u8 mac[6];
struct net_device *ndev;
struct mlx4_ib_gid_entry *ge;
u64 reg_id = 0;
+ enum mlx4_protocol prot = (gid->raw[1] == 0x0e) ?
+ MLX4_PROT_IB_IPV4 : MLX4_PROT_IB_IPV6;
if (mdev->dev->caps.steering_mode ==
MLX4_STEERING_MODE_DEVICE_MANAGED) {
@@ -889,7 +889,7 @@ static int mlx4_ib_mcg_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
}
err = mlx4_multicast_detach(mdev->dev, &mqp->mqp, gid->raw,
- MLX4_PROT_IB_IPV6, reg_id);
+ prot, reg_id);
if (err)
return err;
@@ -901,13 +901,8 @@ static int mlx4_ib_mcg_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
if (ndev)
dev_hold(ndev);
spin_unlock(&mdev->iboe.lock);
- rdma_get_mcast_mac((struct in6_addr *)gid, mac);
- if (ndev) {
- rtnl_lock();
- dev_mc_del(mdev->iboe.netdevs[ge->port - 1], mac);
- rtnl_unlock();
+ if (ndev)
dev_put(ndev);
- }
list_del(&ge->list);
kfree(ge);
} else
@@ -1003,20 +998,6 @@ static struct device_attribute *mlx4_class_attributes[] = {
&dev_attr_board_id
};
-static void mlx4_addrconf_ifid_eui48(u8 *eui, u16 vlan_id, struct net_device *dev)
-{
- memcpy(eui, dev->dev_addr, 3);
- memcpy(eui + 5, dev->dev_addr + 3, 3);
- if (vlan_id < 0x1000) {
- eui[3] = vlan_id >> 8;
- eui[4] = vlan_id & 0xff;
- } else {
- eui[3] = 0xff;
- eui[4] = 0xfe;
- }
- eui[0] ^= 2;
-}
-
static void update_gids_task(struct work_struct *work)
{
struct update_gid_work *gw = container_of(work, struct update_gid_work, work);
@@ -1039,161 +1020,303 @@ static void update_gids_task(struct work_struct *work)
MLX4_CMD_WRAPPED);
if (err)
pr_warn("set port command failed\n");
- else {
- memcpy(gw->dev->iboe.gid_table[gw->port - 1], gw->gids, sizeof gw->gids);
+ else
mlx4_ib_dispatch_event(gw->dev, gw->port, IB_EVENT_GID_CHANGE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ kfree(gw);
+}
+
+static void reset_gids_task(struct work_struct *work)
+{
+ struct update_gid_work *gw =
+ container_of(work, struct update_gid_work, work);
+ struct mlx4_cmd_mailbox *mailbox;
+ union ib_gid *gids;
+ int err;
+ struct mlx4_dev *dev = gw->dev->dev;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ pr_warn("reset gid table failed\n");
+ goto free;
}
+ gids = mailbox->buf;
+ memcpy(gids, gw->gids, sizeof(gw->gids));
+
+ err = mlx4_cmd(dev, mailbox->dma, MLX4_SET_PORT_GID_TABLE << 8 | 1,
+ 1, MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+ if (err)
+ pr_warn(KERN_WARNING "set port 1 command failed\n");
+
+ err = mlx4_cmd(dev, mailbox->dma, MLX4_SET_PORT_GID_TABLE << 8 | 2,
+ 1, MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+ if (err)
+ pr_warn(KERN_WARNING "set port 2 command failed\n");
+
mlx4_free_cmd_mailbox(dev, mailbox);
+free:
kfree(gw);
}
-static int update_ipv6_gids(struct mlx4_ib_dev *dev, int port, int clear)
+static int update_gid_table(struct mlx4_ib_dev *dev, int port,
+ union ib_gid *gid, int clear)
{
- struct net_device *ndev = dev->iboe.netdevs[port - 1];
struct update_gid_work *work;
- struct net_device *tmp;
int i;
- u8 *hits;
- int ret;
- union ib_gid gid;
- int free;
- int found;
int need_update = 0;
- u16 vid;
+ int free = -1;
+ int found = -1;
+ int max_gids;
+
+ max_gids = dev->dev->caps.gid_table_len[port];
+ for (i = 0; i < max_gids; ++i) {
+ if (!memcmp(&dev->iboe.gid_table[port - 1][i], gid,
+ sizeof(*gid)))
+ found = i;
+
+ if (clear) {
+ if (found >= 0) {
+ need_update = 1;
+ dev->iboe.gid_table[port - 1][found] = zgid;
+ break;
+ }
+ } else {
+ if (found >= 0)
+ break;
+
+ if (free < 0 && !memcmp(&dev->iboe.gid_table[port - 1][i], &zgid,
+ sizeof(*gid)))
+ free = i;
+ }
+ }
+
+ if (found == -1 && !clear && free >= 0) {
+ dev->iboe.gid_table[port - 1][free] = *gid;
+ need_update = 1;
+ }
+
+ if (!need_update)
+ return 0;
work = kzalloc(sizeof *work, GFP_ATOMIC);
if (!work)
return -ENOMEM;
- hits = kzalloc(128, GFP_ATOMIC);
- if (!hits) {
- ret = -ENOMEM;
- goto out;
- }
+ memcpy(work->gids, dev->iboe.gid_table[port - 1], sizeof(work->gids));
+ INIT_WORK(&work->work, update_gids_task);
+ work->port = port;
+ work->dev = dev;
+ queue_work(wq, &work->work);
- rcu_read_lock();
- for_each_netdev_rcu(&init_net, tmp) {
- if (ndev && (tmp == ndev || rdma_vlan_dev_real_dev(tmp) == ndev)) {
- gid.global.subnet_prefix = cpu_to_be64(0xfe80000000000000LL);
- vid = rdma_vlan_dev_vlan_id(tmp);
- mlx4_addrconf_ifid_eui48(&gid.raw[8], vid, ndev);
- found = 0;
- free = -1;
- for (i = 0; i < 128; ++i) {
- if (free < 0 &&
- !memcmp(&dev->iboe.gid_table[port - 1][i], &zgid, sizeof zgid))
- free = i;
- if (!memcmp(&dev->iboe.gid_table[port - 1][i], &gid, sizeof gid)) {
- hits[i] = 1;
- found = 1;
- break;
- }
- }
-
- if (!found) {
- if (tmp == ndev &&
- (memcmp(&dev->iboe.gid_table[port - 1][0],
- &gid, sizeof gid) ||
- !memcmp(&dev->iboe.gid_table[port - 1][0],
- &zgid, sizeof gid))) {
- dev->iboe.gid_table[port - 1][0] = gid;
- ++need_update;
- hits[0] = 1;
- } else if (free >= 0) {
- dev->iboe.gid_table[port - 1][free] = gid;
- hits[free] = 1;
- ++need_update;
- }
- }
- }
- }
- rcu_read_unlock();
+ return 0;
+}
- for (i = 0; i < 128; ++i)
- if (!hits[i]) {
- if (memcmp(&dev->iboe.gid_table[port - 1][i], &zgid, sizeof zgid))
- ++need_update;
- dev->iboe.gid_table[port - 1][i] = zgid;
- }
+static int reset_gid_table(struct mlx4_ib_dev *dev)
+{
+ struct update_gid_work *work;
- if (need_update) {
- memcpy(work->gids, dev->iboe.gid_table[port - 1], sizeof work->gids);
- INIT_WORK(&work->work, update_gids_task);
- work->port = port;
- work->dev = dev;
- queue_work(wq, &work->work);
- } else
- kfree(work);
- kfree(hits);
+ work = kzalloc(sizeof(*work), GFP_ATOMIC);
+ if (!work)
+ return -ENOMEM;
+ memset(dev->iboe.gid_table, 0, sizeof(dev->iboe.gid_table));
+ memset(work->gids, 0, sizeof(work->gids));
+ INIT_WORK(&work->work, reset_gids_task);
+ work->dev = dev;
+ queue_work(wq, &work->work);
return 0;
-
-out:
- kfree(work);
- return ret;
}
-static void handle_en_event(struct mlx4_ib_dev *dev, int port, unsigned long event)
+static int mlx4_ib_addr_event(int event, struct net_device *event_netdev,
+ struct mlx4_ib_dev *ibdev, union ib_gid *gid)
{
- switch (event) {
- case NETDEV_UP:
- case NETDEV_CHANGEADDR:
- update_ipv6_gids(dev, port, 0);
- break;
+ struct mlx4_ib_iboe *iboe;
+ int port = 0;
+ struct net_device *real_dev = rdma_vlan_dev_real_dev(event_netdev) ?
+ rdma_vlan_dev_real_dev(event_netdev) : event_netdev;
+
+ if (event != NETDEV_DOWN && event != NETDEV_UP)
+ return 0;
+
+ if ((real_dev != event_netdev) &&
+ (event == NETDEV_DOWN) &&
+ rdma_link_local_addr((struct in6_addr *)gid))
+ return 0;
+
+ iboe = &ibdev->iboe;
+ spin_lock(&iboe->lock);
+
+ for (port = 1; port <= MLX4_MAX_PORTS; ++port)
+ if ((netif_is_bond_master(real_dev) && (real_dev == iboe->masters[port - 1])) ||
+ (!netif_is_bond_master(real_dev) && (real_dev == iboe->netdevs[port - 1])))
+ update_gid_table(ibdev, port, gid, event == NETDEV_DOWN);
+
+ spin_unlock(&iboe->lock);
+ return 0;
- case NETDEV_DOWN:
- update_ipv6_gids(dev, port, 1);
- dev->iboe.netdevs[port - 1] = NULL;
- }
}
-static void netdev_added(struct mlx4_ib_dev *dev, int port)
+static u8 mlx4_ib_get_dev_port(struct net_device *dev,
+ struct mlx4_ib_dev *ibdev)
{
- update_ipv6_gids(dev, port, 0);
+ u8 port = 0;
+ struct mlx4_ib_iboe *iboe;
+ struct net_device *real_dev = rdma_vlan_dev_real_dev(dev) ?
+ rdma_vlan_dev_real_dev(dev) : dev;
+
+ iboe = &ibdev->iboe;
+ spin_lock(&iboe->lock);
+
+ for (port = 1; port <= MLX4_MAX_PORTS; ++port)
+ if ((netif_is_bond_master(real_dev) && (real_dev == iboe->masters[port - 1])) ||
+ (!netif_is_bond_master(real_dev) && (real_dev == iboe->netdevs[port - 1])))
+ break;
+
+ spin_unlock(&iboe->lock);
+
+ if ((port == 0) || (port > MLX4_MAX_PORTS))
+ return 0;
+ else
+ return port;
}
-static void netdev_removed(struct mlx4_ib_dev *dev, int port)
+static int mlx4_ib_inet_event(struct notifier_block *this, unsigned long event,
+ void *ptr)
{
- update_ipv6_gids(dev, port, 1);
+ struct mlx4_ib_dev *ibdev;
+ struct in_ifaddr *ifa = ptr;
+ union ib_gid gid;
+ struct net_device *event_netdev = ifa->ifa_dev->dev;
+
+ ipv6_addr_set_v4mapped(ifa->ifa_address, (struct in6_addr *)&gid);
+
+ ibdev = container_of(this, struct mlx4_ib_dev, iboe.nb_inet);
+
+ mlx4_ib_addr_event(event, event_netdev, ibdev, &gid);
+ return NOTIFY_DONE;
}
-static int mlx4_ib_netdev_event(struct notifier_block *this, unsigned long event,
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+static int mlx4_ib_inet6_event(struct notifier_block *this, unsigned long event,
void *ptr)
{
- struct net_device *dev = ptr;
struct mlx4_ib_dev *ibdev;
- struct net_device *oldnd;
+ struct inet6_ifaddr *ifa = ptr;
+ union ib_gid *gid = (union ib_gid *)&ifa->addr;
+ struct net_device *event_netdev = ifa->idev->dev;
+
+ ibdev = container_of(this, struct mlx4_ib_dev, iboe.nb_inet6);
+
+ mlx4_ib_addr_event(event, event_netdev, ibdev, gid);
+ return NOTIFY_DONE;
+}
+#endif
+
+static void mlx4_ib_get_dev_addr(struct net_device *dev, struct mlx4_ib_dev *ibdev, u8 port)
+{
+ struct in_device *in_dev;
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+ struct inet6_dev *in6_dev;
+ union ib_gid *pgid;
+ struct inet6_ifaddr *ifp;
+#endif
+ union ib_gid gid;
+
+
+ if ((port == 0) || (port > MLX4_MAX_PORTS))
+ return;
+
+ /* IPv4 gids */
+ in_dev = in_dev_get(dev);
+ if (in_dev) {
+ for_ifa(in_dev) {
+ /*ifa->ifa_address;*/
+ ipv6_addr_set_v4mapped(ifa->ifa_address, (struct in6_addr *)&gid);
+ update_gid_table(ibdev, port, &gid, 0);
+ }
+ endfor_ifa(in_dev);
+ in_dev_put(in_dev);
+ }
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+ /* IPv6 gids */
+ in6_dev = in6_dev_get(dev);
+ if (in6_dev) {
+ read_lock_bh(&in6_dev->lock);
+ list_for_each_entry(ifp, &in6_dev->addr_list, if_list) {
+ pgid = (union ib_gid *)&ifp->addr;
+ update_gid_table(ibdev, port, pgid, 0);
+ }
+ read_unlock_bh(&in6_dev->lock);
+ in6_dev_put(in6_dev);
+ }
+#endif
+}
+
+int mlx4_ib_init_gid_table(struct mlx4_ib_dev *ibdev)
+{
+ struct net_device *dev;
+
+ if (reset_gid_table(ibdev))
+ return -1;
+
+ read_lock(&dev_base_lock);
+
+ for_each_netdev(&init_net, dev) {
+ u8 port = mlx4_ib_get_dev_port(dev, ibdev);
+ if (port)
+ mlx4_ib_get_dev_addr(dev, ibdev, port);
+ }
+
+ read_unlock(&dev_base_lock);
+
+ return 0;
+}
+
+static void mlx4_ib_scan_netdevs(struct mlx4_ib_dev *ibdev)
+{
struct mlx4_ib_iboe *iboe;
int port;
- if (!net_eq(dev_net(dev), &init_net))
- return NOTIFY_DONE;
-
- ibdev = container_of(this, struct mlx4_ib_dev, iboe.nb);
iboe = &ibdev->iboe;
spin_lock(&iboe->lock);
mlx4_foreach_ib_transport_port(port, ibdev->dev) {
- oldnd = iboe->netdevs[port - 1];
+ struct net_device *old_master = iboe->masters[port - 1];
+ struct net_device *curr_master;
iboe->netdevs[port - 1] =
mlx4_get_protocol_dev(ibdev->dev, MLX4_PROT_ETH, port);
- if (oldnd != iboe->netdevs[port - 1]) {
- if (iboe->netdevs[port - 1])
- netdev_added(ibdev, port);
- else
- netdev_removed(ibdev, port);
+
+ if (iboe->netdevs[port - 1] && netif_is_bond_slave(iboe->netdevs[port - 1])) {
+ rtnl_lock();
+ iboe->masters[port - 1] = netdev_master_upper_dev_get(iboe->netdevs[port - 1]);
+ rtnl_unlock();
}
- }
+ curr_master = iboe->masters[port - 1];
- if (dev == iboe->netdevs[0] ||
- (iboe->netdevs[0] && rdma_vlan_dev_real_dev(dev) == iboe->netdevs[0]))
- handle_en_event(ibdev, 1, event);
- else if (dev == iboe->netdevs[1]
- || (iboe->netdevs[1] && rdma_vlan_dev_real_dev(dev) == iboe->netdevs[1]))
- handle_en_event(ibdev, 2, event);
+ /* if bonding is used it is possible that we add it to masters only after
+ IP address is assigned to the net bonding interface */
+ if (curr_master && (old_master != curr_master))
+ mlx4_ib_get_dev_addr(curr_master, ibdev, port);
+ }
spin_unlock(&iboe->lock);
+}
+
+static int mlx4_ib_netdev_event(struct notifier_block *this, unsigned long event,
+ void *ptr)
+{
+ struct net_device *dev = ptr;
+ struct mlx4_ib_dev *ibdev;
+
+ if (!net_eq(dev_net(dev), &init_net))
+ return NOTIFY_DONE;
+
+ ibdev = container_of(this, struct mlx4_ib_dev, iboe.nb);
+ mlx4_ib_scan_netdevs(ibdev);
return NOTIFY_DONE;
}
@@ -1490,11 +1613,35 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
if (mlx4_ib_init_sriov(ibdev))
goto err_mad;
- if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE && !iboe->nb.notifier_call) {
- iboe->nb.notifier_call = mlx4_ib_netdev_event;
- err = register_netdevice_notifier(&iboe->nb);
- if (err)
- goto err_sriov;
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) {
+ if (!iboe->nb.notifier_call) {
+ iboe->nb.notifier_call = mlx4_ib_netdev_event;
+ err = register_netdevice_notifier(&iboe->nb);
+ if (err) {
+ iboe->nb.notifier_call = NULL;
+ goto err_notif;
+ }
+ }
+ if (!iboe->nb_inet.notifier_call) {
+ iboe->nb_inet.notifier_call = mlx4_ib_inet_event;
+ err = register_inetaddr_notifier(&iboe->nb_inet);
+ if (err) {
+ iboe->nb_inet.notifier_call = NULL;
+ goto err_notif;
+ }
+ }
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+ if (!iboe->nb_inet6.notifier_call) {
+ iboe->nb_inet6.notifier_call = mlx4_ib_inet6_event;
+ err = register_inet6addr_notifier(&iboe->nb_inet6);
+ if (err) {
+ iboe->nb_inet6.notifier_call = NULL;
+ goto err_notif;
+ }
+ }
+#endif
+ mlx4_ib_scan_netdevs(ibdev);
+ mlx4_ib_init_gid_table(ibdev);
}
for (j = 0; j < ARRAY_SIZE(mlx4_class_attributes); ++j) {
@@ -1520,11 +1667,25 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
return ibdev;
err_notif:
- if (unregister_netdevice_notifier(&ibdev->iboe.nb))
- pr_warn("failure unregistering notifier\n");
+ if (ibdev->iboe.nb.notifier_call) {
+ if (unregister_netdevice_notifier(&ibdev->iboe.nb))
+ pr_warn("failure unregistering notifier\n");
+ ibdev->iboe.nb.notifier_call = NULL;
+ }
+ if (ibdev->iboe.nb_inet.notifier_call) {
+ if (unregister_inetaddr_notifier(&ibdev->iboe.nb_inet))
+ pr_warn("failure unregistering notifier\n");
+ ibdev->iboe.nb_inet.notifier_call = NULL;
+ }
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+ if (ibdev->iboe.nb_inet6.notifier_call) {
+ if (unregister_inet6addr_notifier(&ibdev->iboe.nb_inet6))
+ pr_warn("failure unregistering notifier\n");
+ ibdev->iboe.nb_inet6.notifier_call = NULL;
+ }
+#endif
flush_workqueue(wq);
-err_sriov:
mlx4_ib_close_sriov(ibdev);
err_mad:
@@ -1566,6 +1727,18 @@ static void mlx4_ib_remove(struct mlx4_dev *dev, void *ibdev_ptr)
pr_warn("failure unregistering notifier\n");
ibdev->iboe.nb.notifier_call = NULL;
}
+ if (ibdev->iboe.nb_inet.notifier_call) {
+ if (unregister_inetaddr_notifier(&ibdev->iboe.nb_inet))
+ pr_warn("failure unregistering notifier\n");
+ ibdev->iboe.nb_inet.notifier_call = NULL;
+ }
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+ if (ibdev->iboe.nb_inet6.notifier_call) {
+ if (unregister_inet6addr_notifier(&ibdev->iboe.nb_inet6))
+ pr_warn("failure unregistering notifier\n");
+ ibdev->iboe.nb_inet6.notifier_call = NULL;
+ }
+#endif
iounmap(ibdev->uar_map);
for (p = 0; p < ibdev->num_ports; ++p)
if (ibdev->counters[p] != -1)
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index f61ec26..0c98417 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -422,7 +422,10 @@ struct mlx4_ib_sriov {
struct mlx4_ib_iboe {
spinlock_t lock;
struct net_device *netdevs[MLX4_MAX_PORTS];
+ struct net_device *masters[MLX4_MAX_PORTS];
struct notifier_block nb;
+ struct notifier_block nb_inet;
+ struct notifier_block nb_inet6;
union ib_gid gid_table[MLX4_MAX_PORTS][128];
};
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 4f10af2..ddf5a1a 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1147,11 +1147,8 @@ static void mlx4_set_sched(struct mlx4_qp_path *path, u8 port)
static int mlx4_set_path(struct mlx4_ib_dev *dev, const struct ib_ah_attr *ah,
struct mlx4_qp_path *path, u8 port)
{
- int err;
int is_eth = rdma_port_get_link_layer(&dev->ib_dev, port) ==
IB_LINK_LAYER_ETHERNET;
- u8 mac[6];
- int is_mcast;
u16 vlan_tag;
int vidx;
@@ -1188,16 +1185,12 @@ static int mlx4_set_path(struct mlx4_ib_dev *dev, const struct ib_ah_attr *ah,
if (!(ah->ah_flags & IB_AH_GRH))
return -1;
- err = mlx4_ib_resolve_grh(dev, ah, mac, &is_mcast, port);
- if (err)
- return err;
-
- memcpy(path->dmac, mac, 6);
+ memcpy(path->dmac, ah->dmac, 6);
path->ackto = MLX4_IB_LINK_TYPE_ETH;
/* use index 0 into MAC table for IBoE */
path->grh_mylmc &= 0x80;
- vlan_tag = rdma_get_vlan_id(&dev->iboe.gid_table[port - 1][ah->grh.sgid_index]);
+ vlan_tag = ah->vlan;
if (vlan_tag < 0x1000) {
if (mlx4_find_cached_vlan(dev->dev, port, vlan_tag, &vidx))
return -ENOENT;
@@ -1236,6 +1229,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
enum mlx4_qp_optpar optpar = 0;
int sqd_event;
int err = -EINVAL;
+ int is_eth;
context = kzalloc(sizeof *context, GFP_KERNEL);
if (!context)
@@ -1464,6 +1458,13 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
context->pri_path.ackto = (context->pri_path.ackto & 0xf8) |
MLX4_IB_LINK_TYPE_ETH;
+ if (ibqp->qp_type == IB_QPT_UD)
+ if (is_eth && (new_state == IB_QPS_RTR)) {
+ context->pri_path.ackto = MLX4_IB_LINK_TYPE_ETH;
+ optpar |= MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH;
+ }
+
+
if (cur_state == IB_QPS_RTS && new_state == IB_QPS_SQD &&
attr_mask & IB_QP_EN_SQD_ASYNC_NOTIFY && attr->en_sqd_async_notify)
sqd_event = 1;
diff --git a/include/linux/mlx4/cq.h b/include/linux/mlx4/cq.h
index 98fa492..72ba0a9 100644
--- a/include/linux/mlx4/cq.h
+++ b/include/linux/mlx4/cq.h
@@ -43,10 +43,15 @@ struct mlx4_cqe {
__be32 immed_rss_invalid;
__be32 g_mlpath_rqpn;
__be16 sl_vid;
- __be16 rlid;
- __be16 status;
- u8 ipv6_ext_mask;
- u8 badfcs_enc;
+ union {
+ struct {
+ __be16 rlid;
+ __be16 status;
+ u8 ipv6_ext_mask;
+ u8 badfcs_enc;
+ };
+ u8 smac[6];
+ };
__be32 byte_cnt;
__be16 wqe_index;
__be16 checksum;
@@ -83,6 +88,7 @@ struct mlx4_ts_cqe {
enum {
MLX4_CQE_VLAN_PRESENT_MASK = 1 << 29,
MLX4_CQE_QPN_MASK = 0xffffff,
+ MLX4_CQE_VID_MASK = 0xfff,
};
enum {
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH for-next 3/4] IB/core: Infra-structure to support verbs extensions through uverbs
[not found] ` <1371135704-5712-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-13 15:01 ` [PATCH for-next 1/4] IB/core: RoCE IP based GID addressing Or Gerlitz
2013-06-13 15:01 ` [PATCH for-next 2/4] IB/mlx4: " Or Gerlitz
@ 2013-06-13 15:01 ` Or Gerlitz
2013-06-13 15:01 ` [PATCH for-next 4/4] IB/core: Add RoCE IP based addressing extensions towards user space Or Gerlitz
2013-06-13 17:00 ` [PATCH for-next 0/4] IP based RoCE GID Addressing Jason Gunthorpe
4 siblings, 0 replies; 10+ messages in thread
From: Or Gerlitz @ 2013-06-13 15:01 UTC (permalink / raw)
To: roland-DgEjT+Ai2ygdnm+yROfE0A
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
matanb-VPRAkNaXOzVWk0Htik3J/w, Igor Ivanov, Hadar Hen Zion,
Or Gerlitz
From: Igor Ivanov <Igor.Ivanov-wN0M4riKYwLQT0dZR+AlfA@public.gmane.org>
Add Infra-structure to support extended uverbs capabilities in a forward/backward
manner. Uverbs command opcodes which are based on the verbs extensions approach should
be greater or equal to IB_USER_VERBS_CMD_THRESHOLD. They have new header format
and processed a bit differently.
Signed-off-by: Igor Ivanov <Igor.Ivanov-wN0M4riKYwLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Hadar Hen Zion <hadarh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/core/uverbs_main.c | 29 ++++++++++++++++++++++++-----
include/uapi/rdma/ib_user_verbs.h | 10 ++++++++++
2 files changed, 34 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 2c6f0f2..e4e7b24 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -583,9 +583,6 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
if (copy_from_user(&hdr, buf, sizeof hdr))
return -EFAULT;
- if (hdr.in_words * 4 != count)
- return -EINVAL;
-
if (hdr.command >= ARRAY_SIZE(uverbs_cmd_table) ||
!uverbs_cmd_table[hdr.command])
return -EINVAL;
@@ -597,8 +594,30 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
if (!(file->device->ib_dev->uverbs_cmd_mask & (1ull << hdr.command)))
return -ENOSYS;
- return uverbs_cmd_table[hdr.command](file, buf + sizeof hdr,
- hdr.in_words * 4, hdr.out_words * 4);
+ if (hdr.command >= IB_USER_VERBS_CMD_THRESHOLD) {
+ struct ib_uverbs_cmd_hdr_ex hdr_ex;
+
+ if (copy_from_user(&hdr_ex, buf, sizeof(hdr_ex)))
+ return -EFAULT;
+
+ if (((hdr_ex.in_words + hdr_ex.provider_in_words) * 4) != count)
+ return -EINVAL;
+
+ return uverbs_cmd_table[hdr.command](file,
+ buf + sizeof(hdr_ex),
+ (hdr_ex.in_words +
+ hdr_ex.provider_in_words) * 4,
+ (hdr_ex.out_words +
+ hdr_ex.provider_out_words) * 4);
+ } else {
+ if (hdr.in_words * 4 != count)
+ return -EINVAL;
+
+ return uverbs_cmd_table[hdr.command](file,
+ buf + sizeof(hdr),
+ hdr.in_words * 4,
+ hdr.out_words * 4);
+ }
}
static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 805711e..61535aa 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -43,6 +43,7 @@
* compatibility are made.
*/
#define IB_USER_VERBS_ABI_VERSION 6
+#define IB_USER_VERBS_CMD_THRESHOLD 50
enum {
IB_USER_VERBS_CMD_GET_CONTEXT,
@@ -123,6 +124,15 @@ struct ib_uverbs_cmd_hdr {
__u16 out_words;
};
+struct ib_uverbs_cmd_hdr_ex {
+ __u32 command;
+ __u16 in_words;
+ __u16 out_words;
+ __u16 provider_in_words;
+ __u16 provider_out_words;
+ __u32 cmd_hdr_reserved;
+};
+
struct ib_uverbs_get_context {
__u64 response;
__u64 driver_data[0];
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH for-next 4/4] IB/core: Add RoCE IP based addressing extensions towards user space
[not found] ` <1371135704-5712-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (2 preceding siblings ...)
2013-06-13 15:01 ` [PATCH for-next 3/4] IB/core: Infra-structure to support verbs extensions through uverbs Or Gerlitz
@ 2013-06-13 15:01 ` Or Gerlitz
[not found] ` <1371135704-5712-5-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-13 17:00 ` [PATCH for-next 0/4] IP based RoCE GID Addressing Jason Gunthorpe
4 siblings, 1 reply; 10+ messages in thread
From: Or Gerlitz @ 2013-06-13 15:01 UTC (permalink / raw)
To: roland-DgEjT+Ai2ygdnm+yROfE0A
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
matanb-VPRAkNaXOzVWk0Htik3J/w, Or Gerlitz
From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Add support for RoCE (IBoE) IP based addressing extensions towards user space.
Extend INIT_QP_ATTR and QUERY_ROUTE ucma commands.
Extend MODIFY_QP and CREATE_AH uverbs commands.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/core/ucma.c | 172 +++++++++++++++-
drivers/infiniband/core/uverbs.h | 2 +
drivers/infiniband/core/uverbs_cmd.c | 330 ++++++++++++++++++++++-------
drivers/infiniband/core/uverbs_main.c | 4 +-
drivers/infiniband/core/uverbs_marshall.c | 94 ++++++++-
include/rdma/ib_marshall.h | 12 +
include/uapi/rdma/ib_user_sa.h | 34 +++-
include/uapi/rdma/ib_user_verbs.h | 120 +++++++++++-
include/uapi/rdma/rdma_user_cm.h | 21 ++-
9 files changed, 690 insertions(+), 99 deletions(-)
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index bc2cb5d..c7dfd99 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -599,6 +599,35 @@ static void ucma_copy_ib_route(struct rdma_ucm_query_route_resp *resp,
}
}
+static void ucma_copy_ib_route_ex(struct rdma_ucm_query_route_resp_ex *resp,
+ struct rdma_route *route)
+{
+ struct rdma_dev_addr *dev_addr;
+
+ resp->num_paths = route->num_paths;
+ switch (route->num_paths) {
+ case 0:
+ dev_addr = &route->addr.dev_addr;
+ rdma_addr_get_dgid(dev_addr,
+ (union ib_gid *)&resp->ib_route[0].dgid);
+ rdma_addr_get_sgid(dev_addr,
+ (union ib_gid *)&resp->ib_route[0].sgid);
+ resp->ib_route[0].pkey =
+ cpu_to_be16(ib_addr_get_pkey(dev_addr));
+ break;
+ case 2:
+ ib_copy_path_rec_to_user_ex(&resp->ib_route[1],
+ &route->path_rec[1]);
+ /* fall through */
+ case 1:
+ ib_copy_path_rec_to_user_ex(&resp->ib_route[0],
+ &route->path_rec[0]);
+ break;
+ default:
+ break;
+ }
+}
+
static void ucma_copy_iboe_route(struct rdma_ucm_query_route_resp *resp,
struct rdma_route *route)
{
@@ -625,14 +654,39 @@ static void ucma_copy_iboe_route(struct rdma_ucm_query_route_resp *resp,
}
}
-static void ucma_copy_iw_route(struct rdma_ucm_query_route_resp *resp,
+static void ucma_copy_iboe_route_ex(struct rdma_ucm_query_route_resp_ex *resp,
+ struct rdma_route *route)
+{
+ resp->num_paths = route->num_paths;
+ switch (route->num_paths) {
+ case 0:
+ rdma_ip2gid((struct sockaddr *)&route->addr.dst_addr,
+ (union ib_gid *)&resp->ib_route[0].dgid);
+ rdma_ip2gid((struct sockaddr *)&route->addr.src_addr,
+ (union ib_gid *)&resp->ib_route[0].sgid);
+ resp->ib_route[0].pkey = cpu_to_be16(0xffff);
+ break;
+ case 2:
+ ib_copy_path_rec_to_user_ex(&resp->ib_route[1],
+ &route->path_rec[1]);
+ /* fall through */
+ case 1:
+ ib_copy_path_rec_to_user_ex(&resp->ib_route[0],
+ &route->path_rec[0]);
+ break;
+ default:
+ break;
+ }
+}
+
+static void ucma_copy_iw_route(struct ib_user_path_rec *resp_path,
struct rdma_route *route)
{
struct rdma_dev_addr *dev_addr;
dev_addr = &route->addr.dev_addr;
- rdma_addr_get_dgid(dev_addr, (union ib_gid *) &resp->ib_route[0].dgid);
- rdma_addr_get_sgid(dev_addr, (union ib_gid *) &resp->ib_route[0].sgid);
+ rdma_addr_get_dgid(dev_addr, (union ib_gid *)&resp_path->dgid);
+ rdma_addr_get_sgid(dev_addr, (union ib_gid *)&resp_path->sgid);
}
static ssize_t ucma_query_route(struct ucma_file *file,
@@ -684,7 +738,74 @@ static ssize_t ucma_query_route(struct ucma_file *file,
}
break;
case RDMA_TRANSPORT_IWARP:
- ucma_copy_iw_route(&resp, &ctx->cm_id->route);
+ ucma_copy_iw_route(&resp.ib_route[0], &ctx->cm_id->route);
+ break;
+ default:
+ break;
+ }
+
+out:
+ if (copy_to_user((void __user *)(unsigned long)cmd.response,
+ &resp, sizeof(resp)))
+ ret = -EFAULT;
+
+ ucma_put_ctx(ctx);
+ return ret;
+}
+
+static ssize_t ucma_query_route_ex(struct ucma_file *file,
+ const char __user *inbuf,
+ int in_len, int out_len)
+{
+ struct rdma_ucm_query_route_ex cmd;
+ struct rdma_ucm_query_route_resp_ex resp;
+ struct ucma_context *ctx;
+ struct sockaddr *addr;
+ int ret = 0;
+
+ if (out_len < sizeof(resp))
+ return -ENOSPC;
+
+ if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
+ return -EFAULT;
+
+ ctx = ucma_get_ctx(file, cmd.id);
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
+
+ memset(&resp, 0, sizeof(resp));
+ addr = (struct sockaddr *)&ctx->cm_id->route.addr.src_addr;
+ memcpy(&resp.src_addr, addr, addr->sa_family == AF_INET ?
+ sizeof(struct sockaddr_in) :
+ sizeof(struct sockaddr_in6));
+ addr = (struct sockaddr *)&ctx->cm_id->route.addr.dst_addr;
+ memcpy(&resp.dst_addr, addr, addr->sa_family == AF_INET ?
+ sizeof(struct sockaddr_in) :
+ sizeof(struct sockaddr_in6));
+ if (!ctx->cm_id->device)
+ goto out;
+
+ resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
+ resp.port_num = ctx->cm_id->port_num;
+ switch (rdma_node_get_transport(ctx->cm_id->device->node_type)) {
+ case RDMA_TRANSPORT_IB:
+ switch (rdma_port_get_link_layer(ctx->cm_id->device,
+ ctx->cm_id->port_num)) {
+ case IB_LINK_LAYER_INFINIBAND:
+ ucma_copy_ib_route_ex(&resp, &ctx->cm_id->route);
+ break;
+ case IB_LINK_LAYER_ETHERNET:
+ ucma_copy_iboe_route_ex(&resp, &ctx->cm_id->route);
+ break;
+ default:
+ break;
+ }
+ break;
+ case RDMA_TRANSPORT_IWARP:
+ ucma_copy_iw_route((struct ib_user_path_rec *)
+ ((void *)&resp.ib_route[0] +
+ sizeof(resp.ib_route[0].comp_mask)),
+ &ctx->cm_id->route);
break;
default:
break;
@@ -862,6 +983,43 @@ out:
return ret;
}
+static ssize_t ucma_init_qp_attr_ex(struct ucma_file *file,
+ const char __user *inbuf,
+ int in_len, int out_len)
+{
+ struct rdma_ucm_init_qp_attr cmd;
+ struct ib_uverbs_qp_attr_ex resp;
+ struct ucma_context *ctx;
+ struct ib_qp_attr qp_attr;
+ int ret;
+
+ if (out_len < sizeof(resp))
+ return -ENOSPC;
+
+ if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
+ return -EFAULT;
+
+ ctx = ucma_get_ctx(file, cmd.id);
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
+
+ resp.qp_attr_mask = 0;
+ memset(&qp_attr, 0, sizeof(qp_attr));
+ qp_attr.qp_state = cmd.qp_state;
+ ret = rdma_init_qp_attr(ctx->cm_id, &qp_attr, &resp.qp_attr_mask);
+ if (ret)
+ goto out;
+
+ ib_copy_qp_attr_to_user_ex(&resp, &qp_attr);
+ if (copy_to_user((void __user *)(unsigned long)cmd.response,
+ &resp, sizeof(resp)))
+ ret = -EFAULT;
+
+out:
+ ucma_put_ctx(ctx);
+ return ret;
+}
+
static int ucma_set_option_id(struct ucma_context *ctx, int optname,
void *optval, size_t optlen)
{
@@ -1229,7 +1387,9 @@ static ssize_t (*ucma_cmd_table[])(struct ucma_file *file,
[RDMA_USER_CM_CMD_NOTIFY] = ucma_notify,
[RDMA_USER_CM_CMD_JOIN_MCAST] = ucma_join_multicast,
[RDMA_USER_CM_CMD_LEAVE_MCAST] = ucma_leave_multicast,
- [RDMA_USER_CM_CMD_MIGRATE_ID] = ucma_migrate_id
+ [RDMA_USER_CM_CMD_MIGRATE_ID] = ucma_migrate_id,
+ [RDMA_USER_CM_CMD_QUERY_ROUTE_EX] = ucma_query_route_ex,
+ [RDMA_USER_CM_CMD_INIT_QP_ATTR_EX] = ucma_init_qp_attr_ex
};
static ssize_t ucma_write(struct file *filp, const char __user *buf,
@@ -1245,6 +1405,8 @@ static ssize_t ucma_write(struct file *filp, const char __user *buf,
if (copy_from_user(&hdr, buf, sizeof(hdr)))
return -EFAULT;
+ pr_info("UCMA: HDR_CMD: %d\n", hdr.cmd);
+
if (hdr.cmd >= ARRAY_SIZE(ucma_cmd_table))
return -EINVAL;
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 0fcd7aa..1ec4850 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -200,11 +200,13 @@ IB_UVERBS_DECLARE_CMD(create_qp);
IB_UVERBS_DECLARE_CMD(open_qp);
IB_UVERBS_DECLARE_CMD(query_qp);
IB_UVERBS_DECLARE_CMD(modify_qp);
+IB_UVERBS_DECLARE_CMD(modify_qp_ex);
IB_UVERBS_DECLARE_CMD(destroy_qp);
IB_UVERBS_DECLARE_CMD(post_send);
IB_UVERBS_DECLARE_CMD(post_recv);
IB_UVERBS_DECLARE_CMD(post_srq_recv);
IB_UVERBS_DECLARE_CMD(create_ah);
+IB_UVERBS_DECLARE_CMD(create_ah_ex);
IB_UVERBS_DECLARE_CMD(destroy_ah);
IB_UVERBS_DECLARE_CMD(attach_mcast);
IB_UVERBS_DECLARE_CMD(detach_mcast);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index a7d00f6..eb3e7e6 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1891,6 +1891,58 @@ static int modify_qp_mask(enum ib_qp_type qp_type, int mask)
}
}
+static void ib_uverbs_modify_qp_assign(struct ib_uverbs_modify_qp *cmd,
+ struct ib_qp_attr *attr) {
+ attr->qp_state = cmd->qp_state;
+ attr->cur_qp_state = cmd->cur_qp_state;
+ attr->path_mtu = cmd->path_mtu;
+ attr->path_mig_state = cmd->path_mig_state;
+ attr->qkey = cmd->qkey;
+ attr->rq_psn = cmd->rq_psn;
+ attr->sq_psn = cmd->sq_psn;
+ attr->dest_qp_num = cmd->dest_qp_num;
+ attr->qp_access_flags = cmd->qp_access_flags;
+ attr->pkey_index = cmd->pkey_index;
+ attr->alt_pkey_index = cmd->alt_pkey_index;
+ attr->en_sqd_async_notify = cmd->en_sqd_async_notify;
+ attr->max_rd_atomic = cmd->max_rd_atomic;
+ attr->max_dest_rd_atomic = cmd->max_dest_rd_atomic;
+ attr->min_rnr_timer = cmd->min_rnr_timer;
+ attr->port_num = cmd->port_num;
+ attr->timeout = cmd->timeout;
+ attr->retry_cnt = cmd->retry_cnt;
+ attr->rnr_retry = cmd->rnr_retry;
+ attr->alt_port_num = cmd->alt_port_num;
+ attr->alt_timeout = cmd->alt_timeout;
+
+ memcpy(attr->ah_attr.grh.dgid.raw, cmd->dest.dgid, 16);
+ attr->ah_attr.grh.flow_label = cmd->dest.flow_label;
+ attr->ah_attr.grh.sgid_index = cmd->dest.sgid_index;
+ attr->ah_attr.grh.hop_limit = cmd->dest.hop_limit;
+ attr->ah_attr.grh.traffic_class = cmd->dest.traffic_class;
+ attr->ah_attr.dlid = cmd->dest.dlid;
+ attr->ah_attr.sl = cmd->dest.sl;
+ attr->ah_attr.src_path_bits = cmd->dest.src_path_bits;
+ attr->ah_attr.static_rate = cmd->dest.static_rate;
+ attr->ah_attr.ah_flags = cmd->dest.is_global ?
+ IB_AH_GRH : 0;
+ attr->ah_attr.port_num = cmd->dest.port_num;
+
+ memcpy(attr->alt_ah_attr.grh.dgid.raw, cmd->alt_dest.dgid, 16);
+ attr->alt_ah_attr.grh.flow_label = cmd->alt_dest.flow_label;
+ attr->alt_ah_attr.grh.sgid_index = cmd->alt_dest.sgid_index;
+ attr->alt_ah_attr.grh.hop_limit = cmd->alt_dest.hop_limit;
+ attr->alt_ah_attr.grh.traffic_class = cmd->alt_dest.traffic_class;
+ attr->alt_ah_attr.dlid = cmd->alt_dest.dlid;
+ attr->alt_ah_attr.sl = cmd->alt_dest.sl;
+ attr->alt_ah_attr.src_path_bits = cmd->alt_dest.src_path_bits;
+ attr->alt_ah_attr.static_rate = cmd->alt_dest.static_rate;
+ attr->alt_ah_attr.ah_flags = cmd->alt_dest.is_global
+ ? IB_AH_GRH : 0;
+ attr->alt_ah_attr.port_num = cmd->alt_dest.port_num;
+}
+
+
ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
const char __user *buf, int in_len,
int out_len)
@@ -1917,51 +1969,11 @@ ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
goto out;
}
- attr->qp_state = cmd.qp_state;
- attr->cur_qp_state = cmd.cur_qp_state;
- attr->path_mtu = cmd.path_mtu;
- attr->path_mig_state = cmd.path_mig_state;
- attr->qkey = cmd.qkey;
- attr->rq_psn = cmd.rq_psn;
- attr->sq_psn = cmd.sq_psn;
- attr->dest_qp_num = cmd.dest_qp_num;
- attr->qp_access_flags = cmd.qp_access_flags;
- attr->pkey_index = cmd.pkey_index;
- attr->alt_pkey_index = cmd.alt_pkey_index;
- attr->en_sqd_async_notify = cmd.en_sqd_async_notify;
- attr->max_rd_atomic = cmd.max_rd_atomic;
- attr->max_dest_rd_atomic = cmd.max_dest_rd_atomic;
- attr->min_rnr_timer = cmd.min_rnr_timer;
- attr->port_num = cmd.port_num;
- attr->timeout = cmd.timeout;
- attr->retry_cnt = cmd.retry_cnt;
- attr->rnr_retry = cmd.rnr_retry;
- attr->alt_port_num = cmd.alt_port_num;
- attr->alt_timeout = cmd.alt_timeout;
-
- memcpy(attr->ah_attr.grh.dgid.raw, cmd.dest.dgid, 16);
- attr->ah_attr.grh.flow_label = cmd.dest.flow_label;
- attr->ah_attr.grh.sgid_index = cmd.dest.sgid_index;
- attr->ah_attr.grh.hop_limit = cmd.dest.hop_limit;
- attr->ah_attr.grh.traffic_class = cmd.dest.traffic_class;
- attr->ah_attr.dlid = cmd.dest.dlid;
- attr->ah_attr.sl = cmd.dest.sl;
- attr->ah_attr.src_path_bits = cmd.dest.src_path_bits;
- attr->ah_attr.static_rate = cmd.dest.static_rate;
- attr->ah_attr.ah_flags = cmd.dest.is_global ? IB_AH_GRH : 0;
- attr->ah_attr.port_num = cmd.dest.port_num;
-
- memcpy(attr->alt_ah_attr.grh.dgid.raw, cmd.alt_dest.dgid, 16);
- attr->alt_ah_attr.grh.flow_label = cmd.alt_dest.flow_label;
- attr->alt_ah_attr.grh.sgid_index = cmd.alt_dest.sgid_index;
- attr->alt_ah_attr.grh.hop_limit = cmd.alt_dest.hop_limit;
- attr->alt_ah_attr.grh.traffic_class = cmd.alt_dest.traffic_class;
- attr->alt_ah_attr.dlid = cmd.alt_dest.dlid;
- attr->alt_ah_attr.sl = cmd.alt_dest.sl;
- attr->alt_ah_attr.src_path_bits = cmd.alt_dest.src_path_bits;
- attr->alt_ah_attr.static_rate = cmd.alt_dest.static_rate;
- attr->alt_ah_attr.ah_flags = cmd.alt_dest.is_global ? IB_AH_GRH : 0;
- attr->alt_ah_attr.port_num = cmd.alt_dest.port_num;
+ ib_uverbs_modify_qp_assign(&cmd, attr);
+ memset(attr->ah_attr.dmac, 0, sizeof(attr->ah_attr.dmac));
+ attr->ah_attr.vlan = 0xFFFF;
+ memset(attr->alt_ah_attr.dmac, 0, sizeof(attr->alt_ah_attr.dmac));
+ attr->alt_ah_attr.vlan = 0xFFFF;
if (qp->real_qp == qp) {
ret = qp->device->modify_qp(qp, attr,
@@ -1983,6 +1995,80 @@ out:
return ret;
}
+ssize_t ib_uverbs_modify_qp_ex(struct ib_uverbs_file *file,
+ const char __user *buf, int in_len,
+ int out_len)
+{
+ struct ib_uverbs_modify_qp_ex cmd;
+ struct ib_udata udata;
+ struct ib_qp *qp;
+ struct ib_qp_attr *attr;
+ int ret;
+
+ if (copy_from_user(&cmd, buf, sizeof(cmd)))
+ return -EFAULT;
+
+ INIT_UDATA(&udata, buf + sizeof(cmd), NULL, in_len - sizeof(cmd),
+ out_len);
+
+ attr = kmalloc(sizeof(*attr), GFP_KERNEL);
+ if (!attr)
+ return -ENOMEM;
+
+ qp = idr_read_qp(cmd.qp_handle, file->ucontext);
+ if (!qp) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ ib_uverbs_modify_qp_assign((struct ib_uverbs_modify_qp *)((void *)&cmd +
+ sizeof(cmd.comp_mask)), attr);
+
+ if (cmd.comp_mask & IB_UVERBS_MODIFY_QP_EX_DEST_EX_FLAGS) {
+ if (cmd.dest_ex.comp_mask & IBV_QP_DEST_EX_DMAC)
+ memcpy(attr->ah_attr.dmac, cmd.dest_ex.dmac,
+ sizeof(attr->ah_attr.dmac));
+ else
+ memset(attr->ah_attr.dmac, 0,
+ sizeof(attr->ah_attr.dmac));
+ if (cmd.dest_ex.comp_mask & IBV_QP_DEST_EX_VID)
+ attr->ah_attr.vlan = cmd.dest_ex.vid;
+ else
+ attr->ah_attr.vlan = 0xFFFF;
+ }
+ if (cmd.comp_mask & IB_UVERBS_MODIFY_QP_EX_ALT_DEST_EX_FLAGS) {
+ if (cmd.alt_dest_ex.comp_mask & IBV_QP_DEST_EX_DMAC)
+ memcpy(attr->alt_ah_attr.dmac, cmd.alt_dest_ex.dmac,
+ sizeof(attr->alt_ah_attr.dmac));
+ else
+ memset(attr->alt_ah_attr.dmac, 0,
+ sizeof(attr->alt_ah_attr.dmac));
+ if (cmd.alt_dest_ex.comp_mask & IBV_QP_DEST_EX_VID)
+ attr->alt_ah_attr.vlan = cmd.alt_dest_ex.vid;
+ else
+ attr->alt_ah_attr.vlan = 0xFFFF;
+ }
+
+ if (qp->real_qp == qp) {
+ ret = qp->device->modify_qp(qp, attr,
+ modify_qp_mask(qp->qp_type, cmd.attr_mask), &udata);
+ } else {
+ ret = ib_modify_qp(qp, attr,
+ modify_qp_mask(qp->qp_type, cmd.attr_mask));
+ }
+
+ put_qp_read(qp);
+
+ if (ret)
+ goto out;
+
+ ret = in_len;
+
+out:
+ kfree(attr);
+
+ return ret;
+}
ssize_t ib_uverbs_destroy_qp(struct ib_uverbs_file *file,
const char __user *buf, int in_len,
int out_len)
@@ -2377,48 +2463,51 @@ out:
return ret ? ret : in_len;
}
-ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
- const char __user *buf, int in_len,
- int out_len)
+struct ib_uobject *ib_uverbs_create_ah_assign(
+ struct ib_uverbs_create_ah_ex *cmd,
+ struct ib_uverbs_ah_attr_ex *src_attr,
+ struct ib_uverbs_file *file)
{
- struct ib_uverbs_create_ah cmd;
- struct ib_uverbs_create_ah_resp resp;
- struct ib_uobject *uobj;
struct ib_pd *pd;
struct ib_ah *ah;
struct ib_ah_attr attr;
- int ret;
-
- if (out_len < sizeof resp)
- return -ENOSPC;
-
- if (copy_from_user(&cmd, buf, sizeof cmd))
- return -EFAULT;
+ struct ib_uobject *uobj;
+ long ret;
- uobj = kmalloc(sizeof *uobj, GFP_KERNEL);
+ uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
if (!uobj)
- return -ENOMEM;
+ return (void *)-ENOMEM;
- init_uobj(uobj, cmd.user_handle, file->ucontext, &ah_lock_class);
+ init_uobj(uobj, cmd->user_handle, file->ucontext, &ah_lock_class);
down_write(&uobj->mutex);
- pd = idr_read_pd(cmd.pd_handle, file->ucontext);
+ pd = idr_read_pd(cmd->pd_handle, file->ucontext);
if (!pd) {
ret = -EINVAL;
goto err;
}
- attr.dlid = cmd.attr.dlid;
- attr.sl = cmd.attr.sl;
- attr.src_path_bits = cmd.attr.src_path_bits;
- attr.static_rate = cmd.attr.static_rate;
- attr.ah_flags = cmd.attr.is_global ? IB_AH_GRH : 0;
- attr.port_num = cmd.attr.port_num;
- attr.grh.flow_label = cmd.attr.grh.flow_label;
- attr.grh.sgid_index = cmd.attr.grh.sgid_index;
- attr.grh.hop_limit = cmd.attr.grh.hop_limit;
- attr.grh.traffic_class = cmd.attr.grh.traffic_class;
- memcpy(attr.grh.dgid.raw, cmd.attr.grh.dgid, 16);
+ attr.dlid = src_attr->dlid;
+ attr.sl = src_attr->sl;
+ attr.src_path_bits = src_attr->src_path_bits;
+ attr.static_rate = src_attr->static_rate;
+ attr.ah_flags = src_attr->is_global ? IB_AH_GRH : 0;
+ attr.port_num = src_attr->port_num;
+ attr.grh.flow_label = src_attr->grh.flow_label;
+ attr.grh.sgid_index = src_attr->grh.sgid_index;
+ attr.grh.hop_limit = src_attr->grh.hop_limit;
+ attr.grh.traffic_class = src_attr->grh.traffic_class;
+ memcpy(attr.grh.dgid.raw, src_attr->grh.dgid, 16);
+
+ if (src_attr->comp_mask & IB_UVERBS_AH_ATTR_DMAC)
+ memcpy(attr.dmac, src_attr->dmac, sizeof(attr.dmac));
+ else
+ memset(attr.dmac, 0, sizeof(attr.dmac));
+
+ if (src_attr->comp_mask & IB_UVERBS_AH_ATTR_VID)
+ attr.vlan = src_attr->vlan;
+ else
+ attr.vlan = 0xFFFF;
ah = ib_create_ah(pd, &attr);
if (IS_ERR(ah)) {
@@ -2427,22 +2516,62 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
}
ah->uobject = uobj;
+
uobj->object = ah;
ret = idr_add_uobj(&ib_uverbs_ah_idr, uobj);
if (ret)
goto err_destroy;
+ put_pd_read(pd);
+
+ return uobj;
+
+err_destroy:
+ ib_destroy_ah(ah);
+err_put:
+ put_pd_read(pd);
+err:
+ put_uobj_write(uobj);
+ return (void *)ret;
+}
+
+ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
+ const char __user *buf, int in_len,
+ int out_len)
+{
+ struct ib_uverbs_create_ah_ex cmd_ex;
+ struct ib_uverbs_create_ah *cmd = (struct ib_uverbs_create_ah *)
+ ((void *)&cmd_ex +
+ sizeof(cmd_ex.comp_mask));
+ struct ib_uverbs_ah_attr_ex attr_ex;
+ struct ib_uverbs_create_ah_resp resp;
+ struct ib_uobject *uobj;
+ int ret;
+
+ if (out_len < sizeof(resp))
+ return -ENOSPC;
+
+ cmd_ex.comp_mask = 0;
+ if (copy_from_user(cmd, buf, sizeof(*cmd)))
+ return -EFAULT;
+
+ attr_ex.comp_mask = 0;
+ memcpy(((void *)&attr_ex) + sizeof(attr_ex.comp_mask),
+ &cmd->attr, sizeof(cmd->attr));
+
+ uobj = ib_uverbs_create_ah_assign(&cmd_ex, &attr_ex, file);
+ if (IS_ERR(uobj))
+ return (ssize_t)uobj;
+
resp.ah_handle = uobj->id;
- if (copy_to_user((void __user *) (unsigned long) cmd.response,
+ if (copy_to_user((void __user *)(unsigned long) cmd->response,
&resp, sizeof resp)) {
ret = -EFAULT;
goto err_copy;
}
- put_pd_read(pd);
-
mutex_lock(&file->mutex);
list_add_tail(&uobj->list, &file->ucontext->ah_list);
mutex_unlock(&file->mutex);
@@ -2455,15 +2584,54 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
err_copy:
idr_remove_uobj(&ib_uverbs_ah_idr, uobj);
+ ib_destroy_ah(uobj->object);
+ put_uobj_write(uobj);
-err_destroy:
- ib_destroy_ah(ah);
+ return ret;
+}
-err_put:
- put_pd_read(pd);
+ssize_t ib_uverbs_create_ah_ex(struct ib_uverbs_file *file,
+ const char __user *buf, int in_len,
+ int out_len)
+{
+ struct ib_uverbs_create_ah_ex cmd_ex;
+ struct ib_uverbs_create_ah_resp resp;
+ struct ib_uobject *uobj;
+ int ret;
-err:
+ if (out_len < sizeof(resp))
+ return -ENOSPC;
+
+ if (copy_from_user(&cmd_ex, buf, sizeof(cmd_ex)))
+ return -EFAULT;
+
+ uobj = ib_uverbs_create_ah_assign(&cmd_ex, &cmd_ex.attr, file);
+ if (IS_ERR(uobj))
+ return (ssize_t)uobj;
+
+ resp.ah_handle = uobj->id;
+
+ if (copy_to_user((void __user *)(unsigned long)cmd_ex.response,
+ &resp, sizeof(resp))) {
+ ret = -EFAULT;
+ goto err_copy;
+ }
+
+ mutex_lock(&file->mutex);
+ list_add_tail(&uobj->list, &file->ucontext->ah_list);
+ mutex_unlock(&file->mutex);
+
+ uobj->live = 1;
+
+ up_write(&uobj->mutex);
+
+ return in_len;
+
+err_copy:
+ idr_remove_uobj(&ib_uverbs_ah_idr, uobj);
+ ib_destroy_ah(uobj->object);
put_uobj_write(uobj);
+
return ret;
}
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index e4e7b24..93264c8 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -113,7 +113,9 @@ static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file,
[IB_USER_VERBS_CMD_OPEN_XRCD] = ib_uverbs_open_xrcd,
[IB_USER_VERBS_CMD_CLOSE_XRCD] = ib_uverbs_close_xrcd,
[IB_USER_VERBS_CMD_CREATE_XSRQ] = ib_uverbs_create_xsrq,
- [IB_USER_VERBS_CMD_OPEN_QP] = ib_uverbs_open_qp
+ [IB_USER_VERBS_CMD_OPEN_QP] = ib_uverbs_open_qp,
+ [IB_USER_VERBS_CMD_MODIFY_QP_EX] = ib_uverbs_modify_qp_ex,
+ [IB_USER_VERBS_CMD_CREATE_AH_EX] = ib_uverbs_create_ah_ex,
};
static void ib_uverbs_add_one(struct ib_device *device);
diff --git a/drivers/infiniband/core/uverbs_marshall.c b/drivers/infiniband/core/uverbs_marshall.c
index e7bee46..0470407 100644
--- a/drivers/infiniband/core/uverbs_marshall.c
+++ b/drivers/infiniband/core/uverbs_marshall.c
@@ -33,6 +33,9 @@
#include <linux/export.h>
#include <rdma/ib_marshall.h>
+#define UVERB_EX_TO_UVERB(uverb_ex) ((void *)(uverb_ex) + \
+ sizeof(uverb_ex->comp_mask))
+
void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
struct ib_ah_attr *src)
{
@@ -52,9 +55,20 @@ void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
}
EXPORT_SYMBOL(ib_copy_ah_attr_to_user);
-void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
- struct ib_qp_attr *src)
+void ib_copy_ah_attr_to_user_ex(struct ib_uverbs_ah_attr_ex *dst,
+ struct ib_ah_attr *src)
{
+ ib_copy_ah_attr_to_user((struct ib_uverbs_ah_attr *)
+ UVERB_EX_TO_UVERB(dst), src);
+ dst->comp_mask = IB_UVERBS_AH_ATTR_DMAC;
+ memcpy(dst->dmac, src->dmac, sizeof(dst->dmac));
+ dst->comp_mask = IB_UVERBS_AH_ATTR_VID;
+ dst->vlan = src->vlan;
+}
+EXPORT_SYMBOL(ib_copy_ah_attr_to_user_ex);
+
+static void ib_copy_qp_attr_to_user_data(struct ib_uverbs_qp_attr *dst,
+ struct ib_qp_attr *src) {
dst->qp_state = src->qp_state;
dst->cur_qp_state = src->cur_qp_state;
dst->path_mtu = src->path_mtu;
@@ -71,9 +85,6 @@ void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
dst->max_recv_sge = src->cap.max_recv_sge;
dst->max_inline_data = src->cap.max_inline_data;
- ib_copy_ah_attr_to_user(&dst->ah_attr, &src->ah_attr);
- ib_copy_ah_attr_to_user(&dst->alt_ah_attr, &src->alt_ah_attr);
-
dst->pkey_index = src->pkey_index;
dst->alt_pkey_index = src->alt_pkey_index;
dst->en_sqd_async_notify = src->en_sqd_async_notify;
@@ -89,8 +100,26 @@ void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
dst->alt_timeout = src->alt_timeout;
memset(dst->reserved, 0, sizeof(dst->reserved));
}
+
+void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
+ struct ib_qp_attr *src)
+{
+ ib_copy_qp_attr_to_user_data(dst, src);
+ ib_copy_ah_attr_to_user(&dst->ah_attr, &src->ah_attr);
+ ib_copy_ah_attr_to_user(&dst->alt_ah_attr, &src->alt_ah_attr);
+}
EXPORT_SYMBOL(ib_copy_qp_attr_to_user);
+void ib_copy_qp_attr_to_user_ex(struct ib_uverbs_qp_attr_ex *dst,
+ struct ib_qp_attr *src)
+{
+ ib_copy_qp_attr_to_user_data((struct ib_uverbs_qp_attr *)
+ UVERB_EX_TO_UVERB(dst), src);
+ ib_copy_ah_attr_to_user_ex(&dst->ah_attr, &src->ah_attr);
+ ib_copy_ah_attr_to_user_ex(&dst->alt_ah_attr, &src->alt_ah_attr);
+}
+EXPORT_SYMBOL(ib_copy_qp_attr_to_user_ex);
+
void ib_copy_path_rec_to_user(struct ib_user_path_rec *dst,
struct ib_sa_path_rec *src)
{
@@ -117,11 +146,27 @@ void ib_copy_path_rec_to_user(struct ib_user_path_rec *dst,
}
EXPORT_SYMBOL(ib_copy_path_rec_to_user);
-void ib_copy_path_rec_from_user(struct ib_sa_path_rec *dst,
- struct ib_user_path_rec *src)
+void ib_copy_path_rec_to_user_ex(struct ib_user_path_rec_ex *dst,
+ struct ib_sa_path_rec *src)
+{
+ ib_copy_path_rec_to_user((struct ib_user_path_rec *)
+ UVERB_EX_TO_UVERB(dst), src);
+
+ dst->comp_mask = IB_USER_PATH_REC_ATTR_DMAC |
+ IB_USER_PATH_REC_ATTR_SMAC |
+ IB_USER_PATH_REC_ATTR_VID;
+
+ memcpy(dst->dmac, src->dmac, sizeof(dst->dmac));
+ memcpy(dst->smac, src->smac, sizeof(dst->smac));
+ dst->vlan = src->vlan;
+}
+EXPORT_SYMBOL(ib_copy_path_rec_to_user_ex);
+
+void ib_copy_path_rec_from_user_assign(struct ib_sa_path_rec *dst,
+ struct ib_user_path_rec *src)
{
- memcpy(dst->dgid.raw, src->dgid, sizeof dst->dgid);
- memcpy(dst->sgid.raw, src->sgid, sizeof dst->sgid);
+ memcpy(dst->dgid.raw, src->dgid, sizeof(dst->dgid));
+ memcpy(dst->sgid.raw, src->sgid, sizeof(dst->sgid));
dst->dlid = src->dlid;
dst->slid = src->slid;
@@ -141,4 +186,35 @@ void ib_copy_path_rec_from_user(struct ib_sa_path_rec *dst,
dst->preference = src->preference;
dst->packet_life_time_selector = src->packet_life_time_selector;
}
+
+void ib_copy_path_rec_from_user(struct ib_sa_path_rec *dst,
+ struct ib_user_path_rec *src) {
+ memset(dst->dmac, 0, sizeof(dst->dmac));
+ memset(dst->smac, 0, sizeof(dst->smac));
+ dst->vlan = 0xFFFF;
+
+ ib_copy_path_rec_from_user_assign(dst, src);
+}
EXPORT_SYMBOL(ib_copy_path_rec_from_user);
+
+void ib_copy_path_rec_from_user_ex(struct ib_sa_path_rec *dst,
+ struct ib_user_path_rec_ex *src) {
+ if (src->comp_mask & IB_USER_PATH_REC_ATTR_DMAC)
+ memcpy(dst->dmac, src->dmac, sizeof(dst->dmac));
+ else
+ memset(dst->dmac, 0, sizeof(dst->dmac));
+
+ if (src->comp_mask & IB_USER_PATH_REC_ATTR_SMAC)
+ memcpy(dst->smac, src->smac, sizeof(dst->smac));
+ else
+ memset(dst->smac, 0, sizeof(dst->smac));
+
+ if (src->comp_mask & IB_USER_PATH_REC_ATTR_VID)
+ dst->vlan = src->vlan;
+ else
+ dst->vlan = 0xFFFF;
+
+ ib_copy_path_rec_from_user_assign(dst, (struct ib_user_path_rec *)
+ UVERB_EX_TO_UVERB(src));
+}
+EXPORT_SYMBOL(ib_copy_path_rec_from_user_ex);
diff --git a/include/rdma/ib_marshall.h b/include/rdma/ib_marshall.h
index db03720..11ab3a8 100644
--- a/include/rdma/ib_marshall.h
+++ b/include/rdma/ib_marshall.h
@@ -41,13 +41,25 @@
void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
struct ib_qp_attr *src);
+void ib_copy_qp_attr_to_user_ex(struct ib_uverbs_qp_attr_ex *dst,
+ struct ib_qp_attr *src);
+
void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
struct ib_ah_attr *src);
+void ib_copy_ah_attr_to_user_ex(struct ib_uverbs_ah_attr_ex *dst,
+ struct ib_ah_attr *src);
+
void ib_copy_path_rec_to_user(struct ib_user_path_rec *dst,
struct ib_sa_path_rec *src);
+void ib_copy_path_rec_to_user_ex(struct ib_user_path_rec_ex *dst,
+ struct ib_sa_path_rec *src);
+
void ib_copy_path_rec_from_user(struct ib_sa_path_rec *dst,
struct ib_user_path_rec *src);
+void ib_copy_path_rec_from_user_ex(struct ib_sa_path_rec *dst,
+ struct ib_user_path_rec_ex *src);
+
#endif /* IB_USER_MARSHALL_H */
diff --git a/include/uapi/rdma/ib_user_sa.h b/include/uapi/rdma/ib_user_sa.h
index cfc7c9b..367d66a 100644
--- a/include/uapi/rdma/ib_user_sa.h
+++ b/include/uapi/rdma/ib_user_sa.h
@@ -48,7 +48,13 @@ enum {
struct ib_path_rec_data {
__u32 flags;
__u32 reserved;
- __u32 path_rec[16];
+ __u32 path_rec[20];
+};
+
+enum ibv_kern_path_rec_attr_mask {
+ IB_USER_PATH_REC_ATTR_DMAC = 1ULL << 0,
+ IB_USER_PATH_REC_ATTR_SMAC = 1ULL << 1,
+ IB_USER_PATH_REC_ATTR_VID = 1ULL << 2
};
struct ib_user_path_rec {
@@ -73,4 +79,30 @@ struct ib_user_path_rec {
__u8 preference;
};
+struct ib_user_path_rec_ex {
+ __u32 comp_mask;
+ __u8 dgid[16];
+ __u8 sgid[16];
+ __be16 dlid;
+ __be16 slid;
+ __u32 raw_traffic;
+ __be32 flow_label;
+ __u32 reversible;
+ __u32 mtu;
+ __be16 pkey;
+ __u8 hop_limit;
+ __u8 traffic_class;
+ __u8 numb_path;
+ __u8 sl;
+ __u8 mtu_selector;
+ __u8 rate_selector;
+ __u8 rate;
+ __u8 packet_life_time_selector;
+ __u8 packet_life_time;
+ __u8 preference;
+ __u8 smac[6];
+ __u8 dmac[6];
+ __be16 vlan;
+};
+
#endif /* IB_USER_SA_H */
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 61535aa..954a790 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -86,7 +86,9 @@ enum {
IB_USER_VERBS_CMD_OPEN_XRCD,
IB_USER_VERBS_CMD_CLOSE_XRCD,
IB_USER_VERBS_CMD_CREATE_XSRQ,
- IB_USER_VERBS_CMD_OPEN_QP
+ IB_USER_VERBS_CMD_OPEN_QP,
+ IB_USER_VERBS_CMD_MODIFY_QP_EX = IB_USER_VERBS_CMD_THRESHOLD,
+ IB_USER_VERBS_CMD_CREATE_AH_EX,
};
/*
@@ -392,6 +394,25 @@ struct ib_uverbs_ah_attr {
__u8 reserved;
};
+enum ib_uverbs_ah_attr_mask {
+ IB_UVERBS_AH_ATTR_DMAC,
+ IB_UVERBS_AH_ATTR_VID
+};
+
+struct ib_uverbs_ah_attr_ex {
+ __u32 comp_mask;
+ struct ib_uverbs_global_route grh;
+ __u16 dlid;
+ __u8 sl;
+ __u8 src_path_bits;
+ __u8 static_rate;
+ __u8 is_global;
+ __u8 port_num;
+ __u8 reserved;
+ __u8 dmac[6];
+ __u16 vlan;
+};
+
struct ib_uverbs_qp_attr {
__u32 qp_attr_mask;
__u32 qp_state;
@@ -430,6 +451,45 @@ struct ib_uverbs_qp_attr {
__u8 reserved[5];
};
+struct ib_uverbs_qp_attr_ex {
+ __u32 comp_mask;
+ __u32 qp_attr_mask;
+ __u32 qp_state;
+ __u32 cur_qp_state;
+ __u32 path_mtu;
+ __u32 path_mig_state;
+ __u32 qkey;
+ __u32 rq_psn;
+ __u32 sq_psn;
+ __u32 dest_qp_num;
+ __u32 qp_access_flags;
+
+ struct ib_uverbs_ah_attr_ex ah_attr;
+ struct ib_uverbs_ah_attr_ex alt_ah_attr;
+
+ /* ib_qp_cap */
+ __u32 max_send_wr;
+ __u32 max_recv_wr;
+ __u32 max_send_sge;
+ __u32 max_recv_sge;
+ __u32 max_inline_data;
+
+ __u16 pkey_index;
+ __u16 alt_pkey_index;
+ __u8 en_sqd_async_notify;
+ __u8 sq_draining;
+ __u8 max_rd_atomic;
+ __u8 max_dest_rd_atomic;
+ __u8 min_rnr_timer;
+ __u8 port_num;
+ __u8 timeout;
+ __u8 retry_cnt;
+ __u8 rnr_retry;
+ __u8 alt_port_num;
+ __u8 alt_timeout;
+ __u8 reserved[5];
+};
+
struct ib_uverbs_create_qp {
__u64 response;
__u64 user_handle;
@@ -531,6 +591,17 @@ struct ib_uverbs_query_qp_resp {
__u64 driver_data[0];
};
+enum ib_uverbs_qp_dest_ex_comp_mask {
+ IBV_QP_DEST_EX_DMAC = (1ULL << 0),
+ IBV_QP_DEST_EX_VID = (1ULL << 1)
+};
+
+struct ib_uverbs_qp_dest_ex {
+ __u32 comp_mask;
+ __u8 dmac[6];
+ __u16 vid;
+};
+
struct ib_uverbs_modify_qp {
struct ib_uverbs_qp_dest dest;
struct ib_uverbs_qp_dest alt_dest;
@@ -561,6 +632,44 @@ struct ib_uverbs_modify_qp {
__u64 driver_data[0];
};
+enum ib_uverbs_modify_qp_ex_comp_mask {
+ IB_UVERBS_MODIFY_QP_EX_DEST_EX_FLAGS = (1ULL << 0),
+ IB_UVERBS_MODIFY_QP_EX_ALT_DEST_EX_FLAGS = (1ULL << 1)
+};
+
+struct ib_uverbs_modify_qp_ex {
+ __u32 comp_mask;
+ struct ib_uverbs_qp_dest dest;
+ struct ib_uverbs_qp_dest alt_dest;
+ __u32 qp_handle;
+ __u32 attr_mask;
+ __u32 qkey;
+ __u32 rq_psn;
+ __u32 sq_psn;
+ __u32 dest_qp_num;
+ __u32 qp_access_flags;
+ __u16 pkey_index;
+ __u16 alt_pkey_index;
+ __u8 qp_state;
+ __u8 cur_qp_state;
+ __u8 path_mtu;
+ __u8 path_mig_state;
+ __u8 en_sqd_async_notify;
+ __u8 max_rd_atomic;
+ __u8 max_dest_rd_atomic;
+ __u8 min_rnr_timer;
+ __u8 port_num;
+ __u8 timeout;
+ __u8 retry_cnt;
+ __u8 rnr_retry;
+ __u8 alt_port_num;
+ __u8 alt_timeout;
+ __u8 reserved[2];
+ struct ib_uverbs_qp_dest_ex dest_ex;
+ struct ib_uverbs_qp_dest_ex alt_dest_ex;
+ __u64 driver_data[0];
+};
+
struct ib_uverbs_modify_qp_resp {
};
@@ -670,6 +779,15 @@ struct ib_uverbs_create_ah {
struct ib_uverbs_ah_attr attr;
};
+struct ib_uverbs_create_ah_ex {
+ int comp_mask;
+ __u64 response;
+ __u64 user_handle;
+ __u32 pd_handle;
+ __u32 reserved;
+ struct ib_uverbs_ah_attr_ex attr;
+};
+
struct ib_uverbs_create_ah_resp {
__u32 ah_handle;
};
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 1ee9239..8dceb35 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -61,7 +61,9 @@ enum {
RDMA_USER_CM_CMD_NOTIFY,
RDMA_USER_CM_CMD_JOIN_MCAST,
RDMA_USER_CM_CMD_LEAVE_MCAST,
- RDMA_USER_CM_CMD_MIGRATE_ID
+ RDMA_USER_CM_CMD_MIGRATE_ID,
+ RDMA_USER_CM_CMD_QUERY_ROUTE_EX,
+ RDMA_USER_CM_CMD_INIT_QP_ATTR_EX
};
/*
@@ -119,6 +121,13 @@ struct rdma_ucm_query_route {
__u32 reserved;
};
+struct rdma_ucm_query_route_ex {
+ __u32 comp_mask;
+ __u64 response;
+ __u32 id;
+ __u32 reserved;
+};
+
struct rdma_ucm_query_route_resp {
__u64 node_guid;
struct ib_user_path_rec ib_route[2];
@@ -129,6 +138,16 @@ struct rdma_ucm_query_route_resp {
__u8 reserved[3];
};
+struct rdma_ucm_query_route_resp_ex {
+ __u64 node_guid;
+ struct ib_user_path_rec_ex ib_route[2];
+ struct sockaddr_in6 src_addr;
+ struct sockaddr_in6 dst_addr;
+ __u32 num_paths;
+ __u8 port_num;
+ __u8 reserved[3];
+};
+
struct rdma_ucm_conn_param {
__u32 qp_num;
__u32 reserved;
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 0/4] IP based RoCE GID Addressing
[not found] ` <1371135704-5712-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (3 preceding siblings ...)
2013-06-13 15:01 ` [PATCH for-next 4/4] IB/core: Add RoCE IP based addressing extensions towards user space Or Gerlitz
@ 2013-06-13 17:00 ` Jason Gunthorpe
[not found] ` <20130613170011.GA21570-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
4 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2013-06-13 17:00 UTC (permalink / raw)
To: Or Gerlitz
Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
monis-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w
On Thu, Jun 13, 2013 at 06:01:40PM +0300, Or Gerlitz wrote:
> Currently, the IB stack (core + drivers) handle RoCE (IBoE) gids as
> they encode related Ethernet net-device interface MAC address and
> possibly VLAN id.
>
> This series changes RoCE GIDs to encode IP addresses (IPv4 + IPv6)
> of the that Ethernet interface, under the following reasoning:
Can you talk abit about compatibility please?
What happens when nodes with this patch are on the same network as
nodes without it?
Does this patch remove the encoding of the VLAN from the GID?
How is the destination MAC derived now?
There is a RoCE standard, it doesn't say much, but how the MAC and GRH
GID are related/derived really should be specified...
Not sure about copying the IP/IPv6 address from the interface into the
HW, there has always been pressure to keep verbs separate from the net
stack.. At the very least patch #2 should have its change log updated
to actually reflect what is in the patch.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 4/4] IB/core: Add RoCE IP based addressing extensions towards user space
[not found] ` <1371135704-5712-5-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-13 17:09 ` Jason Gunthorpe
[not found] ` <20130613170939.GB21570-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2013-06-13 17:09 UTC (permalink / raw)
To: Or Gerlitz
Cc: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
monis-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w
On Thu, Jun 13, 2013 at 06:01:44PM +0300, Or Gerlitz wrote:
> From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>
> Add support for RoCE (IBoE) IP based addressing extensions towards
> user space.
>
> Extend INIT_QP_ATTR and QUERY_ROUTE ucma commands.
>
> Extend MODIFY_QP and CREATE_AH uverbs commands.
This is a really big patch Or, there is lots going on here, hard to
review :(
The rdma cm stuff should probably be split out of this, and Sean
should look at it of course.
In fact, since the user ABI is so important, every ABI change should
be a distinct patch, with a good change log, stating the intended
goals of the change and ABI visible changes it makes.
The changelog above is terrible for a huge patch that makes changes to
the userspace API.
> diff --git a/include/uapi/rdma/ib_user_sa.h b/include/uapi/rdma/ib_user_sa.h
> index cfc7c9b..367d66a 100644
> +++ b/include/uapi/rdma/ib_user_sa.h
> @@ -48,7 +48,13 @@ enum {
> struct ib_path_rec_data {
> __u32 flags;
> __u32 reserved;
> - __u32 path_rec[16];
> + __u32 path_rec[20];
> +};
> +
> +enum ibv_kern_path_rec_attr_mask {
> + IB_USER_PATH_REC_ATTR_DMAC = 1ULL << 0,
> + IB_USER_PATH_REC_ATTR_SMAC = 1ULL << 1,
> + IB_USER_PATH_REC_ATTR_VID = 1ULL << 2
> };
So, how is userspace supposed to know what these values are? The
current system where the MAC address is in the GID seemed
understandable, assuming you discover the MAC out of band some how...
> +struct ib_uverbs_modify_qp_ex {
> + __u32 comp_mask;
> + struct ib_uverbs_qp_dest dest;
> + struct ib_uverbs_qp_dest alt_dest;
[...]
> + struct ib_uverbs_qp_dest_ex dest_ex;
> + struct ib_uverbs_qp_dest_ex alt_dest_ex;
Yuk.. The 'ex' structures don't have to be byte compatible, they just
have to have a known transform, dest should be the full extended dest,
not split into two..
> +struct rdma_ucm_query_route_resp_ex {
> + __u64 node_guid;
> + struct ib_user_path_rec_ex ib_route[2];
> + struct sockaddr_in6 src_addr;
> + struct sockaddr_in6 dst_addr;
> + __u32 num_paths;
> + __u8 port_num;
> + __u8 reserved[3];
> +};
Should these be sockaddr_storage? How does this intersect with Sean's
AF_GID work?
JAson
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 0/4] IP based RoCE GID Addressing
[not found] ` <20130613170011.GA21570-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-06-14 20:35 ` Or Gerlitz
[not found] ` <CALsNU1PdjV2N4c3j9gQ0BRrd4GhOypDMMjmHEgF1ffCYOXeC1g@mail.gmail.com>
0 siblings, 1 reply; 10+ messages in thread
From: Or Gerlitz @ 2013-06-14 20:35 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
matanb-VPRAkNaXOzVWk0Htik3J/w
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> Can you talk abit about compatibility please? What happens when nodes
> with this patch are on the same network as nodes without it?
The CM on the passive side would send a reject with the reason being
"invalid gid" so this will not go unnoticed.
> Does this patch remove the encoding of the VLAN from the GID?
YES, and I explained in argument #1 why the vlan being there doesn't
work in many environments, in other words, its something that needs to
be fix, and this series addresses that.
> How is the destination MAC derived now?
as it was before, using address resolution, e.g ARPs sent by the RDMA-CM.
> There is a RoCE standard, it doesn't say much, but how the MAC and GRH
> GID are related/derived really should be specified...
>
> Not sure about copying the IP/IPv6 address from the interface into the
> HW, there has always been pressure to keep verbs separate from the net
> stack.. At the very least patch #2 should have its change log updated
> to actually reflect what is in the patch.
Sure, I'll see what needs to be better explained in the change-log.
Note that the inbox RoCE implementation is tightly coupled to
net-devices, e.g the GID table population is based on netevents of
related netdevices.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 4/4] IB/core: Add RoCE IP based addressing extensions towards user space
[not found] ` <20130613170939.GB21570-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-06-14 20:42 ` Or Gerlitz
0 siblings, 0 replies; 10+ messages in thread
From: Or Gerlitz @ 2013-06-14 20:42 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, monis-VPRAkNaXOzVWk0Htik3J/w,
matanb-VPRAkNaXOzVWk0Htik3J/w
On Thu, Jun 13, 2013 at 8:09 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Thu, Jun 13, 2013 at 06:01:44PM +0300, Or Gerlitz wrote:
>> From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>
>> Add support for RoCE (IBoE) IP based addressing extensions towards
>> user space.
>>
>> Extend INIT_QP_ATTR and QUERY_ROUTE ucma commands.
>>
>> Extend MODIFY_QP and CREATE_AH uverbs commands.
>
> This is a really big patch Or, there is lots going on here, hard to
> review :(
>
> The rdma cm stuff should probably be split out of this, and Sean
> should look at it of course.
sure, will do that, one patch for uverbs and one patch for rdma_ucm
> In fact, since the user ABI is so important, every ABI change should
> be a distinct patch, with a good change log, stating the intended
> goals of the change and ABI visible changes it makes.
point taken, will do that, thanks for bringing this over.
>
> The changelog above is terrible for a huge patch that makes changes to
> the userspace API.
>
>> diff --git a/include/uapi/rdma/ib_user_sa.h b/include/uapi/rdma/ib_user_sa.h
>> index cfc7c9b..367d66a 100644
>> +++ b/include/uapi/rdma/ib_user_sa.h
>> @@ -48,7 +48,13 @@ enum {
>> struct ib_path_rec_data {
>> __u32 flags;
>> __u32 reserved;
>> - __u32 path_rec[16];
>> + __u32 path_rec[20];
>> +};
>> +
>> +enum ibv_kern_path_rec_attr_mask {
>> + IB_USER_PATH_REC_ATTR_DMAC = 1ULL << 0,
>> + IB_USER_PATH_REC_ATTR_SMAC = 1ULL << 1,
>> + IB_USER_PATH_REC_ATTR_VID = 1ULL << 2
>> };
>
> So, how is userspace supposed to know what these values are?
Its part of the verbs extensions deal.
> The current system where the MAC address is in the GID seemed
> understandable, assuming you discover the MAC out of band some how...
MAC is Ethernet layer 2 address, I don't see why put mac in L3 header
(GRH) its better understandable vs putting there L3 address (IP).
>
>> +struct ib_uverbs_modify_qp_ex {
>> + __u32 comp_mask;
>> + struct ib_uverbs_qp_dest dest;
>> + struct ib_uverbs_qp_dest alt_dest;
> [...]
>> + struct ib_uverbs_qp_dest_ex dest_ex;
>> + struct ib_uverbs_qp_dest_ex alt_dest_ex;
>
> Yuk.. The 'ex' structures don't have to be byte compatible, they just
> have to have a known transform, dest should be the full extended dest,
> not split into two..
>
>> +struct rdma_ucm_query_route_resp_ex {
>> + __u64 node_guid;
>> + struct ib_user_path_rec_ex ib_route[2];
>> + struct sockaddr_in6 src_addr;
>> + struct sockaddr_in6 dst_addr;
>> + __u32 num_paths;
>> + __u8 port_num;
>> + __u8 reserved[3];
>> +};
>
> Should these be sockaddr_storage? How does this intersect with Sean's AF_GID work?
sockaddr_in6 is OK for extending rdma_ucm_query_route_resp as its OK
for the non extended version of that command. I don't see any
intersection with the AF_IB work.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 0/4] IP based RoCE GID Addressing
[not found] ` <CALsNU1PdjV2N4c3j9gQ0BRrd4GhOypDMMjmHEgF1ffCYOXeC1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-07-16 3:30 ` Or Gerlitz
0 siblings, 0 replies; 10+ messages in thread
From: Or Gerlitz @ 2013-07-16 3:30 UTC (permalink / raw)
To: Devesh Sharma
Cc: Jason Gunthorpe, Or Gerlitz,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
monis-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w
Devesh Sharma <devesh28-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
[...]
> What will happen to those devices which does not suppot IP based RoCE, How
> connection management will happen with those if Native RoCE support is
> removed. There may be other devices/implementations which does not support
> IP Base RoCE.
RoCE GIDs is something programmed by the low-level IB driver into the
device GID table, so what ever format used for GIDs, the device has to
support being programmed by the driver.
> How connection management will happen on them, will those devices always
> see Connection Reject event form RDMA-CM?
When a CM connection request is received by a node and the GID inside
it doesn't match any GID on this node GID table, the IB CM sends a
reject message with "invalid gid" reject reason.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-07-16 3:30 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-13 15:01 [PATCH for-next 0/4] IP based RoCE GID Addressing Or Gerlitz
[not found] ` <1371135704-5712-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-13 15:01 ` [PATCH for-next 1/4] IB/core: RoCE IP based GID addressing Or Gerlitz
2013-06-13 15:01 ` [PATCH for-next 2/4] IB/mlx4: " Or Gerlitz
2013-06-13 15:01 ` [PATCH for-next 3/4] IB/core: Infra-structure to support verbs extensions through uverbs Or Gerlitz
2013-06-13 15:01 ` [PATCH for-next 4/4] IB/core: Add RoCE IP based addressing extensions towards user space Or Gerlitz
[not found] ` <1371135704-5712-5-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-13 17:09 ` Jason Gunthorpe
[not found] ` <20130613170939.GB21570-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-14 20:42 ` Or Gerlitz
2013-06-13 17:00 ` [PATCH for-next 0/4] IP based RoCE GID Addressing Jason Gunthorpe
[not found] ` <20130613170011.GA21570-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-06-14 20:35 ` Or Gerlitz
[not found] ` <CALsNU1PdjV2N4c3j9gQ0BRrd4GhOypDMMjmHEgF1ffCYOXeC1g@mail.gmail.com>
[not found] ` <CALsNU1PdjV2N4c3j9gQ0BRrd4GhOypDMMjmHEgF1ffCYOXeC1g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-07-16 3:30 ` Or Gerlitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox