* [RFC PATCH 0/5] Add Core Capability Bits to use in Management helpers
@ 2015-05-04 6:14 ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1430720099-32512-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-05-04 6:14 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny
From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
The following 5 patches use new Core Capability bits to signal core
capabilities rather than inferring these capabilities based on the protocols.
The first 3 are the beginning of converting Michaels patches over to a bit
mask. Before converting all the functions I wanted to get consensus with a
smaller patch series.
In addition I have included the additional OPA flags which will be needed for
the OPA series to show how new support can be communicated. If we are in
agreement then I can complete the series with an official submission.
Ira Weiny (5):
IB/core: Add Core Capability flags to ib_device
IB/core: Replace query_protocol callback with Core Capability flags
check
IB/core: Convert cap_ib_mad to core_cap_flags bit mask
IB/core: Add rdma_dev_max_mad_size call
IB/core: Add cap_opa_mad helper using RDMA_CORE_CAP_OPA_MAD flag
drivers/infiniband/core/device.c | 42 +++++++++++++++-
drivers/infiniband/core/mad.c | 4 ++
drivers/infiniband/hw/amso1100/c2_provider.c | 8 +---
drivers/infiniband/hw/amso1100/c2_rnic.c | 1 +
drivers/infiniband/hw/cxgb3/iwch_provider.c | 9 +---
drivers/infiniband/hw/cxgb4/provider.c | 9 +---
drivers/infiniband/hw/ehca/ehca_hca.c | 11 ++--
drivers/infiniband/hw/ehca/ehca_main.c | 1 -
drivers/infiniband/hw/ipath/ipath_verbs.c | 9 +---
drivers/infiniband/hw/mlx4/main.c | 14 ++----
drivers/infiniband/hw/mlx5/main.c | 10 +---
drivers/infiniband/hw/mthca/mthca_provider.c | 10 +---
drivers/infiniband/hw/nes/nes_verbs.c | 9 +---
drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 -
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 8 +--
drivers/infiniband/hw/qib/qib_verbs.c | 9 +---
drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 -
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 9 +--
include/rdma/ib_mad.h | 1 +
include/rdma/ib_verbs.h | 72 +++++++++++++++++++++++---
20 files changed, 142 insertions(+), 96 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430720099-32512-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-05-04 6:14 ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1430720099-32512-2-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 6:14 ` [RFC PATCH 2/5] IB/core: Replace query_protocol callback with Core Capability flags check ira.weiny-ral2JQCrhuEAvxtiuMwx3w
` (3 subsequent siblings)
4 siblings, 1 reply; 50+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-05-04 6:14 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny
From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Add Core capability flags to each port attribute and read those into ib_device
upon registration for each port.
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/device.c | 41 ++++++++++++++++++++++++++++++++++++++
include/rdma/ib_verbs.h | 22 ++++++++++++++++++++
2 files changed, 63 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index b360350..6a37255 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -262,6 +262,37 @@ out:
return ret;
}
+static int read_core_cap_flags(struct ib_device *device)
+{
+ struct ib_port_attr tprops;
+ int num_ports, ret = -ENOMEM;
+ u8 port_index;
+
+ num_ports = device->phys_port_cnt;
+
+ device->core_cap_flags = kzalloc(sizeof(*device->core_cap_flags)
+ * (num_ports+1),
+ GFP_KERNEL);
+ if (!device->core_cap_flags)
+ return -ENOMEM;
+
+ for (port_index = 0; port_index <= num_ports; ++port_index) {
+ if ((port_index == 0 && device->node_type != RDMA_NODE_IB_SWITCH))
+ continue;
+
+ ret = ib_query_port(device, port_index, &tprops);
+ if (ret)
+ goto err;
+
+ device->core_cap_flags[port_index] = tprops.core_cap_flags;
+ }
+
+ return 0;
+err:
+ kfree(device->core_cap_flags);
+ return ret;
+}
+
/**
* ib_register_device - Register an IB device with IB core
* @device:Device to register
@@ -302,12 +333,21 @@ int ib_register_device(struct ib_device *device,
goto out;
}
+ ret = read_core_cap_flags(device);
+ if (ret) {
+ dev_err(&device->dev, "Couldn't create Core Capability flags\n");
+ kfree(device->gid_tbl_len);
+ kfree(device->pkey_tbl_len);
+ goto out;
+ }
+
ret = ib_device_register_sysfs(device, port_callback);
if (ret) {
printk(KERN_WARNING "Couldn't register device %s with driver model\n",
device->name);
kfree(device->gid_tbl_len);
kfree(device->pkey_tbl_len);
+ kfree(device->core_cap_flags);
goto out;
}
@@ -351,6 +391,7 @@ void ib_unregister_device(struct ib_device *device)
kfree(device->gid_tbl_len);
kfree(device->pkey_tbl_len);
+ kfree(device->core_cap_flags);
mutex_unlock(&device_mutex);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index c724114..4de2758 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -353,11 +353,32 @@ union rdma_protocol_stats {
struct iw_protocol_stats iw;
};
+/* Define bits for the various functionality this port needs to be supported by
+ * the core.
+ */
+/* Management 0x00000000FFFFFFFF */
+#define RDMA_CORE_CAP_IB_MAD 0x0000000000000001ULL
+#define RDMA_CORE_CAP_IB_SMI 0x0000000000000002ULL
+#define RDMA_CORE_CAP_IB_CM 0x0000000000000004ULL
+#define RDMA_CORE_CAP_IW_CM 0x0000000000000008ULL
+#define RDMA_CORE_CAP_IB_SA 0x0000000000000010ULL
+
+/* Address format 0x0000FFFF00000000 */
+#define RDMA_CORE_CAP_AF_IB 0x0000000100000000ULL
+#define RDMA_CORE_CAP_ETH_AH 0x0000000200000000ULL
+
+/* Protocol 0xFFFF000000000000 */
+#define RDMA_CORE_CAP_PROT_IB 0x0001000000000000ULL
+#define RDMA_CORE_CAP_PROT_IBOE 0x0002000000000000ULL
+#define RDMA_CORE_CAP_PROT_IWARP 0x0004000000000000ULL
+#define RDMA_CORE_CAP_PROT_USNIC_UDP 0x0008000000000000ULL
+
struct ib_port_attr {
enum ib_port_state state;
enum ib_mtu max_mtu;
enum ib_mtu active_mtu;
int gid_tbl_len;
+ u64 core_cap_flags;
u32 port_cap_flags;
u32 max_msg_sz;
u32 bad_pkey_cntr;
@@ -1684,6 +1705,7 @@ struct ib_device {
u32 local_dma_lkey;
u8 node_type;
u8 phys_port_cnt;
+ u64 *core_cap_flags; /* Per port core capability flags */
};
struct ib_client {
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [RFC PATCH 2/5] IB/core: Replace query_protocol callback with Core Capability flags check
[not found] ` <1430720099-32512-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 6:14 ` [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2015-05-04 6:14 ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-05-04 6:14 ` [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask ira.weiny-ral2JQCrhuEAvxtiuMwx3w
` (2 subsequent siblings)
4 siblings, 0 replies; 50+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-05-04 6:14 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny
From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Replace the query_protocol with a check against the Core Capabilities Flags.
This is more efficient than a function call.
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/device.c | 1 -
drivers/infiniband/hw/amso1100/c2_provider.c | 8 +-------
drivers/infiniband/hw/cxgb3/iwch_provider.c | 8 +-------
drivers/infiniband/hw/cxgb4/provider.c | 8 +-------
drivers/infiniband/hw/ehca/ehca_hca.c | 8 ++------
drivers/infiniband/hw/ehca/ehca_main.c | 1 -
drivers/infiniband/hw/ipath/ipath_verbs.c | 8 +-------
drivers/infiniband/hw/mlx4/main.c | 13 +++----------
drivers/infiniband/hw/mlx5/main.c | 9 ++-------
drivers/infiniband/hw/mthca/mthca_provider.c | 8 +-------
drivers/infiniband/hw/nes/nes_verbs.c | 8 +-------
drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 -
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 7 +------
drivers/infiniband/hw/qib/qib_verbs.c | 8 +-------
drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 -
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 7 +------
include/rdma/ib_verbs.h | 13 ++++++-------
17 files changed, 22 insertions(+), 95 deletions(-)
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 6a37255..fd8961a 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -76,7 +76,6 @@ static int ib_device_check_mandatory(struct ib_device *device)
} mandatory_table[] = {
IB_MANDATORY_FUNC(query_device),
IB_MANDATORY_FUNC(query_port),
- IB_MANDATORY_FUNC(query_protocol),
IB_MANDATORY_FUNC(query_pkey),
IB_MANDATORY_FUNC(query_gid),
IB_MANDATORY_FUNC(alloc_pd),
diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c
index 6fe329a..b9300db 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -95,16 +95,11 @@ static int c2_query_port(struct ib_device *ibdev,
props->qkey_viol_cntr = 0;
props->active_width = 1;
props->active_speed = IB_SPEED_SDR;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IWARP;
return 0;
}
-static enum rdma_protocol_type
-c2_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IWARP;
-}
-
static int c2_query_pkey(struct ib_device *ibdev,
u8 port, u16 index, u16 * pkey)
{
@@ -807,7 +802,6 @@ int c2_register_device(struct c2_dev *dev)
dev->ibdev.dma_device = &dev->pcidev->dev;
dev->ibdev.query_device = c2_query_device;
dev->ibdev.query_port = c2_query_port;
- dev->ibdev.query_protocol = c2_query_protocol;
dev->ibdev.query_pkey = c2_query_pkey;
dev->ibdev.query_gid = c2_query_gid;
dev->ibdev.alloc_ucontext = c2_alloc_ucontext;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 298d1ca..e270846 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1228,16 +1228,11 @@ static int iwch_query_port(struct ib_device *ibdev,
props->active_width = 2;
props->active_speed = IB_SPEED_DDR;
props->max_msg_sz = -1;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IWARP;
return 0;
}
-static enum rdma_protocol_type
-iwch_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IWARP;
-}
-
static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
char *buf)
{
@@ -1391,7 +1386,6 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.dma_device = &(dev->rdev.rnic_info.pdev->dev);
dev->ibdev.query_device = iwch_query_device;
dev->ibdev.query_port = iwch_query_port;
- dev->ibdev.query_protocol = iwch_query_protocol;
dev->ibdev.query_pkey = iwch_query_pkey;
dev->ibdev.query_gid = iwch_query_gid;
dev->ibdev.alloc_ucontext = iwch_alloc_ucontext;
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index f52ee63..ff63344 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -386,16 +386,11 @@ static int c4iw_query_port(struct ib_device *ibdev, u8 port,
props->active_width = 2;
props->active_speed = IB_SPEED_DDR;
props->max_msg_sz = -1;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IWARP;
return 0;
}
-static enum rdma_protocol_type
-c4iw_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IWARP;
-}
-
static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
char *buf)
{
@@ -512,7 +507,6 @@ int c4iw_register_device(struct c4iw_dev *dev)
dev->ibdev.dma_device = &(dev->rdev.lldi.pdev->dev);
dev->ibdev.query_device = c4iw_query_device;
dev->ibdev.query_port = c4iw_query_port;
- dev->ibdev.query_protocol = c4iw_query_protocol;
dev->ibdev.query_pkey = c4iw_query_pkey;
dev->ibdev.query_gid = c4iw_query_gid;
dev->ibdev.alloc_ucontext = c4iw_alloc_ucontext;
diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
index 1f4dc9c..75f5353 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -236,18 +236,14 @@ int ehca_query_port(struct ib_device *ibdev,
props->active_speed = IB_SPEED_SDR;
}
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+
query_port1:
ehca_free_fw_ctrlblock(rblock);
return ret;
}
-enum rdma_protocol_type
-ehca_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IB;
-}
-
int ehca_query_sma_attr(struct ehca_shca *shca,
u8 port, struct ehca_sma_attr *attr)
{
diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
index 321545b..cd8d290 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -467,7 +467,6 @@ static int ehca_init_device(struct ehca_shca *shca)
shca->ib_device.dma_device = &shca->ofdev->dev;
shca->ib_device.query_device = ehca_query_device;
shca->ib_device.query_port = ehca_query_port;
- shca->ib_device.query_protocol = ehca_query_protocol;
shca->ib_device.query_gid = ehca_query_gid;
shca->ib_device.query_pkey = ehca_query_pkey;
/* shca->in_device.modify_device = ehca_modify_device */
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
index 34b94c3..94a03eb 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
@@ -1634,16 +1634,11 @@ static int ipath_query_port(struct ib_device *ibdev,
}
props->active_mtu = mtu;
props->subnet_timeout = dev->subnet_timeout;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
return 0;
}
-static enum rdma_protocol_type
-ipath_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IB;
-}
-
static int ipath_modify_device(struct ib_device *device,
int device_modify_mask,
struct ib_device_modify *device_modify)
@@ -2146,7 +2141,6 @@ int ipath_register_ib_device(struct ipath_devdata *dd)
dev->query_device = ipath_query_device;
dev->modify_device = ipath_modify_device;
dev->query_port = ipath_query_port;
- dev->query_protocol = ipath_query_protocol;
dev->modify_port = ipath_modify_port;
dev->query_pkey = ipath_query_pkey;
dev->query_gid = ipath_query_gid;
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 26678d2..fa4e45c 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -330,6 +330,8 @@ static int ib_link_query_port(struct ib_device *ibdev, u8 port,
if (props->state == IB_PORT_DOWN)
props->active_speed = IB_SPEED_SDR;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+
out:
kfree(in_mad);
kfree(out_mad);
@@ -390,6 +392,7 @@ static int eth_link_query_port(struct ib_device *ibdev, u8 port,
props->state = (netif_running(ndev) && netif_carrier_ok(ndev)) ?
IB_PORT_ACTIVE : IB_PORT_DOWN;
props->phys_state = state_to_phys_state(props->state);
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IBOE;
out_unlock:
spin_unlock_bh(&iboe->lock);
if (is_bonded)
@@ -420,15 +423,6 @@ static int mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
return __mlx4_ib_query_port(ibdev, port, props, 0);
}
-static enum rdma_protocol_type
-mlx4_ib_query_protocol(struct ib_device *device, u8 port_num)
-{
- struct mlx4_dev *dev = to_mdev(device)->dev;
-
- return dev->caps.port_mask[port_num] == MLX4_PORT_TYPE_IB ?
- RDMA_PROTOCOL_IB : RDMA_PROTOCOL_IBOE;
-}
-
int __mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
union ib_gid *gid, int netw_view)
{
@@ -2211,7 +2205,6 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ibdev->ib_dev.query_device = mlx4_ib_query_device;
ibdev->ib_dev.query_port = mlx4_ib_query_port;
- ibdev->ib_dev.query_protocol = mlx4_ib_query_protocol;
ibdev->ib_dev.get_link_layer = mlx4_ib_port_link_layer;
ibdev->ib_dev.query_gid = mlx4_ib_query_gid;
ibdev->ib_dev.query_pkey = mlx4_ib_query_pkey;
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 8dec380..e3cdc17 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -255,6 +255,8 @@ int mlx5_ib_query_port(struct ib_device *ibdev, u8 port,
}
}
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+
out:
kfree(in_mad);
kfree(out_mad);
@@ -262,12 +264,6 @@ out:
return err;
}
-static enum rdma_protocol_type
-mlx5_ib_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IB;
-}
-
static int mlx5_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
union ib_gid *gid)
{
@@ -1250,7 +1246,6 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
dev->ib_dev.query_device = mlx5_ib_query_device;
dev->ib_dev.query_port = mlx5_ib_query_port;
- dev->ib_dev.query_protocol = mlx5_ib_query_protocol;
dev->ib_dev.query_gid = mlx5_ib_query_gid;
dev->ib_dev.query_pkey = mlx5_ib_query_pkey;
dev->ib_dev.modify_device = mlx5_ib_modify_device;
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index ad1cca3..e49aece 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -172,6 +172,7 @@ static int mthca_query_port(struct ib_device *ibdev,
props->subnet_timeout = out_mad->data[51] & 0x1f;
props->max_vl_num = out_mad->data[37] >> 4;
props->init_type_reply = out_mad->data[41] >> 4;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
out:
kfree(in_mad);
@@ -179,12 +180,6 @@ static int mthca_query_port(struct ib_device *ibdev,
return err;
}
-static enum rdma_protocol_type
-mthca_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IB;
-}
-
static int mthca_modify_device(struct ib_device *ibdev,
int mask,
struct ib_device_modify *props)
@@ -1287,7 +1282,6 @@ int mthca_register_device(struct mthca_dev *dev)
dev->ib_dev.dma_device = &dev->pdev->dev;
dev->ib_dev.query_device = mthca_query_device;
dev->ib_dev.query_port = mthca_query_port;
- dev->ib_dev.query_protocol = mthca_query_protocol;
dev->ib_dev.modify_device = mthca_modify_device;
dev->ib_dev.modify_port = mthca_modify_port;
dev->ib_dev.query_pkey = mthca_query_pkey;
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index 027f6d1..187eda6 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -602,16 +602,11 @@ static int nes_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr
props->active_width = IB_WIDTH_4X;
props->active_speed = IB_SPEED_SDR;
props->max_msg_sz = 0x80000000;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IWARP;
return 0;
}
-static enum rdma_protocol_type
-nes_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IWARP;
-}
-
/**
* nes_query_pkey
*/
@@ -3884,7 +3879,6 @@ struct nes_ib_device *nes_init_ofa_device(struct net_device *netdev)
nesibdev->ibdev.dev.parent = &nesdev->pcidev->dev;
nesibdev->ibdev.query_device = nes_query_device;
nesibdev->ibdev.query_port = nes_query_port;
- nesibdev->ibdev.query_protocol = nes_query_protocol;
nesibdev->ibdev.query_pkey = nes_query_pkey;
nesibdev->ibdev.query_gid = nes_query_gid;
nesibdev->ibdev.alloc_ucontext = nes_alloc_ucontext;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index 85d99e9..7a2b59a 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -244,7 +244,6 @@ static int ocrdma_register_device(struct ocrdma_dev *dev)
/* mandatory verbs. */
dev->ibdev.query_device = ocrdma_query_device;
dev->ibdev.query_port = ocrdma_query_port;
- dev->ibdev.query_protocol = ocrdma_query_protocol;
dev->ibdev.modify_port = ocrdma_modify_port;
dev->ibdev.query_gid = ocrdma_query_gid;
dev->ibdev.get_link_layer = ocrdma_link_layer;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 3e98360..4b8f8e6 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -184,15 +184,10 @@ int ocrdma_query_port(struct ib_device *ibdev,
&props->active_width);
props->max_msg_sz = 0x80000000;
props->max_vl_num = 4;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IBOE;
return 0;
}
-enum rdma_protocol_type
-ocrdma_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IBOE;
-}
-
int ocrdma_modify_port(struct ib_device *ibdev, u8 port, int mask,
struct ib_port_modify *props)
{
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 9fd4b28..6e52946 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1646,16 +1646,11 @@ static int qib_query_port(struct ib_device *ibdev, u8 port,
}
props->active_mtu = mtu;
props->subnet_timeout = ibp->subnet_timeout;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
return 0;
}
-static enum rdma_protocol_type
-qib_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_IB;
-}
-
static int qib_modify_device(struct ib_device *device,
int device_modify_mask,
struct ib_device_modify *device_modify)
@@ -2190,7 +2185,6 @@ int qib_register_ib_device(struct qib_devdata *dd)
ibdev->query_device = qib_query_device;
ibdev->modify_device = qib_modify_device;
ibdev->query_port = qib_query_port;
- ibdev->query_protocol = qib_query_protocol;
ibdev->modify_port = qib_modify_port;
ibdev->query_pkey = qib_query_pkey;
ibdev->query_gid = qib_query_gid;
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
index bd9f364..0d0f986 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
@@ -360,7 +360,6 @@ static void *usnic_ib_device_add(struct pci_dev *dev)
us_ibdev->ib_dev.query_device = usnic_ib_query_device;
us_ibdev->ib_dev.query_port = usnic_ib_query_port;
- us_ibdev->ib_dev.query_protocol = usnic_ib_query_protocol;
us_ibdev->ib_dev.query_pkey = usnic_ib_query_pkey;
us_ibdev->ib_dev.query_gid = usnic_ib_query_gid;
us_ibdev->ib_dev.get_link_layer = usnic_ib_port_link_layer;
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 732b5c5..83c61ef 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -344,16 +344,11 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
props->max_msg_sz = us_ibdev->ufdev->mtu;
props->max_vl_num = 1;
mutex_unlock(&us_ibdev->usdev_lock);
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_USNIC_UDP;
return 0;
}
-enum rdma_protocol_type
-usnic_ib_query_protocol(struct ib_device *device, u8 port_num)
-{
- return RDMA_PROTOCOL_USNIC_UDP;
-}
-
int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 4de2758..dcaee4f 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1529,8 +1529,6 @@ struct ib_device {
int (*query_port)(struct ib_device *device,
u8 port_num,
struct ib_port_attr *port_attr);
- enum rdma_protocol_type (*query_protocol)(struct ib_device *device,
- u8 port_num);
enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
u8 port_num);
int (*query_gid)(struct ib_device *device,
@@ -1776,24 +1774,25 @@ enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device,
static inline int rdma_protocol_ib(struct ib_device *device, u8 port_num)
{
- return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IB;
+ return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_PROT_IB);
}
static inline int rdma_protocol_iboe(struct ib_device *device, u8 port_num)
{
- return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IBOE;
+ return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_PROT_IBOE);
}
static inline int rdma_protocol_iwarp(struct ib_device *device, u8 port_num)
{
- return device->query_protocol(device, port_num) == RDMA_PROTOCOL_IWARP;
+ return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_PROT_IWARP);
}
static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
{
- enum rdma_protocol_type pt = device->query_protocol(device, port_num);
+ u64 flags = device->core_cap_flags[port_num];
- return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE);
+ return (!!(flags & RDMA_CORE_CAP_PROT_IB) ||
+ !!(flags & RDMA_CORE_CAP_PROT_IBOE));
}
/**
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask
[not found] ` <1430720099-32512-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 6:14 ` [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-05-04 6:14 ` [RFC PATCH 2/5] IB/core: Replace query_protocol callback with Core Capability flags check ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2015-05-04 6:14 ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1430720099-32512-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 6:14 ` [RFC PATCH 4/5] IB/core: Add rdma_dev_max_mad_size call ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-05-04 6:14 ` [RFC PATCH 5/5] IB/core: Add cap_opa_mad helper using RDMA_CORE_CAP_OPA_MAD flag ira.weiny-ral2JQCrhuEAvxtiuMwx3w
4 siblings, 1 reply; 50+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-05-04 6:14 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny
From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Use the new Core Capability bits instead of inferring this support from the
protocol.
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/ehca/ehca_hca.c | 2 +-
drivers/infiniband/hw/ipath/ipath_verbs.c | 2 +-
drivers/infiniband/hw/mlx4/main.c | 4 ++--
drivers/infiniband/hw/mlx5/main.c | 2 +-
drivers/infiniband/hw/mthca/mthca_provider.c | 2 +-
drivers/infiniband/hw/qib/qib_verbs.c | 2 +-
include/rdma/ib_verbs.h | 2 +-
7 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
index 75f5353..a06eadd 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -236,7 +236,7 @@ int ehca_query_port(struct ib_device *ibdev,
props->active_speed = IB_SPEED_SDR;
}
- props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
query_port1:
ehca_free_fw_ctrlblock(rblock);
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
index 94a03eb..bb66aa7 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
@@ -1634,7 +1634,7 @@ static int ipath_query_port(struct ib_device *ibdev,
}
props->active_mtu = mtu;
props->subnet_timeout = dev->subnet_timeout;
- props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
return 0;
}
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index fa4e45c..9db5fdc 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -330,7 +330,7 @@ static int ib_link_query_port(struct ib_device *ibdev, u8 port,
if (props->state == IB_PORT_DOWN)
props->active_speed = IB_SPEED_SDR;
- props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
out:
kfree(in_mad);
@@ -392,7 +392,7 @@ static int eth_link_query_port(struct ib_device *ibdev, u8 port,
props->state = (netif_running(ndev) && netif_carrier_ok(ndev)) ?
IB_PORT_ACTIVE : IB_PORT_DOWN;
props->phys_state = state_to_phys_state(props->state);
- props->core_cap_flags = RDMA_CORE_CAP_PROT_IBOE;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IBOE | RDMA_CORE_CAP_IB_MAD;
out_unlock:
spin_unlock_bh(&iboe->lock);
if (is_bonded)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index e3cdc17..4b8ef01 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -255,7 +255,7 @@ int mlx5_ib_query_port(struct ib_device *ibdev, u8 port,
}
}
- props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
out:
kfree(in_mad);
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index e49aece..60fcc02 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -172,7 +172,7 @@ static int mthca_query_port(struct ib_device *ibdev,
props->subnet_timeout = out_mad->data[51] & 0x1f;
props->max_vl_num = out_mad->data[37] >> 4;
props->init_type_reply = out_mad->data[41] >> 4;
- props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
out:
kfree(in_mad);
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 6e52946..6975528 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1646,7 +1646,7 @@ static int qib_query_port(struct ib_device *ibdev, u8 port,
}
props->active_mtu = mtu;
props->subnet_timeout = ibp->subnet_timeout;
- props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
+ props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
return 0;
}
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index dcaee4f..6c2b0e5 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1807,7 +1807,7 @@ static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
*/
static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
{
- return rdma_ib_or_iboe(device, port_num);
+ return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_IB_MAD);
}
/**
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [RFC PATCH 4/5] IB/core: Add rdma_dev_max_mad_size call
[not found] ` <1430720099-32512-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
` (2 preceding siblings ...)
2015-05-04 6:14 ` [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2015-05-04 6:14 ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-05-04 6:14 ` [RFC PATCH 5/5] IB/core: Add cap_opa_mad helper using RDMA_CORE_CAP_OPA_MAD flag ira.weiny-ral2JQCrhuEAvxtiuMwx3w
4 siblings, 0 replies; 50+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-05-04 6:14 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny
From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Change all IB drivers to report their max MAD size through the
rdma_dev_max_mad_size helper function.
Set all current devices to the IB_MGMT_MAD_SIZE and add check to verify that
all devices support at least IB_MGMT_MAD_SIZE
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
Changes from the original OPA series V4:
Change MAD size error message to a WARN_ON
Remove reference to cached_dev_attr
Add rdma_dev_max_mad_size
drivers/infiniband/core/mad.c | 4 ++++
drivers/infiniband/hw/amso1100/c2_rnic.c | 1 +
drivers/infiniband/hw/cxgb3/iwch_provider.c | 1 +
drivers/infiniband/hw/cxgb4/provider.c | 1 +
drivers/infiniband/hw/ehca/ehca_hca.c | 3 +++
drivers/infiniband/hw/ipath/ipath_verbs.c | 1 +
drivers/infiniband/hw/mlx4/main.c | 1 +
drivers/infiniband/hw/mlx5/main.c | 1 +
drivers/infiniband/hw/mthca/mthca_provider.c | 2 ++
drivers/infiniband/hw/nes/nes_verbs.c | 1 +
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 1 +
drivers/infiniband/hw/qib/qib_verbs.c | 1 +
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 ++
include/rdma/ib_mad.h | 1 +
include/rdma/ib_verbs.h | 19 +++++++++++++++++++
15 files changed, 40 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 0749d7b..09578a6 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2923,6 +2923,10 @@ static int ib_mad_port_open(struct ib_device *device,
unsigned long flags;
char name[sizeof "ib_mad123"];
int has_smi;
+ size_t max_mad_size = rdma_dev_max_mad_size(device, port_num);
+
+ if (WARN_ON(max_mad_size < IB_MGMT_MAD_SIZE))
+ return -EFAULT;
/* Create new device info */
port_priv = kzalloc(sizeof *port_priv, GFP_KERNEL);
diff --git a/drivers/infiniband/hw/amso1100/c2_rnic.c b/drivers/infiniband/hw/amso1100/c2_rnic.c
index d2a6d96..63322c0 100644
--- a/drivers/infiniband/hw/amso1100/c2_rnic.c
+++ b/drivers/infiniband/hw/amso1100/c2_rnic.c
@@ -197,6 +197,7 @@ static int c2_rnic_query(struct c2_dev *c2dev, struct ib_device_attr *props)
props->max_srq_sge = 0;
props->max_pkeys = 0;
props->local_ca_ack_delay = 0;
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
bail2:
vq_repbuf_free(c2dev, reply);
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index e270846..714afdc 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1174,6 +1174,7 @@ static int iwch_query_device(struct ib_device *ibdev,
props->max_pd = dev->attr.max_pds;
props->local_ca_ack_delay = 0;
props->max_fast_reg_page_list_len = T3_MAX_FASTREG_DEPTH;
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
return 0;
}
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index ff63344..8261b11 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -332,6 +332,7 @@ static int c4iw_query_device(struct ib_device *ibdev,
props->max_pd = T4_MAX_NUM_PD;
props->local_ca_ack_delay = 0;
props->max_fast_reg_page_list_len = t4_max_fr_depth(use_dsgl);
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
return 0;
}
diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
index a06eadd..8ed3472 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -40,6 +40,7 @@
*/
#include <linux/gfp.h>
+#include <rdma/ib_mad.h>
#include "ehca_tools.h"
#include "ehca_iverbs.h"
@@ -133,6 +134,8 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props)
if (rblock->hca_cap_indicators & cap_mapping[i + 1])
props->device_cap_flags |= cap_mapping[i];
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
+
query_device1:
ehca_free_fw_ctrlblock(rblock);
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
index bb66aa7..85b8ad1 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
@@ -1538,6 +1538,7 @@ static int ipath_query_device(struct ib_device *ibdev,
props->max_mcast_qp_attach = ib_ipath_max_mcast_qp_attached;
props->max_total_mcast_qp_attach = props->max_mcast_qp_attach *
props->max_mcast_grp;
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
return 0;
}
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 9db5fdc..881dcd7 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -229,6 +229,7 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
props->max_total_mcast_qp_attach = props->max_mcast_qp_attach *
props->max_mcast_grp;
props->max_map_per_fmr = dev->dev->caps.max_fmr_maps;
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
out:
kfree(in_mad);
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 4b8ef01..68f2d2c 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -154,6 +154,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
props->max_total_mcast_qp_attach = props->max_mcast_qp_attach *
props->max_mcast_grp;
props->max_map_per_fmr = INT_MAX; /* no limit in ConnectIB */
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
if (dev->mdev->caps.gen.flags & MLX5_DEV_CAP_FLAG_ON_DMND_PG)
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 60fcc02..811e3dd 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -123,6 +123,8 @@ static int mthca_query_device(struct ib_device *ibdev,
props->max_map_per_fmr =
(1 << (32 - ilog2(mdev->limits.num_mpts))) - 1;
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
+
err = 0;
out:
kfree(in_mad);
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index 187eda6..d16fdd5 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -555,6 +555,7 @@ static int nes_query_device(struct ib_device *ibdev, struct ib_device_attr *prop
props->max_qp_init_rd_atom = props->max_qp_rd_atom;
props->atomic_cap = IB_ATOMIC_NONE;
props->max_map_per_fmr = 1;
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
return 0;
}
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 4b8f8e6..8d18053 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -103,6 +103,7 @@ int ocrdma_query_device(struct ib_device *ibdev, struct ib_device_attr *attr)
attr->local_ca_ack_delay = dev->attr.local_ca_ack_delay;
attr->max_fast_reg_page_list_len = dev->attr.max_pages_per_frmr;
attr->max_pkeys = 1;
+ attr->max_mad_size = IB_MGMT_MAD_SIZE;
return 0;
}
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 6975528..b95d14a 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1592,6 +1592,7 @@ static int qib_query_device(struct ib_device *ibdev,
props->max_mcast_qp_attach = ib_qib_max_mcast_qp_attached;
props->max_total_mcast_qp_attach = props->max_mcast_qp_attach *
props->max_mcast_grp;
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
return 0;
}
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 83c61ef..be04372 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -22,6 +22,7 @@
#include <rdma/ib_user_verbs.h>
#include <rdma/ib_addr.h>
+#include <rdma/ib_mad.h>
#include "usnic_abi.h"
#include "usnic_ib.h"
@@ -296,6 +297,7 @@ int usnic_ib_query_device(struct ib_device *ibdev,
props->max_mcast_qp_attach = 0;
props->max_total_mcast_qp_attach = 0;
props->max_map_per_fmr = 0;
+ props->max_mad_size = IB_MGMT_MAD_SIZE;
/* Owned by Userspace
* max_qp_wr, max_sge, max_sge_rd, max_cqe */
mutex_unlock(&us_ibdev->usdev_lock);
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index 9c89939..5823016 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -135,6 +135,7 @@ enum {
IB_MGMT_SA_DATA = 200,
IB_MGMT_DEVICE_HDR = 64,
IB_MGMT_DEVICE_DATA = 192,
+ IB_MGMT_MAD_SIZE = IB_MGMT_MAD_HDR + IB_MGMT_MAD_DATA,
};
struct ib_mad_hdr {
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6c2b0e5..01bdf12 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -217,6 +217,7 @@ struct ib_device_attr {
int sig_prot_cap;
int sig_guard_cap;
struct ib_odp_caps odp_caps;
+ u32 max_mad_size;
};
enum ib_mtu {
@@ -1930,6 +1931,24 @@ static inline int cap_read_multi_sge(struct ib_device *device, u8 port_num)
return !rdma_protocol_iwarp(device, port_num);
}
+/**
+ * rdma_dev_max_mad_size - Return the max MAD size required by this RDMA Port.
+ *
+ * @device: Device
+ * @port_num: Port number
+ *
+ * Return the max MAD size required by the Port. May return 0 if the port does
+ * not support MADs
+ */
+static inline size_t rdma_dev_max_mad_size(struct ib_device *device,
+ u8 port_num)
+{
+ struct ib_device_attr attr;
+
+ device->query_device(device, &attr);
+ return attr.max_mad_size;
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [RFC PATCH 5/5] IB/core: Add cap_opa_mad helper using RDMA_CORE_CAP_OPA_MAD flag
[not found] ` <1430720099-32512-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
` (3 preceding siblings ...)
2015-05-04 6:14 ` [RFC PATCH 4/5] IB/core: Add rdma_dev_max_mad_size call ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2015-05-04 6:14 ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
4 siblings, 0 replies; 50+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2015-05-04 6:14 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny
From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
OPA MADs share a common header with IBTA MADs but with a different base version
and an extended length. These MADs increase the performance of management
traffic on OPA devices.
Sharing a common header with IBTA MADs allows us to share most of the MAD
processing code when dealing with OPA MADs in addition to supporting some IBTA
MADs on OPA devices.
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
Changes from the OPA series v4:
Use new cap_opa_mad and RDMA_CORE_CAP_OPA_MAD flag
include/rdma/ib_verbs.h | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 01bdf12..31f1ff9 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -363,6 +363,7 @@ union rdma_protocol_stats {
#define RDMA_CORE_CAP_IB_CM 0x0000000000000004ULL
#define RDMA_CORE_CAP_IW_CM 0x0000000000000008ULL
#define RDMA_CORE_CAP_IB_SA 0x0000000000000010ULL
+#define RDMA_CORE_CAP_OPA_MAD 0x0000000000000020ULL
/* Address format 0x0000FFFF00000000 */
#define RDMA_CORE_CAP_AF_IB 0x0000000100000000ULL
@@ -1812,6 +1813,21 @@ static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
}
/**
+ * cap_opa_mad - Check if the port of device supports OPA defined
+ * Management Datagrams.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device does not support OPA
+ * Management Datagrams.
+ */
+static inline int cap_opa_mad(struct ib_device *device, u8 port_num)
+{
+ return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_OPA_MAD);
+}
+
+/**
* cap_ib_smi - Check if the port of device has the capability Infiniband
* Subnet Management Interface.
*
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430720099-32512-2-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-05-04 14:41 ` Doug Ledford
[not found] ` <1430750492.2407.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-04 16:42 ` Hefty, Sean
2015-05-04 18:36 ` Jason Gunthorpe
2 siblings, 1 reply; 50+ messages in thread
From: Doug Ledford @ 2015-05-04 14:41 UTC (permalink / raw)
To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 4864 bytes --]
On Mon, 2015-05-04 at 02:14 -0400, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> Add Core capability flags to each port attribute and read those into ib_device
> upon registration for each port.
>
> Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
> drivers/infiniband/core/device.c | 41 ++++++++++++++++++++++++++++++++++++++
> include/rdma/ib_verbs.h | 22 ++++++++++++++++++++
> 2 files changed, 63 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index b360350..6a37255 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -262,6 +262,37 @@ out:
> return ret;
> }
>
> +static int read_core_cap_flags(struct ib_device *device)
> +{
> + struct ib_port_attr tprops;
> + int num_ports, ret = -ENOMEM;
> + u8 port_index;
> +
> + num_ports = device->phys_port_cnt;
> +
> + device->core_cap_flags = kzalloc(sizeof(*device->core_cap_flags)
> + * (num_ports+1),
> + GFP_KERNEL);
> + if (!device->core_cap_flags)
> + return -ENOMEM;
> +
> + for (port_index = 0; port_index <= num_ports; ++port_index) {
> + if ((port_index == 0 && device->node_type != RDMA_NODE_IB_SWITCH))
> + continue;
> +
> + ret = ib_query_port(device, port_index, &tprops);
> + if (ret)
> + goto err;
> +
> + device->core_cap_flags[port_index] = tprops.core_cap_flags;
> + }
> +
> + return 0;
> +err:
> + kfree(device->core_cap_flags);
> + return ret;
> +}
> +
> /**
> * ib_register_device - Register an IB device with IB core
> * @device:Device to register
> @@ -302,12 +333,21 @@ int ib_register_device(struct ib_device *device,
> goto out;
> }
>
> + ret = read_core_cap_flags(device);
> + if (ret) {
> + dev_err(&device->dev, "Couldn't create Core Capability flags\n");
> + kfree(device->gid_tbl_len);
> + kfree(device->pkey_tbl_len);
> + goto out;
> + }
> +
> ret = ib_device_register_sysfs(device, port_callback);
> if (ret) {
> printk(KERN_WARNING "Couldn't register device %s with driver model\n",
> device->name);
> kfree(device->gid_tbl_len);
> kfree(device->pkey_tbl_len);
> + kfree(device->core_cap_flags);
> goto out;
> }
>
> @@ -351,6 +391,7 @@ void ib_unregister_device(struct ib_device *device)
>
> kfree(device->gid_tbl_len);
> kfree(device->pkey_tbl_len);
> + kfree(device->core_cap_flags);
>
> mutex_unlock(&device_mutex);
>
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index c724114..4de2758 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -353,11 +353,32 @@ union rdma_protocol_stats {
> struct iw_protocol_stats iw;
> };
>
> +/* Define bits for the various functionality this port needs to be supported by
> + * the core.
> + */
> +/* Management 0x00000000FFFFFFFF */
> +#define RDMA_CORE_CAP_IB_MAD 0x0000000000000001ULL
> +#define RDMA_CORE_CAP_IB_SMI 0x0000000000000002ULL
> +#define RDMA_CORE_CAP_IB_CM 0x0000000000000004ULL
> +#define RDMA_CORE_CAP_IW_CM 0x0000000000000008ULL
> +#define RDMA_CORE_CAP_IB_SA 0x0000000000000010ULL
> +
> +/* Address format 0x0000FFFF00000000 */
> +#define RDMA_CORE_CAP_AF_IB 0x0000000100000000ULL
> +#define RDMA_CORE_CAP_ETH_AH 0x0000000200000000ULL
> +
> +/* Protocol 0xFFFF000000000000 */
> +#define RDMA_CORE_CAP_PROT_IB 0x0001000000000000ULL
> +#define RDMA_CORE_CAP_PROT_IBOE 0x0002000000000000ULL
> +#define RDMA_CORE_CAP_PROT_IWARP 0x0004000000000000ULL
> +#define RDMA_CORE_CAP_PROT_USNIC_UDP 0x0008000000000000ULL
In accordance with what we've been talking about, drop IBOE for ROCE.
Drop the UDP off of USNIC, then define a bit for CAP_PROT_UDP_ENCAP.
USNIC will be just USNIC, USNIC_UDP will be USNIC | UDP_ENCAP, ROCE v1
will be ROCE, and ROCEv2 will be ROCE | UDP_ENCAP.
> +
> struct ib_port_attr {
> enum ib_port_state state;
> enum ib_mtu max_mtu;
> enum ib_mtu active_mtu;
> int gid_tbl_len;
> + u64 core_cap_flags;
I think u32 should be enough here, and will help keep our footprint
smaller.
> u32 port_cap_flags;
> u32 max_msg_sz;
> u32 bad_pkey_cntr;
> @@ -1684,6 +1705,7 @@ struct ib_device {
> u32 local_dma_lkey;
> u8 node_type;
> u8 phys_port_cnt;
> + u64 *core_cap_flags; /* Per port core capability flags */
Ditto.
> };
>
> struct ib_client {
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430750492.2407.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-05-04 16:40 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA17C-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Hefty, Sean @ 2015-05-04 16:40 UTC (permalink / raw)
To: Doug Ledford, Weiny, Ira
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > +/* Protocol 0xFFFF000000000000 */
> > +#define RDMA_CORE_CAP_PROT_IB 0x0001000000000000ULL
> > +#define RDMA_CORE_CAP_PROT_IBOE 0x0002000000000000ULL
> > +#define RDMA_CORE_CAP_PROT_IWARP 0x0004000000000000ULL
> > +#define RDMA_CORE_CAP_PROT_USNIC_UDP 0x0008000000000000ULL
>
> In accordance with what we've been talking about, drop IBOE for ROCE.
>
> Drop the UDP off of USNIC, then define a bit for CAP_PROT_UDP_ENCAP.
> USNIC will be just USNIC, USNIC_UDP will be USNIC | UDP_ENCAP, ROCE v1
> will be ROCE, and ROCEv2 will be ROCE | UDP_ENCAP.
USNIC_UDP is just UDP. I don't understand why we would want 'USNIC | UDP_ENCAP', or what UDP_ENCAP is intended to convey. Nothing is being encapsulated.
RoCEv2 is IB transport over UDP.
I'm not sure what the protocol field is intended to imply. If we want to expose the link, network, transport, and RDMA protocols in use, shouldn't these be separate fields or bits? And even then, I'm not sure what use this has for the ULPs. iWarp does not require Ethernet or TCP. RoCEv2 would work fine over any link. And the core layer should not assume that a device is limited to supporting only one protocol, especially at the network and transport levels. I vote for deprecating the protocol goofiness.
- Sean
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430720099-32512-2-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 14:41 ` Doug Ledford
@ 2015-05-04 16:42 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA192-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 18:36 ` Jason Gunthorpe
2 siblings, 1 reply; 50+ messages in thread
From: Hefty, Sean @ 2015-05-04 16:42 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Weiny, Ira
> @@ -302,12 +333,21 @@ int ib_register_device(struct ib_device *device,
> goto out;
> }
>
> + ret = read_core_cap_flags(device);
> + if (ret) {
> + dev_err(&device->dev, "Couldn't create Core Capability
> flags\n");
> + kfree(device->gid_tbl_len);
> + kfree(device->pkey_tbl_len);
> + goto out;
> + }
> +
> ret = ib_device_register_sysfs(device, port_callback);
> if (ret) {
> printk(KERN_WARNING "Couldn't register device %s with driver
> model\n",
> device->name);
> kfree(device->gid_tbl_len);
> kfree(device->pkey_tbl_len);
> + kfree(device->core_cap_flags);
Use a common exit location to avoid duplicating the kfree's on errors.
> goto out;
> }
>
> @@ -351,6 +391,7 @@ void ib_unregister_device(struct ib_device *device)
>
> kfree(device->gid_tbl_len);
> kfree(device->pkey_tbl_len);
> + kfree(device->core_cap_flags);
>
> mutex_unlock(&device_mutex);
>
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index c724114..4de2758 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -353,11 +353,32 @@ union rdma_protocol_stats {
> struct iw_protocol_stats iw;
> };
>
> +/* Define bits for the various functionality this port needs to be
> supported by
> + * the core.
> + */
> +/* Management 0x00000000FFFFFFFF */
> +#define RDMA_CORE_CAP_IB_MAD 0x0000000000000001ULL
> +#define RDMA_CORE_CAP_IB_SMI 0x0000000000000002ULL
> +#define RDMA_CORE_CAP_IB_CM 0x0000000000000004ULL
> +#define RDMA_CORE_CAP_IW_CM 0x0000000000000008ULL
> +#define RDMA_CORE_CAP_IB_SA 0x0000000000000010ULL
> +
> +/* Address format 0x0000FFFF00000000 */
> +#define RDMA_CORE_CAP_AF_IB 0x0000000100000000ULL
> +#define RDMA_CORE_CAP_ETH_AH 0x0000000200000000ULL
> +
> +/* Protocol 0xFFFF000000000000 */
> +#define RDMA_CORE_CAP_PROT_IB 0x0001000000000000ULL
> +#define RDMA_CORE_CAP_PROT_IBOE 0x0002000000000000ULL
> +#define RDMA_CORE_CAP_PROT_IWARP 0x0004000000000000ULL
> +#define RDMA_CORE_CAP_PROT_USNIC_UDP 0x0008000000000000ULL
> +
> struct ib_port_attr {
> enum ib_port_state state;
> enum ib_mtu max_mtu;
> enum ib_mtu active_mtu;
> int gid_tbl_len;
> + u64 core_cap_flags;
> u32 port_cap_flags;
> u32 max_msg_sz;
> u32 bad_pkey_cntr;
> @@ -1684,6 +1705,7 @@ struct ib_device {
> u32 local_dma_lkey;
> u8 node_type;
> u8 phys_port_cnt;
> + u64 *core_cap_flags; /* Per port core
> capability flags */
Why are the core_cap_flags duplicated in the struct and in port attributes?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA192-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-04 16:48 ` Doug Ledford
[not found] ` <1430758097.2407.59.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Doug Ledford @ 2015-05-04 16:48 UTC (permalink / raw)
To: Hefty, Sean
Cc: Weiny, Ira, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]
On Mon, 2015-05-04 at 16:42 +0000, Hefty, Sean wrote:
> > struct ib_port_attr {
> > enum ib_port_state state;
> > enum ib_mtu max_mtu;
> > enum ib_mtu active_mtu;
> > int gid_tbl_len;
> > + u64 core_cap_flags;
> > u32 port_cap_flags;
> > u32 max_msg_sz;
> > u32 bad_pkey_cntr;
> > @@ -1684,6 +1705,7 @@ struct ib_device {
> > u32 local_dma_lkey;
> > u8 node_type;
> > u8 phys_port_cnt;
> > + u64 *core_cap_flags; /* Per port core
> > capability flags */
>
> Why are the core_cap_flags duplicated in the struct and in port attributes?
Because the per port ib_port_attr struct is not visible to the core
code, it must call ib_query_port each time it wants to see it. In order
to make the helpers avoid a call, you have to stash the cap_flags into
the ib_device struct. In the past, the only thing there was the
node_type, and it was per device and not per node. In order to make all
of this work and not require calls into the driver all the time, that
has to change to a per-port array. Eventually we could get rid of the
node_type element once everything is in place.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask
[not found] ` <1430720099-32512-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2015-05-04 16:49 ` Hefty, Sean
2015-05-04 18:46 ` Jason Gunthorpe
1 sibling, 0 replies; 50+ messages in thread
From: Hefty, Sean @ 2015-05-04 16:49 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Weiny, Ira
> Subject: [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit
> mask
>
> From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> Use the new Core Capability bits instead of inferring this support from
> the
> protocol.
It would help if the patch description defined what this bit meant. E.g. why aren't more specific definitions (SMI, CM, GSI, PM, etc.) less desirable than just 'MAD'?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430758097.2407.59.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-05-04 16:53 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA1D9-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Hefty, Sean @ 2015-05-04 16:53 UTC (permalink / raw)
To: Doug Ledford
Cc: Weiny, Ira, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > struct ib_port_attr {
> > > enum ib_port_state state;
> > > enum ib_mtu max_mtu;
> > > enum ib_mtu active_mtu;
> > > int gid_tbl_len;
> > > + u64 core_cap_flags;
> > > u32 port_cap_flags;
> > > u32 max_msg_sz;
> > > u32 bad_pkey_cntr;
> > > @@ -1684,6 +1705,7 @@ struct ib_device {
> > > u32 local_dma_lkey;
> > > u8 node_type;
> > > u8 phys_port_cnt;
> > > + u64 *core_cap_flags; /* Per port core
> > > capability flags */
> >
> > Why are the core_cap_flags duplicated in the struct and in port
> attributes?
>
> Because the per port ib_port_attr struct is not visible to the core
> code, it must call ib_query_port each time it wants to see it. In order
> to make the helpers avoid a call, you have to stash the cap_flags into
> the ib_device struct. In the past, the only thing there was the
> node_type, and it was per device and not per node. In order to make all
> of this work and not require calls into the driver all the time, that
> has to change to a per-port array. Eventually we could get rid of the
> node_type element once everything is in place.
I understand why they are in struct ib_device, but why are they duplicated in struct ib_port_attr?
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA1D9-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-04 16:56 ` Doug Ledford
[not found] ` <1430758566.2407.62.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Doug Ledford @ 2015-05-04 16:56 UTC (permalink / raw)
To: Hefty, Sean
Cc: Weiny, Ira, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]
On Mon, 2015-05-04 at 16:53 +0000, Hefty, Sean wrote:
> > > > struct ib_port_attr {
> > > > enum ib_port_state state;
> > > > enum ib_mtu max_mtu;
> > > > enum ib_mtu active_mtu;
> > > > int gid_tbl_len;
> > > > + u64 core_cap_flags;
> > > > u32 port_cap_flags;
> > > > u32 max_msg_sz;
> > > > u32 bad_pkey_cntr;
> > > > @@ -1684,6 +1705,7 @@ struct ib_device {
> > > > u32 local_dma_lkey;
> > > > u8 node_type;
> > > > u8 phys_port_cnt;
> > > > + u64 *core_cap_flags; /* Per port core
> > > > capability flags */
> > >
> > > Why are the core_cap_flags duplicated in the struct and in port
> > attributes?
> >
> > Because the per port ib_port_attr struct is not visible to the core
> > code, it must call ib_query_port each time it wants to see it. In order
> > to make the helpers avoid a call, you have to stash the cap_flags into
> > the ib_device struct. In the past, the only thing there was the
> > node_type, and it was per device and not per node. In order to make all
> > of this work and not require calls into the driver all the time, that
> > has to change to a per-port array. Eventually we could get rid of the
> > node_type element once everything is in place.
>
> I understand why they are in struct ib_device, but why are they duplicated in struct ib_port_attr?
Because ib_query_port returns a struct, so this must be an element of
that struct or it can't be returned (without adding a new, single use
driver upcall).
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430758566.2407.62.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-05-04 17:25 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA217-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Hefty, Sean @ 2015-05-04 17:25 UTC (permalink / raw)
To: Doug Ledford
Cc: Weiny, Ira, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 406 bytes --]
> Because ib_query_port returns a struct, so this must be an element of
> that struct or it can't be returned (without adding a new, single use
> driver upcall).
Nm - This could be clearer if the functionality were folded into the 'read_port_table_lengths' function.
N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¥{±Ù{ayº\x1dÊÚë,j\a¢f£¢·h»öì\x17/oSc¾Ú³9uÀ¦æåÈ&jw¨®\x03(éÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þàþf£¢·h§~m
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA217-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-04 17:31 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E11069818-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Weiny, Ira @ 2015-05-04 17:31 UTC (permalink / raw)
To: Hefty, Sean, Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>
> > Because ib_query_port returns a struct, so this must be an element of
> > that struct or it can't be returned (without adding a new, single use
> > driver upcall).
>
> Nm - This could be clearer if the functionality were folded into the
> 'read_port_table_lengths' function.
I thought of doing that but then the function name no longer matches the functionality.
I could change it to:
read_per_port_info(...)
And add the bits there.
Ira
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E11069818-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-04 17:34 ` Hefty, Sean
0 siblings, 0 replies; 50+ messages in thread
From: Hefty, Sean @ 2015-05-04 17:34 UTC (permalink / raw)
To: Weiny, Ira, Doug Ledford
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Nm - This could be clearer if the functionality were folded into the
> > 'read_port_table_lengths' function.
>
> I thought of doing that but then the function name no longer matches the
> functionality.
>
> I could change it to:
>
> read_per_port_info(...)
That makes sense.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA17C-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-04 17:38 ` Doug Ledford
[not found] ` <1430761111.2407.85.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-05 19:27 ` Liran Liss
1 sibling, 1 reply; 50+ messages in thread
From: Doug Ledford @ 2015-05-04 17:38 UTC (permalink / raw)
To: Hefty, Sean
Cc: Weiny, Ira, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 2721 bytes --]
On Mon, 2015-05-04 at 16:40 +0000, Hefty, Sean wrote:
> > > +/* Protocol 0xFFFF000000000000 */
> > > +#define RDMA_CORE_CAP_PROT_IB 0x0001000000000000ULL
> > > +#define RDMA_CORE_CAP_PROT_IBOE 0x0002000000000000ULL
> > > +#define RDMA_CORE_CAP_PROT_IWARP 0x0004000000000000ULL
> > > +#define RDMA_CORE_CAP_PROT_USNIC_UDP 0x0008000000000000ULL
> >
> > In accordance with what we've been talking about, drop IBOE for ROCE.
> >
> > Drop the UDP off of USNIC, then define a bit for CAP_PROT_UDP_ENCAP.
> > USNIC will be just USNIC, USNIC_UDP will be USNIC | UDP_ENCAP, ROCE v1
> > will be ROCE, and ROCEv2 will be ROCE | UDP_ENCAP.
>
> USNIC_UDP is just UDP. I don't understand why we would want 'USNIC | UDP_ENCAP', or what UDP_ENCAP is intended to convey. Nothing is being encapsulated.
I thought USNIC_UDP had an embedded USNIC protocol header inside the UDP
header. That would make it a UDP_ENCAP protocol.
> RoCEv2 is IB transport over UDP.
Right, ROCE (or IB, whichever you prefer) encapsulated in UDP.
> I'm not sure what the protocol field is intended to imply.
There is still information in those bits that we can't get elsewhere.
For instance, even though this patch replaces the CAP_* stuff with bits,
if you took away the CAP_PROT_* entries, then there would be no entry to
identify USNIC at all.
Right now, you could infer iWARP from CAP_IW_CM.
You could infer InfiniBand from any of the CAP_IB_* (but later will need
a way to differentiate between IB and OPA)
You could infer ROCE from CAP_ETH_AH (but later will need a way to
differentiate between ROCE and ROCEv2)
The only way to differentiate USNIC at the moment, is that the CAPS
would be all 0. That's not the sort of positive identification I would
prefer.
So you *could* reduce this to just one bit for USNIC.
And if you then add a UDP_ENCAP bit, then that single bit can do double
duty in telling apart USNIC and USNIC_UDP and ROCE and ROCEv2.
> If we want
> to expose the link, network, transport, and RDMA protocols in use,
> shouldn't these be separate fields or bits? And even then, I'm not
> sure what use this has for the ULPs. iWarp does not require Ethernet
> or TCP. RoCEv2 would work fine over any link.
> And the core layer
> should not assume that a device is limited to supporting only one
> protocol, especially at the network and transport levels.
Given that this is a per port thing, there is no assumption about a
device only supporting a single protocol.
> I vote for
> deprecating the protocol goofiness.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430761111.2407.85.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-05-04 18:26 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA2F1-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Hefty, Sean @ 2015-05-04 18:26 UTC (permalink / raw)
To: Doug Ledford
Cc: Weiny, Ira, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3415 bytes --]
> I thought USNIC_UDP had an embedded USNIC protocol header inside the UDP
> header. That would make it a UDP_ENCAP protocol.
Someone from Cisco can correct me, but USNIC supports 2 protocols. Just plain UDP, and a proprietary protocol that runs over Ethernet, but uses the same EtherType as RoCE. I thought these could both be active on the same port at the same time.
> > RoCEv2 is IB transport over UDP.
>
> Right, ROCE (or IB, whichever you prefer) encapsulated in UDP.
>
> > I'm not sure what the protocol field is intended to imply.
>
> There is still information in those bits that we can't get elsewhere.
> For instance, even though this patch replaces the CAP_* stuff with bits,
> if you took away the CAP_PROT_* entries, then there would be no entry to
> identify USNIC at all.
>
> Right now, you could infer iWARP from CAP_IW_CM.
> You could infer InfiniBand from any of the CAP_IB_* (but later will need
> a way to differentiate between IB and OPA)
> You could infer ROCE from CAP_ETH_AH (but later will need a way to
> differentiate between ROCE and ROCEv2)
> The only way to differentiate USNIC at the moment, is that the CAPS
> would be all 0. That's not the sort of positive identification I would
> prefer.
>
> So you *could* reduce this to just one bit for USNIC.
>
> And if you then add a UDP_ENCAP bit, then that single bit can do double
> duty in telling apart USNIC and USNIC_UDP and ROCE and ROCEv2.
My question is who needs these bits and why? The primary reason it was exposed was to do the job that the new cap flags are accomplishing.
I still believe that RoCEv2 is conceptually the same as iWarp. An RDMA protocol has been layered over some other transport. In the case of iWarp, it's TCP. In the case of RoCEv2, it's UDP. Do we define a TCP_ENCAP bit that corresponds with UDP_ENCAP? And why should an app care? We don't specify whether the port is running IPv4 or IPv6. Why is the transport level called out, but not the network layer? Architecturally, neither iWarp or RoCEv2 (despite its name) cares what the link layer is.
> > And the core layer
> > should not assume that a device is limited to supporting only one
> > protocol, especially at the network and transport levels.
>
> Given that this is a per port thing, there is no assumption about a
> device only supporting a single protocol.
Device, no, but we are assuming this per port. I don't think this is true for USNIC. For that matter, it's entirely possible for a RoCEv2 device to expose UDP directly to user space, same as USNIC. (I'd actually be surprised if no devices have this capability, if for debugging capabilities, even if for nothing else.) What are we going to do if there's a device that supports both iWarp and RoCEv2? That's easily doable today through software.
Basically, I question how protocol is defined and what it means to expose it as a device attribute. Should it instead be negotiated (i.e. more like sockets)? If an app needs a specific "RDMA protocol" (IB or iWarp) or "application protocol" (IB or iWarp or UDP or MADs?), they can request it. Otherwise, the app gets assigned some protocol. And the protocol should really be associated with a QP, rather than a port.
- Sean
N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¥{±Ù{ayº\x1dÊÚë,j\a¢f£¢·h»öì\x17/oSc¾Ú³9uÀ¦æåÈ&jw¨®\x03(éÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þàþf£¢·h§~m
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430720099-32512-2-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 14:41 ` Doug Ledford
2015-05-04 16:42 ` Hefty, Sean
@ 2015-05-04 18:36 ` Jason Gunthorpe
[not found] ` <20150504183657.GA20586-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2 siblings, 1 reply; 50+ messages in thread
From: Jason Gunthorpe @ 2015-05-04 18:36 UTC (permalink / raw)
To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Mon, May 04, 2015 at 02:14:55AM -0400, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> Add Core capability flags to each port attribute and read those into ib_device
> upon registration for each port.
+1 on adding it to read_port_table_lengths
But this whole thing is starting to get goofy, 3rd add gets the work
to fix it up I guess.
Pull pkey_tbl_len, gid_tbl_len and your new thing into a single struct
and allocate an array of them.
Actually, why not just allocate an array of ib_port_attrs and fill
that? Then you can use it for the mad size too.
Not in your patch, but why does read_port_table_lengths use a kmalloc
for what should be a stack allocation? Yuk.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask
[not found] ` <1430720099-32512-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 16:49 ` Hefty, Sean
@ 2015-05-04 18:46 ` Jason Gunthorpe
[not found] ` <20150504184610.GB20586-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
1 sibling, 1 reply; 50+ messages in thread
From: Jason Gunthorpe @ 2015-05-04 18:46 UTC (permalink / raw)
To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael Wang
On Mon, May 04, 2015 at 02:14:57AM -0400, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> Use the new Core Capability bits instead of inferring this support from the
> protocol.
Does this really need to be a seperate patch? At least for the
core_cap_flags parts it makes no sense to change those lines twice
> - props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
> + props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
Hurm,
Maybe add some macros to help this out, document the standard that
the port implements:
#define RDMA_CORE_PORT_IB_IBA_v1_2 (RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD)
#define RDMA_CORE_PORT_ROCEE_IBA_v1_2_A15 ..
#define RDMA_CORE_PORT_ROCEE_IBA_v1_3_A16 ..
> static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
> {
> - return rdma_ib_or_iboe(device, port_num);
> + return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_IB_MAD);
'bool' is OK in the kernel, just use that instead of !! - in fact all of
thse cap returns should return bool. Michael should fix that in his
series too.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA2F1-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-04 19:40 ` Doug Ledford
[not found] ` <1430768425.2407.143.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-04 20:21 ` Dave Goodell (dgoodell)
2015-05-05 19:51 ` Liran Liss
2 siblings, 1 reply; 50+ messages in thread
From: Doug Ledford @ 2015-05-04 19:40 UTC (permalink / raw)
To: Hefty, Sean
Cc: Weiny, Ira, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 2551 bytes --]
On Mon, 2015-05-04 at 18:26 +0000, Hefty, Sean wrote:
> > Given that this is a per port thing, there is no assumption about a
> > device only supporting a single protocol.
>
> Device, no, but we are assuming this per port.
At a high level, there are certain things that will be per port because
they are tied to the link layer in question (you won't have a port with
IB_SA that doesn't have IB_SA on all its queue pairs on that port).
> I don't think this is
> true for USNIC.
I can't speak for USNIC, I haven't read the driver for it closely
enough.
> For that matter, it's entirely possible for a RoCEv2
> device to expose UDP directly to user space, same as USNIC.
I'm not sure I'm following what you are saying here. USNIC is unique in
that it presents a NIC to the application that it has exclusive control
over.
> (I'd
> actually be surprised if no devices have this capability, if for
> debugging capabilities, even if for nothing else.) What are we going
> to do if there's a device that supports both iWarp and RoCEv2? That's
> easily doable today through software.
It's doable, but nobody is actually doing it.
> Basically, I question how protocol is defined and what it means to
> expose it as a device attribute. Should it instead be negotiated (i.e.
> more like sockets)? If an app needs a specific "RDMA protocol" (IB or
> iWarp) or "application protocol" (IB or iWarp or UDP or MADs?), they
> can request it. Otherwise, the app gets assigned some protocol. And
> the protocol should really be associated with a QP, rather than a
> port.
If you're going down to that level (which we will when the RoCEv2
patches are integrated), then it doesn't quite map to QPs. Since a UD
QP may need to talk to both RoCEv1 and RoCEv2 hosts, it's at the AH
level for RoCE.
To a certain extent you are right. RoCE, RoCEv2, iWARP, and USNIC could
all reside on a single port by virtue of them all sharing an Ethernet
link layer. IB link layer (and associated attributes) and OPA link
layer are the outliers in this method of thinking in that their ports
will only support their own transport. So if we were to actually be
forward thinking about this, then the link layer is the important bit
you need to capture on a per port basis, not the transport technology.
We would have to modify lots of the stack to support per QP or per AH
transport settings though.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA2F1-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 19:40 ` Doug Ledford
@ 2015-05-04 20:21 ` Dave Goodell (dgoodell)
2015-05-05 19:51 ` Liran Liss
2 siblings, 0 replies; 50+ messages in thread
From: Dave Goodell (dgoodell) @ 2015-05-04 20:21 UTC (permalink / raw)
To: Hefty, Sean
Cc: Doug Ledford, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On May 4, 2015, at 1:26 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> I thought USNIC_UDP had an embedded USNIC protocol header inside the UDP
>> header. That would make it a UDP_ENCAP protocol.
>
> Someone from Cisco can correct me, but USNIC supports 2 protocols. Just plain UDP, and a proprietary protocol that runs over Ethernet, but uses the same EtherType as RoCE. I thought these could both be active on the same port at the same time.
Sean's statements above are correct. "USNIC_UDP" in today's code really is plain-old-UDP (over IP over Ethernet).
>>> And the core layer
>>> should not assume that a device is limited to supporting only one
>>> protocol, especially at the network and transport levels.
>>
>> Given that this is a per port thing, there is no assumption about a
>> device only supporting a single protocol.
>
> Device, no, but we are assuming this per port. I don't think this is true for USNIC.
Correct, usNIC devices and ports can speak either format concurrently, IIRC even for different QPs in the same context/PD/etc.
Incidentally, usNIC devices only ever have a single port right now, but there's no reason that has to remain true.
> For that matter, it's entirely possible for a RoCEv2 device to expose UDP directly to user space, same as USNIC. (I'd actually be surprised if no devices have this capability, if for debugging capabilities, even if for nothing else.) What are we going to do if there's a device that supports both iWarp and RoCEv2? That's easily doable today through software.
+1
-Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1430768425.2407.143.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-05-04 21:07 ` Jason Gunthorpe
[not found] ` <20150504210741.GA20839-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Jason Gunthorpe @ 2015-05-04 21:07 UTC (permalink / raw)
To: Doug Ledford
Cc: Hefty, Sean, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Mon, May 04, 2015 at 03:40:25PM -0400, Doug Ledford wrote:
> So if we were to actually be forward thinking about this, then the
> link layer is the important bit you need to capture on a per port
> basis, not the transport technology. We would have to modify lots
> of the stack to support per QP or per AH transport settings though.
We are going to have to do this, at least in the kernel side, the idea
that the port can imply the one and only address family of the QP is
certainly antiquated now. I think the roceev2 patches will require
this work.
As far as this patch series goes, I actually don't think the bits
matter one bit - the intent is to optimize the cap tests, so a one
cap test-one-bit method is fine. Don't add extra we don't need right
now.
As an optimization, be aware of some of the restrictions, it is
actually pretty expensive on some arch's to load large 32 or 64 bit
numbers, smaller numbers are better, single bit test is better.
If this is combined with my thought to use a #define for all the
common driver cases then we can trivially rework the bit-field to
future needs. No reason to over architect something here.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20150504183657.GA20586-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-05-04 22:32 ` ira.weiny
[not found] ` <20150504223234.GB10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: ira.weiny @ 2015-05-04 22:32 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Mon, May 04, 2015 at 12:36:57PM -0600, Jason Gunthorpe wrote:
> On Mon, May 04, 2015 at 02:14:55AM -0400, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> > From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >
> > Add Core capability flags to each port attribute and read those into ib_device
> > upon registration for each port.
>
> +1 on adding it to read_port_table_lengths
>
> But this whole thing is starting to get goofy,
>
<quote>
> 3rd add gets the work
> to fix it up I guess.
</quote>
I don't understand this comment?
>
> Pull pkey_tbl_len, gid_tbl_len and your new thing into a single struct
> and allocate an array of them.
>
> Actually, why not just allocate an array of ib_port_attrs and fill
> that? Then you can use it for the mad size too.
That was debated before and I was hoping to leave that for another day.
It does make some sense to roll it in here.
>
> Not in your patch, but why does read_port_table_lengths use a kmalloc
> for what should be a stack allocation? Yuk.
I don't know. I almost submitted a separate patch for that. I will clean it
up when I combine the functions.
I also don't like the way the arrays in read_port_table_lengths are handled.
This requires a start_port call when comparing bits. It is easier to just make
the array 1 based with index 0 valid only for switches.
This is part of the reason there is a separate function and array.
Ira
>
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask
[not found] ` <20150504184610.GB20586-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-05-04 22:43 ` ira.weiny
[not found] ` <20150504224342.GD10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2015-05-05 8:26 ` Michael Wang
1 sibling, 1 reply; 50+ messages in thread
From: ira.weiny @ 2015-05-04 22:43 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael Wang
On Mon, May 04, 2015 at 12:46:10PM -0600, Jason Gunthorpe wrote:
> On Mon, May 04, 2015 at 02:14:57AM -0400, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
>
> > Use the new Core Capability bits instead of inferring this support from the
> > protocol.
>
> Does this really need to be a seperate patch? At least for the
> core_cap_flags parts it makes no sense to change those lines twice
Only to show the progression and for me to test as I went. I think a squash is
good. I just was taking my testing slowly to make sure I did not do something
stupid... ;-)
>
> > - props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
> > + props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
>
> Hurm,
>
> Maybe add some macros to help this out, document the standard that
> the port implements:
>
> #define RDMA_CORE_PORT_IB_IBA_v1_2 (RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD)
> #define RDMA_CORE_PORT_ROCEE_IBA_v1_2_A15 ..
> #define RDMA_CORE_PORT_ROCEE_IBA_v1_3_A16 ..
Good idea.
>
> > static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
> > {
> > - return rdma_ib_or_iboe(device, port_num);
> > + return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_IB_MAD);
>
>
> 'bool' is OK in the kernel, just use that instead of !! - in fact all of
> thse cap returns should return bool. Michael should fix that in his
> series too.
Michael?
Ira
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20150504223234.GB10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2015-05-04 23:16 ` ira.weiny
[not found] ` <20150504231622.GE10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: ira.weiny @ 2015-05-04 23:16 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
> > Pull pkey_tbl_len, gid_tbl_len and your new thing into a single struct
> > and allocate an array of them.
> >
> > Actually, why not just allocate an array of ib_port_attrs and fill
> > that? Then you can use it for the mad size too.
>
> That was debated before and I was hoping to leave that for another day.
>
> It does make some sense to roll it in here.
I remember more details now...
This came up in my original OPA patch series when Sean objected to me calling
the port attributes "cached_port_attr".
There are a number of values in the ib_port_attr structure which are not fixed
(state, active_mtu, lid, sm_lid, etc...)
Putting another copy of this data in the device will be confusing to know which
of these one can count on for the correct data.
I did not write the table length code so I am assuming that data is immutable.
As is the max MAD size, and these new capability bits. But storing the entire
ib_port_attr structure is probably going to mislead someone.
Do we still think this is a good idea? I think putting in a single struct with
the proper immutable data is the proper way to go.
Ira
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20150504231622.GE10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2015-05-04 23:52 ` Jason Gunthorpe
0 siblings, 0 replies; 50+ messages in thread
From: Jason Gunthorpe @ 2015-05-04 23:52 UTC (permalink / raw)
To: ira.weiny
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Mon, May 04, 2015 at 07:16:23PM -0400, ira.weiny wrote:
> > > Pull pkey_tbl_len, gid_tbl_len and your new thing into a single struct
> > > and allocate an array of them.
> > >
> > > Actually, why not just allocate an array of ib_port_attrs and fill
> > > that? Then you can use it for the mad size too.
> >
> > That was debated before and I was hoping to leave that for another day.
> >
> > It does make some sense to roll it in here.
>
> I remember more details now...
>
> This came up in my original OPA patch series when Sean objected to me calling
> the port attributes "cached_port_attr".
Yeah, not a great name for storing immutable values. port_properties
or something.
> There are a number of values in the ib_port_attr structure which are not fixed
> (state, active_mtu, lid, sm_lid, etc...)
Yes, there are alot of those..
So, a new struct is better, and this has just gotten so messy.
- Update ib_alloc_device to accept a num_ports argument and
create the port-port array at that point
- Have the drivers fill in their per port values before calling
ib_register. Delete read_port_table_lengths
- Get ib_dealloc_device to free the list instead of unregister,
feels like keeping that memory around for the duration of the kref
is smarter..
- Drop gid_tbl_len and pkey_tbl_len for the new scheme
- Mark all the old immutable attrs in ib_port_attrs as deprecated and
if any are easy to remove then do so..
- Don't add the caps or the max mad size immutables to port_attrs
> <quote>
> > 3rd add gets the work
> > to fix it up I guess.
> </quote>
> I don't understand this comment?
gid_tbl_len, pkey_tbl_len were the first two to use a 'shortcut' here,
your caps and/or max mad size are the third to take the 'shortcut'.
Rule of threes: The third person to extend the same badly designed
widget gets to fix it properly.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask
[not found] ` <20150504224342.GD10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2015-05-05 8:00 ` Michael Wang
0 siblings, 0 replies; 50+ messages in thread
From: Michael Wang @ 2015-05-05 8:00 UTC (permalink / raw)
To: ira.weiny, Jason Gunthorpe
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
On 05/05/2015 12:43 AM, ira.weiny wrote:
[snip]
>
>>
>>> static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
>>> {
>>> - return rdma_ib_or_iboe(device, port_num);
>>> + return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_IB_MAD);
>>
>>
>> 'bool' is OK in the kernel, just use that instead of !! - in fact all of
>> thse cap returns should return bool. Michael should fix that in his
>> series too.
>
> Michael?
Ok, I will use bool in next version :-)
Regards,
Michael Wang
>
> Ira
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask
[not found] ` <20150504184610.GB20586-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-05-04 22:43 ` ira.weiny
@ 2015-05-05 8:26 ` Michael Wang
1 sibling, 0 replies; 50+ messages in thread
From: Michael Wang @ 2015-05-05 8:26 UTC (permalink / raw)
To: Jason Gunthorpe, ira.weiny-ral2JQCrhuEAvxtiuMwx3w
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
On 05/04/2015 08:46 PM, Jason Gunthorpe wrote:
> On Mon, May 04, 2015 at 02:14:57AM -0400, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
>
>> Use the new Core Capability bits instead of inferring this support from the
>> protocol.
>
> Does this really need to be a seperate patch? At least for the
> core_cap_flags parts it makes no sense to change those lines twice
I think I missed this patch series, and I can't find all the 5 patch in
archive too...
If there are not much argument on this proposal, then we can include
the changes and make them one patch set.
Regards,
Michael Wang
>
>> - props->core_cap_flags = RDMA_CORE_CAP_PROT_IB;
>> + props->core_cap_flags = RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD;
>
> Hurm,
>
> Maybe add some macros to help this out, document the standard that
> the port implements:
>
> #define RDMA_CORE_PORT_IB_IBA_v1_2 (RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD)
> #define RDMA_CORE_PORT_ROCEE_IBA_v1_2_A15 ..
> #define RDMA_CORE_PORT_ROCEE_IBA_v1_3_A16 ..
>
>> static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
>> {
>> - return rdma_ib_or_iboe(device, port_num);
>> + return !!(device->core_cap_flags[port_num] & RDMA_CORE_CAP_IB_MAD);
>
>
> 'bool' is OK in the kernel, just use that instead of !! - in fact all of
> thse cap returns should return bool. Michael should fix that in his
> series too.
>
> Jason
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA17C-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 17:38 ` Doug Ledford
@ 2015-05-05 19:27 ` Liran Liss
[not found] ` <HE1PR05MB1418E58A6EB92D92B78C76A0B1D10-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
1 sibling, 1 reply; 50+ messages in thread
From: Liran Liss @ 2015-05-05 19:27 UTC (permalink / raw)
To: Hefty, Sean, Doug Ledford, Weiny, Ira,
linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1690 bytes --]
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> owner@vger.kernel.org] On Behalf Of Hefty, Sean
> >
> > In accordance with what we've been talking about, drop IBOE for ROCE.
> >
> > Drop the UDP off of USNIC, then define a bit for CAP_PROT_UDP_ENCAP.
> > USNIC will be just USNIC, USNIC_UDP will be USNIC | UDP_ENCAP, ROCE v1
> > will be ROCE, and ROCEv2 will be ROCE | UDP_ENCAP.
>
> USNIC_UDP is just UDP. I don't understand why we would want 'USNIC |
> UDP_ENCAP', or what UDP_ENCAP is intended to convey. Nothing is being
> encapsulated.
>
I agree that the UDP_ENCAP notion is confusing.
We should stick to ROCE_V1 and ROCE_V2.
> RoCEv2 is IB transport over UDP.
>
> I'm not sure what the protocol field is intended to imply. If we want to
> expose the link, network, transport, and RDMA protocols in use, shouldn't
> these be separate fields or bits? And even then, I'm not sure what use this
> has for the ULPs. iWarp does not require Ethernet or TCP. RoCEv2 would
> work fine over any link. And the core layer should not assume that a device is
> limited to supporting only one protocol, especially at the network and
> transport levels. I vote for deprecating the protocol goofiness.
>
> - Sean
The protocol notion might not have any value for ULPs, but it is useful for core
management code. Also, when we extend these notions to user-space, admins will
actually want to know what wire protocols a certain device can support, even
just for the sake of interoperability.
--Liran
N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¥{±Ù{ayº\x1dÊÚë,j\a¢f£¢·h»öì\x17/oSc¾Ú³9uÀ¦æåÈ&jw¨®\x03(éÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þàþf£¢·h§~m
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA2F1-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 19:40 ` Doug Ledford
2015-05-04 20:21 ` Dave Goodell (dgoodell)
@ 2015-05-05 19:51 ` Liran Liss
[not found] ` <HE1PR05MB1418DF0669B6E6CABE1D9F1EB1D10-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2 siblings, 1 reply; 50+ messages in thread
From: Liran Liss @ 2015-05-05 19:51 UTC (permalink / raw)
To: Hefty, Sean, Doug Ledford,
linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: Weiny, Ira, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> owner@vger.kernel.org] On Behalf Of Hefty, Sean
> > I thought USNIC_UDP had an embedded USNIC protocol header inside the
> > UDP header. That would make it a UDP_ENCAP protocol.
>
> Someone from Cisco can correct me, but USNIC supports 2 protocols. Just
> plain UDP, and a proprietary protocol that runs over Ethernet, but uses the
> same EtherType as RoCE. I thought these could both be active on the same
> port at the same time.
>
'protocol' refers to what your device generates when you do a post_send() on
some QP.
In the RoCE case, it is IBTA transport headers + payload over UDP encapsulation
over IP. In the USNIC case, you might want the protocol to refer to addition
information the distinguishes this wire protocol rather than just "I am sending
UDP packets"...
In other words, I think that 'protocol' should uniquely distinguish interoperable
peers at the Verbs API level. We are *not* trying to describe a certain header,
but rather a stack of protocols.
> > > RoCEv2 is IB transport over UDP.
> >
> > Right, ROCE (or IB, whichever you prefer) encapsulated in UDP.
> >
> > > I'm not sure what the protocol field is intended to imply.
> >
> > There is still information in those bits that we can't get elsewhere.
> > For instance, even though this patch replaces the CAP_* stuff with
> > bits, if you took away the CAP_PROT_* entries, then there would be no
> > entry to identify USNIC at all.
> >
> > Right now, you could infer iWARP from CAP_IW_CM.
> > You could infer InfiniBand from any of the CAP_IB_* (but later will
> > need a way to differentiate between IB and OPA) You could infer ROCE
> > from CAP_ETH_AH (but later will need a way to differentiate between
> > ROCE and ROCEv2) The only way to differentiate USNIC at the moment, is
> > that the CAPS would be all 0. That's not the sort of positive
> > identification I would prefer.
> >
> > So you *could* reduce this to just one bit for USNIC.
> >
> > And if you then add a UDP_ENCAP bit, then that single bit can do
> > double duty in telling apart USNIC and USNIC_UDP and ROCE and ROCEv2.
>
> My question is who needs these bits and why? The primary reason it was
> exposed was to do the job that the new cap flags are accomplishing.
>
> I still believe that RoCEv2 is conceptually the same as iWarp. An RDMA
> protocol has been layered over some other transport. In the case of iWarp,
> it's TCP. In the case of RoCEv2, it's UDP. Do we define a TCP_ENCAP bit that
> corresponds with UDP_ENCAP? And why should an app care? We don't
> specify whether the port is running IPv4 or IPv6. Why is the transport level
> called out, but not the network layer? Architecturally, neither iWarp or
> RoCEv2 (despite its name) cares what the link layer is.
We don't call out the transport layer.
That's why we call it 'protocol' and not 'transport' !
>
> > > And the core layer
> > > should not assume that a device is limited to supporting only one
> > > protocol, especially at the network and transport levels.
> >
> > Given that this is a per port thing, there is no assumption about a
> > device only supporting a single protocol.
>
> Device, no, but we are assuming this per port. I don't think this is true for
> USNIC. For that matter, it's entirely possible for a RoCEv2 device to expose
> UDP directly to user space, same as USNIC. (I'd actually be surprised if no
> devices have this capability, if for debugging capabilities, even if for nothing
> else.) What are we going to do if there's a device that supports both iWarp
> and RoCEv2? That's easily doable today through software.
If and when there are such devices, they can advertise multiple protocols.
>
> Basically, I question how protocol is defined and what it means to expose it
> as a device attribute. Should it instead be negotiated (i.e. more like sockets)?
> If an app needs a specific "RDMA protocol" (IB or iWarp) or "application
> protocol" (IB or iWarp or UDP or MADs?), they can request it. Otherwise, the
> app gets assigned some protocol. And the protocol should really be
> associated with a QP, rather than a port.
>
> - Sean
The port capabilities do just that: advertise capabilities.
The actual protocol selection for each transport entity happens later.
For ROCE RC QPs, for example, this occurs while modifying the QP to refer
to the desired GID index.
--Liran
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20150504210741.GA20839-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-05-05 19:59 ` Liran Liss
0 siblings, 0 replies; 50+ messages in thread
From: Liran Liss @ 2015-05-05 19:59 UTC (permalink / raw)
To: Jason Gunthorpe, Doug Ledford
Cc: Hefty, Sean, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Jason Gunthorpe
> > So if we were to actually be forward thinking about this, then the
> > link layer is the important bit you need to capture on a per port
> > basis, not the transport technology. We would have to modify lots of
> > the stack to support per QP or per AH transport settings though.
>
> We are going to have to do this, at least in the kernel side, the idea that the
> port can imply the one and only address family of the QP is certainly
> antiquated now. I think the roceev2 patches will require this work.
>
I second that the link layer should not be used for this purpose.
RDMA devices can and will advertise multiple rdma 'protocols', such as
ROCE devices that support both ROCE_V1 and ROCE_V2.
--Liran
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <HE1PR05MB1418DF0669B6E6CABE1D9F1EB1D10-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2015-05-05 20:22 ` Dave Goodell (dgoodell)
[not found] ` <20BA79B2-9DA5-49B3-8455-BD4021CB882C-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Dave Goodell (dgoodell) @ 2015-05-05 20:22 UTC (permalink / raw)
To: Liran Liss
Cc: Hefty, Sean, Doug Ledford, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On May 5, 2015, at 2:51 PM, Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
>> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Hefty, Sean
>
>>> I thought USNIC_UDP had an embedded USNIC protocol header inside the
>>> UDP header. That would make it a UDP_ENCAP protocol.
>>
>> Someone from Cisco can correct me, but USNIC supports 2 protocols. Just
>> plain UDP, and a proprietary protocol that runs over Ethernet, but uses the
>> same EtherType as RoCE. I thought these could both be active on the same
>> port at the same time.
>>
>
> 'protocol' refers to what your device generates when you do a post_send() on
> some QP.
> In the RoCE case, it is IBTA transport headers + payload over UDP encapsulation
> over IP. In the USNIC case, you might want the protocol to refer to addition
> information the distinguishes this wire protocol rather than just "I am sending
> UDP packets"...
In the case that usNIC is operating in UDP mode (which is the overwhelming majority of the cases), there is absolutely no additional protocol that ends up on the wire or headers in the user buffers besides UDP/IP/Ethernet. They are 100% plain UDP packets, they just happen to be sent via OS-bypass queues instead of traveling through the kernel networking stack.
[^^^^^ there continues to be confusion about this for some reason, but I don't know why]
> In other words, I think that 'protocol' should uniquely distinguish interoperable
> peers at the Verbs API level. We are *not* trying to describe a certain header,
> but rather a stack of protocols.
Any non-UDP "protocol" that might currently be in use over usNIC is an entirely application-level protocol outside of the view of the Verbs API level.
-Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20BA79B2-9DA5-49B3-8455-BD4021CB882C-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2015-05-05 20:29 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCAEFD-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-09 3:32 ` Doug Ledford
2015-05-11 6:42 ` ira.weiny
2 siblings, 1 reply; 50+ messages in thread
From: Hefty, Sean @ 2015-05-05 20:29 UTC (permalink / raw)
To: Dave Goodell (dgoodell), Liran Liss
Cc: Doug Ledford, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > In other words, I think that 'protocol' should uniquely distinguish
> interoperable
> > peers at the Verbs API level. We are *not* trying to describe a certain
> header,
> > but rather a stack of protocols.
>
> Any non-UDP "protocol" that might currently be in use over usNIC is an
> entirely application-level protocol outside of the view of the Verbs API
> level.
I agree with this for usNIC. For other technologies, the definition isn't even this simple. UD QPs and RC QPs cannot interoperate, yet are treated as the same 'protocol'. Just defining protocol as a 'stack of protocols' shows how broken the approach is. And we haven't even tried to incorporate other 'protocols' needed to make this work, such as the CM or management protocols.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCAEFD-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-06 14:25 ` Liran Liss
0 siblings, 0 replies; 50+ messages in thread
From: Liran Liss @ 2015-05-06 14:25 UTC (permalink / raw)
To: Hefty, Sean, Dave Goodell (dgoodell)
Cc: Doug Ledford, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> From: Hefty, Sean [mailto:sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org]
> Sent: Tuesday, May 05, 2015 11:29 PM
> > > In other words, I think that 'protocol' should uniquely distinguish
> > interoperable
> > > peers at the Verbs API level. We are *not* trying to describe a
> > > certain
> > header,
> > > but rather a stack of protocols.
> >
> > Any non-UDP "protocol" that might currently be in use over usNIC is an
> > entirely application-level protocol outside of the view of the Verbs
> > API level.
>
Fine. Then we can add a 'udp/ip' protocol for usNIC.
Similarly, 'raw Ethernet' can designate the ability to inject of all headers starting from L2.
> I agree with this for usNIC. For other technologies, the definition isn't even
> this simple. UD QPs and RC QPs cannot interoperate, yet are treated as the
> same 'protocol'. Just defining protocol as a 'stack of protocols' shows how
> broken the approach is. And we haven't even tried to incorporate other
> 'protocols' needed to make this work, such as the CM or management
> protocols.
The interoperability between different transport objects depends on the technology.
I don't think that we should even attempt to describe these dependencies in any
abstract way. RTFM...
However, users should have a way of knowing what transports and what wire
protocol stacks a certain rdma device supports.
No one using IB or RoCE would think that UD and RC interoperate.
(If someone tries it, he would be disappointed 'gracefully'...)
But people that use RoCE would like to know which devices they can use to send
RoCE packets (or select a device that supports a certain RoCE version).
--Liran
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <HE1PR05MB1418E58A6EB92D92B78C76A0B1D10-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2015-05-08 18:56 ` Doug Ledford
[not found] ` <1431111412.2407.463.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Doug Ledford @ 2015-05-08 18:56 UTC (permalink / raw)
To: Liran Liss
Cc: Hefty, Sean, Weiny, Ira,
linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 3548 bytes --]
On Tue, 2015-05-05 at 19:27 +0000, Liran Liss wrote:
> > From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
>
> > owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Hefty, Sean
> > >
> > > In accordance with what we've been talking about, drop IBOE for ROCE.
> > >
> > > Drop the UDP off of USNIC, then define a bit for CAP_PROT_UDP_ENCAP.
> > > USNIC will be just USNIC, USNIC_UDP will be USNIC | UDP_ENCAP, ROCE v1
> > > will be ROCE, and ROCEv2 will be ROCE | UDP_ENCAP.
> >
> > USNIC_UDP is just UDP. I don't understand why we would want 'USNIC |
> > UDP_ENCAP', or what UDP_ENCAP is intended to convey. Nothing is being
> > encapsulated.
> >
>
> I agree that the UDP_ENCAP notion is confusing.
> We should stick to ROCE_V1 and ROCE_V2.
>
> > RoCEv2 is IB transport over UDP.
> >
> > I'm not sure what the protocol field is intended to imply. If we want to
> > expose the link, network, transport, and RDMA protocols in use, shouldn't
> > these be separate fields or bits? And even then, I'm not sure what use this
> > has for the ULPs. iWarp does not require Ethernet or TCP. RoCEv2 would
> > work fine over any link. And the core layer should not assume that a device is
> > limited to supporting only one protocol, especially at the network and
> > transport levels. I vote for deprecating the protocol goofiness.
> >
> > - Sean
>
> The protocol notion might not have any value for ULPs, but it is useful for core
> management code. Also, when we extend these notions to user-space, admins will
> actually want to know what wire protocols a certain device can support, even
> just for the sake of interoperability.
So I've been thinking a bit more about the overall architecture of the
underlying patch set from Michael. The structure of that patch set is
to a certain extent dictating this follow on patch set and so any
problems in it end up being transmitted here as well.
Sean's comments about transports I think were spot on, and they got me
doing a little thought exercise in my head.
Theoretically, if we were to take the SoftiWARP and SoftRoCE drivers, we
could merge them down to one driver that supported both iWARP and RoCE
(and we could add USNIC too I suspect). Because all of these things
work on an Ethernet link layer, there is no reason they couldn't all be
presented on the same device. So if you had a single verbs device that
could support all of these things, where would you need to capture and
store the relevant information?
This is what I have so far:
On a per device basis: not much to be honest, node_type IB_CA/RNIC for
back compatibility and that's pretty much it
On a per port basis: link layer (Ethernet, IB, or OPA). Notice that I
didn't include the transport here. Two of the above link layers
strictly imply their transport. The third link layer could have
multiple transports. Which brings me to my next level...
On a per queue pair basis: if (link_layer == Ethernet) iWARP/RoCE/USNIC
if (qp_type == RoCE && qp != UD) RoCEv1/RoCEv2
On a per ah basis: if (qp_type == RoCE && qp = UD) RoCEv1/RoCEv2
Michael's patch set was intended to give us a framework within which we
can start cleaning things up. Once we get to cleaning things up, I
think the above is where we should target storing the relevant
information.
Thoughts?
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1431111412.2407.463.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-05-08 20:06 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCD719-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Hefty, Sean @ 2015-05-08 20:06 UTC (permalink / raw)
To: Doug Ledford, Liran Liss
Cc: Weiny, Ira,
linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> This is what I have so far:
>
> On a per device basis: not much to be honest, node_type IB_CA/RNIC for
> back compatibility and that's pretty much it
>
> On a per port basis: link layer (Ethernet, IB, or OPA). Notice that I
> didn't include the transport here. Two of the above link layers
> strictly imply their transport. The third link layer could have
> multiple transports. Which brings me to my next level...
>
> On a per queue pair basis: if (link_layer == Ethernet) iWARP/RoCE/USNIC
> if (qp_type == RoCE && qp != UD) RoCEv1/RoCEv2
>
> On a per ah basis: if (qp_type == RoCE && qp = UD) RoCEv1/RoCEv2
>
> Michael's patch set was intended to give us a framework within which we
> can start cleaning things up. Once we get to cleaning things up, I
> think the above is where we should target storing the relevant
> information.
>
> Thoughts?
I'm not following your intent here.
The qp_type values are currently defined as RC, UC, UD, XRC, plus some weirdness like SMI, GSI. Are you suggesting that we store the relevant information as part of the qp_type, or that we keep the qp_type as-is?
Once Michael's patches are integrated, do apps need anything else beyond the qp type as currently defined?
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCD719-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-08 20:30 ` Doug Ledford
[not found] ` <1431117051.2407.468.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Doug Ledford @ 2015-05-08 20:30 UTC (permalink / raw)
To: Hefty, Sean
Cc: Liran Liss, Weiny, Ira,
linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 2226 bytes --]
On Fri, 2015-05-08 at 20:06 +0000, Hefty, Sean wrote:
> > This is what I have so far:
> >
> > On a per device basis: not much to be honest, node_type IB_CA/RNIC for
> > back compatibility and that's pretty much it
> >
> > On a per port basis: link layer (Ethernet, IB, or OPA). Notice that I
> > didn't include the transport here. Two of the above link layers
> > strictly imply their transport. The third link layer could have
> > multiple transports. Which brings me to my next level...
> >
> > On a per queue pair basis: if (link_layer == Ethernet) iWARP/RoCE/USNIC
> > if (qp_type == RoCE && qp != UD) RoCEv1/RoCEv2
> >
> > On a per ah basis: if (qp_type == RoCE && qp = UD) RoCEv1/RoCEv2
> >
> > Michael's patch set was intended to give us a framework within which we
> > can start cleaning things up. Once we get to cleaning things up, I
> > think the above is where we should target storing the relevant
> > information.
> >
> > Thoughts?
>
> I'm not following your intent here.
>
> The qp_type values are currently defined as RC, UC, UD, XRC, plus some
> weirdness like SMI, GSI. Are you suggesting that we store the relevant
> information as part of the qp_type, or that we keep the qp_type as-is?
Well, you note I wrote qp != UD, where as that's really the qp_type, so
the above was psuedo code at best. I was necessarily suggesting where
in the qp data struct to store it, just that even though there isn't
hardware that does this (yet), there's no reason hardware couldn't be
designed to support both iWARP and RoCE and USNIC over the same Ethernet
link layer, and merging the SoftRoCE and SoftiWARP drivers would make a
proof of concept, and so we would *have* to store that information on a
per QP basis.
> \r Once Michael's patches are integrated, do apps need anything else
> beyond the qp type as currently defined?
If you had a device that supported iWARP and RoCE on the same physical
link layer, then yes, the app would need a means of saying which
transport to use in addition to the type of QP to establish.
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1431117051.2407.468.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-05-08 20:56 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCD78C-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Hefty, Sean @ 2015-05-08 20:56 UTC (permalink / raw)
To: Doug Ledford
Cc: Liran Liss, Weiny, Ira,
linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Well, you note I wrote qp != UD, where as that's really the qp_type, so
> the above was psuedo code at best. I was necessarily suggesting where
> in the qp data struct to store it, just that even though there isn't
> hardware that does this (yet), there's no reason hardware couldn't be
> designed to support both iWARP and RoCE and USNIC over the same Ethernet
> link layer, and merging the SoftRoCE and SoftiWARP drivers would make a
> proof of concept, and so we would *have* to store that information on a
> per QP basis.
>
> >
> Once Michael's patches are integrated, do apps need anything else
> > beyond the qp type as currently defined?
>
> If you had a device that supported iWARP and RoCE on the same physical
> link layer, then yes, the app would need a means of saying which
> transport to use in addition to the type of QP to establish.
Ah - got it now. And I agree, there should be some way to specify this at the QP level.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCD78C-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-08 21:48 ` Jason Gunthorpe
[not found] ` <20150508214855.GA3917-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Jason Gunthorpe @ 2015-05-08 21:48 UTC (permalink / raw)
To: Hefty, Sean
Cc: Doug Ledford, Liran Liss, Weiny, Ira,
linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Fri, May 08, 2015 at 08:56:16PM +0000, Hefty, Sean wrote:
> > If you had a device that supported iWARP and RoCE on the same physical
> > link layer, then yes, the app would need a means of saying which
> > transport to use in addition to the type of QP to establish.
>
> Ah - got it now. And I agree, there should be some way to specify
> this at the QP level.
Yes, the only way out is to specify on a per QP basis the addressing
and protocol/transport/whatever thing. Socket uses the AF,SOCK,PROTO
tuple to specify this information. We can probably productively use a
similar breakdown:
AF_IB,SOCK_RC,PROTO_IBA // InfiniBand
AF_OPA,SOCK_RC,PROTO_IBA // Future 32 bit LID OPB 'InfiniBand'
AF_ETH,SOCK_RC,PROTO_IBA // RoCEv1
AF_INET,SOCK_RC,PROTO_IBA // InfiniBand or RoCEv2, depending on the Link Layer
AF_ETH,SOCK_RC,PROTO_USNIC
AF_INET,SOCK_RC,PROTO_USNIC
AF_INET,SOCK_RC,PROTO_IWARP
The expectation would be that any SOCK,PROTO combination can
'interoperate' with any AF combination - which is true with the above
examples, ie with the right magic NAT hardware RoCEv2/v1/IB/(OPA?) can
all interwork at the BTH layer.
Which is to also say, it also unambiguously implies which standard
defines the subtle verbs varietions that the app has to cope with.
Incompatible AH's can not be used with the wrong type of QP, the CM
layers would auto create QPs with the correct type to reflect how the
CM process was run, etc.
But Michael's patches are still an improvement, ie a RoCE + iWarp port
would still have the cap_mad as true and would still have to create a
QP1 interface, which I think is captured OK. The CM side probably
explodes if a port is both iWarp and RoCE, but that is alot of work to
fix anyhow..
I think we have to tackle the addressing problem as part of the RoCEv2
stuff, just burying all this in a qp index is really horrible.
Realistically, today, if a RoCE/iWarp driver appeared then it would
have to present to the system as two RDMA devices.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20BA79B2-9DA5-49B3-8455-BD4021CB882C-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2015-05-05 20:29 ` Hefty, Sean
@ 2015-05-09 3:32 ` Doug Ledford
[not found] ` <1431142328.2407.488.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-11 6:42 ` ira.weiny
2 siblings, 1 reply; 50+ messages in thread
From: Doug Ledford @ 2015-05-09 3:32 UTC (permalink / raw)
To: Dave Goodell (dgoodell)
Cc: Liran Liss, Hefty, Sean, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 2213 bytes --]
On Tue, 2015-05-05 at 20:22 +0000, Dave Goodell (dgoodell) wrote:
> On May 5, 2015, at 2:51 PM, Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>
> >> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> >> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Hefty, Sean
> >
> >>> I thought USNIC_UDP had an embedded USNIC protocol header inside the
> >>> UDP header. That would make it a UDP_ENCAP protocol.
> >>
> >> Someone from Cisco can correct me, but USNIC supports 2 protocols. Just
> >> plain UDP, and a proprietary protocol that runs over Ethernet, but uses the
> >> same EtherType as RoCE. I thought these could both be active on the same
> >> port at the same time.
> >>
> >
> > 'protocol' refers to what your device generates when you do a post_send() on
> > some QP.
> > In the RoCE case, it is IBTA transport headers + payload over UDP encapsulation
> > over IP. In the USNIC case, you might want the protocol to refer to addition
> > information the distinguishes this wire protocol rather than just "I am sending
> > UDP packets"...
>
> In the case that usNIC is operating in UDP mode (which is the
> overwhelming majority of the cases), there is absolutely no additional
> protocol that ends up on the wire or headers in the user buffers
> besides UDP/IP/Ethernet. They are 100% plain UDP packets, they just
> happen to be sent via OS-bypass queues instead of traveling through the
> kernel networking stack.
>
> [^^^^^ there continues to be confusion about this for some reason, but
> I don't know why]
I really need to just sit down and read your driver front to back. The
confusion probably comes from people (such as myself) that first think
about this and go "Well, if you have multiple queue pairs, and UDP, then
you need a header to tell what QP a packet goes to" (which assumes a
RoCE like usage of UDP where all packets go to the same UDP address)
where your actual usage is probably more iWARP like in that the IP/UDP
SRC/DST combo maps to a specific QP, right?
--
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
GPG KeyID: 0E572FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20BA79B2-9DA5-49B3-8455-BD4021CB882C-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2015-05-05 20:29 ` Hefty, Sean
2015-05-09 3:32 ` Doug Ledford
@ 2015-05-11 6:42 ` ira.weiny
[not found] ` <20150511064232.GA3042-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2 siblings, 1 reply; 50+ messages in thread
From: ira.weiny @ 2015-05-11 6:42 UTC (permalink / raw)
To: Dave Goodell (dgoodell)
Cc: Liran Liss, Hefty, Sean, Doug Ledford,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, May 05, 2015 at 08:22:21PM +0000, Dave Goodell (dgoodell) wrote:
> On May 5, 2015, at 2:51 PM, Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>
> >> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> >> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Hefty, Sean
> >
> >>> I thought USNIC_UDP had an embedded USNIC protocol header inside the
> >>> UDP header. That would make it a UDP_ENCAP protocol.
> >>
> >> Someone from Cisco can correct me, but USNIC supports 2 protocols. Just
> >> plain UDP, and a proprietary protocol that runs over Ethernet, but uses the
> >> same EtherType as RoCE. I thought these could both be active on the same
> >> port at the same time.
> >>
> >
> > 'protocol' refers to what your device generates when you do a post_send() on
> > some QP.
> > In the RoCE case, it is IBTA transport headers + payload over UDP encapsulation
> > over IP. In the USNIC case, you might want the protocol to refer to addition
> > information the distinguishes this wire protocol rather than just "I am sending
> > UDP packets"...
>
> In the case that usNIC is operating in UDP mode (which is the overwhelming majority of the cases), there is absolutely no additional protocol that ends up on the wire or headers in the user buffers besides UDP/IP/Ethernet. They are 100% plain UDP packets, they just happen to be sent via OS-bypass queues instead of traveling through the kernel networking stack.
>
> [^^^^^ there continues to be confusion about this for some reason, but I don't know why]
So what is this patch for?
commit 248567f79304b953ea492fb92ade097b62ed09b2
Author: Upinder Malhi <umalhi-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Date: Thu Jan 9 14:48:19 2014 -0800
IB/core: Add RDMA_TRANSPORT_USNIC_UDP
Add RDMA_TRANSPORT_USNIC_UDP which will be used by usNIC.
Signed-off-by: Upinder Malhi <umalhi-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
This is probably where a lot of the confusion is coming from.
Ira
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <1431142328.2407.488.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2015-05-11 22:19 ` Dave Goodell (dgoodell)
[not found] ` <CCC397F5-31B6-4A35-94E0-7EAAE6C4803F-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Dave Goodell (dgoodell) @ 2015-05-11 22:19 UTC (permalink / raw)
To: Doug Ledford
Cc: Liran Liss, Hefty, Sean, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On May 8, 2015, at 8:32 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Tue, 2015-05-05 at 20:22 +0000, Dave Goodell (dgoodell) wrote:
>>
>> In the case that usNIC is operating in UDP mode (which is the
>> overwhelming majority of the cases), there is absolutely no additional
>> protocol that ends up on the wire or headers in the user buffers
>> besides UDP/IP/Ethernet. They are 100% plain UDP packets, they just
>> happen to be sent via OS-bypass queues instead of traveling through the
>> kernel networking stack.
>>
>> [^^^^^ there continues to be confusion about this for some reason, but
>> I don't know why]
>
> I really need to just sit down and read your driver front to back.
Feel free to ping me off-list if you want any explanation about what's going on in the usnic_verbs module. I'm happy to help.
> The
> confusion probably comes from people (such as myself) that first think
> about this and go "Well, if you have multiple queue pairs, and UDP, then
> you need a header to tell what QP a packet goes to" (which assumes a
> RoCE like usage of UDP where all packets go to the same UDP address)
> where your actual usage is probably more iWARP like in that the IP/UDP
> SRC/DST combo maps to a specific QP, right?
I don't know much about the details of iWARP, but yes, a UDP/IP src/dst port combo maps to a specific QP when using USNIC_UDP QPs. We do _not_ use a single well-known port.
-Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20150511064232.GA3042-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
@ 2015-05-11 22:26 ` Dave Goodell (dgoodell)
[not found] ` <D72B89F6-B333-4DC2-9BA7-CB45EBC31843-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Dave Goodell (dgoodell) @ 2015-05-11 22:26 UTC (permalink / raw)
To: ira.weiny
Cc: Liran Liss, Hefty, Sean, Doug Ledford,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On May 10, 2015, at 11:42 PM, ira.weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> On Tue, May 05, 2015 at 08:22:21PM +0000, Dave Goodell (dgoodell) wrote:
>>
>> In the case that usNIC is operating in UDP mode (which is the overwhelming majority of the cases), there is absolutely no additional protocol that ends up on the wire or headers in the user buffers besides UDP/IP/Ethernet. They are 100% plain UDP packets, they just happen to be sent via OS-bypass queues instead of traveling through the kernel networking stack.
>>
>> [^^^^^ there continues to be confusion about this for some reason, but I don't know why]
>
> So what is this patch for?
Does my earlier email clarify the situation any? http://marc.info/?l=linux-rdma&m=142972178630720&w=2
> commit 248567f79304b953ea492fb92ade097b62ed09b2
> Author: Upinder Malhi <umalhi-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
> Date: Thu Jan 9 14:48:19 2014 -0800
>
> IB/core: Add RDMA_TRANSPORT_USNIC_UDP
>
> Add RDMA_TRANSPORT_USNIC_UDP which will be used by usNIC.
>
> Signed-off-by: Upinder Malhi <umalhi-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
>
> This is probably where a lot of the confusion is coming from.
Arguably RDMA_TRANSPORT_USNIC_UDP could/should have simply been named RDMA_TRANSPORT_UDP.
-Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <CCC397F5-31B6-4A35-94E0-7EAAE6C4803F-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2015-05-11 22:39 ` Jason Gunthorpe
[not found] ` <20150511223930.GA15628-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Jason Gunthorpe @ 2015-05-11 22:39 UTC (permalink / raw)
To: Dave Goodell (dgoodell)
Cc: Doug Ledford, Liran Liss, Hefty, Sean, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Mon, May 11, 2015 at 10:19:29PM +0000, Dave Goodell (dgoodell) wrote:
> > I really need to just sit down and read your driver front to back.
>
> Feel free to ping me off-list if you want any explanation about
> what's going on in the usnic_verbs module. I'm happy to help.
So, I look at this:
int usnic_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr)
{
usnic_dbg("\n");
return -EINVAL;
}
And conclude that usnic doesn't actually provide any RDMA services to
in-kernel users, correct?
I'd then guess the entire point of this is to hack some non-RDMA
device to export it's interface via uverbs? Presumably because someone
nack'd using your own char device or whatever?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <D72B89F6-B333-4DC2-9BA7-CB45EBC31843-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2015-05-12 2:13 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E1107C20E-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Weiny, Ira @ 2015-05-12 2:13 UTC (permalink / raw)
To: Dave Goodell (dgoodell)
Cc: Liran Liss, Hefty, Sean, Doug Ledford,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>
> On May 10, 2015, at 11:42 PM, ira.weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>
> > On Tue, May 05, 2015 at 08:22:21PM +0000, Dave Goodell (dgoodell) wrote:
> >>
> >> In the case that usNIC is operating in UDP mode (which is the overwhelming
> majority of the cases), there is absolutely no additional protocol that ends up on
> the wire or headers in the user buffers besides UDP/IP/Ethernet. They are
> 100% plain UDP packets, they just happen to be sent via OS-bypass queues
> instead of traveling through the kernel networking stack.
> >>
> >> [^^^^^ there continues to be confusion about this for some reason, but I
> don't know why]
> >
> > So what is this patch for?
>
> Does my earlier email clarify the situation any? http://marc.info/?l=linux-
> rdma&m=142972178630720&w=2
Somewhat, is there any reason applications need to distinguish between the " The legacy RDMA_TRANSPORT_USNIC type" and " The current RDMA_TRANSPORT_USNIC_UDP type"?
Or does the former no longer exist?
>
> > commit 248567f79304b953ea492fb92ade097b62ed09b2
> > Author: Upinder Malhi <umalhi-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
> > Date: Thu Jan 9 14:48:19 2014 -0800
> >
> > IB/core: Add RDMA_TRANSPORT_USNIC_UDP
> >
> > Add RDMA_TRANSPORT_USNIC_UDP which will be used by usNIC.
> >
> > Signed-off-by: Upinder Malhi <umalhi-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
> > Signed-off-by: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
> >
> > This is probably where a lot of the confusion is coming from.
>
> Arguably RDMA_TRANSPORT_USNIC_UDP could/should have simply been
> named RDMA_TRANSPORT_UDP.
I guess I'm wondering if there needs to be an RDMA_TRANSPORT_USNIC to represent the " The legacy RDMA_TRANSPORT_USNIC type" you mention in the link above.
Ira
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20150511223930.GA15628-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-05-12 17:41 ` Dave Goodell (dgoodell)
[not found] ` <DC760F33-F704-4763-AEA3-6DE9016D0687-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 50+ messages in thread
From: Dave Goodell (dgoodell) @ 2015-05-12 17:41 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Doug Ledford, Liran Liss, Hefty, Sean, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On May 11, 2015, at 3:39 PM, Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> So, I look at this:
>
> int usnic_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
> struct ib_send_wr **bad_wr)
> {
> usnic_dbg("\n");
> return -EINVAL;
> }
>
> And conclude that usnic doesn't actually provide any RDMA services to
> in-kernel users, correct?
Correct.
> I'd then guess the entire point of this is to hack some non-RDMA
> device to export it's interface via uverbs?
If by "non-RDMA", you mean "does not currently support hardware offload of direct data placement", then yes. There are a couple of things to keep in mind:
1) usNIC is UD-only at this time. Are there any kernel clients that make use of UD without also requiring RC? (this is a genuine question)
2) When using UDP, usNIC deposits 42-byte UDP/IP/Ethernet headers into the start of the recv buffer, not a 40-byte GRH. Unfortunately, the current verbs stack has no way to indicate the device's/port's/QP's header size, let alone format, to any client. So clients must be aware they are using a usNIC device under the hood.
> Presumably because someone
> nack'd using your own char device or whatever?
I wasn't participating on the kernel side of the usNIC project at the time that some of the early decisions were made, but to my knowledge the uverbs stack was selected as the most appropriate way to expose this technology to users after evaluation of several approaches. I don't think that any other approach was previously submitted and NACK-ed.
-Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E1107C20E-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-05-12 17:56 ` Dave Goodell (dgoodell)
0 siblings, 0 replies; 50+ messages in thread
From: Dave Goodell (dgoodell) @ 2015-05-12 17:56 UTC (permalink / raw)
To: Weiny, Ira
Cc: Liran Liss, Hefty, Sean, Doug Ledford,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On May 11, 2015, at 7:13 PM, Weiny, Ira <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> Does my earlier email clarify the situation any? http://marc.info/?l=linux-
>> rdma&m=142972178630720&w=2
>
> Somewhat, is there any reason applications need to distinguish between the " The legacy RDMA_TRANSPORT_USNIC type" and " The current RDMA_TRANSPORT_USNIC_UDP type"?
Addressing is different between the two (Ethernet_MAC+QP# vs. IP+UDP_port, respectively).
Also, the received UD header format and size are different between them (40-byte synthetic GRH vs. 42-byte UDP/IP/Ethernet header).
> Or does the former no longer exist?
The former is no longer being deployed and should be rare to find in the wild at this point, though I do not know precisely how many customers may be using it still. It is definitely legacy at this point.
>> Arguably RDMA_TRANSPORT_USNIC_UDP could/should have simply been
>> named RDMA_TRANSPORT_UDP.
>
> I guess I'm wondering if there needs to be an RDMA_TRANSPORT_USNIC to represent the " The legacy RDMA_TRANSPORT_USNIC type" you mention in the link above.
It probably needs to exist for a little while still.
-Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <DC760F33-F704-4763-AEA3-6DE9016D0687-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2015-05-12 18:08 ` Jason Gunthorpe
0 siblings, 0 replies; 50+ messages in thread
From: Jason Gunthorpe @ 2015-05-12 18:08 UTC (permalink / raw)
To: Dave Goodell (dgoodell)
Cc: Doug Ledford, Liran Liss, Hefty, Sean, Weiny, Ira,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, May 12, 2015 at 05:41:14PM +0000, Dave Goodell (dgoodell) wrote:
> > I'd then guess the entire point of this is to hack some non-RDMA
> > device to export it's interface via uverbs?
>
> If by "non-RDMA", you mean "does not currently support hardware
> offload of direct data placement", then yes. There are a couple of
> things to keep in mind:
I actually mean 'does not provide RDMA services to the kernel'.
What actually happens if you, for instance, load the iser module and
try and use it with usnic?
What parts of the RDMA core code does usnic actually use beyond
command marshaling through uverbs?
Would things be easier for you if usnic lived within DPDK?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device
[not found] ` <20150508214855.GA3917-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-05-13 20:41 ` Liran Liss
0 siblings, 0 replies; 50+ messages in thread
From: Liran Liss @ 2015-05-13 20:41 UTC (permalink / raw)
To: Jason Gunthorpe, Hefty, Sean,
linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: Doug Ledford, Weiny, Ira
> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
> > > If you had a device that supported iWARP and RoCE on the same
> > > physical link layer, then yes, the app would need a means of saying
> > > which transport to use in addition to the type of QP to establish.
> >
> > Ah - got it now. And I agree, there should be some way to specify
> > this at the QP level.
Supporting iWARP and RoCE over the same RDMA device makes no sense.
First of all, it is confusing to the user. It's like supporting SRP and iSCSI
using the same SCSI device instance...
Second, different technologies affect much more than the QP.
A technology determines the format and semantics of the whole interface.
For example, AHs, memory windows, atomics, CQE formats are different as well.
Just thinking of how to abstract all these in single device is mind-boggling.
Finally, there is no sane way to describe device capabilities. There is no
single set of capabilities that applies to all of these technologies consistently.
>
> Yes, the only way out is to specify on a per QP basis the addressing and
> protocol/transport/whatever thing. Socket uses the AF,SOCK,PROTO tuple to
> specify this information. We can probably productively use a similar
> breakdown:
We might consider such abstractions at the CMA layer, not at the device level.
Essentially, this is what the CMA was intended for.
>
> AF_IB,SOCK_RC,PROTO_IBA // InfiniBand
> AF_OPA,SOCK_RC,PROTO_IBA // Future 32 bit LID OPB 'InfiniBand'
This is not PROTO_IBA.
> AF_ETH,SOCK_RC,PROTO_IBA // RoCEv1
> AF_INET,SOCK_RC,PROTO_IBA // InfiniBand or RoCEv2, depending on the Link
> Layer AF_ETH,SOCK_RC,PROTO_USNIC AF_INET,SOCK_RC,PROTO_USNIC
> AF_INET,SOCK_RC,PROTO_IWARP
>
[snip]
>
>
> Realistically, today, if a RoCE/iWarp driver appeared then it would have to
> present to the system as two RDMA devices.
>
Exactly!
--Liran
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2015-05-13 20:41 UTC | newest]
Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-04 6:14 [RFC PATCH 0/5] Add Core Capability Bits to use in Management helpers ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1430720099-32512-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 6:14 ` [RFC PATCH 1/5] IB/core: Add Core Capability flags to ib_device ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1430720099-32512-2-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 14:41 ` Doug Ledford
[not found] ` <1430750492.2407.9.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-04 16:40 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA17C-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 17:38 ` Doug Ledford
[not found] ` <1430761111.2407.85.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-04 18:26 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA2F1-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 19:40 ` Doug Ledford
[not found] ` <1430768425.2407.143.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-04 21:07 ` Jason Gunthorpe
[not found] ` <20150504210741.GA20839-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-05-05 19:59 ` Liran Liss
2015-05-04 20:21 ` Dave Goodell (dgoodell)
2015-05-05 19:51 ` Liran Liss
[not found] ` <HE1PR05MB1418DF0669B6E6CABE1D9F1EB1D10-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2015-05-05 20:22 ` Dave Goodell (dgoodell)
[not found] ` <20BA79B2-9DA5-49B3-8455-BD4021CB882C-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2015-05-05 20:29 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCAEFD-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-06 14:25 ` Liran Liss
2015-05-09 3:32 ` Doug Ledford
[not found] ` <1431142328.2407.488.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-11 22:19 ` Dave Goodell (dgoodell)
[not found] ` <CCC397F5-31B6-4A35-94E0-7EAAE6C4803F-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2015-05-11 22:39 ` Jason Gunthorpe
[not found] ` <20150511223930.GA15628-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-05-12 17:41 ` Dave Goodell (dgoodell)
[not found] ` <DC760F33-F704-4763-AEA3-6DE9016D0687-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2015-05-12 18:08 ` Jason Gunthorpe
2015-05-11 6:42 ` ira.weiny
[not found] ` <20150511064232.GA3042-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2015-05-11 22:26 ` Dave Goodell (dgoodell)
[not found] ` <D72B89F6-B333-4DC2-9BA7-CB45EBC31843-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2015-05-12 2:13 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E1107C20E-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-12 17:56 ` Dave Goodell (dgoodell)
2015-05-05 19:27 ` Liran Liss
[not found] ` <HE1PR05MB1418E58A6EB92D92B78C76A0B1D10-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2015-05-08 18:56 ` Doug Ledford
[not found] ` <1431111412.2407.463.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-08 20:06 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCD719-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-08 20:30 ` Doug Ledford
[not found] ` <1431117051.2407.468.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-08 20:56 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCD78C-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-08 21:48 ` Jason Gunthorpe
[not found] ` <20150508214855.GA3917-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-05-13 20:41 ` Liran Liss
2015-05-04 16:42 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA192-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 16:48 ` Doug Ledford
[not found] ` <1430758097.2407.59.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-04 16:53 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA1D9-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 16:56 ` Doug Ledford
[not found] ` <1430758566.2407.62.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-05-04 17:25 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FCA217-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 17:31 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E11069818-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-05-04 17:34 ` Hefty, Sean
2015-05-04 18:36 ` Jason Gunthorpe
[not found] ` <20150504183657.GA20586-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-05-04 22:32 ` ira.weiny
[not found] ` <20150504223234.GB10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2015-05-04 23:16 ` ira.weiny
[not found] ` <20150504231622.GE10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2015-05-04 23:52 ` Jason Gunthorpe
2015-05-04 6:14 ` [RFC PATCH 2/5] IB/core: Replace query_protocol callback with Core Capability flags check ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-05-04 6:14 ` [RFC PATCH 3/5] IB/core: Convert cap_ib_mad to core_cap_flags bit mask ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1430720099-32512-4-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2015-05-04 16:49 ` Hefty, Sean
2015-05-04 18:46 ` Jason Gunthorpe
[not found] ` <20150504184610.GB20586-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-05-04 22:43 ` ira.weiny
[not found] ` <20150504224342.GD10115-W4f6Xiosr+yv7QzWx2u06xL4W9x8LtSr@public.gmane.org>
2015-05-05 8:00 ` Michael Wang
2015-05-05 8:26 ` Michael Wang
2015-05-04 6:14 ` [RFC PATCH 4/5] IB/core: Add rdma_dev_max_mad_size call ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2015-05-04 6:14 ` [RFC PATCH 5/5] IB/core: Add cap_opa_mad helper using RDMA_CORE_CAP_OPA_MAD flag ira.weiny-ral2JQCrhuEAvxtiuMwx3w
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox