* [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017
@ 2017-08-04 20:52 Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 01/27] IB/hfi1: Revert egress pkey check enforcement Dennis Dalessandro
` (2 more replies)
0 siblings, 3 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford
Cc: Mike Marciniszyn, Bartlomiej Dudek, Jakub Byczkowski, linux-rdma,
Ira Weiny, Alex Estrin, stable, Michael J. Ruhl, Don Hiatt,
Sebastian Sanchez, Dasaratharaman Chandramouli
Hi Doug,
Here is the next set of patches for our drivers for 4.14. Along with the usual
fixes and such this includes our extended LID patches which were posted a
while back [1] [2]. Please drop those patch sets as they needed rebased.
Along with those core changes we are now submitting the driver changes that
are needed to make use of extended LIDs. 21 of these 27 patches are related
to that effort. The other 6 are usual fixes.
Also note the first patch we did mark stable. That may be something to consider
for the 4.13-rc.
Patches can can also be found in my GitHub repo at:
https://github.com/ddalessa/kernel/tree/for-4.14
[1] http://marc.info/?l=linux-rdma&m=149694352021933&w=2
[2] http://marc.info/?l=linux-rdma&m=149694355721946&w=2
---
Alex Estrin (1):
IB/hfi1: Revert egress pkey check enforcement
Bartlomiej Dudek (1):
IB/hfi1: Use host_link_state to read state when DC is shut down
Byczkowski, Jakub (1):
IB/hfi1: Remove lstate from hfi1_pportdata
Dasaratharaman Chandramouli (10):
IB/core: Convert ah_attr from OPA to IB when copying to user
IB/srpt: Increase lid and sm_lid to 32 bits
IB/IPoIB: Increase local_lid to 32 bits
IB/mad: Change slid in RMPP recv from 16 to 32 bits
IB/core: Change port_attr.lid size from 16 to 32 bits
IB/core: Change port_attr.sm_lid from 16 to 32 bits
IB/CM: Create appropriate path records when handling CM request
IB/CM: Set appropriate slid and dlid when handling CM request
IB/rdmavt,hfi1,qib: Enhance rdmavt and hfi1 to use 32 bit lids
IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs
Don Hiatt (11):
IB/core: Change wc.slid from 16 to 32 bits
IB/CM: Add OPA Path record support to CM
IB/rdmavt,hfi1,qib: Modify check_ah() to account for extended LIDs
IB/hfi1: Add support to receive 16B bypass packets
IB/hfi1: Add support to send 16B bypass packets
IB/hfi1: Add support to process 16B header errors
IB/hfi1: Determine 9B/16B L2 header type based on Address handle
IB/hfi1: Add 16B UD support
IB/hfi1: Add 16B trace support
IB/hfi1: Add 16B RC/UC support
IB/hfi1: Enhance PIO/SDMA send for 16B
Michael J. Ruhl (2):
IB/hfi1: Protect context array set/clear with spinlock
IB/hf1: User context locking is inconsistent
Sebastian Sanchez (1):
IB/hfi1: Remove pmtu from the QP structure
drivers/infiniband/core/cm.c | 159 +++++++++--
drivers/infiniband/core/core_priv.h | 1
drivers/infiniband/core/mad_rmpp.c | 2
drivers/infiniband/core/sa_query.c | 21 +
drivers/infiniband/core/ucm.c | 2
drivers/infiniband/core/ucma.c | 10 -
drivers/infiniband/core/user_mad.c | 2
drivers/infiniband/core/uverbs_cmd.c | 22 +
drivers/infiniband/core/uverbs_marshall.c | 48 +++
drivers/infiniband/hw/hfi1/aspm.h | 35 ++
drivers/infiniband/hw/hfi1/chip.c | 110 +++++--
drivers/infiniband/hw/hfi1/chip.h | 3
drivers/infiniband/hw/hfi1/common.h | 10 -
drivers/infiniband/hw/hfi1/debugfs.c | 32 +-
drivers/infiniband/hw/hfi1/driver.c | 291 +++++++++++++++-----
drivers/infiniband/hw/hfi1/file_ops.c | 340 +++++++++++++----------
drivers/infiniband/hw/hfi1/hfi.h | 422 ++++++++++++++++++++++++++---
drivers/infiniband/hw/hfi1/init.c | 306 ++++++++++++++-------
drivers/infiniband/hw/hfi1/mad.c | 108 ++++++-
drivers/infiniband/hw/hfi1/qp.c | 28 ++
drivers/infiniband/hw/hfi1/rc.c | 389 +++++++++++++++++---------
drivers/infiniband/hw/hfi1/ruc.c | 237 ++++++++++++----
drivers/infiniband/hw/hfi1/trace.c | 153 ++++++++++
drivers/infiniband/hw/hfi1/trace_ibhdrs.h | 364 +++++++++++++++++--------
drivers/infiniband/hw/hfi1/trace_rx.h | 12 -
drivers/infiniband/hw/hfi1/uc.c | 42 ++-
drivers/infiniband/hw/hfi1/ud.c | 427 ++++++++++++++++++++++-------
drivers/infiniband/hw/hfi1/user_sdma.c | 7
drivers/infiniband/hw/hfi1/verbs.c | 280 +++++++++++++------
drivers/infiniband/hw/hfi1/verbs.h | 45 ++-
drivers/infiniband/hw/hfi1/vnic.h | 15 -
drivers/infiniband/hw/hfi1/vnic_main.c | 32 --
drivers/infiniband/hw/mlx4/alias_GUID.c | 2
drivers/infiniband/hw/mlx4/mad.c | 8 -
drivers/infiniband/hw/mlx5/mad.c | 2
drivers/infiniband/hw/mthca/mthca_cmd.c | 4
drivers/infiniband/hw/mthca/mthca_mad.c | 4
drivers/infiniband/hw/qib/qib_mad.c | 4
drivers/infiniband/hw/qib/qib_verbs.c | 9 +
drivers/infiniband/sw/rdmavt/ah.c | 10 -
drivers/infiniband/sw/rdmavt/cq.c | 2
drivers/infiniband/sw/rdmavt/qp.c | 32 ++
drivers/infiniband/ulp/ipoib/ipoib.h | 2
drivers/infiniband/ulp/srpt/ib_srpt.h | 4
include/rdma/ib_marshall.h | 6
include/rdma/ib_verbs.h | 33 ++
include/rdma/opa_addr.h | 42 +++
include/rdma/opa_vnic.h | 3
include/rdma/rdma_vt.h | 2
include/rdma/rdmavt_qp.h | 1
50 files changed, 2986 insertions(+), 1139 deletions(-)
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH for-next 01/27] IB/hfi1: Revert egress pkey check enforcement
2017-08-04 20:52 [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017 Dennis Dalessandro
@ 2017-08-04 20:52 ` Dennis Dalessandro
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-08-10 17:05 ` [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017 Dennis Dalessandro
2 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford; +Cc: linux-rdma, Mike Marciniszyn, Alex Estrin, stable
From: Alex Estrin <alex.estrin@intel.com>
Current code has some serious flaws. Disarm the flag
pending an appropriate patch.
Fixes: 53526500f301 ("IB/hfi1: Permanently enable P_Key checking in HFI")
Cc: stable@vger.kernel.org
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
drivers/infiniband/hw/hfi1/init.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index be027c9..da7cd5b 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -519,7 +519,6 @@ void hfi1_init_pportdata(struct pci_dev *pdev, struct hfi1_pportdata *ppd,
ppd->pkeys[default_pkey_idx] = DEFAULT_P_KEY;
ppd->part_enforce |= HFI1_PART_ENFORCE_IN;
- ppd->part_enforce |= HFI1_PART_ENFORCE_OUT;
if (loopback) {
hfi1_early_err(&pdev->dev,
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 02/27] IB/hfi1: Remove pmtu from the QP structure
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-08-04 20:52 ` Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 03/27] IB/hfi1: Remove lstate from hfi1_pportdata Dennis Dalessandro
` (24 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn,
Sebastian Sanchez
From: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
The pmtu field doens't have be stored in the QP structure
as it can easily be calculated when needed.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/sw/rdmavt/qp.c | 3 +--
include/rdma/rdmavt_qp.h | 1 -
2 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index 1878a97..eb0c3d6 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -1243,7 +1243,6 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
if (attr_mask & IB_QP_PATH_MTU) {
qp->pmtu = rdi->driver_f.mtu_from_qp(rdi, qp, pmtu);
- qp->path_mtu = rdi->driver_f.mtu_to_path_mtu(qp->pmtu);
qp->log_pmtu = ilog2(qp->pmtu);
}
@@ -1366,7 +1365,7 @@ int rvt_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
attr->qp_state = qp->state;
attr->cur_qp_state = attr->qp_state;
- attr->path_mtu = qp->path_mtu;
+ attr->path_mtu = rdi->driver_f.mtu_to_path_mtu(qp->pmtu);
attr->path_mig_state = qp->s_mig_state;
attr->qkey = qp->qkey;
attr->rq_psn = qp->r_psn & rdi->dparms.psn_mask;
diff --git a/include/rdma/rdmavt_qp.h b/include/rdma/rdmavt_qp.h
index 07e2fff..8fbafb0 100644
--- a/include/rdma/rdmavt_qp.h
+++ b/include/rdma/rdmavt_qp.h
@@ -277,7 +277,6 @@ struct rvt_qp {
unsigned long timeout_jiffies; /* computed from timeout */
- enum ib_mtu path_mtu;
int srate_mbps; /* s_srate (below) converted to Mbit/s */
pid_t pid; /* pid for user mode QPs */
u32 remote_qpn;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 03/27] IB/hfi1: Remove lstate from hfi1_pportdata
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-08-04 20:52 ` [PATCH for-next 02/27] IB/hfi1: Remove pmtu from the QP structure Dennis Dalessandro
@ 2017-08-04 20:52 ` Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 04/27] IB/hfi1: Use host_link_state to read state when DC is shut down Dennis Dalessandro
` (23 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny, Jakub Byczkowski
From: Byczkowski, Jakub <jakub.byczkowski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Do not track logical state separately from host_link_state. Deduce
logical state from host_link_state when required. Transitions in
set_link_state and goto_offline already make sure host_link_state
reflects hardware's logical state properly.
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/chip.c | 52 +++++++++++++++++++------------------
drivers/infiniband/hw/hfi1/chip.h | 3 +-
drivers/infiniband/hw/hfi1/hfi.h | 17 ------------
drivers/infiniband/hw/hfi1/mad.c | 2 +
4 files changed, 28 insertions(+), 46 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 101fbbb..8c47653 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -1065,6 +1065,7 @@ static int do_8051_command(struct hfi1_devdata *dd, u32 type, u64 in_data,
static int read_idle_sma(struct hfi1_devdata *dd, u64 *data);
static int thermal_init(struct hfi1_devdata *dd);
+static void update_statusp(struct hfi1_pportdata *ppd, u32 state);
static int wait_logical_linkstate(struct hfi1_pportdata *ppd, u32 state,
int msecs);
static int wait_physical_linkstate(struct hfi1_pportdata *ppd, u32 state,
@@ -10261,8 +10262,10 @@ static void force_logical_link_state_down(struct hfi1_pportdata *ppd)
write_csr(dd, DC_LCB_CFG_ALLOW_LINK_UP, 0);
write_csr(dd, DC_LCB_CFG_IGNORE_LOST_RCLK, 0);
- /* call again to adjust ppd->statusp, if needed */
- get_logical_state(ppd);
+ /* adjust ppd->statusp, if needed */
+ update_statusp(ppd, IB_PORT_DOWN);
+
+ dd_dev_info(ppd->dd, "logical state forced to LINK_DOWN\n");
}
/*
@@ -10458,11 +10461,11 @@ u32 driver_physical_state(struct hfi1_pportdata *ppd)
}
/*
- * driver_logical_state - convert the driver's notion of a port's
+ * driver_lstate - convert the driver's notion of a port's
* state (an HLS_*) into a logical state (a IB_PORT_*). Return -1
* (converted to a u32) to indicate error.
*/
-u32 driver_logical_state(struct hfi1_pportdata *ppd)
+u32 driver_lstate(struct hfi1_pportdata *ppd)
{
if (ppd->host_link_state && (ppd->host_link_state & HLS_DOWN))
return IB_PORT_DOWN;
@@ -12651,20 +12654,8 @@ u32 chip_to_opa_pstate(struct hfi1_devdata *dd, u32 chip_pstate)
return "unknown";
}
-/*
- * Read the hardware link state and set the driver's cached value of it.
- * Return the (new) current value.
- */
-u32 get_logical_state(struct hfi1_pportdata *ppd)
+static void update_statusp(struct hfi1_pportdata *ppd, u32 state)
{
- u32 new_state;
-
- new_state = chip_to_opa_lstate(ppd->dd, read_logical_state(ppd->dd));
- if (new_state != ppd->lstate) {
- dd_dev_info(ppd->dd, "logical state changed to %s (0x%x)\n",
- opa_lstate_name(new_state), new_state);
- ppd->lstate = new_state;
- }
/*
* Set port status flags in the page mapped into userspace
* memory. Do it here to ensure a reliable state - this is
@@ -12674,7 +12665,7 @@ u32 get_logical_state(struct hfi1_pportdata *ppd)
* function.
*/
if (ppd->statusp) {
- switch (ppd->lstate) {
+ switch (state) {
case IB_PORT_DOWN:
case IB_PORT_INIT:
*ppd->statusp &= ~(HFI1_STATUS_IB_CONF |
@@ -12688,10 +12679,9 @@ u32 get_logical_state(struct hfi1_pportdata *ppd)
break;
}
}
- return ppd->lstate;
}
-/**
+/*
* wait_logical_linkstate - wait for an IB link state change to occur
* @ppd: port device
* @state: the state to wait for
@@ -12705,18 +12695,29 @@ static int wait_logical_linkstate(struct hfi1_pportdata *ppd, u32 state,
int msecs)
{
unsigned long timeout;
+ u32 new_state;
timeout = jiffies + msecs_to_jiffies(msecs);
while (1) {
- if (get_logical_state(ppd) == state)
- return 0;
- if (time_after(jiffies, timeout))
+ new_state = chip_to_opa_lstate(ppd->dd,
+ read_logical_state(ppd->dd));
+ if (new_state == state)
break;
+ if (time_after(jiffies, timeout)) {
+ dd_dev_err(ppd->dd,
+ "timeout waiting for link state 0x%x\n",
+ state);
+ return -ETIMEDOUT;
+ }
msleep(20);
}
- dd_dev_err(ppd->dd, "timeout waiting for link state 0x%x\n", state);
- return -ETIMEDOUT;
+ update_statusp(ppd, state);
+ dd_dev_info(ppd->dd,
+ "logical state changed to %s (0x%x)\n",
+ opa_lstate_name(state),
+ state);
+ return 0;
}
/*
@@ -14855,7 +14856,6 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
* Set the initial values to reasonable default, will be set
* for real when link is up.
*/
- ppd->lstate = IB_PORT_DOWN;
ppd->overrun_threshold = 0x4;
ppd->phy_error_threshold = 0xf;
ppd->port_crc_mode_enabled = link_crc_mask;
diff --git a/drivers/infiniband/hw/hfi1/chip.h b/drivers/infiniband/hw/hfi1/chip.h
index 6a0c691..f708302 100644
--- a/drivers/infiniband/hw/hfi1/chip.h
+++ b/drivers/infiniband/hw/hfi1/chip.h
@@ -748,12 +748,11 @@ void update_usrhead(struct hfi1_ctxtdata *rcd, u32 hd, u32 updegr, u32 egrhd,
int is_bx(struct hfi1_devdata *dd);
u32 read_physical_state(struct hfi1_devdata *dd);
u32 chip_to_opa_pstate(struct hfi1_devdata *dd, u32 chip_pstate);
-u32 get_logical_state(struct hfi1_pportdata *ppd);
void cache_physical_state(struct hfi1_pportdata *ppd);
const char *opa_lstate_name(u32 lstate);
const char *opa_pstate_name(u32 pstate);
u32 driver_physical_state(struct hfi1_pportdata *ppd);
-u32 driver_logical_state(struct hfi1_pportdata *ppd);
+u32 driver_lstate(struct hfi1_pportdata *ppd);
int acquire_lcb_access(struct hfi1_devdata *dd, int sleep_ok);
int release_lcb_access(struct hfi1_devdata *dd, int sleep_ok);
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index fb5f839..e66e8f9 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -591,8 +591,6 @@ struct hfi1_pportdata {
struct mutex hls_lock;
u32 host_link_state;
- u32 lstate; /* logical link state */
-
/* these are the "32 bit" regs */
u32 ibmtu; /* The MTU programmed for this unit */
@@ -1296,21 +1294,6 @@ void hfi1_init_pportdata(struct pci_dev *pdev, struct hfi1_pportdata *ppd,
int hfi1_reset_device(int);
-/* return the driver's idea of the logical OPA port state */
-static inline u32 driver_lstate(struct hfi1_pportdata *ppd)
-{
- /*
- * The driver does some processing from the time the logical
- * link state is at INIT to the time the SM can be notified
- * as such. Return IB_PORT_DOWN until the software state
- * is ready.
- */
- if (ppd->lstate == IB_PORT_INIT && !(ppd->host_link_state & HLS_UP))
- return IB_PORT_DOWN;
- else
- return ppd->lstate;
-}
-
/* return the driver's idea of the physical OPA port state */
static inline u32 driver_pstate(struct hfi1_pportdata *ppd)
{
diff --git a/drivers/infiniband/hw/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
index 0a3e2df..885d9fd 100644
--- a/drivers/infiniband/hw/hfi1/mad.c
+++ b/drivers/infiniband/hw/hfi1/mad.c
@@ -1117,7 +1117,7 @@ static int port_states_transition_allowed(struct hfi1_pportdata *ppd,
u32 logical_new, u32 physical_new)
{
u32 physical_old = driver_physical_state(ppd);
- u32 logical_old = driver_logical_state(ppd);
+ u32 logical_old = driver_lstate(ppd);
int ret, logical_allowed, physical_allowed;
ret = logical_transition_allowed(logical_old, logical_new);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 04/27] IB/hfi1: Use host_link_state to read state when DC is shut down
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-08-04 20:52 ` [PATCH for-next 02/27] IB/hfi1: Remove pmtu from the QP structure Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 03/27] IB/hfi1: Remove lstate from hfi1_pportdata Dennis Dalessandro
@ 2017-08-04 20:52 ` Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 05/27] IB/hfi1: Protect context array set/clear with spinlock Dennis Dalessandro
` (22 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bartlomiej Dudek,
Jakub Byczkowski
From: Bartlomiej Dudek <bartlomiej.dudek-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
When DC is shut down (by e.g. disconnecting the cable), the
driver should use host_link_state to get port's current
physical state. This is due to the fact that physical state
is read from DC's CSRs and when DC is shut down and state is
changed, its registers are not impacted.
Reviewed-by: Jakub Byczkowski <jakub.byczkowski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Bartlomiej Dudek <bartlomiej.dudek-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/hfi.h | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index e66e8f9..728ed45 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1298,6 +1298,13 @@ void hfi1_init_pportdata(struct pci_dev *pdev, struct hfi1_pportdata *ppd,
static inline u32 driver_pstate(struct hfi1_pportdata *ppd)
{
/*
+ * When DC is shut down and state is changed, its CSRs are not
+ * impacted, therefore host_link_state should be used to get
+ * current physical state.
+ */
+ if (ppd->dd->dc_shutdown)
+ return driver_physical_state(ppd);
+ /*
* The driver does some processing from the time the physical
* link state is at LINKUP to the time the SM can be notified
* as such. Return IB_PORTPHYSSTATE_TRAINING until the software
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 05/27] IB/hfi1: Protect context array set/clear with spinlock
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (2 preceding siblings ...)
2017-08-04 20:52 ` [PATCH for-next 04/27] IB/hfi1: Use host_link_state to read state when DC is shut down Dennis Dalessandro
@ 2017-08-04 20:52 ` Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 06/27] IB/hf1: User context locking is inconsistent Dennis Dalessandro
` (21 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl,
Mike Marciniszyn, Sebastian Sanchez
From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
The rcd array can be accessed from user context or during interrupts.
Protecting this with a mutex isn't a good idea because the mutex should
not be used from an IRQ.
Protect the allocation and freeing of rcd array elements with a
spinlock.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/chip.c | 10 +-
drivers/infiniband/hw/hfi1/file_ops.c | 171 ++++++++++++---------------
drivers/infiniband/hw/hfi1/hfi.h | 8 +
drivers/infiniband/hw/hfi1/init.c | 199 ++++++++++++++++++++++----------
drivers/infiniband/hw/hfi1/vnic_main.c | 22 +---
5 files changed, 229 insertions(+), 181 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 8c47653..249b56a 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -15054,10 +15054,16 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
if (ret)
goto bail_cleanup;
- ret = hfi1_create_ctxts(dd);
+ ret = hfi1_create_kctxts(dd);
if (ret)
goto bail_cleanup;
+ /*
+ * Initialize aspm, to be done after gen3 transition and setting up
+ * contexts and before enabling interrupts
+ */
+ aspm_init(dd);
+
dd->rcvhdrsize = DEFAULT_RCVHDRSIZE;
/*
* rcd[0] is guaranteed to be valid by this point. Also, all
@@ -15076,7 +15082,7 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
goto bail_cleanup;
}
- /* use contexts created by hfi1_create_ctxts */
+ /* use contexts created by hfi1_create_kctxts */
ret = set_up_interrupts(dd);
if (ret)
goto bail_cleanup;
diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index a0c13fa..7361366 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -79,8 +79,8 @@
static u64 kvirt_to_phys(void *addr);
static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo);
-static int init_subctxts(struct hfi1_ctxtdata *uctxt,
- const struct hfi1_user_info *uinfo);
+static void init_subctxts(struct hfi1_ctxtdata *uctxt,
+ const struct hfi1_user_info *uinfo);
static int init_user_ctxt(struct hfi1_filedata *fd,
struct hfi1_ctxtdata *uctxt);
static void user_init(struct hfi1_ctxtdata *uctxt);
@@ -758,7 +758,6 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
goto done;
hfi1_cdbg(PROC, "freeing ctxt %u:%u", uctxt->ctxt, fdata->subctxt);
- mutex_lock(&hfi1_mutex);
flush_wc();
/* drain user sdma queue */
@@ -778,6 +777,7 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
HFI1_MAX_SHARED_CTXTS) + fdata->subctxt;
*ev = 0;
+ mutex_lock(&hfi1_mutex);
__clear_bit(fdata->subctxt, uctxt->in_use_ctxts);
fdata->uctxt = NULL;
hfi1_rcd_put(uctxt); /* fdata reference */
@@ -844,6 +844,38 @@ static u64 kvirt_to_phys(void *addr)
return paddr;
}
+static int complete_subctxt(struct hfi1_filedata *fd)
+{
+ int ret;
+
+ /*
+ * sub-context info can only be set up after the base context
+ * has been completed.
+ */
+ ret = wait_event_interruptible(
+ fd->uctxt->wait,
+ !test_bit(HFI1_CTXT_BASE_UNINIT, &fd->uctxt->event_flags));
+
+ if (test_bit(HFI1_CTXT_BASE_FAILED, &fd->uctxt->event_flags))
+ ret = -ENOMEM;
+
+ /* The only thing a sub context needs is the user_xxx stuff */
+ if (!ret) {
+ fd->rec_cpu_num = hfi1_get_proc_affinity(fd->uctxt->numa_id);
+ ret = init_user_ctxt(fd, fd->uctxt);
+ }
+
+ if (ret) {
+ hfi1_rcd_put(fd->uctxt);
+ fd->uctxt = NULL;
+ mutex_lock(&hfi1_mutex);
+ __clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts);
+ mutex_unlock(&hfi1_mutex);
+ }
+
+ return ret;
+}
+
static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
{
int ret;
@@ -854,24 +886,25 @@ static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
if (swmajor != HFI1_USER_SWMAJOR)
return -ENODEV;
+ if (uinfo->subctxt_cnt > HFI1_MAX_SHARED_CTXTS)
+ return -EINVAL;
+
swminor = uinfo->userversion & 0xffff;
+ /*
+ * Acquire the mutex to protect against multiple creations of what
+ * could be a shared base context.
+ */
mutex_lock(&hfi1_mutex);
/*
- * Get a sub context if necessary.
+ * Get a sub context if available (fd->uctxt will be set).
* ret < 0 error, 0 no context, 1 sub-context found
*/
- ret = 0;
- if (uinfo->subctxt_cnt) {
- ret = find_sub_ctxt(fd, uinfo);
- if (ret > 0)
- fd->rec_cpu_num =
- hfi1_get_proc_affinity(fd->uctxt->numa_id);
- }
+ ret = find_sub_ctxt(fd, uinfo);
/*
- * Allocate a base context if context sharing is not required or we
- * couldn't find a sub context.
+ * Allocate a base context if context sharing is not required or a
+ * sub context wasn't found.
*/
if (!ret)
ret = allocate_ctxt(fd, fd->dd, uinfo, &uctxt);
@@ -879,31 +912,10 @@ static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
mutex_unlock(&hfi1_mutex);
/* Depending on the context type, do the appropriate init */
- if (ret > 0) {
- /*
- * sub-context info can only be set up after the base
- * context has been completed.
- */
- ret = wait_event_interruptible(fd->uctxt->wait, !test_bit(
- HFI1_CTXT_BASE_UNINIT,
- &fd->uctxt->event_flags));
- if (test_bit(HFI1_CTXT_BASE_FAILED, &fd->uctxt->event_flags))
- ret = -ENOMEM;
-
- /* The only thing a sub context needs is the user_xxx stuff */
- if (!ret)
- ret = init_user_ctxt(fd, fd->uctxt);
-
- if (ret)
- clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts);
-
- } else if (!ret) {
+ switch (ret) {
+ case 0:
ret = setup_base_ctxt(fd, uctxt);
if (uctxt->subctxt_cnt) {
- /* If there is an error, set the failed bit. */
- if (ret)
- set_bit(HFI1_CTXT_BASE_FAILED,
- &uctxt->event_flags);
/*
* Base context is done, notify anybody using a
* sub-context that is waiting for this completion
@@ -911,14 +923,12 @@ static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
clear_bit(HFI1_CTXT_BASE_UNINIT, &uctxt->event_flags);
wake_up(&uctxt->wait);
}
- if (ret)
- deallocate_ctxt(uctxt);
- }
-
- /* If an error occurred, clear the reference */
- if (ret && fd->uctxt) {
- hfi1_rcd_put(fd->uctxt);
- fd->uctxt = NULL;
+ break;
+ case 1:
+ ret = complete_subctxt(fd);
+ break;
+ default:
+ break;
}
return ret;
@@ -926,7 +936,7 @@ static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
/*
* The hfi1_mutex must be held when this function is called. It is
- * necessary to ensure serialized access to the bitmask in_use_ctxts.
+ * necessary to ensure serialized creation of shared contexts.
*/
static int find_sub_ctxt(struct hfi1_filedata *fd,
const struct hfi1_user_info *uinfo)
@@ -935,6 +945,9 @@ static int find_sub_ctxt(struct hfi1_filedata *fd,
struct hfi1_devdata *dd = fd->dd;
u16 subctxt;
+ if (!uinfo->subctxt_cnt)
+ return 0;
+
for (i = dd->first_dyn_alloc_ctxt; i < dd->num_rcv_contexts; i++) {
struct hfi1_ctxtdata *uctxt = dd->rcd[i];
@@ -983,7 +996,6 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
struct hfi1_ctxtdata **cd)
{
struct hfi1_ctxtdata *uctxt;
- u16 ctxt;
int ret, numa;
if (dd->flags & HFI1_FROZEN) {
@@ -997,22 +1009,9 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
return -EIO;
}
- /*
- * This check is sort of redundant to the next EBUSY error. It would
- * also indicate an inconsistancy in the driver if this value was
- * zero, but there were still contexts available.
- */
if (!dd->freectxts)
return -EBUSY;
- for (ctxt = dd->first_dyn_alloc_ctxt;
- ctxt < dd->num_rcv_contexts; ctxt++)
- if (!dd->rcd[ctxt])
- break;
-
- if (ctxt == dd->num_rcv_contexts)
- return -EBUSY;
-
/*
* If we don't have a NUMA node requested, preference is towards
* device NUMA node.
@@ -1022,11 +1021,10 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
numa = cpu_to_node(fd->rec_cpu_num);
else
numa = numa_node_id();
- uctxt = hfi1_create_ctxtdata(dd->pport, ctxt, numa);
- if (!uctxt) {
- dd_dev_err(dd,
- "Unable to allocate ctxtdata memory, failing open\n");
- return -ENOMEM;
+ ret = hfi1_create_ctxtdata(dd->pport, numa, &uctxt);
+ if (ret < 0) {
+ dd_dev_err(dd, "user ctxtdata allocation failed\n");
+ return ret;
}
hfi1_cdbg(PROC, "[%u:%u] pid %u assigned to CPU %d (NUMA %u)",
uctxt->ctxt, fd->subctxt, current->pid, fd->rec_cpu_num,
@@ -1035,8 +1033,7 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
/*
* Allocate and enable a PIO send context.
*/
- uctxt->sc = sc_alloc(dd, SC_USER, uctxt->rcvhdrqentsize,
- uctxt->dd->node);
+ uctxt->sc = sc_alloc(dd, SC_USER, uctxt->rcvhdrqentsize, dd->node);
if (!uctxt->sc) {
ret = -ENOMEM;
goto ctxdata_free;
@@ -1048,20 +1045,13 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
goto ctxdata_free;
/*
- * Setup sub context resources if the user-level has requested
+ * Setup sub context information if the user-level has requested
* sub contexts.
* This has to be done here so the rest of the sub-contexts find the
- * proper master.
+ * proper base context.
*/
- if (uinfo->subctxt_cnt) {
- ret = init_subctxts(uctxt, uinfo);
- /*
- * On error, we don't need to disable and de-allocate the
- * send context because it will be done during file close
- */
- if (ret)
- goto ctxdata_free;
- }
+ if (uinfo->subctxt_cnt)
+ init_subctxts(uctxt, uinfo);
uctxt->userversion = uinfo->userversion;
uctxt->flags = hfi1_cap_mask; /* save current flag state */
init_waitqueue_head(&uctxt->wait);
@@ -1081,9 +1071,7 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
return 0;
ctxdata_free:
- *cd = NULL;
- dd->rcd[ctxt] = NULL;
- hfi1_rcd_put(uctxt);
+ hfi1_free_ctxt(dd, uctxt);
return ret;
}
@@ -1093,28 +1081,17 @@ static void deallocate_ctxt(struct hfi1_ctxtdata *uctxt)
hfi1_stats.sps_ctxts--;
if (++uctxt->dd->freectxts == uctxt->dd->num_user_contexts)
aspm_enable_all(uctxt->dd);
-
- /* _rcd_put() should be done after releasing mutex */
- uctxt->dd->rcd[uctxt->ctxt] = NULL;
mutex_unlock(&hfi1_mutex);
- hfi1_rcd_put(uctxt); /* dd reference */
+
+ hfi1_free_ctxt(uctxt->dd, uctxt);
}
-static int init_subctxts(struct hfi1_ctxtdata *uctxt,
- const struct hfi1_user_info *uinfo)
+static void init_subctxts(struct hfi1_ctxtdata *uctxt,
+ const struct hfi1_user_info *uinfo)
{
- u16 num_subctxts;
-
- num_subctxts = uinfo->subctxt_cnt;
- if (num_subctxts > HFI1_MAX_SHARED_CTXTS)
- return -EINVAL;
-
uctxt->subctxt_cnt = uinfo->subctxt_cnt;
uctxt->subctxt_id = uinfo->subctxt_id;
- uctxt->redirect_seq_cnt = 1;
set_bit(HFI1_CTXT_BASE_UNINIT, &uctxt->event_flags);
-
- return 0;
}
static int setup_subctxt(struct hfi1_ctxtdata *uctxt)
@@ -1302,8 +1279,8 @@ static int setup_base_ctxt(struct hfi1_filedata *fd,
return 0;
setup_failed:
- /* Call _free_ctxtdata, not _rcd_put(). We still need the context. */
- hfi1_free_ctxtdata(dd, uctxt);
+ set_bit(HFI1_CTXT_BASE_FAILED, &uctxt->event_flags);
+ deallocate_ctxt(uctxt);
return ret;
}
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 728ed45..bb003ff 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -273,7 +273,6 @@ struct hfi1_ctxtdata {
u16 poll_type;
/* receive packet sequence counter */
u8 seq_cnt;
- u8 redirect_seq_cnt;
/* ctxt rcvhdrq head offset */
u32 head;
/* QPs waiting for context processing */
@@ -1263,9 +1262,10 @@ struct hfi1_filedata {
int hfi1_create_rcvhdrq(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd);
int hfi1_setup_eagerbufs(struct hfi1_ctxtdata *rcd);
-int hfi1_create_ctxts(struct hfi1_devdata *dd);
-struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u16 ctxt,
- int numa);
+int hfi1_create_kctxts(struct hfi1_devdata *dd);
+int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
+ struct hfi1_ctxtdata **rcd);
+void hfi1_free_ctxt(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd);
void hfi1_init_pportdata(struct pci_dev *pdev, struct hfi1_pportdata *ppd,
struct hfi1_devdata *dd, u8 hw_pidx, u8 port);
void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd);
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index da7cd5b..23f0bbc 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -126,71 +126,67 @@
u32 hfi1_cpulist_count;
unsigned long *hfi1_cpulist;
-/*
- * Common code for creating the receive context array.
- */
-int hfi1_create_ctxts(struct hfi1_devdata *dd)
+static int hfi1_create_kctxt(struct hfi1_devdata *dd,
+ struct hfi1_pportdata *ppd)
{
- u16 i;
+ struct hfi1_ctxtdata *rcd;
int ret;
/* Control context has to be always 0 */
BUILD_BUG_ON(HFI1_CTRL_CTXT != 0);
+ ret = hfi1_create_ctxtdata(ppd, dd->node, &rcd);
+ if (ret < 0) {
+ dd_dev_err(dd, "Kernel receive context allocation failed\n");
+ return ret;
+ }
+
+ /*
+ * Set up the kernel context flags here and now because they use
+ * default values for all receive side memories. User contexts will
+ * be handled as they are created.
+ */
+ rcd->flags = HFI1_CAP_KGET(MULTI_PKT_EGR) |
+ HFI1_CAP_KGET(NODROP_RHQ_FULL) |
+ HFI1_CAP_KGET(NODROP_EGR_FULL) |
+ HFI1_CAP_KGET(DMA_RTAIL);
+
+ /* Control context must use DMA_RTAIL */
+ if (rcd->ctxt == HFI1_CTRL_CTXT)
+ rcd->flags |= HFI1_CAP_DMA_RTAIL;
+ rcd->seq_cnt = 1;
+
+ rcd->sc = sc_alloc(dd, SC_ACK, rcd->rcvhdrqentsize, dd->node);
+ if (!rcd->sc) {
+ dd_dev_err(dd, "Kernel send context allocation failed\n");
+ return -ENOMEM;
+ }
+ hfi1_init_ctxt(rcd->sc);
+
+ return 0;
+}
+
+/*
+ * Create the receive context array and one or more kernel contexts
+ */
+int hfi1_create_kctxts(struct hfi1_devdata *dd)
+{
+ u16 i;
+ int ret;
+
dd->rcd = kzalloc_node(dd->num_rcv_contexts * sizeof(*dd->rcd),
GFP_KERNEL, dd->node);
if (!dd->rcd)
- goto nomem;
+ return -ENOMEM;
- /* create one or more kernel contexts */
for (i = 0; i < dd->first_dyn_alloc_ctxt; ++i) {
- struct hfi1_pportdata *ppd;
- struct hfi1_ctxtdata *rcd;
-
- ppd = dd->pport + (i % dd->num_pports);
-
- /* dd->rcd[i] gets assigned inside the callee */
- rcd = hfi1_create_ctxtdata(ppd, i, dd->node);
- if (!rcd) {
- dd_dev_err(dd,
- "Unable to allocate kernel receive context, failing\n");
- goto nomem;
- }
- /*
- * Set up the kernel context flags here and now because they
- * use default values for all receive side memories. User
- * contexts will be handled as they are created.
- */
- rcd->flags = HFI1_CAP_KGET(MULTI_PKT_EGR) |
- HFI1_CAP_KGET(NODROP_RHQ_FULL) |
- HFI1_CAP_KGET(NODROP_EGR_FULL) |
- HFI1_CAP_KGET(DMA_RTAIL);
-
- /* Control context must use DMA_RTAIL */
- if (rcd->ctxt == HFI1_CTRL_CTXT)
- rcd->flags |= HFI1_CAP_DMA_RTAIL;
- rcd->seq_cnt = 1;
-
- rcd->sc = sc_alloc(dd, SC_ACK, rcd->rcvhdrqentsize, dd->node);
- if (!rcd->sc) {
- dd_dev_err(dd,
- "Unable to allocate kernel send context, failing\n");
- goto nomem;
- }
-
- hfi1_init_ctxt(rcd->sc);
+ ret = hfi1_create_kctxt(dd, dd->pport);
+ if (ret)
+ goto bail;
}
- /*
- * Initialize aspm, to be done after gen3 transition and setting up
- * contexts and before enabling interrupts
- */
- aspm_init(dd);
-
return 0;
-nomem:
- ret = -ENOMEM;
-
+bail:
for (i = 0; dd->rcd && i < dd->first_dyn_alloc_ctxt; ++i)
hfi1_rcd_put(dd->rcd[i]);
@@ -208,6 +204,11 @@ static void hfi1_rcd_init(struct hfi1_ctxtdata *rcd)
kref_init(&rcd->kref);
}
+/**
+ * hfi1_rcd_free - When reference is zero clean up.
+ * @kref: pointer to an initialized rcd data structure
+ *
+ */
static void hfi1_rcd_free(struct kref *kref)
{
struct hfi1_ctxtdata *rcd =
@@ -217,6 +218,12 @@ static void hfi1_rcd_free(struct kref *kref)
kfree(rcd);
}
+/**
+ * hfi1_rcd_put - decrement reference for rcd
+ * @rcd: pointer to an initialized rcd data structure
+ *
+ * Use this to put a reference after the init.
+ */
int hfi1_rcd_put(struct hfi1_ctxtdata *rcd)
{
if (rcd)
@@ -225,16 +232,58 @@ int hfi1_rcd_put(struct hfi1_ctxtdata *rcd)
return 0;
}
+/**
+ * hfi1_rcd_get - increment reference for rcd
+ * @rcd: pointer to an initialized rcd data structure
+ *
+ * Use this to get a reference after the init.
+ */
void hfi1_rcd_get(struct hfi1_ctxtdata *rcd)
{
kref_get(&rcd->kref);
}
+/**
+ * allocate_rcd_index - allocate an rcd index from the rcd array
+ * @dd: pointer to a valid devdata structure
+ * @rcd: rcd data structure to assign
+ * @index: pointer to index that is allocated
+ *
+ * Find an empty index in the rcd array, and assign the given rcd to it.
+ * If the array is full, we are EBUSY.
+ *
+ */
+static u16 allocate_rcd_index(struct hfi1_devdata *dd,
+ struct hfi1_ctxtdata *rcd, u16 *index)
+{
+ unsigned long flags;
+ u16 ctxt;
+
+ spin_lock_irqsave(&dd->uctxt_lock, flags);
+ for (ctxt = 0; ctxt < dd->num_rcv_contexts; ctxt++)
+ if (!dd->rcd[ctxt])
+ break;
+
+ if (ctxt < dd->num_rcv_contexts) {
+ rcd->ctxt = ctxt;
+ dd->rcd[ctxt] = rcd;
+ hfi1_rcd_init(rcd);
+ }
+ spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+
+ if (ctxt >= dd->num_rcv_contexts)
+ return -EBUSY;
+
+ *index = ctxt;
+
+ return 0;
+}
+
/*
* Common code for user and kernel context setup.
*/
-struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u16 ctxt,
- int numa)
+int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
+ struct hfi1_ctxtdata **context)
{
struct hfi1_devdata *dd = ppd->dd;
struct hfi1_ctxtdata *rcd;
@@ -248,9 +297,18 @@ struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u16 ctxt,
rcd = kzalloc_node(sizeof(*rcd), GFP_KERNEL, numa);
if (rcd) {
u32 rcvtids, max_entries;
+ u16 ctxt;
+ int ret;
hfi1_cdbg(PROC, "setting up context %u\n", ctxt);
+ ret = allocate_rcd_index(dd, rcd, &ctxt);
+ if (ret) {
+ *context = NULL;
+ kfree(rcd);
+ return ret;
+ }
+
INIT_LIST_HEAD(&rcd->qp_wait_list);
hfi1_exp_tid_group_init(&rcd->tid_group_list);
hfi1_exp_tid_group_init(&rcd->tid_used_list);
@@ -258,8 +316,6 @@ struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u16 ctxt,
rcd->ppd = ppd;
rcd->dd = dd;
__set_bit(0, rcd->in_use_ctxts);
- rcd->ctxt = ctxt;
- dd->rcd[ctxt] = rcd;
rcd->numa_id = numa;
rcd->rcv_array_groups = dd->rcv_entries.ngroups;
@@ -363,15 +419,34 @@ struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u16 ctxt,
goto bail;
}
- hfi1_rcd_init(rcd);
+ *context = rcd;
+ return 0;
}
- return rcd;
+
bail:
- dd->rcd[ctxt] = NULL;
- kfree(rcd->egrbufs.rcvtids);
- kfree(rcd->egrbufs.buffers);
- kfree(rcd);
- return NULL;
+ *context = NULL;
+ hfi1_free_ctxt(dd, rcd);
+ return -ENOMEM;
+}
+
+/**
+ * hfi1_free_ctxt
+ * @dd: Pointer to a valid device
+ * @rcd: pointer to an initialized rcd data structure
+ *
+ * This is the "free" to match the _create_ctxtdata (alloc) function.
+ * This is the final "put" for the kref.
+ */
+void hfi1_free_ctxt(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
+{
+ unsigned long flags;
+
+ if (rcd) {
+ spin_lock_irqsave(&dd->uctxt_lock, flags);
+ dd->rcd[rcd->ctxt] = NULL;
+ spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+ hfi1_rcd_put(rcd);
+ }
}
/*
diff --git a/drivers/infiniband/hw/hfi1/vnic_main.c b/drivers/infiniband/hw/hfi1/vnic_main.c
index 89b0564..c91456c 100644
--- a/drivers/infiniband/hw/hfi1/vnic_main.c
+++ b/drivers/infiniband/hw/hfi1/vnic_main.c
@@ -106,22 +106,13 @@ static int allocate_vnic_ctxt(struct hfi1_devdata *dd,
struct hfi1_ctxtdata **vnic_ctxt)
{
struct hfi1_ctxtdata *uctxt;
- u16 ctxt;
int ret;
if (dd->flags & HFI1_FROZEN)
return -EIO;
- for (ctxt = dd->first_dyn_alloc_ctxt;
- ctxt < dd->num_rcv_contexts; ctxt++)
- if (!dd->rcd[ctxt])
- break;
-
- if (ctxt == dd->num_rcv_contexts)
- return -EBUSY;
-
- uctxt = hfi1_create_ctxtdata(dd->pport, ctxt, dd->node);
- if (!uctxt) {
+ ret = hfi1_create_ctxtdata(dd->pport, dd->node, &uctxt);
+ if (ret < 0) {
dd_dev_err(dd, "Unable to create ctxtdata, failing open\n");
return -ENOMEM;
}
@@ -156,11 +147,10 @@ static int allocate_vnic_ctxt(struct hfi1_devdata *dd,
return ret;
bail:
/*
- * hfi1_rcd_put() will call hfi1_free_ctxtdata(), which will
+ * hfi1_free_ctxt() will call hfi1_free_ctxtdata(), which will
* release send_context structure if uctxt->sc is not null
*/
- dd->rcd[uctxt->ctxt] = NULL;
- hfi1_rcd_put(uctxt);
+ hfi1_free_ctxt(dd, uctxt);
dd_dev_dbg(dd, "vnic allocation failed. rc %d\n", ret);
return ret;
}
@@ -201,14 +191,14 @@ static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
dd->send_contexts[uctxt->sc->sw_index].type = SC_USER;
spin_unlock_irqrestore(&dd->uctxt_lock, flags);
- dd->rcd[uctxt->ctxt] = NULL;
uctxt->event_flags = 0;
hfi1_clear_tids(uctxt);
hfi1_clear_ctxt_pkey(dd, uctxt);
hfi1_stats.sps_ctxts--;
- hfi1_rcd_put(uctxt);
+
+ hfi1_free_ctxt(dd, uctxt);
}
void hfi1_vnic_setup(struct hfi1_devdata *dd)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 06/27] IB/hf1: User context locking is inconsistent
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (3 preceding siblings ...)
2017-08-04 20:52 ` [PATCH for-next 05/27] IB/hfi1: Protect context array set/clear with spinlock Dennis Dalessandro
@ 2017-08-04 20:52 ` Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 07/27] IB/core: Convert ah_attr from OPA to IB when copying to user Dennis Dalessandro
` (20 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl,
Mike Marciniszyn, Sebastian Sanchez
From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
There is a mixture of mutex and spinlocks to protect receive context
(rcd/uctxt) information. This is not used consistently.
Use the mutex to protect device receive context information only.
Use the spinlock to protect sub context information only.
Protect access to items in the rcd array with a spinlock and
reference count.
Remove spinlock around dd->rcd array cleanup. Since interrupts are
disabled and cleaned up before this point, this lock is not useful.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/aspm.h | 35 ++++--
drivers/infiniband/hw/hfi1/chip.c | 30 ++++-
drivers/infiniband/hw/hfi1/debugfs.c | 32 ++++--
drivers/infiniband/hw/hfi1/driver.c | 71 ++++++++----
drivers/infiniband/hw/hfi1/file_ops.c | 185 +++++++++++++++++++++-----------
drivers/infiniband/hw/hfi1/hfi.h | 6 +
drivers/infiniband/hw/hfi1/init.c | 134 +++++++++++++++--------
drivers/infiniband/hw/hfi1/trace_rx.h | 12 +-
drivers/infiniband/hw/hfi1/vnic_main.c | 12 --
9 files changed, 326 insertions(+), 191 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/aspm.h b/drivers/infiniband/hw/hfi1/aspm.h
index 3f9a071..522b40e 100644
--- a/drivers/infiniband/hw/hfi1/aspm.h
+++ b/drivers/infiniband/hw/hfi1/aspm.h
@@ -240,11 +240,14 @@ static inline void aspm_disable_all(struct hfi1_devdata *dd)
u16 i;
for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
- rcd = dd->rcd[i];
- del_timer_sync(&rcd->aspm_timer);
- spin_lock_irqsave(&rcd->aspm_lock, flags);
- rcd->aspm_intr_enable = false;
- spin_unlock_irqrestore(&rcd->aspm_lock, flags);
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (rcd) {
+ del_timer_sync(&rcd->aspm_timer);
+ spin_lock_irqsave(&rcd->aspm_lock, flags);
+ rcd->aspm_intr_enable = false;
+ spin_unlock_irqrestore(&rcd->aspm_lock, flags);
+ hfi1_rcd_put(rcd);
+ }
}
aspm_disable(dd);
@@ -264,11 +267,14 @@ static inline void aspm_enable_all(struct hfi1_devdata *dd)
return;
for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
- rcd = dd->rcd[i];
- spin_lock_irqsave(&rcd->aspm_lock, flags);
- rcd->aspm_intr_enable = true;
- rcd->aspm_enabled = true;
- spin_unlock_irqrestore(&rcd->aspm_lock, flags);
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (rcd) {
+ spin_lock_irqsave(&rcd->aspm_lock, flags);
+ rcd->aspm_intr_enable = true;
+ rcd->aspm_enabled = true;
+ spin_unlock_irqrestore(&rcd->aspm_lock, flags);
+ hfi1_rcd_put(rcd);
+ }
}
}
@@ -284,13 +290,18 @@ static inline void aspm_ctx_init(struct hfi1_ctxtdata *rcd)
static inline void aspm_init(struct hfi1_devdata *dd)
{
+ struct hfi1_ctxtdata *rcd;
u16 i;
spin_lock_init(&dd->aspm_lock);
dd->aspm_supported = aspm_hw_l1_supported(dd);
- for (i = 0; i < dd->first_dyn_alloc_ctxt; i++)
- aspm_ctx_init(dd->rcd[i]);
+ for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (rcd)
+ aspm_ctx_init(rcd);
+ hfi1_rcd_put(rcd);
+ }
/* Start with ASPM disabled */
aspm_hw_set_l1_ent_latency(dd);
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 249b56a..305c568 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -6785,13 +6785,17 @@ static void wait_for_freeze_status(struct hfi1_devdata *dd, int freeze)
static void rxe_freeze(struct hfi1_devdata *dd)
{
int i;
+ struct hfi1_ctxtdata *rcd;
/* disable port */
clear_rcvctrl(dd, RCV_CTRL_RCV_PORT_ENABLE_SMASK);
/* disable all receive contexts */
- for (i = 0; i < dd->num_rcv_contexts; i++)
- hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_DIS, dd->rcd[i]);
+ for (i = 0; i < dd->num_rcv_contexts; i++) {
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_DIS, rcd);
+ hfi1_rcd_put(rcd);
+ }
}
/*
@@ -6804,20 +6808,23 @@ static void rxe_kernel_unfreeze(struct hfi1_devdata *dd)
{
u32 rcvmask;
u16 i;
+ struct hfi1_ctxtdata *rcd;
/* enable all kernel contexts */
for (i = 0; i < dd->num_rcv_contexts; i++) {
- struct hfi1_ctxtdata *rcd = dd->rcd[i];
+ rcd = hfi1_rcd_get_by_index(dd, i);
/* Ensure all non-user contexts(including vnic) are enabled */
- if (!rcd || !rcd->sc || (rcd->sc->type == SC_USER))
+ if (!rcd || !rcd->sc || (rcd->sc->type == SC_USER)) {
+ hfi1_rcd_put(rcd);
continue;
-
+ }
rcvmask = HFI1_RCVCTRL_CTXT_ENB;
/* HFI1_RCVCTRL_TAILUPD_[ENB|DIS] needs to be set explicitly */
rcvmask |= HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL) ?
HFI1_RCVCTRL_TAILUPD_ENB : HFI1_RCVCTRL_TAILUPD_DIS;
hfi1_rcvctrl(dd, rcvmask, rcd);
+ hfi1_rcd_put(rcd);
}
/* enable port */
@@ -8104,7 +8111,7 @@ static void is_rcv_avail_int(struct hfi1_devdata *dd, unsigned int source)
char *err_detail;
if (likely(source < dd->num_rcv_contexts)) {
- rcd = dd->rcd[source];
+ rcd = hfi1_rcd_get_by_index(dd, source);
if (rcd) {
/* Check for non-user contexts, including vnic */
if ((source < dd->first_dyn_alloc_ctxt) ||
@@ -8112,6 +8119,8 @@ static void is_rcv_avail_int(struct hfi1_devdata *dd, unsigned int source)
rcd->do_interrupt(rcd, 0);
else
handle_user_interrupt(rcd);
+
+ hfi1_rcd_put(rcd);
return; /* OK */
}
/* received an interrupt, but no rcd */
@@ -8133,12 +8142,14 @@ static void is_rcv_urgent_int(struct hfi1_devdata *dd, unsigned int source)
char *err_detail;
if (likely(source < dd->num_rcv_contexts)) {
- rcd = dd->rcd[source];
+ rcd = hfi1_rcd_get_by_index(dd, source);
if (rcd) {
/* only pay attention to user urgent interrupts */
if ((source >= dd->first_dyn_alloc_ctxt) &&
(!rcd->sc || (rcd->sc->type == SC_USER)))
handle_user_interrupt(rcd);
+
+ hfi1_rcd_put(rcd);
return; /* OK */
}
/* received an interrupt, but no rcd */
@@ -8343,7 +8354,7 @@ static irqreturn_t receive_context_interrupt(int irq, void *data)
int disposition;
int present;
- trace_hfi1_receive_interrupt(dd, rcd->ctxt);
+ trace_hfi1_receive_interrupt(dd, rcd);
this_cpu_inc(*dd->int_counter);
aspm_ctx_disable(rcd);
@@ -13030,7 +13041,7 @@ static int request_msix_irqs(struct hfi1_devdata *dd)
me->type = IRQ_SDMA;
} else if (first_rx <= i && i < last_rx) {
idx = i - first_rx;
- rcd = dd->rcd[idx];
+ rcd = hfi1_rcd_get_by_index(dd, idx);
if (rcd) {
/*
* Set the interrupt register and mask for this
@@ -13049,6 +13060,7 @@ static int request_msix_irqs(struct hfi1_devdata *dd)
remap_intr(dd, IS_RCVAVAIL_START + idx, i);
me->type = IRQ_RCVCTXT;
rcd->msix_intr = i;
+ hfi1_rcd_put(rcd);
}
} else {
/* not in our expected range - complain, then
diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index e9fa3c2..550119c 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -173,12 +173,15 @@ static int _opcode_stats_seq_show(struct seq_file *s, void *v)
u64 n_packets = 0, n_bytes = 0;
struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
struct hfi1_devdata *dd = dd_from_dev(ibd);
+ struct hfi1_ctxtdata *rcd;
for (j = 0; j < dd->first_dyn_alloc_ctxt; j++) {
- if (!dd->rcd[j])
- continue;
- n_packets += dd->rcd[j]->opstats->stats[i].n_packets;
- n_bytes += dd->rcd[j]->opstats->stats[i].n_bytes;
+ rcd = hfi1_rcd_get_by_index(dd, j);
+ if (rcd) {
+ n_packets += rcd->opstats->stats[i].n_packets;
+ n_bytes += rcd->opstats->stats[i].n_bytes;
+ }
+ hfi1_rcd_put(rcd);
}
if (!n_packets && !n_bytes)
return SEQ_SKIP;
@@ -231,6 +234,7 @@ static int _ctx_stats_seq_show(struct seq_file *s, void *v)
u64 n_packets = 0;
struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
struct hfi1_devdata *dd = dd_from_dev(ibd);
+ struct hfi1_ctxtdata *rcd;
if (v == SEQ_START_TOKEN) {
seq_puts(s, "Ctx:npkts\n");
@@ -240,11 +244,14 @@ static int _ctx_stats_seq_show(struct seq_file *s, void *v)
spos = v;
i = *spos;
- if (!dd->rcd[i])
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (!rcd)
return SEQ_SKIP;
- for (j = 0; j < ARRAY_SIZE(dd->rcd[i]->opstats->stats); j++)
- n_packets += dd->rcd[i]->opstats->stats[j].n_packets;
+ for (j = 0; j < ARRAY_SIZE(rcd->opstats->stats); j++)
+ n_packets += rcd->opstats->stats[j].n_packets;
+
+ hfi1_rcd_put(rcd);
if (!n_packets)
return SEQ_SKIP;
@@ -1098,12 +1105,15 @@ static int _fault_stats_seq_show(struct seq_file *s, void *v)
u64 n_packets = 0, n_bytes = 0;
struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
struct hfi1_devdata *dd = dd_from_dev(ibd);
+ struct hfi1_ctxtdata *rcd;
for (j = 0; j < dd->first_dyn_alloc_ctxt; j++) {
- if (!dd->rcd[j])
- continue;
- n_packets += dd->rcd[j]->opstats->stats[i].n_packets;
- n_bytes += dd->rcd[j]->opstats->stats[i].n_bytes;
+ rcd = hfi1_rcd_get_by_index(dd, j);
+ if (rcd) {
+ n_packets += rcd->opstats->stats[i].n_packets;
+ n_bytes += rcd->opstats->stats[i].n_bytes;
+ }
+ hfi1_rcd_put(rcd);
}
if (!n_packets && !n_bytes)
return SEQ_SKIP;
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 0b7ca0e..14f2a00 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -839,6 +839,7 @@ int handle_receive_interrupt_dma_rtail(struct hfi1_ctxtdata *rcd, int thread)
static inline void set_nodma_rtail(struct hfi1_devdata *dd, u16 ctxt)
{
+ struct hfi1_ctxtdata *rcd;
u16 i;
/*
@@ -847,18 +848,27 @@ static inline void set_nodma_rtail(struct hfi1_devdata *dd, u16 ctxt)
* interrupt handler for all statically allocated kernel contexts.
*/
if (ctxt >= dd->first_dyn_alloc_ctxt) {
- dd->rcd[ctxt]->do_interrupt =
- &handle_receive_interrupt_nodma_rtail;
+ rcd = hfi1_rcd_get_by_index(dd, ctxt);
+ if (rcd) {
+ rcd->do_interrupt =
+ &handle_receive_interrupt_nodma_rtail;
+ hfi1_rcd_put(rcd);
+ }
return;
}
- for (i = HFI1_CTRL_CTXT + 1; i < dd->first_dyn_alloc_ctxt; i++)
- dd->rcd[i]->do_interrupt =
- &handle_receive_interrupt_nodma_rtail;
+ for (i = HFI1_CTRL_CTXT + 1; i < dd->first_dyn_alloc_ctxt; i++) {
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (rcd)
+ rcd->do_interrupt =
+ &handle_receive_interrupt_nodma_rtail;
+ hfi1_rcd_put(rcd);
+ }
}
static inline void set_dma_rtail(struct hfi1_devdata *dd, u16 ctxt)
{
+ struct hfi1_ctxtdata *rcd;
u16 i;
/*
@@ -867,27 +877,39 @@ static inline void set_dma_rtail(struct hfi1_devdata *dd, u16 ctxt)
* interrupt handler for all statically allocated kernel contexts.
*/
if (ctxt >= dd->first_dyn_alloc_ctxt) {
- dd->rcd[ctxt]->do_interrupt =
- &handle_receive_interrupt_dma_rtail;
+ rcd = hfi1_rcd_get_by_index(dd, ctxt);
+ if (rcd) {
+ rcd->do_interrupt =
+ &handle_receive_interrupt_dma_rtail;
+ hfi1_rcd_put(rcd);
+ }
return;
}
- for (i = HFI1_CTRL_CTXT + 1; i < dd->first_dyn_alloc_ctxt; i++)
- dd->rcd[i]->do_interrupt =
- &handle_receive_interrupt_dma_rtail;
+ for (i = HFI1_CTRL_CTXT + 1; i < dd->first_dyn_alloc_ctxt; i++) {
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (rcd)
+ rcd->do_interrupt =
+ &handle_receive_interrupt_dma_rtail;
+ hfi1_rcd_put(rcd);
+ }
}
void set_all_slowpath(struct hfi1_devdata *dd)
{
+ struct hfi1_ctxtdata *rcd;
u16 i;
/* HFI1_CTRL_CTXT must always use the slow path interrupt handler */
for (i = HFI1_CTRL_CTXT + 1; i < dd->num_rcv_contexts; i++) {
- struct hfi1_ctxtdata *rcd = dd->rcd[i];
-
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (!rcd)
+ continue;
if ((i < dd->first_dyn_alloc_ctxt) ||
- (rcd && rcd->sc && (rcd->sc->type == SC_KERNEL)))
+ (rcd->sc && (rcd->sc->type == SC_KERNEL))) {
rcd->do_interrupt = &handle_receive_interrupt;
+ }
+ hfi1_rcd_put(rcd);
}
}
@@ -1068,6 +1090,7 @@ void receive_interrupt_work(struct work_struct *work)
struct hfi1_pportdata *ppd = container_of(work, struct hfi1_pportdata,
linkstate_active_work);
struct hfi1_devdata *dd = ppd->dd;
+ struct hfi1_ctxtdata *rcd;
u16 i;
/* Received non-SC15 packet implies neighbor_normal */
@@ -1078,8 +1101,12 @@ void receive_interrupt_work(struct work_struct *work)
* Interrupt all statically allocated kernel contexts that could
* have had an interrupt during auto activation.
*/
- for (i = HFI1_CTRL_CTXT; i < dd->first_dyn_alloc_ctxt; i++)
- force_recv_intr(dd->rcd[i]);
+ for (i = HFI1_CTRL_CTXT; i < dd->first_dyn_alloc_ctxt; i++) {
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (rcd)
+ force_recv_intr(rcd);
+ hfi1_rcd_put(rcd);
+ }
}
/*
@@ -1270,10 +1297,8 @@ void hfi1_start_led_override(struct hfi1_pportdata *ppd, unsigned int timeon,
int hfi1_reset_device(int unit)
{
int ret;
- u16 i;
struct hfi1_devdata *dd = hfi1_lookup(unit);
struct hfi1_pportdata *ppd;
- unsigned long flags;
int pidx;
if (!dd) {
@@ -1291,17 +1316,15 @@ int hfi1_reset_device(int unit)
goto bail;
}
- spin_lock_irqsave(&dd->uctxt_lock, flags);
+ /* If there are any user/vnic contexts, we cannot reset */
+ mutex_lock(&hfi1_mutex);
if (dd->rcd)
- for (i = dd->first_dyn_alloc_ctxt;
- i < dd->num_rcv_contexts; i++) {
- if (!dd->rcd[i])
- continue;
- spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+ if (hfi1_stats.sps_ctxts) {
+ mutex_unlock(&hfi1_mutex);
ret = -EBUSY;
goto bail;
}
- spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+ mutex_unlock(&hfi1_mutex);
for (pidx = 0; pidx < dd->num_pports; ++pidx) {
ppd = dd->pport + pidx;
diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 7361366..ab8eb2b 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -757,7 +757,7 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
if (!uctxt)
goto done;
- hfi1_cdbg(PROC, "freeing ctxt %u:%u", uctxt->ctxt, fdata->subctxt);
+ hfi1_cdbg(PROC, "closing ctxt %u:%u", uctxt->ctxt, fdata->subctxt);
flush_wc();
/* drain user sdma queue */
@@ -770,6 +770,13 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
hfi1_user_exp_rcv_free(fdata);
/*
+ * fdata->uctxt is used in the above cleanup. It is not ready to be
+ * removed until here.
+ */
+ fdata->uctxt = NULL;
+ hfi1_rcd_put(uctxt);
+
+ /*
* Clear any left over, unhandled events so the next process that
* gets this context doesn't get confused.
*/
@@ -777,16 +784,14 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
HFI1_MAX_SHARED_CTXTS) + fdata->subctxt;
*ev = 0;
- mutex_lock(&hfi1_mutex);
+ spin_lock_irqsave(&dd->uctxt_lock, flags);
__clear_bit(fdata->subctxt, uctxt->in_use_ctxts);
- fdata->uctxt = NULL;
- hfi1_rcd_put(uctxt); /* fdata reference */
if (!bitmap_empty(uctxt->in_use_ctxts, HFI1_MAX_SHARED_CTXTS)) {
- mutex_unlock(&hfi1_mutex);
+ spin_unlock_irqrestore(&dd->uctxt_lock, flags);
goto done;
}
+ spin_unlock_irqrestore(&dd->uctxt_lock, flags);
- spin_lock_irqsave(&dd->uctxt_lock, flags);
/*
* Disable receive context and interrupt available, reset all
* RcvCtxtCtrl bits to default values.
@@ -808,13 +813,11 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
set_pio_integrity(uctxt->sc);
sc_disable(uctxt->sc);
}
- spin_unlock_irqrestore(&dd->uctxt_lock, flags);
hfi1_free_ctxt_rcv_groups(uctxt);
hfi1_clear_ctxt_pkey(dd, uctxt);
uctxt->event_flags = 0;
- mutex_unlock(&hfi1_mutex);
deallocate_ctxt(uctxt);
done:
@@ -844,9 +847,22 @@ static u64 kvirt_to_phys(void *addr)
return paddr;
}
+/**
+ * complete_subctxt
+ * @fd: valid filedata pointer
+ *
+ * Sub-context info can only be set up after the base context
+ * has been completed. This is indicated by the clearing of the
+ * HFI1_CTXT_BASE_UINIT bit.
+ *
+ * Wait for the bit to be cleared, and then complete the subcontext
+ * initialization.
+ *
+ */
static int complete_subctxt(struct hfi1_filedata *fd)
{
int ret;
+ unsigned long flags;
/*
* sub-context info can only be set up after the base context
@@ -859,7 +875,7 @@ static int complete_subctxt(struct hfi1_filedata *fd)
if (test_bit(HFI1_CTXT_BASE_FAILED, &fd->uctxt->event_flags))
ret = -ENOMEM;
- /* The only thing a sub context needs is the user_xxx stuff */
+ /* Finish the sub-context init */
if (!ret) {
fd->rec_cpu_num = hfi1_get_proc_affinity(fd->uctxt->numa_id);
ret = init_user_ctxt(fd, fd->uctxt);
@@ -868,9 +884,9 @@ static int complete_subctxt(struct hfi1_filedata *fd)
if (ret) {
hfi1_rcd_put(fd->uctxt);
fd->uctxt = NULL;
- mutex_lock(&hfi1_mutex);
+ spin_lock_irqsave(&fd->dd->uctxt_lock, flags);
__clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts);
- mutex_unlock(&hfi1_mutex);
+ spin_unlock_irqrestore(&fd->dd->uctxt_lock, flags);
}
return ret;
@@ -911,14 +927,15 @@ static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
mutex_unlock(&hfi1_mutex);
- /* Depending on the context type, do the appropriate init */
+ /* Depending on the context type, finish the appropriate init */
switch (ret) {
case 0:
ret = setup_base_ctxt(fd, uctxt);
if (uctxt->subctxt_cnt) {
/*
- * Base context is done, notify anybody using a
- * sub-context that is waiting for this completion
+ * Base context is done (successfully or not), notify
+ * anybody using a sub-context that is waiting for
+ * this completion.
*/
clear_bit(HFI1_CTXT_BASE_UNINIT, &uctxt->event_flags);
wake_up(&uctxt->wait);
@@ -934,58 +951,97 @@ static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
return ret;
}
-/*
- * The hfi1_mutex must be held when this function is called. It is
- * necessary to ensure serialized creation of shared contexts.
+/**
+ * match_ctxt
+ * @fd: valid filedata pointer
+ * @uinfo: user info to compare base context with
+ * @uctxt: context to compare uinfo to.
+ *
+ * Compare the given context with the given information to see if it
+ * can be used for a sub context.
*/
-static int find_sub_ctxt(struct hfi1_filedata *fd,
- const struct hfi1_user_info *uinfo)
+static int match_ctxt(struct hfi1_filedata *fd,
+ const struct hfi1_user_info *uinfo,
+ struct hfi1_ctxtdata *uctxt)
{
- u16 i;
struct hfi1_devdata *dd = fd->dd;
+ unsigned long flags;
u16 subctxt;
- if (!uinfo->subctxt_cnt)
+ /* Skip dynamically allocated kernel contexts */
+ if (uctxt->sc && (uctxt->sc->type == SC_KERNEL))
return 0;
- for (i = dd->first_dyn_alloc_ctxt; i < dd->num_rcv_contexts; i++) {
- struct hfi1_ctxtdata *uctxt = dd->rcd[i];
+ /* Skip ctxt if it doesn't match the requested one */
+ if (memcmp(uctxt->uuid, uinfo->uuid, sizeof(uctxt->uuid)) ||
+ uctxt->jkey != generate_jkey(current_uid()) ||
+ uctxt->subctxt_id != uinfo->subctxt_id ||
+ uctxt->subctxt_cnt != uinfo->subctxt_cnt)
+ return 0;
- /* Skip ctxts which are not yet open */
- if (!uctxt ||
- bitmap_empty(uctxt->in_use_ctxts,
- HFI1_MAX_SHARED_CTXTS))
- continue;
+ /* Verify the sharing process matches the base */
+ if (uctxt->userversion != uinfo->userversion)
+ return -EINVAL;
- /* Skip dynamically allocted kernel contexts */
- if (uctxt->sc && (uctxt->sc->type == SC_KERNEL))
- continue;
+ /* Find an unused sub context */
+ spin_lock_irqsave(&dd->uctxt_lock, flags);
+ if (bitmap_empty(uctxt->in_use_ctxts, HFI1_MAX_SHARED_CTXTS)) {
+ /* context is being closed, do not use */
+ spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+ return 0;
+ }
- /* Skip ctxt if it doesn't match the requested one */
- if (memcmp(uctxt->uuid, uinfo->uuid,
- sizeof(uctxt->uuid)) ||
- uctxt->jkey != generate_jkey(current_uid()) ||
- uctxt->subctxt_id != uinfo->subctxt_id ||
- uctxt->subctxt_cnt != uinfo->subctxt_cnt)
- continue;
+ subctxt = find_first_zero_bit(uctxt->in_use_ctxts,
+ HFI1_MAX_SHARED_CTXTS);
+ if (subctxt >= uctxt->subctxt_cnt) {
+ spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+ return -EBUSY;
+ }
- /* Verify the sharing process matches the master */
- if (uctxt->userversion != uinfo->userversion)
- return -EINVAL;
+ fd->subctxt = subctxt;
+ __set_bit(fd->subctxt, uctxt->in_use_ctxts);
+ spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+
+ fd->uctxt = uctxt;
+ hfi1_rcd_get(uctxt);
- /* Find an unused context */
- subctxt = find_first_zero_bit(uctxt->in_use_ctxts,
- HFI1_MAX_SHARED_CTXTS);
- if (subctxt >= uctxt->subctxt_cnt)
- return -EBUSY;
+ return 1;
+}
- fd->uctxt = uctxt;
- fd->subctxt = subctxt;
+/**
+ * find_sub_ctxt
+ * @fd: valid filedata pointer
+ * @uinfo: matching info to use to find a possible context to share.
+ *
+ * The hfi1_mutex must be held when this function is called. It is
+ * necessary to ensure serialized creation of shared contexts.
+ *
+ * Return:
+ * 0 No sub-context found
+ * 1 Subcontext found and allocated
+ * errno EINVAL (incorrect parameters)
+ * EBUSY (all sub contexts in use)
+ */
+static int find_sub_ctxt(struct hfi1_filedata *fd,
+ const struct hfi1_user_info *uinfo)
+{
+ struct hfi1_ctxtdata *uctxt;
+ struct hfi1_devdata *dd = fd->dd;
+ u16 i;
+ int ret;
- hfi1_rcd_get(uctxt);
- __set_bit(fd->subctxt, uctxt->in_use_ctxts);
+ if (!uinfo->subctxt_cnt)
+ return 0;
- return 1;
+ for (i = dd->first_dyn_alloc_ctxt; i < dd->num_rcv_contexts; i++) {
+ uctxt = hfi1_rcd_get_by_index(dd, i);
+ if (uctxt) {
+ ret = match_ctxt(fd, uinfo, uctxt);
+ hfi1_rcd_put(uctxt);
+ /* value of != 0 will return */
+ if (ret)
+ return ret;
+ }
}
return 0;
@@ -993,7 +1049,7 @@ static int find_sub_ctxt(struct hfi1_filedata *fd,
static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
struct hfi1_user_info *uinfo,
- struct hfi1_ctxtdata **cd)
+ struct hfi1_ctxtdata **rcd)
{
struct hfi1_ctxtdata *uctxt;
int ret, numa;
@@ -1066,12 +1122,12 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
if (dd->freectxts-- == dd->num_user_contexts)
aspm_disable_all(dd);
- *cd = uctxt;
+ *rcd = uctxt;
return 0;
ctxdata_free:
- hfi1_free_ctxt(dd, uctxt);
+ hfi1_free_ctxt(uctxt);
return ret;
}
@@ -1083,7 +1139,7 @@ static void deallocate_ctxt(struct hfi1_ctxtdata *uctxt)
aspm_enable_all(uctxt->dd);
mutex_unlock(&hfi1_mutex);
- hfi1_free_ctxt(uctxt->dd, uctxt);
+ hfi1_free_ctxt(uctxt);
}
static void init_subctxts(struct hfi1_ctxtdata *uctxt,
@@ -1279,8 +1335,10 @@ static int setup_base_ctxt(struct hfi1_filedata *fd,
return 0;
setup_failed:
+ /* Set the failed bit so sub-context init can do the right thing */
set_bit(HFI1_CTXT_BASE_FAILED, &uctxt->event_flags);
deallocate_ctxt(uctxt);
+
return ret;
}
@@ -1417,18 +1475,13 @@ int hfi1_set_uevent_bits(struct hfi1_pportdata *ppd, const int evtbit)
struct hfi1_ctxtdata *uctxt;
struct hfi1_devdata *dd = ppd->dd;
u16 ctxt;
- int ret = 0;
- unsigned long flags;
- if (!dd->events) {
- ret = -EINVAL;
- goto done;
- }
+ if (!dd->events)
+ return -EINVAL;
- spin_lock_irqsave(&dd->uctxt_lock, flags);
for (ctxt = dd->first_dyn_alloc_ctxt; ctxt < dd->num_rcv_contexts;
ctxt++) {
- uctxt = dd->rcd[ctxt];
+ uctxt = hfi1_rcd_get_by_index(dd, ctxt);
if (uctxt) {
unsigned long *evs = dd->events +
(uctxt->ctxt - dd->first_dyn_alloc_ctxt) *
@@ -1441,11 +1494,11 @@ int hfi1_set_uevent_bits(struct hfi1_pportdata *ppd, const int evtbit)
set_bit(evtbit, evs);
for (i = 1; i < uctxt->subctxt_cnt; i++)
set_bit(evtbit, evs + i);
+ hfi1_rcd_put(uctxt);
}
}
- spin_unlock_irqrestore(&dd->uctxt_lock, flags);
-done:
- return ret;
+
+ return 0;
}
/**
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index bb003ff..fa9160f 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -938,8 +938,7 @@ struct hfi1_devdata {
u64 __iomem *egrtidbase;
spinlock_t sendctrl_lock; /* protect changes to SendCtrl */
spinlock_t rcvctrl_lock; /* protect changes to RcvCtrl */
- /* around rcd and (user ctxts) ctxt_cnt use (intr vs free) */
- spinlock_t uctxt_lock; /* rcd and user context changes */
+ spinlock_t uctxt_lock; /* protect rcd changes */
struct mutex dc8051_lock; /* exclusive access to 8051 */
struct workqueue_struct *update_cntr_wq;
struct work_struct update_cntr_work;
@@ -1265,12 +1264,13 @@ struct hfi1_filedata {
int hfi1_create_kctxts(struct hfi1_devdata *dd);
int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
struct hfi1_ctxtdata **rcd);
-void hfi1_free_ctxt(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd);
+void hfi1_free_ctxt(struct hfi1_ctxtdata *rcd);
void hfi1_init_pportdata(struct pci_dev *pdev, struct hfi1_pportdata *ppd,
struct hfi1_devdata *dd, u8 hw_pidx, u8 port);
void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd);
int hfi1_rcd_put(struct hfi1_ctxtdata *rcd);
void hfi1_rcd_get(struct hfi1_ctxtdata *rcd);
+struct hfi1_ctxtdata *hfi1_rcd_get_by_index(struct hfi1_devdata *dd, u16 ctxt);
int handle_receive_interrupt(struct hfi1_ctxtdata *rcd, int thread);
int handle_receive_interrupt_nodma_rtail(struct hfi1_ctxtdata *rcd, int thread);
int handle_receive_interrupt_dma_rtail(struct hfi1_ctxtdata *rcd, int thread);
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index 23f0bbc..fba7700 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -188,7 +188,7 @@ int hfi1_create_kctxts(struct hfi1_devdata *dd)
return 0;
bail:
for (i = 0; dd->rcd && i < dd->first_dyn_alloc_ctxt; ++i)
- hfi1_rcd_put(dd->rcd[i]);
+ hfi1_free_ctxt(dd->rcd[i]);
/* All the contexts should be freed, free the array */
kfree(dd->rcd);
@@ -197,7 +197,7 @@ int hfi1_create_kctxts(struct hfi1_devdata *dd)
}
/*
- * Helper routines for the receive context reference count (rcd and uctxt)
+ * Helper routines for the receive context reference count (rcd and uctxt).
*/
static void hfi1_rcd_init(struct hfi1_ctxtdata *rcd)
{
@@ -211,10 +211,16 @@ static void hfi1_rcd_init(struct hfi1_ctxtdata *rcd)
*/
static void hfi1_rcd_free(struct kref *kref)
{
+ unsigned long flags;
struct hfi1_ctxtdata *rcd =
container_of(kref, struct hfi1_ctxtdata, kref);
hfi1_free_ctxtdata(rcd->dd, rcd);
+
+ spin_lock_irqsave(&rcd->dd->uctxt_lock, flags);
+ rcd->dd->rcd[rcd->ctxt] = NULL;
+ spin_unlock_irqrestore(&rcd->dd->uctxt_lock, flags);
+
kfree(rcd);
}
@@ -253,7 +259,7 @@ void hfi1_rcd_get(struct hfi1_ctxtdata *rcd)
* If the array is full, we are EBUSY.
*
*/
-static u16 allocate_rcd_index(struct hfi1_devdata *dd,
+static int allocate_rcd_index(struct hfi1_devdata *dd,
struct hfi1_ctxtdata *rcd, u16 *index)
{
unsigned long flags;
@@ -279,8 +285,36 @@ static u16 allocate_rcd_index(struct hfi1_devdata *dd,
return 0;
}
+/**
+ * hfi1_rcd_get_by_index
+ * @dd: pointer to a valid devdata structure
+ * @ctxt: the index of an possilbe rcd
+ *
+ * We need to protect access to the rcd array. If access is needed to
+ * one or more index, get the protecting spinlock and then increment the
+ * kref.
+ *
+ * The caller is responsible for making the _put().
+ *
+ */
+struct hfi1_ctxtdata *hfi1_rcd_get_by_index(struct hfi1_devdata *dd, u16 ctxt)
+{
+ unsigned long flags;
+ struct hfi1_ctxtdata *rcd = NULL;
+
+ spin_lock_irqsave(&dd->uctxt_lock, flags);
+ if (dd->rcd[ctxt]) {
+ rcd = dd->rcd[ctxt];
+ hfi1_rcd_get(rcd);
+ }
+ spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+
+ return rcd;
+}
+
/*
- * Common code for user and kernel context setup.
+ * Common code for user and kernel context create and setup.
+ * NOTE: the initial kref is done here (hf1_rcd_init()).
*/
int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
struct hfi1_ctxtdata **context)
@@ -300,8 +334,6 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
u16 ctxt;
int ret;
- hfi1_cdbg(PROC, "setting up context %u\n", ctxt);
-
ret = allocate_rcd_index(dd, rcd, &ctxt);
if (ret) {
*context = NULL;
@@ -321,6 +353,8 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
mutex_init(&rcd->exp_lock);
+ hfi1_cdbg(PROC, "setting up context %u\n", rcd->ctxt);
+
/*
* Calculate the context's RcvArray entry starting point.
* We do this here because we have to take into account all
@@ -425,28 +459,23 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
bail:
*context = NULL;
- hfi1_free_ctxt(dd, rcd);
+ hfi1_free_ctxt(rcd);
return -ENOMEM;
}
/**
* hfi1_free_ctxt
- * @dd: Pointer to a valid device
* @rcd: pointer to an initialized rcd data structure
*
- * This is the "free" to match the _create_ctxtdata (alloc) function.
- * This is the final "put" for the kref.
+ * This wrapper is the free function that matches hfi1_create_ctxtdata().
+ * When a context is done being used (kernel or user), this function is called
+ * for the "final" put to match the kref init from hf1i_create_ctxtdata().
+ * Other users of the context do a get/put sequence to make sure that the
+ * structure isn't removed while in use.
*/
-void hfi1_free_ctxt(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
+void hfi1_free_ctxt(struct hfi1_ctxtdata *rcd)
{
- unsigned long flags;
-
- if (rcd) {
- spin_lock_irqsave(&dd->uctxt_lock, flags);
- dd->rcd[rcd->ctxt] = NULL;
- spin_unlock_irqrestore(&dd->uctxt_lock, flags);
- hfi1_rcd_put(rcd);
- }
+ hfi1_rcd_put(rcd);
}
/*
@@ -669,16 +698,19 @@ static int loadtime_init(struct hfi1_devdata *dd)
static int init_after_reset(struct hfi1_devdata *dd)
{
int i;
-
+ struct hfi1_ctxtdata *rcd;
/*
* Ensure chip does no sends or receives, tail updates, or
* pioavail updates while we re-initialize. This is mostly
* for the driver data structures, not chip registers.
*/
- for (i = 0; i < dd->num_rcv_contexts; i++)
+ for (i = 0; i < dd->num_rcv_contexts; i++) {
+ rcd = hfi1_rcd_get_by_index(dd, i);
hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_DIS |
HFI1_RCVCTRL_INTRAVAIL_DIS |
- HFI1_RCVCTRL_TAILUPD_DIS, dd->rcd[i]);
+ HFI1_RCVCTRL_TAILUPD_DIS, rcd);
+ hfi1_rcd_put(rcd);
+ }
pio_send_control(dd, PSC_GLOBAL_DISABLE);
for (i = 0; i < dd->num_send_contexts; i++)
sc_disable(dd->send_contexts[i].sc);
@@ -688,6 +720,7 @@ static int init_after_reset(struct hfi1_devdata *dd)
static void enable_chip(struct hfi1_devdata *dd)
{
+ struct hfi1_ctxtdata *rcd;
u32 rcvmask;
u16 i;
@@ -699,17 +732,21 @@ static void enable_chip(struct hfi1_devdata *dd)
* Other ctxts done as user opens and initializes them.
*/
for (i = 0; i < dd->first_dyn_alloc_ctxt; ++i) {
+ rcd = hfi1_rcd_get_by_index(dd, i);
+ if (!rcd)
+ continue;
rcvmask = HFI1_RCVCTRL_CTXT_ENB | HFI1_RCVCTRL_INTRAVAIL_ENB;
- rcvmask |= HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, DMA_RTAIL) ?
+ rcvmask |= HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL) ?
HFI1_RCVCTRL_TAILUPD_ENB : HFI1_RCVCTRL_TAILUPD_DIS;
- if (!HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, MULTI_PKT_EGR))
+ if (!HFI1_CAP_KGET_MASK(rcd->flags, MULTI_PKT_EGR))
rcvmask |= HFI1_RCVCTRL_ONE_PKT_EGR_ENB;
- if (HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, NODROP_RHQ_FULL))
+ if (HFI1_CAP_KGET_MASK(rcd->flags, NODROP_RHQ_FULL))
rcvmask |= HFI1_RCVCTRL_NO_RHQ_DROP_ENB;
- if (HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, NODROP_EGR_FULL))
+ if (HFI1_CAP_KGET_MASK(rcd->flags, NODROP_EGR_FULL))
rcvmask |= HFI1_RCVCTRL_NO_EGR_DROP_ENB;
- hfi1_rcvctrl(dd, rcvmask, dd->rcd[i]);
- sc_enable(dd->rcd[i]->sc);
+ hfi1_rcvctrl(dd, rcvmask, rcd);
+ sc_enable(rcd->sc);
+ hfi1_rcd_put(rcd);
}
}
@@ -854,7 +891,7 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
* existing, and re-allocate.
* Need to re-create rest of ctxt 0 ctxtdata as well.
*/
- rcd = dd->rcd[i];
+ rcd = hfi1_rcd_get_by_index(dd, i);
if (!rcd)
continue;
@@ -868,6 +905,7 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
"failed to allocate kernel ctxt's rcvhdrq and/or egr bufs\n");
ret = lastfail;
}
+ hfi1_rcd_put(rcd);
}
/* Allocate enough memory for user event notification. */
@@ -987,6 +1025,7 @@ static void stop_timers(struct hfi1_devdata *dd)
static void shutdown_device(struct hfi1_devdata *dd)
{
struct hfi1_pportdata *ppd;
+ struct hfi1_ctxtdata *rcd;
unsigned pidx;
int i;
@@ -1005,12 +1044,15 @@ static void shutdown_device(struct hfi1_devdata *dd)
for (pidx = 0; pidx < dd->num_pports; ++pidx) {
ppd = dd->pport + pidx;
- for (i = 0; i < dd->num_rcv_contexts; i++)
+ for (i = 0; i < dd->num_rcv_contexts; i++) {
+ rcd = hfi1_rcd_get_by_index(dd, i);
hfi1_rcvctrl(dd, HFI1_RCVCTRL_TAILUPD_DIS |
HFI1_RCVCTRL_CTXT_DIS |
HFI1_RCVCTRL_INTRAVAIL_DIS |
HFI1_RCVCTRL_PKEY_DIS |
- HFI1_RCVCTRL_ONE_PKT_EGR_DIS, dd->rcd[i]);
+ HFI1_RCVCTRL_ONE_PKT_EGR_DIS, rcd);
+ hfi1_rcd_put(rcd);
+ }
/*
* Gracefully stop all sends allowing any in progress to
* trickle out first.
@@ -1450,8 +1492,6 @@ static void cleanup_device_data(struct hfi1_devdata *dd)
{
int ctxt;
int pidx;
- struct hfi1_ctxtdata **tmp;
- unsigned long flags;
/* users can't do anything more with chip */
for (pidx = 0; pidx < dd->num_pports; ++pidx) {
@@ -1476,18 +1516,6 @@ static void cleanup_device_data(struct hfi1_devdata *dd)
free_credit_return(dd);
- /*
- * Free any resources still in use (usually just kernel contexts)
- * at unload; we do for ctxtcnt, because that's what we allocate.
- * We acquire lock to be really paranoid that rcd isn't being
- * accessed from some interrupt-related code (that should not happen,
- * but best to be sure).
- */
- spin_lock_irqsave(&dd->uctxt_lock, flags);
- tmp = dd->rcd;
- dd->rcd = NULL;
- spin_unlock_irqrestore(&dd->uctxt_lock, flags);
-
if (dd->rcvhdrtail_dummy_kvaddr) {
dma_free_coherent(&dd->pcidev->dev, sizeof(u64),
(void *)dd->rcvhdrtail_dummy_kvaddr,
@@ -1495,16 +1523,22 @@ static void cleanup_device_data(struct hfi1_devdata *dd)
dd->rcvhdrtail_dummy_kvaddr = NULL;
}
- for (ctxt = 0; tmp && ctxt < dd->num_rcv_contexts; ctxt++) {
- struct hfi1_ctxtdata *rcd = tmp[ctxt];
+ /*
+ * Free any resources still in use (usually just kernel contexts)
+ * at unload; we do for ctxtcnt, because that's what we allocate.
+ */
+ for (ctxt = 0; dd->rcd && ctxt < dd->num_rcv_contexts; ctxt++) {
+ struct hfi1_ctxtdata *rcd = dd->rcd[ctxt];
- tmp[ctxt] = NULL; /* debugging paranoia */
if (rcd) {
hfi1_clear_tids(rcd);
- hfi1_rcd_put(rcd);
+ hfi1_free_ctxt(rcd);
}
}
- kfree(tmp);
+
+ kfree(dd->rcd);
+ dd->rcd = NULL;
+
free_pio_map(dd);
/* must follow rcv context free - need to remove rcv's hooks */
for (ctxt = 0; ctxt < dd->num_send_contexts; ctxt++)
diff --git a/drivers/infiniband/hw/hfi1/trace_rx.h b/drivers/infiniband/hw/hfi1/trace_rx.h
index bebf0a8..f9909d2 100644
--- a/drivers/infiniband/hw/hfi1/trace_rx.h
+++ b/drivers/infiniband/hw/hfi1/trace_rx.h
@@ -114,24 +114,24 @@
);
TRACE_EVENT(hfi1_receive_interrupt,
- TP_PROTO(struct hfi1_devdata *dd, u16 ctxt),
- TP_ARGS(dd, ctxt),
+ TP_PROTO(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd),
+ TP_ARGS(dd, rcd),
TP_STRUCT__entry(DD_DEV_ENTRY(dd)
__field(u32, ctxt)
__field(u8, slow_path)
__field(u8, dma_rtail)
),
TP_fast_assign(DD_DEV_ASSIGN(dd);
- __entry->ctxt = ctxt;
- if (dd->rcd[ctxt]->do_interrupt ==
+ __entry->ctxt = rcd->ctxt;
+ if (rcd->do_interrupt ==
&handle_receive_interrupt) {
__entry->slow_path = 1;
__entry->dma_rtail = 0xFF;
- } else if (dd->rcd[ctxt]->do_interrupt ==
+ } else if (rcd->do_interrupt ==
&handle_receive_interrupt_dma_rtail){
__entry->dma_rtail = 1;
__entry->slow_path = 0;
- } else if (dd->rcd[ctxt]->do_interrupt ==
+ } else if (rcd->do_interrupt ==
&handle_receive_interrupt_nodma_rtail) {
__entry->dma_rtail = 0;
__entry->slow_path = 0;
diff --git a/drivers/infiniband/hw/hfi1/vnic_main.c b/drivers/infiniband/hw/hfi1/vnic_main.c
index c91456c..2917a23 100644
--- a/drivers/infiniband/hw/hfi1/vnic_main.c
+++ b/drivers/infiniband/hw/hfi1/vnic_main.c
@@ -146,11 +146,7 @@ static int allocate_vnic_ctxt(struct hfi1_devdata *dd,
return ret;
bail:
- /*
- * hfi1_free_ctxt() will call hfi1_free_ctxtdata(), which will
- * release send_context structure if uctxt->sc is not null
- */
- hfi1_free_ctxt(dd, uctxt);
+ hfi1_free_ctxt(uctxt);
dd_dev_dbg(dd, "vnic allocation failed. rc %d\n", ret);
return ret;
}
@@ -158,15 +154,12 @@ static int allocate_vnic_ctxt(struct hfi1_devdata *dd,
static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
struct hfi1_ctxtdata *uctxt)
{
- unsigned long flags;
-
dd_dev_dbg(dd, "closing vnic context %d\n", uctxt->ctxt);
flush_wc();
if (dd->num_msix_entries)
hfi1_reset_vnic_msix_info(uctxt);
- spin_lock_irqsave(&dd->uctxt_lock, flags);
/*
* Disable receive context and interrupt available, reset all
* RcvCtxtCtrl bits to default values.
@@ -189,7 +182,6 @@ static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
sc_disable(uctxt->sc);
dd->send_contexts[uctxt->sc->sw_index].type = SC_USER;
- spin_unlock_irqrestore(&dd->uctxt_lock, flags);
uctxt->event_flags = 0;
@@ -198,7 +190,7 @@ static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
hfi1_stats.sps_ctxts--;
- hfi1_free_ctxt(dd, uctxt);
+ hfi1_free_ctxt(uctxt);
}
void hfi1_vnic_setup(struct hfi1_devdata *dd)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 07/27] IB/core: Convert ah_attr from OPA to IB when copying to user
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (4 preceding siblings ...)
2017-08-04 20:52 ` [PATCH for-next 06/27] IB/hf1: User context locking is inconsistent Dennis Dalessandro
@ 2017-08-04 20:52 ` Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 08/27] IB/srpt: Increase lid and sm_lid to 32 bits Dennis Dalessandro
` (19 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
OPA address handle atttibutes that have 32 bit LIDs would have to
be converted to IB address handle attribute with the LID field
programmed in the GID before copying to user space.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/ucm.c | 2 +
drivers/infiniband/core/ucma.c | 10 ++++--
drivers/infiniband/core/uverbs_marshall.c | 48 ++++++++++++++++++++++++++---
include/rdma/ib_marshall.h | 6 ++--
4 files changed, 54 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
index 112099c..f2a7f62 100644
--- a/drivers/infiniband/core/ucm.c
+++ b/drivers/infiniband/core/ucm.c
@@ -618,7 +618,7 @@ static ssize_t ib_ucm_init_qp_attr(struct ib_ucm_file *file,
if (result)
goto out;
- ib_copy_qp_attr_to_user(&resp, &qp_attr);
+ ib_copy_qp_attr_to_user(ctx->cm_id->device, &resp, &qp_attr);
if (copy_to_user((void __user *)(unsigned long)cmd.response,
&resp, sizeof(resp)))
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 276f0ef..eb85b54 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -248,14 +248,15 @@ static void ucma_copy_conn_event(struct rdma_ucm_conn_param *dst,
dst->qp_num = src->qp_num;
}
-static void ucma_copy_ud_event(struct rdma_ucm_ud_param *dst,
+static void ucma_copy_ud_event(struct ib_device *device,
+ struct rdma_ucm_ud_param *dst,
struct rdma_ud_param *src)
{
if (src->private_data_len)
memcpy(dst->private_data, src->private_data,
src->private_data_len);
dst->private_data_len = src->private_data_len;
- ib_copy_ah_attr_to_user(&dst->ah_attr, &src->ah_attr);
+ ib_copy_ah_attr_to_user(device, &dst->ah_attr, &src->ah_attr);
dst->qp_num = src->qp_num;
dst->qkey = src->qkey;
}
@@ -335,7 +336,8 @@ static int ucma_event_handler(struct rdma_cm_id *cm_id,
uevent->resp.event = event->event;
uevent->resp.status = event->status;
if (cm_id->qp_type == IB_QPT_UD)
- ucma_copy_ud_event(&uevent->resp.param.ud, &event->param.ud);
+ ucma_copy_ud_event(cm_id->device, &uevent->resp.param.ud,
+ &event->param.ud);
else
ucma_copy_conn_event(&uevent->resp.param.conn,
&event->param.conn);
@@ -1157,7 +1159,7 @@ static ssize_t ucma_init_qp_attr(struct ucma_file *file,
if (ret)
goto out;
- ib_copy_qp_attr_to_user(&resp, &qp_attr);
+ ib_copy_qp_attr_to_user(ctx->cm_id->device, &resp, &qp_attr);
if (copy_to_user((void __user *)(unsigned long)cmd.response,
&resp, sizeof(resp)))
ret = -EFAULT;
diff --git a/drivers/infiniband/core/uverbs_marshall.c b/drivers/infiniband/core/uverbs_marshall.c
index 94fd989..bd0acf3 100644
--- a/drivers/infiniband/core/uverbs_marshall.c
+++ b/drivers/infiniband/core/uverbs_marshall.c
@@ -33,10 +33,47 @@
#include <linux/export.h>
#include <rdma/ib_marshall.h>
-void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
- struct rdma_ah_attr *src)
+#define OPA_DEFAULT_GID_PREFIX cpu_to_be64(0xfe80000000000000ULL)
+static int rdma_ah_conv_opa_to_ib(struct ib_device *dev,
+ struct rdma_ah_attr *ib,
+ struct rdma_ah_attr *opa)
{
+ struct ib_port_attr port_attr;
+ int ret = 0;
+
+ /* Do structure copy and the over-write fields */
+ *ib = *opa;
+
+ ib->type = RDMA_AH_ATTR_TYPE_IB;
+ rdma_ah_set_grh(ib, NULL, 0, 0, 1, 0);
+
+ if (ib_query_port(dev, opa->port_num, &port_attr)) {
+ /* Set to default subnet to indicate error */
+ rdma_ah_set_subnet_prefix(ib, OPA_DEFAULT_GID_PREFIX);
+ ret = -EINVAL;
+ } else {
+ rdma_ah_set_subnet_prefix(ib,
+ cpu_to_be64(port_attr.subnet_prefix));
+ }
+ rdma_ah_set_interface_id(ib, OPA_MAKE_ID(rdma_ah_get_dlid(opa)));
+ return ret;
+}
+
+void ib_copy_ah_attr_to_user(struct ib_device *device,
+ struct ib_uverbs_ah_attr *dst,
+ struct rdma_ah_attr *ah_attr)
+{
+ struct rdma_ah_attr *src = ah_attr;
+ struct rdma_ah_attr conv_ah;
+
memset(&dst->grh.reserved, 0, sizeof(dst->grh.reserved));
+
+ if ((ah_attr->type == RDMA_AH_ATTR_TYPE_OPA) &&
+ (rdma_ah_get_dlid(ah_attr) >=
+ be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
+ (!rdma_ah_conv_opa_to_ib(device, &conv_ah, ah_attr)))
+ src = &conv_ah;
+
dst->dlid = rdma_ah_get_dlid(src);
dst->sl = rdma_ah_get_sl(src);
dst->src_path_bits = rdma_ah_get_path_bits(src);
@@ -57,7 +94,8 @@ void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
}
EXPORT_SYMBOL(ib_copy_ah_attr_to_user);
-void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
+void ib_copy_qp_attr_to_user(struct ib_device *device,
+ struct ib_uverbs_qp_attr *dst,
struct ib_qp_attr *src)
{
dst->qp_state = src->qp_state;
@@ -76,8 +114,8 @@ void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
dst->max_recv_sge = src->cap.max_recv_sge;
dst->max_inline_data = src->cap.max_inline_data;
- ib_copy_ah_attr_to_user(&dst->ah_attr, &src->ah_attr);
- ib_copy_ah_attr_to_user(&dst->alt_ah_attr, &src->alt_ah_attr);
+ ib_copy_ah_attr_to_user(device, &dst->ah_attr, &src->ah_attr);
+ ib_copy_ah_attr_to_user(device, &dst->alt_ah_attr, &src->alt_ah_attr);
dst->pkey_index = src->pkey_index;
dst->alt_pkey_index = src->alt_pkey_index;
diff --git a/include/rdma/ib_marshall.h b/include/rdma/ib_marshall.h
index 68cef3b..8ebf84a 100644
--- a/include/rdma/ib_marshall.h
+++ b/include/rdma/ib_marshall.h
@@ -38,10 +38,12 @@
#include <rdma/ib_user_verbs.h>
#include <rdma/ib_user_sa.h>
-void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
+void ib_copy_qp_attr_to_user(struct ib_device *device,
+ struct ib_uverbs_qp_attr *dst,
struct ib_qp_attr *src);
-void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
+void ib_copy_ah_attr_to_user(struct ib_device *device,
+ struct ib_uverbs_ah_attr *dst,
struct rdma_ah_attr *src);
void ib_copy_path_rec_to_user(struct ib_user_path_rec *dst,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 08/27] IB/srpt: Increase lid and sm_lid to 32 bits
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (5 preceding siblings ...)
2017-08-04 20:52 ` [PATCH for-next 07/27] IB/core: Convert ah_attr from OPA to IB when copying to user Dennis Dalessandro
@ 2017-08-04 20:52 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 09/27] IB/IPoIB: Increase local_lid " Dennis Dalessandro
` (18 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:52 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
srpt contains lid and sm_lid fields which are 16 bits in
length, increase them to 32 bits.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/ulp/srpt/ib_srpt.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h
index cc11838..1b817e5 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.h
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.h
@@ -328,8 +328,8 @@ struct srpt_port {
u8 port_guid[24];
u8 port_gid[64];
u8 port;
- u16 sm_lid;
- u16 lid;
+ u32 sm_lid;
+ u32 lid;
union ib_gid gid;
struct work_struct work;
struct se_portal_group port_guid_tpg;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 09/27] IB/IPoIB: Increase local_lid to 32 bits
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (6 preceding siblings ...)
2017-08-04 20:52 ` [PATCH for-next 08/27] IB/srpt: Increase lid and sm_lid to 32 bits Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 10/27] IB/mad: Change slid in RMPP recv from 16 " Dennis Dalessandro
` (17 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
IPoIB contains local_lid field which is 16 bits in
length, increase it to 32 bits.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/ulp/ipoib/ipoib.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index ff50a7b..9e73810 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -366,7 +366,7 @@ struct ipoib_dev_priv {
u32 qkey;
union ib_gid local_gid;
- u16 local_lid;
+ u32 local_lid;
unsigned int admin_mtu;
unsigned int mcast_mtu;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 10/27] IB/mad: Change slid in RMPP recv from 16 to 32 bits
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (7 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 09/27] IB/IPoIB: Increase local_lid " Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 11/27] IB/core: Change port_attr.lid size " Dennis Dalessandro
` (16 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
MAD RMPP contains slid field which is 16 bits in
length, increase it to 32 bits.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/mad_rmpp.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 0d3cca0..e5cf09c 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -64,7 +64,7 @@ struct mad_rmpp_recv {
__be64 tid;
u32 src_qp;
- u16 slid;
+ u32 slid;
u8 mgmt_class;
u8 class_version;
u8 method;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 11/27] IB/core: Change port_attr.lid size from 16 to 32 bits
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (8 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 10/27] IB/mad: Change slid in RMPP recv from 16 " Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid " Dennis Dalessandro
` (15 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
lid field in struct ib_port_attr is increased to 32 bits. This enables core
components to use larger LIDs if needed.
The user ABI is unchanged and return 16 bit values when queried.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/core_priv.h | 1 +
drivers/infiniband/core/uverbs_cmd.c | 5 ++++-
drivers/infiniband/hw/mlx4/alias_GUID.c | 2 +-
drivers/infiniband/hw/mlx4/mad.c | 2 +-
drivers/infiniband/hw/mthca/mthca_mad.c | 2 +-
include/rdma/ib_verbs.h | 2 +-
include/rdma/opa_addr.h | 3 ++-
7 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 11ae675..9d4f17f 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -39,6 +39,7 @@
#include <rdma/ib_verbs.h>
#include <rdma/ib_mad.h>
+#include <rdma/opa_addr.h>
#include "mad_priv.h"
struct pkey_index_qp_list {
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 60535c7..7ef74b0 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -275,8 +275,11 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
resp.bad_pkey_cntr = attr.bad_pkey_cntr;
resp.qkey_viol_cntr = attr.qkey_viol_cntr;
resp.pkey_tbl_len = attr.pkey_tbl_len;
- resp.lid = attr.lid;
resp.sm_lid = attr.sm_lid;
+ if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
+ resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
+ else
+ resp.lid = (u16)attr.lid;
resp.lmc = attr.lmc;
resp.max_vl_num = attr.max_vl_num;
resp.sm_sl = attr.sm_sl;
diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c b/drivers/infiniband/hw/mlx4/alias_GUID.c
index ea24230..5a897b0 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -528,7 +528,7 @@ static int set_guid_rec(struct ib_device *ibdev,
memset(&guid_info_rec, 0, sizeof (struct ib_sa_guidinfo_rec));
- guid_info_rec.lid = cpu_to_be16(attr.lid);
+ guid_info_rec.lid = cpu_to_be16((u16)attr.lid);
guid_info_rec.block_num = index;
memcpy(guid_info_rec.guid_info_list, rec_det->all_recs,
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 21d31cb..00f0570 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -860,7 +860,7 @@ static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
in_mad->mad_hdr.method == IB_MGMT_METHOD_SET &&
in_mad->mad_hdr.attr_id == IB_SMP_ATTR_PORT_INFO &&
!ib_query_port(ibdev, port_num, &pattr))
- prev_lid = pattr.lid;
+ prev_lid = (u16)pattr.lid;
err = mlx4_MAD_IFC(to_mdev(ibdev),
(mad_flags & IB_MAD_IGNORE_MKEY ? MLX4_MAD_IFC_IGNORE_MKEY : 0) |
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c
index 7df3db7..617531f 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -256,7 +256,7 @@ int mthca_process_mad(struct ib_device *ibdev,
in_mad->mad_hdr.method == IB_MGMT_METHOD_SET &&
in_mad->mad_hdr.attr_id == IB_SMP_ATTR_PORT_INFO &&
!ib_query_port(ibdev, port_num, &pattr))
- prev_lid = pattr.lid;
+ prev_lid = (u16)pattr.lid;
err = mthca_MAD_IFC(to_mdev(ibdev),
mad_flags & IB_MAD_IGNORE_MKEY,
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 1082b4c..4eccf89 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -549,8 +549,8 @@ struct ib_port_attr {
u32 bad_pkey_cntr;
u32 qkey_viol_cntr;
u16 pkey_tbl_len;
- u16 lid;
u16 sm_lid;
+ u32 lid;
u8 lmc;
u8 max_vl_num;
u8 sm_sl;
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index eace28f..46d0567 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -50,7 +50,8 @@
#define OPA_SPECIAL_OUI (0x00066AULL)
#define OPA_MAKE_ID(x) (cpu_to_be64(OPA_SPECIAL_OUI << 40 | (x)))
-
+#define OPA_TO_IB_UCAST_LID(x) (((x) >= be16_to_cpu(IB_MULTICAST_LID_BASE)) \
+ ? 0 : x)
/**
* ib_is_opa_gid: Returns true if the top 24 bits of the gid
* contains the OPA_STL_OUI identifier. This identifies that
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (9 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 11/27] IB/core: Change port_attr.lid size " Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
[not found] ` <20170804205320.17853.77236.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-08-04 20:53 ` [PATCH for-next 13/27] IB/core: Change wc.slid " Dennis Dalessandro
` (14 subsequent siblings)
25 siblings, 1 reply; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
sm_lid field in struct ib_port_attr is increased to 32 bits. This
enables core components to use larger LIDs if needed.
The user ABI is unchanged and return 16 bit values when queried.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
include/rdma/ib_verbs.h | 2 +-
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 7ef74b0..38dce45 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
resp.bad_pkey_cntr = attr.bad_pkey_cntr;
resp.qkey_viol_cntr = attr.qkey_viol_cntr;
resp.pkey_tbl_len = attr.pkey_tbl_len;
- resp.sm_lid = attr.sm_lid;
- if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
+ if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
- else
+ resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
+ } else {
resp.lid = (u16)attr.lid;
+ resp.sm_lid = (u16)attr.sm_lid;
+ }
resp.lmc = attr.lmc;
resp.max_vl_num = attr.max_vl_num;
resp.sm_sl = attr.sm_sl;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 4eccf89..5f4f2d3 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -549,7 +549,7 @@ struct ib_port_attr {
u32 bad_pkey_cntr;
u32 qkey_viol_cntr;
u16 pkey_tbl_len;
- u16 sm_lid;
+ u32 sm_lid;
u32 lid;
u8 lmc;
u8 max_vl_num;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 13/27] IB/core: Change wc.slid from 16 to 32 bits
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (10 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid " Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 14/27] IB/CM: Add OPA Path record support to CM Dennis Dalessandro
` (13 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
slid field in struct ib_wc is increased to 32 bits.
This enables core components to use larger LIDs if needed.
The user ABI is unchanged and return 16 bit values when queried.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/cm.c | 4 ++--
drivers/infiniband/core/user_mad.c | 2 +-
drivers/infiniband/core/uverbs_cmd.c | 10 +++++++---
drivers/infiniband/hw/hfi1/mad.c | 2 +-
drivers/infiniband/hw/mlx4/mad.c | 6 +++---
drivers/infiniband/hw/mlx5/mad.c | 2 +-
drivers/infiniband/hw/mthca/mthca_cmd.c | 4 ++--
drivers/infiniband/hw/mthca/mthca_mad.c | 2 +-
drivers/infiniband/sw/rdmavt/cq.c | 2 +-
include/rdma/ib_verbs.h | 14 +++++++++++++-
10 files changed, 32 insertions(+), 16 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 2b4d613..b39ee16 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1703,7 +1703,7 @@ static void cm_process_routed_req(struct cm_req_msg *req_msg, struct ib_wc *wc)
{
if (!cm_req_get_primary_subnet_local(req_msg)) {
if (req_msg->primary_local_lid == IB_LID_PERMISSIVE) {
- req_msg->primary_local_lid = cpu_to_be16(wc->slid);
+ req_msg->primary_local_lid = ib_slid_be16(wc->slid);
cm_req_set_primary_sl(req_msg, wc->sl);
}
@@ -1713,7 +1713,7 @@ static void cm_process_routed_req(struct cm_req_msg *req_msg, struct ib_wc *wc)
if (!cm_req_get_alt_subnet_local(req_msg)) {
if (req_msg->alt_local_lid == IB_LID_PERMISSIVE) {
- req_msg->alt_local_lid = cpu_to_be16(wc->slid);
+ req_msg->alt_local_lid = ib_slid_be16(wc->slid);
cm_req_set_alt_sl(req_msg, wc->sl);
}
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 36a6f5c..ff3c67a 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -229,7 +229,7 @@ static void recv_handler(struct ib_mad_agent *agent,
packet->mad.hdr.status = 0;
packet->mad.hdr.length = hdr_size(file) + mad_recv_wc->mad_len;
packet->mad.hdr.qpn = cpu_to_be32(mad_recv_wc->wc->src_qp);
- packet->mad.hdr.lid = cpu_to_be16(mad_recv_wc->wc->slid);
+ packet->mad.hdr.lid = ib_slid_be16(mad_recv_wc->wc->slid);
packet->mad.hdr.sl = mad_recv_wc->wc->sl;
packet->mad.hdr.path_bits = mad_recv_wc->wc->dlid_path_bits;
packet->mad.hdr.pkey_index = mad_recv_wc->wc->pkey_index;
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 38dce45..670176b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1190,7 +1190,8 @@ ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file,
return ret ? ret : in_len;
}
-static int copy_wc_to_user(void __user *dest, struct ib_wc *wc)
+static int copy_wc_to_user(struct ib_device *ib_dev, void __user *dest,
+ struct ib_wc *wc)
{
struct ib_uverbs_wc tmp;
@@ -1204,7 +1205,10 @@ static int copy_wc_to_user(void __user *dest, struct ib_wc *wc)
tmp.src_qp = wc->src_qp;
tmp.wc_flags = wc->wc_flags;
tmp.pkey_index = wc->pkey_index;
- tmp.slid = wc->slid;
+ if (rdma_cap_opa_ah(ib_dev, wc->port_num))
+ tmp.slid = OPA_TO_IB_UCAST_LID(wc->slid);
+ else
+ tmp.slid = ib_slid_cpu16(wc->slid);
tmp.sl = wc->sl;
tmp.dlid_path_bits = wc->dlid_path_bits;
tmp.port_num = wc->port_num;
@@ -1248,7 +1252,7 @@ ssize_t ib_uverbs_poll_cq(struct ib_uverbs_file *file,
if (!ret)
break;
- ret = copy_wc_to_user(data_ptr, &wc);
+ ret = copy_wc_to_user(ib_dev, data_ptr, &wc);
if (ret)
goto out_put;
diff --git a/drivers/infiniband/hw/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
index 885d9fd..1d54568 100644
--- a/drivers/infiniband/hw/hfi1/mad.c
+++ b/drivers/infiniband/hw/hfi1/mad.c
@@ -4216,7 +4216,7 @@ static int opa_local_smp_check(struct hfi1_ibport *ibp,
const struct ib_wc *in_wc)
{
struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
- u16 slid = in_wc->slid;
+ u16 slid = ib_slid_cpu16(in_wc->slid);
u16 pkey;
if (in_wc->pkey_index >= ARRAY_SIZE(ppd->pkeys))
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 00f0570..04fb44e 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -169,7 +169,7 @@ int mlx4_MAD_IFC(struct mlx4_ib_dev *dev, int mad_ifc_flags,
op_modifier |= 0x4;
- in_modifier |= in_wc->slid << 16;
+ in_modifier |= ib_slid_cpu16(in_wc->slid) << 16;
}
err = mlx4_cmd_box(dev->dev, inmailbox->dma, outmailbox->dma, in_modifier,
@@ -625,7 +625,7 @@ int mlx4_ib_send_to_slave(struct mlx4_ib_dev *dev, int slave, u8 port,
memcpy((char *)&tun_mad->hdr.slid_mac_47_32, &(wc->smac[4]), 2);
} else {
tun_mad->hdr.sl_vid = cpu_to_be16(((u16)(wc->sl)) << 12);
- tun_mad->hdr.slid_mac_47_32 = cpu_to_be16(wc->slid);
+ tun_mad->hdr.slid_mac_47_32 = ib_slid_be16(wc->slid);
}
ib_dma_sync_single_for_device(&dev->ib_dev,
@@ -826,7 +826,7 @@ static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
}
}
- slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+ slid = in_wc ? ib_slid_cpu16(in_wc->slid) : be16_to_cpu(IB_LID_PERMISSIVE);
if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP && slid == 0) {
forward_trap(to_mdev(ibdev), port_num, in_mad);
diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c
index 95db929..cd2264a 100644
--- a/drivers/infiniband/hw/mlx5/mad.c
+++ b/drivers/infiniband/hw/mlx5/mad.c
@@ -78,7 +78,7 @@ static int process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
u16 slid;
int err;
- slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+ slid = in_wc ? ib_slid_cpu16(in_wc->slid) : be16_to_cpu(IB_LID_PERMISSIVE);
if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP && slid == 0)
return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED;
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index 9d83a53..e19ae0b 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -1921,7 +1921,7 @@ int mthca_MAD_IFC(struct mthca_dev *dev, int ignore_mkey, int ignore_bkey,
(in_wc->wc_flags & IB_WC_GRH ? 0x80 : 0);
MTHCA_PUT(inbox, val, MAD_IFC_G_PATH_OFFSET);
- MTHCA_PUT(inbox, in_wc->slid, MAD_IFC_RLID_OFFSET);
+ MTHCA_PUT(inbox, ib_slid_cpu16(in_wc->slid), MAD_IFC_RLID_OFFSET);
MTHCA_PUT(inbox, in_wc->pkey_index, MAD_IFC_PKEY_OFFSET);
if (in_grh)
@@ -1929,7 +1929,7 @@ int mthca_MAD_IFC(struct mthca_dev *dev, int ignore_mkey, int ignore_bkey,
op_modifier |= 0x4;
- in_modifier |= in_wc->slid << 16;
+ in_modifier |= ib_slid_cpu16(in_wc->slid) << 16;
}
err = mthca_cmd_box(dev, inmailbox->dma, outmailbox->dma,
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c
index 617531f..a9caada 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -205,7 +205,7 @@ int mthca_process_mad(struct ib_device *ibdev,
u16 *out_mad_pkey_index)
{
int err;
- u16 slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+ u16 slid = in_wc ? ib_slid_cpu16(in_wc->slid) : be16_to_cpu(IB_LID_PERMISSIVE);
u16 prev_lid = 0;
struct ib_port_attr pattr;
const struct ib_mad *in_mad = (const struct ib_mad *)in;
diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 0ae2ff8..0335a3d 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -107,7 +107,7 @@ void rvt_cq_enter(struct rvt_cq *cq, struct ib_wc *entry, bool solicited)
wc->uqueue[head].src_qp = entry->src_qp;
wc->uqueue[head].wc_flags = entry->wc_flags;
wc->uqueue[head].pkey_index = entry->pkey_index;
- wc->uqueue[head].slid = entry->slid;
+ wc->uqueue[head].slid = ib_slid_cpu16(entry->slid);
wc->uqueue[head].sl = entry->sl;
wc->uqueue[head].dlid_path_bits = entry->dlid_path_bits;
wc->uqueue[head].port_num = entry->port_num;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5f4f2d3..3cb31e4 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -951,7 +951,7 @@ struct ib_wc {
u32 src_qp;
int wc_flags;
u16 pkey_index;
- u16 slid;
+ u32 slid;
u8 sl;
u8 dlid_path_bits;
u8 port_num; /* valid only for DR SMPs on switches */
@@ -3717,4 +3717,16 @@ static inline enum rdma_ah_attr_type rdma_ah_find_type(struct ib_device *dev,
else
return RDMA_AH_ATTR_TYPE_IB;
}
+
+/* Return slid in 16bit CPU encoding */
+static inline u16 ib_slid_cpu16(u32 slid)
+{
+ return (u16)slid;
+}
+
+/* Return slid in 16bit BE encoding */
+static inline u16 ib_slid_be16(u32 slid)
+{
+ return cpu_to_be16((u16)slid);
+}
#endif /* IB_VERBS_H */
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 14/27] IB/CM: Add OPA Path record support to CM
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (11 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 13/27] IB/core: Change wc.slid " Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 15/27] IB/CM: Create appropriate path records when handling CM request Dennis Dalessandro
` (12 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Add OPA path record support to the Connection Manager.
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/cm.c | 50 +++++++++++++++++++++++++++++++++++-------
include/rdma/opa_addr.h | 18 +++++++++++++++
2 files changed, 60 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index b39ee16..885c429 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1175,6 +1175,11 @@ static void cm_format_req(struct cm_req_msg *req_msg,
{
struct sa_path_rec *pri_path = param->primary_path;
struct sa_path_rec *alt_path = param->alternate_path;
+ bool pri_ext = false;
+
+ if (pri_path->rec_type == SA_PATH_REC_TYPE_OPA)
+ pri_ext = opa_is_extended_lid(pri_path->opa.dlid,
+ pri_path->opa.slid);
cm_format_mad_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_REQ));
@@ -1202,18 +1207,24 @@ static void cm_format_req(struct cm_req_msg *req_msg,
cm_req_set_srq(req_msg, param->srq);
}
+ req_msg->primary_local_gid = pri_path->sgid;
+ req_msg->primary_remote_gid = pri_path->dgid;
+ if (pri_ext) {
+ req_msg->primary_local_gid.global.interface_id
+ = OPA_MAKE_ID(be32_to_cpu(pri_path->opa.slid));
+ req_msg->primary_remote_gid.global.interface_id
+ = OPA_MAKE_ID(be32_to_cpu(pri_path->opa.dlid));
+ }
if (pri_path->hop_limit <= 1) {
- req_msg->primary_local_lid =
+ req_msg->primary_local_lid = pri_ext ? 0 :
htons(ntohl(sa_path_get_slid(pri_path)));
- req_msg->primary_remote_lid =
+ req_msg->primary_remote_lid = pri_ext ? 0 :
htons(ntohl(sa_path_get_dlid(pri_path)));
} else {
/* Work-around until there's a way to obtain remote LID info */
req_msg->primary_local_lid = IB_LID_PERMISSIVE;
req_msg->primary_remote_lid = IB_LID_PERMISSIVE;
}
- req_msg->primary_local_gid = pri_path->sgid;
- req_msg->primary_remote_gid = pri_path->dgid;
cm_req_set_primary_flow_label(req_msg, pri_path->flow_label);
cm_req_set_primary_packet_rate(req_msg, pri_path->rate);
req_msg->primary_traffic_class = pri_path->traffic_class;
@@ -1225,17 +1236,29 @@ static void cm_format_req(struct cm_req_msg *req_msg,
pri_path->packet_life_time));
if (alt_path) {
+ bool alt_ext = false;
+
+ if (alt_path->rec_type == SA_PATH_REC_TYPE_OPA)
+ alt_ext = opa_is_extended_lid(alt_path->opa.dlid,
+ alt_path->opa.slid);
+
+ req_msg->alt_local_gid = alt_path->sgid;
+ req_msg->alt_remote_gid = alt_path->dgid;
+ if (alt_ext) {
+ req_msg->alt_local_gid.global.interface_id
+ = OPA_MAKE_ID(be32_to_cpu(alt_path->opa.slid));
+ req_msg->alt_remote_gid.global.interface_id
+ = OPA_MAKE_ID(be32_to_cpu(alt_path->opa.dlid));
+ }
if (alt_path->hop_limit <= 1) {
- req_msg->alt_local_lid =
+ req_msg->alt_local_lid = alt_ext ? 0 :
htons(ntohl(sa_path_get_slid(alt_path)));
- req_msg->alt_remote_lid =
+ req_msg->alt_remote_lid = alt_ext ? 0 :
htons(ntohl(sa_path_get_dlid(alt_path)));
} else {
req_msg->alt_local_lid = IB_LID_PERMISSIVE;
req_msg->alt_remote_lid = IB_LID_PERMISSIVE;
}
- req_msg->alt_local_gid = alt_path->sgid;
- req_msg->alt_remote_gid = alt_path->dgid;
cm_req_set_alt_flow_label(req_msg,
alt_path->flow_label);
cm_req_set_alt_packet_rate(req_msg, alt_path->rate);
@@ -2843,6 +2866,11 @@ static void cm_format_lap(struct cm_lap_msg *lap_msg,
const void *private_data,
u8 private_data_len)
{
+ bool alt_ext = false;
+
+ if (alternate_path->rec_type == SA_PATH_REC_TYPE_OPA)
+ alt_ext = opa_is_extended_lid(alternate_path->opa.dlid,
+ alternate_path->opa.slid);
cm_format_mad_hdr(&lap_msg->hdr, CM_LAP_ATTR_ID,
cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_LAP));
lap_msg->local_comm_id = cm_id_priv->id.local_id;
@@ -2856,6 +2884,12 @@ static void cm_format_lap(struct cm_lap_msg *lap_msg,
htons(ntohl(sa_path_get_dlid(alternate_path)));
lap_msg->alt_local_gid = alternate_path->sgid;
lap_msg->alt_remote_gid = alternate_path->dgid;
+ if (alt_ext) {
+ lap_msg->alt_local_gid.global.interface_id
+ = OPA_MAKE_ID(be32_to_cpu(alternate_path->opa.slid));
+ lap_msg->alt_remote_gid.global.interface_id
+ = OPA_MAKE_ID(be32_to_cpu(alternate_path->opa.dlid));
+ }
cm_lap_set_flow_label(lap_msg, alternate_path->flow_label);
cm_lap_set_traffic_class(lap_msg, alternate_path->traffic_class);
lap_msg->alt_hop_limit = alternate_path->hop_limit;
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index 46d0567..9b5e642 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -77,4 +77,22 @@ static inline u32 opa_get_lid_from_gid(union ib_gid *gid)
{
return be64_to_cpu(gid->global.interface_id) & 0xFFFFFFFF;
}
+
+/**
+ * opa_is_extended_lid: Returns true if dlid or slid are
+ * extended.
+ *
+ * @dlid: The DLID
+ * @slid: The SLID
+ */
+static inline bool opa_is_extended_lid(u32 dlid, u32 slid)
+{
+ if ((be32_to_cpu(dlid) >=
+ be16_to_cpu(IB_MULTICAST_LID_BASE)) ||
+ (be32_to_cpu(slid) >=
+ be16_to_cpu(IB_MULTICAST_LID_BASE)))
+ return true;
+ else
+ return false;
+}
#endif /* OPA_ADDR_H */
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 15/27] IB/CM: Create appropriate path records when handling CM request
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (12 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 14/27] IB/CM: Add OPA Path record support to CM Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 16/27] IB/CM: Set appropriate slid and dlid " Dennis Dalessandro
` (11 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
When handling an incoming conection request, ib_cm creates
either an IB or an OPA path record based on the gid field
in the request.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/cm.c | 38 +++++++++++++++++++++++++++++++-------
1 files changed, 31 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 885c429..4d870a0 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1428,6 +1428,21 @@ static inline int cm_is_active_peer(__be64 local_ca_guid, __be64 remote_ca_guid,
(be32_to_cpu(local_qpn) > be32_to_cpu(remote_qpn))));
}
+static bool cm_req_has_alt_path(struct cm_req_msg *req_msg)
+{
+ return ((req_msg->alt_local_lid) ||
+ (ib_is_opa_gid(&req_msg->alt_local_gid)));
+}
+
+static void cm_path_set_rec_type(struct ib_device *ib_device, u8 port_num,
+ struct sa_path_rec *path, union ib_gid *gid)
+{
+ if (ib_is_opa_gid(gid) && rdma_cap_opa_ah(ib_device, port_num))
+ path->rec_type = SA_PATH_REC_TYPE_OPA;
+ else
+ path->rec_type = SA_PATH_REC_TYPE_IB;
+}
+
static void cm_format_paths_from_req(struct cm_req_msg *req_msg,
struct sa_path_rec *primary_path,
struct sa_path_rec *alt_path)
@@ -1807,9 +1822,12 @@ static int cm_req_handler(struct cm_work *work)
dev_net(gid_attr.ndev));
dev_put(gid_attr.ndev);
} else {
- work->path[0].rec_type = SA_PATH_REC_TYPE_IB;
+ cm_path_set_rec_type(work->port->cm_dev->ib_device,
+ work->port->port_num,
+ &work->path[0],
+ &req_msg->primary_local_gid);
}
- if (req_msg->alt_local_lid)
+ if (cm_req_has_alt_path(req_msg))
work->path[1].rec_type = work->path[0].rec_type;
cm_format_paths_from_req(req_msg, &work->path[0],
&work->path[1]);
@@ -1834,16 +1852,19 @@ static int cm_req_handler(struct cm_work *work)
dev_net(gid_attr.ndev));
dev_put(gid_attr.ndev);
} else {
- work->path[0].rec_type = SA_PATH_REC_TYPE_IB;
+ cm_path_set_rec_type(work->port->cm_dev->ib_device,
+ work->port->port_num,
+ &work->path[0],
+ &req_msg->primary_local_gid);
}
- if (req_msg->alt_local_lid)
+ if (cm_req_has_alt_path(req_msg))
work->path[1].rec_type = work->path[0].rec_type;
ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID,
&work->path[0].sgid, sizeof work->path[0].sgid,
NULL, 0);
goto rejected;
}
- if (req_msg->alt_local_lid) {
+ if (cm_req_has_alt_path(req_msg)) {
ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av,
cm_id_priv);
if (ret) {
@@ -2962,8 +2983,6 @@ static void cm_format_path_from_lap(struct cm_id_private *cm_id_priv,
struct sa_path_rec *path,
struct cm_lap_msg *lap_msg)
{
- memset(path, 0, sizeof *path);
- path->rec_type = SA_PATH_REC_TYPE_IB;
path->dgid = lap_msg->alt_local_gid;
path->sgid = lap_msg->alt_remote_gid;
sa_path_set_dlid(path, htonl(ntohs(lap_msg->alt_local_lid)));
@@ -2999,6 +3018,11 @@ static int cm_lap_handler(struct cm_work *work)
return -EINVAL;
param = &work->cm_event.param.lap_rcvd;
+ memset(&work->path[0], 0, sizeof(work->path[1]));
+ cm_path_set_rec_type(work->port->cm_dev->ib_device,
+ work->port->port_num,
+ &work->path[0],
+ &lap_msg->alt_local_gid);
param->alternate_path = &work->path[0];
cm_format_path_from_lap(cm_id_priv, param->alternate_path, lap_msg);
work->cm_event.private_data = &lap_msg->private_data;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 16/27] IB/CM: Set appropriate slid and dlid when handling CM request
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (13 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 15/27] IB/CM: Create appropriate path records when handling CM request Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 17/27] IB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs Dennis Dalessandro
` (10 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
If extended LIDs are being used, a connection request contains
OPA GIDs in them. Extract the lids from the OPA gids and populate
slid/dlid fields in the path records that are created when handling
a connection request.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/cm.c | 67 +++++++++++++++++++++++++++++++++++-------
1 files changed, 56 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 4d870a0..d5ca101 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1443,16 +1443,48 @@ static void cm_path_set_rec_type(struct ib_device *ib_device, u8 port_num,
path->rec_type = SA_PATH_REC_TYPE_IB;
}
+static void cm_format_path_lid_from_req(struct cm_req_msg *req_msg,
+ struct sa_path_rec *primary_path,
+ struct sa_path_rec *alt_path)
+{
+ u32 lid;
+
+ if (primary_path->rec_type != SA_PATH_REC_TYPE_OPA) {
+ sa_path_set_dlid(primary_path,
+ htonl(ntohs(req_msg->primary_local_lid)));
+ sa_path_set_slid(primary_path,
+ htonl(ntohs(req_msg->primary_remote_lid)));
+ } else {
+ lid = opa_get_lid_from_gid(&req_msg->primary_local_gid);
+ sa_path_set_dlid(primary_path, cpu_to_be32(lid));
+
+ lid = opa_get_lid_from_gid(&req_msg->primary_remote_gid);
+ sa_path_set_slid(primary_path, cpu_to_be32(lid));
+ }
+
+ if (!cm_req_has_alt_path(req_msg))
+ return;
+
+ if (alt_path->rec_type != SA_PATH_REC_TYPE_OPA) {
+ sa_path_set_dlid(alt_path,
+ htonl(ntohs(req_msg->alt_local_lid)));
+ sa_path_set_slid(alt_path,
+ htonl(ntohs(req_msg->alt_remote_lid)));
+ } else {
+ lid = opa_get_lid_from_gid(&req_msg->alt_local_gid);
+ sa_path_set_dlid(alt_path, cpu_to_be32(lid));
+
+ lid = opa_get_lid_from_gid(&req_msg->alt_remote_gid);
+ sa_path_set_slid(alt_path, cpu_to_be32(lid));
+ }
+}
+
static void cm_format_paths_from_req(struct cm_req_msg *req_msg,
struct sa_path_rec *primary_path,
struct sa_path_rec *alt_path)
{
primary_path->dgid = req_msg->primary_local_gid;
primary_path->sgid = req_msg->primary_remote_gid;
- sa_path_set_dlid(primary_path,
- htonl(ntohs(req_msg->primary_local_lid)));
- sa_path_set_slid(primary_path,
- htonl(ntohs(req_msg->primary_remote_lid)));
primary_path->flow_label = cm_req_get_primary_flow_label(req_msg);
primary_path->hop_limit = req_msg->primary_hop_limit;
primary_path->traffic_class = req_msg->primary_traffic_class;
@@ -1469,13 +1501,9 @@ static void cm_format_paths_from_req(struct cm_req_msg *req_msg,
primary_path->packet_life_time -= (primary_path->packet_life_time > 0);
primary_path->service_id = req_msg->service_id;
- if (req_msg->alt_local_lid) {
+ if (cm_req_has_alt_path(req_msg)) {
alt_path->dgid = req_msg->alt_local_gid;
alt_path->sgid = req_msg->alt_remote_gid;
- sa_path_set_dlid(alt_path,
- htonl(ntohs(req_msg->alt_local_lid)));
- sa_path_set_slid(alt_path,
- htonl(ntohs(req_msg->alt_remote_lid)));
alt_path->flow_label = cm_req_get_alt_flow_label(req_msg);
alt_path->hop_limit = req_msg->alt_hop_limit;
alt_path->traffic_class = req_msg->alt_traffic_class;
@@ -1492,6 +1520,7 @@ static void cm_format_paths_from_req(struct cm_req_msg *req_msg,
alt_path->packet_life_time -= (alt_path->packet_life_time > 0);
alt_path->service_id = req_msg->service_id;
}
+ cm_format_path_lid_from_req(req_msg, primary_path, alt_path);
}
static u16 cm_get_bth_pkey(struct cm_work *work)
@@ -2979,14 +3008,29 @@ int ib_send_cm_lap(struct ib_cm_id *cm_id,
}
EXPORT_SYMBOL(ib_send_cm_lap);
+static void cm_format_path_lid_from_lap(struct cm_lap_msg *lap_msg,
+ struct sa_path_rec *path)
+{
+ u32 lid;
+
+ if (path->rec_type != SA_PATH_REC_TYPE_OPA) {
+ sa_path_set_dlid(path, htonl(ntohs(lap_msg->alt_local_lid)));
+ sa_path_set_slid(path, htonl(ntohs(lap_msg->alt_remote_lid)));
+ } else {
+ lid = opa_get_lid_from_gid(&lap_msg->alt_local_gid);
+ sa_path_set_dlid(path, cpu_to_be32(lid));
+
+ lid = opa_get_lid_from_gid(&lap_msg->alt_remote_gid);
+ sa_path_set_slid(path, cpu_to_be32(lid));
+ }
+}
+
static void cm_format_path_from_lap(struct cm_id_private *cm_id_priv,
struct sa_path_rec *path,
struct cm_lap_msg *lap_msg)
{
path->dgid = lap_msg->alt_local_gid;
path->sgid = lap_msg->alt_remote_gid;
- sa_path_set_dlid(path, htonl(ntohs(lap_msg->alt_local_lid)));
- sa_path_set_slid(path, htonl(ntohs(lap_msg->alt_remote_lid)));
path->flow_label = cm_lap_get_flow_label(lap_msg);
path->hop_limit = lap_msg->alt_hop_limit;
path->traffic_class = cm_lap_get_traffic_class(lap_msg);
@@ -3000,6 +3044,7 @@ static void cm_format_path_from_lap(struct cm_id_private *cm_id_priv,
path->packet_life_time_selector = IB_SA_EQ;
path->packet_life_time = cm_lap_get_local_ack_timeout(lap_msg);
path->packet_life_time -= (path->packet_life_time > 0);
+ cm_format_path_lid_from_lap(lap_msg, path);
}
static int cm_lap_handler(struct cm_work *work)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 17/27] IB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (14 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 16/27] IB/CM: Set appropriate slid and dlid " Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 18/27] IB/hfi1: Add support to receive 16B bypass packets Dennis Dalessandro
` (9 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
rvt_check_ah() delegates lid verification to underlying
driver. Underlying driver uses different conditions to
check for dlid depending on whether the device supports
extended LIDs
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/common.h | 9 ---------
drivers/infiniband/hw/hfi1/mad.c | 5 +++--
drivers/infiniband/hw/hfi1/verbs.c | 5 +++++
drivers/infiniband/hw/qib/qib_verbs.c | 9 +++++++++
drivers/infiniband/sw/rdmavt/ah.c | 10 ----------
drivers/infiniband/sw/rdmavt/qp.c | 29 +++++++++++++++++++++++------
include/rdma/opa_addr.h | 18 ++++++++++++++++++
7 files changed, 58 insertions(+), 27 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/common.h b/drivers/infiniband/hw/hfi1/common.h
index ba9ab97..aa416ef 100644
--- a/drivers/infiniband/hw/hfi1/common.h
+++ b/drivers/infiniband/hw/hfi1/common.h
@@ -333,15 +333,6 @@ struct diag_pkt {
#define DEFAULT_P_KEY LIM_MGMT_P_KEY
-/**
- * 0xF8 - 4 bits of multicast range and 1 bit for collective range
- * Example: For 24 bit LID space,
- * Multicast range: 0xF00000 to 0xF7FFFF
- * Collective range: 0xF80000 to 0xFFFFFE
- */
-#define HFI1_MCAST_NR 0x4 /* Number of top bits set */
-#define HFI1_COLLECTIVE_NR 0x1 /* Number of bits after MCAST_NR */
-
#define HFI1_PSM_IOC_BASE_SEQ 0x0
static inline __u64 rhf_to_cpu(const __le32 *rbuf)
diff --git a/drivers/infiniband/hw/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
index 1d54568..dea3fa0 100644
--- a/drivers/infiniband/hw/hfi1/mad.c
+++ b/drivers/infiniband/hw/hfi1/mad.c
@@ -46,6 +46,7 @@
*/
#include <linux/net.h>
+#include <rdma/opa_addr.h>
#define OPA_NUM_PKEY_BLOCKS_PER_SMP (OPA_SMP_DR_DATA_SIZE \
/ (OPA_PARTITION_TABLE_BLK_SIZE * sizeof(u16)))
@@ -905,8 +906,8 @@ static int __subn_get_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
pi->buffer_units = cpu_to_be32(buffer_units);
pi->opa_cap_mask = cpu_to_be16(ibp->rvp.port_cap3_flags);
- pi->collectivemask_multicastmask = ((HFI1_COLLECTIVE_NR & 0x7)
- << 3 | (HFI1_MCAST_NR & 0x7));
+ pi->collectivemask_multicastmask = ((OPA_COLLECTIVE_NR & 0x7)
+ << 3 | (OPA_MCAST_NR & 0x7));
/* HFI supports a replay buffer 128 LTPs in size */
pi->replay_depth.buffer = 0x80;
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index dc51bf2..b5feaa8 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -53,6 +53,7 @@
#include <linux/rculist.h>
#include <linux/mm.h>
#include <linux/vmalloc.h>
+#include <rdma/opa_addr.h>
#include "hfi.h"
#include "common.h"
@@ -1461,6 +1462,10 @@ static int hfi1_check_ah(struct ib_device *ibdev, struct rdma_ah_attr *ah_attr)
struct hfi1_devdata *dd;
u8 sc5;
+ if (hfi1_check_mcast(rdma_ah_get_dlid(ah_attr)) &&
+ !(rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH))
+ return -EINVAL;
+
/* test the mapping for validity */
ibp = to_iport(ibdev, rdma_ah_get_port_num(ah_attr));
ppd = ppd_from_ibp(ibp);
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index ac42dce..9d92aeb 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1341,6 +1341,15 @@ int qib_check_ah(struct ib_device *ibdev, struct rdma_ah_attr *ah_attr)
if (rdma_ah_get_sl(ah_attr) > 15)
return -EINVAL;
+ if (rdma_ah_get_dlid(ah_attr) == 0)
+ return -EINVAL;
+ if (rdma_ah_get_dlid(ah_attr) >=
+ be16_to_cpu(IB_MULTICAST_LID_BASE) &&
+ rdma_ah_get_dlid(ah_attr) !=
+ be16_to_cpu(IB_LID_PERMISSIVE) &&
+ !(rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH))
+ return -EINVAL;
+
return 0;
}
diff --git a/drivers/infiniband/sw/rdmavt/ah.c b/drivers/infiniband/sw/rdmavt/ah.c
index a96d4aa..ba3639a 100644
--- a/drivers/infiniband/sw/rdmavt/ah.c
+++ b/drivers/infiniband/sw/rdmavt/ah.c
@@ -66,8 +66,6 @@ int rvt_check_ah(struct ib_device *ibdev,
int port_num = rdma_ah_get_port_num(ah_attr);
struct ib_port_attr port_attr;
struct rvt_dev_info *rdi = ib_to_rvt(ibdev);
- enum rdma_link_layer link = rdma_port_get_link_layer(ibdev, port_num);
- u32 dlid = rdma_ah_get_dlid(ah_attr);
u8 ah_flags = rdma_ah_get_ah_flags(ah_attr);
u8 static_rate = rdma_ah_get_static_rate(ah_attr);
@@ -83,14 +81,6 @@ int rvt_check_ah(struct ib_device *ibdev,
if ((ah_flags & IB_AH_GRH) &&
rdma_ah_read_grh(ah_attr)->sgid_index >= port_attr.gid_tbl_len)
return -EINVAL;
- if (link != IB_LINK_LAYER_ETHERNET) {
- if (dlid == 0)
- return -EINVAL;
- if (dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE) &&
- dlid != be16_to_cpu(IB_LID_PERMISSIVE) &&
- !(ah_flags & IB_AH_GRH))
- return -EINVAL;
- }
if (rdi->driver_f.check_ah)
return rdi->driver_f.check_ah(ibdev, ah_attr);
return 0;
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index eb0c3d6..6f6525d 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -52,6 +52,7 @@
#include <linux/slab.h>
#include <rdma/ib_verbs.h>
#include <rdma/ib_hdrs.h>
+#include <rdma/opa_addr.h>
#include "qp.h"
#include "vt.h"
#include "trace.h"
@@ -1066,6 +1067,7 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
int mig = 0;
int pmtu = 0; /* for gcc warning only */
enum rdma_link_layer link;
+ int opa_ah;
link = rdma_port_get_link_layer(ibqp->device, qp->port_num);
@@ -1076,6 +1078,7 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
cur_state = attr_mask & IB_QP_CUR_STATE ?
attr->cur_qp_state : qp->state;
new_state = attr_mask & IB_QP_STATE ? attr->qp_state : cur_state;
+ opa_ah = rdma_cap_opa_ah(ibqp->device, qp->port_num);
if (!ib_modify_qp_is_ok(cur_state, new_state, ibqp->qp_type,
attr_mask, link))
@@ -1086,17 +1089,31 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
goto inval;
if (attr_mask & IB_QP_AV) {
- if (rdma_ah_get_dlid(&attr->ah_attr) >=
- be16_to_cpu(IB_MULTICAST_LID_BASE))
- goto inval;
+ if (opa_ah) {
+ if (rdma_ah_get_dlid(&attr->ah_attr) >=
+ opa_get_mcast_base(OPA_MCAST_NR))
+ goto inval;
+ } else {
+ if (rdma_ah_get_dlid(&attr->ah_attr) >=
+ be16_to_cpu(IB_MULTICAST_LID_BASE))
+ goto inval;
+ }
+
if (rvt_check_ah(qp->ibqp.device, &attr->ah_attr))
goto inval;
}
if (attr_mask & IB_QP_ALT_PATH) {
- if (rdma_ah_get_dlid(&attr->alt_ah_attr) >=
- be16_to_cpu(IB_MULTICAST_LID_BASE))
- goto inval;
+ if (opa_ah) {
+ if (rdma_ah_get_dlid(&attr->alt_ah_attr) >=
+ opa_get_mcast_base(OPA_MCAST_NR))
+ goto inval;
+ } else {
+ if (rdma_ah_get_dlid(&attr->alt_ah_attr) >=
+ be16_to_cpu(IB_MULTICAST_LID_BASE))
+ goto inval;
+ }
+
if (rvt_check_ah(qp->ibqp.device, &attr->alt_ah_attr))
goto inval;
if (attr->alt_pkey_index >= rvt_get_npkeys(rdi))
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index 9b5e642..8d3ad4e 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -48,11 +48,22 @@
#ifndef OPA_ADDR_H
#define OPA_ADDR_H
+#include <rdma/opa_smi.h>
+
#define OPA_SPECIAL_OUI (0x00066AULL)
#define OPA_MAKE_ID(x) (cpu_to_be64(OPA_SPECIAL_OUI << 40 | (x)))
#define OPA_TO_IB_UCAST_LID(x) (((x) >= be16_to_cpu(IB_MULTICAST_LID_BASE)) \
? 0 : x)
/**
+ * 0xF8 - 4 bits of multicast range and 1 bit for collective range
+ * Example: For 24 bit LID space,
+ * Multicast range: 0xF00000 to 0xF7FFFF
+ * Collective range: 0xF80000 to 0xFFFFFE
+ */
+#define OPA_MCAST_NR 0x4 /* Number of top bits set */
+#define OPA_COLLECTIVE_NR 0x1 /* Number of bits after MCAST_NR */
+
+/**
* ib_is_opa_gid: Returns true if the top 24 bits of the gid
* contains the OPA_STL_OUI identifier. This identifies that
* the provided gid is a special purpose GID meant to carry
@@ -95,4 +106,11 @@ static inline bool opa_is_extended_lid(u32 dlid, u32 slid)
else
return false;
}
+
+/* Get multicast lid base */
+static inline u32 opa_get_mcast_base(u32 nr_top_bits)
+{
+ return (be32_to_cpu(OPA_LID_PERMISSIVE) << (32 - nr_top_bits));
+}
+
#endif /* OPA_ADDR_H */
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 18/27] IB/hfi1: Add support to receive 16B bypass packets
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (15 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 17/27] IB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs Dennis Dalessandro
@ 2017-08-04 20:53 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 19/27] IB/hfi1: Add support to send " Dennis Dalessandro
` (8 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
We introduce a struct hfi1_16b_header to support 16B headers.
16B bypass packets are received by the driver and processed
similar to 9B packets. Add basic support to handle 16B packets.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/chip.c | 6 +
drivers/infiniband/hw/hfi1/common.h | 1
drivers/infiniband/hw/hfi1/driver.c | 127 +++++++++++++++++++++++++++----
drivers/infiniband/hw/hfi1/hfi.h | 131 +++++++++++++++++++++++++++++++-
drivers/infiniband/hw/hfi1/rc.c | 2
drivers/infiniband/hw/hfi1/uc.c | 2
drivers/infiniband/hw/hfi1/ud.c | 4 -
drivers/infiniband/hw/hfi1/verbs.c | 17 ++--
drivers/infiniband/hw/hfi1/verbs.h | 13 +++
drivers/infiniband/hw/hfi1/vnic.h | 15 ----
drivers/infiniband/hw/hfi1/vnic_main.c | 4 -
include/rdma/opa_vnic.h | 3 -
12 files changed, 274 insertions(+), 51 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 305c568..1023701 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -14468,6 +14468,7 @@ void hfi1_deinit_vnic_rsm(struct hfi1_devdata *dd)
static void init_rxe(struct hfi1_devdata *dd)
{
struct rsm_map_table *rmt;
+ u64 val;
/* enable all receive errors */
write_csr(dd, RCV_ERR_MASK, ~0ull);
@@ -14492,6 +14493,11 @@ static void init_rxe(struct hfi1_devdata *dd)
* (64 bytes). Max_Payload_Size is possibly modified upward in
* tune_pcie_caps() which is called after this routine.
*/
+
+ /* Have 16 bytes (4DW) of bypass header available in header queue */
+ val = read_csr(dd, RCV_BYPASS);
+ val |= (4ull << 16);
+ write_csr(dd, RCV_BYPASS, val);
}
static void init_other(struct hfi1_devdata *dd)
diff --git a/drivers/infiniband/hw/hfi1/common.h b/drivers/infiniband/hw/hfi1/common.h
index aa416ef..3e27794 100644
--- a/drivers/infiniband/hw/hfi1/common.h
+++ b/drivers/infiniband/hw/hfi1/common.h
@@ -327,6 +327,7 @@ struct diag_pkt {
/* misc. */
#define SC15_PACKET 0xF
#define SIZE_OF_CRC 1
+#define SIZE_OF_LT 1
#define LIM_MGMT_P_KEY 0x7FFF
#define FULL_MGMT_P_KEY 0xFFFF
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 14f2a00..5280d82 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -237,6 +237,13 @@ int hfi1_count_active_units(void)
return (struct ib_header *)hfi1_get_header(dd, rhf_addr);
}
+static inline struct hfi1_16b_header
+ *hfi1_get_16B_header(struct hfi1_devdata *dd,
+ __le32 *rhf_addr)
+{
+ return (struct hfi1_16b_header *)hfi1_get_header(dd, rhf_addr);
+}
+
/*
* Validate and encode the a given RcvArray Buffer size.
* The function will check whether the given size falls within
@@ -925,6 +932,11 @@ static inline int set_armed_to_active(struct hfi1_ctxtdata *rcd,
struct ib_header *hdr = hfi1_get_msgheader(packet->rcd->dd,
packet->rhf_addr);
sc = hfi1_9B_get_sc5(hdr, packet->rhf);
+ } else if (etype == RHF_RCV_TYPE_BYPASS) {
+ struct hfi1_16b_header *hdr = hfi1_get_16B_header(
+ packet->rcd->dd,
+ packet->rhf_addr);
+ sc = hfi1_16B_get_sc(hdr);
}
if (sc != SC15_PACKET) {
int hwstate = driver_lstate(rcd->ppd);
@@ -1386,9 +1398,14 @@ static int hfi1_setup_9B_packet(struct hfi1_packet *packet)
}
/* Query commonly used fields from packet header */
+ packet->payload = packet->ebuf;
packet->opcode = ib_bth_get_opcode(packet->ohdr);
packet->slid = ib_get_slid(hdr);
packet->dlid = ib_get_dlid(hdr);
+ if (unlikely((packet->dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
+ (packet->dlid != be16_to_cpu(IB_LID_PERMISSIVE))))
+ packet->dlid += opa_get_mcast_base(OPA_MCAST_NR) -
+ be16_to_cpu(IB_MULTICAST_LID_BASE);
packet->sl = ib_get_sl(hdr);
packet->sc = hfi1_9B_get_sc5(hdr, packet->rhf);
packet->pad = ib_bth_get_pad(packet->ohdr);
@@ -1402,6 +1419,73 @@ static int hfi1_setup_9B_packet(struct hfi1_packet *packet)
return -EINVAL;
}
+static int hfi1_setup_bypass_packet(struct hfi1_packet *packet)
+{
+ /*
+ * Bypass packets have a different header/payload split
+ * compared to an IB packet.
+ * Current split is set such that 16 bytes of the actual
+ * header is in the header buffer and the remining is in
+ * the eager buffer. We chose 16 since hfi1 driver only
+ * supports 16B bypass packets and we will be able to
+ * receive the entire LRH with such a split.
+ */
+
+ struct hfi1_ctxtdata *rcd = packet->rcd;
+ struct hfi1_pportdata *ppd = rcd->ppd;
+ struct hfi1_ibport *ibp = &ppd->ibport_data;
+ u8 l4;
+ u8 grh_len;
+
+ packet->hdr = (struct hfi1_16b_header *)
+ hfi1_get_16B_header(packet->rcd->dd,
+ packet->rhf_addr);
+ packet->hlen = (u8 *)packet->rhf_addr - (u8 *)packet->hdr;
+
+ l4 = hfi1_16B_get_l4(packet->hdr);
+ if (l4 == OPA_16B_L4_IB_LOCAL) {
+ grh_len = 0;
+ packet->ohdr = packet->ebuf;
+ packet->grh = NULL;
+ } else if (l4 == OPA_16B_L4_IB_GLOBAL) {
+ u32 vtf;
+
+ grh_len = sizeof(struct ib_grh);
+ packet->ohdr = packet->ebuf + grh_len;
+ packet->grh = packet->ebuf;
+ if (packet->grh->next_hdr != IB_GRH_NEXT_HDR)
+ goto drop;
+ vtf = be32_to_cpu(packet->grh->version_tclass_flow);
+ if ((vtf >> IB_GRH_VERSION_SHIFT) != IB_GRH_VERSION)
+ goto drop;
+ } else {
+ goto drop;
+ }
+
+ /* Query commonly used fields from packet header */
+ packet->opcode = ib_bth_get_opcode(packet->ohdr);
+ packet->hlen = hdr_len_by_opcode[packet->opcode] + 8 + grh_len;
+ packet->payload = packet->ebuf + packet->hlen - (4 * sizeof(u32));
+ packet->slid = hfi1_16B_get_slid(packet->hdr);
+ packet->dlid = hfi1_16B_get_dlid(packet->hdr);
+ if (unlikely(hfi1_is_16B_mcast(packet->dlid)))
+ packet->dlid += opa_get_mcast_base(OPA_MCAST_NR) -
+ opa_get_lid(opa_get_mcast_base(OPA_MCAST_NR),
+ 16B);
+ packet->sc = hfi1_16B_get_sc(packet->hdr);
+ packet->sl = ibp->sc_to_sl[packet->sc];
+ packet->pad = hfi1_16B_bth_get_pad(packet->ohdr);
+ packet->extra_byte = SIZE_OF_LT;
+ packet->fecn = hfi1_16B_get_fecn(packet->hdr);
+ packet->becn = hfi1_16B_get_becn(packet->hdr);
+
+ return 0;
+drop:
+ hfi1_cdbg(PKT, "%s: packet dropped\n", __func__);
+ ibp->rvp.n_pkt_drops++;
+ return -EINVAL;
+}
+
void handle_eflags(struct hfi1_packet *packet)
{
struct hfi1_ctxtdata *rcd = packet->rcd;
@@ -1464,8 +1548,8 @@ static inline bool hfi1_is_vnic_packet(struct hfi1_packet *packet)
if (packet->rcd->is_vnic)
return true;
- if ((HFI1_GET_L2_TYPE(packet->ebuf) == OPA_VNIC_L2_TYPE) &&
- (HFI1_GET_L4_TYPE(packet->ebuf) == OPA_VNIC_L4_ETHR))
+ if ((hfi1_16B_get_l2(packet->ebuf) == OPA_16B_L2_TYPE) &&
+ (hfi1_16B_get_l4(packet->ebuf) == OPA_16B_L4_ETHR))
return true;
return false;
@@ -1475,25 +1559,38 @@ int process_receive_bypass(struct hfi1_packet *packet)
{
struct hfi1_devdata *dd = packet->rcd->dd;
- if (unlikely(rhf_err_flags(packet->rhf))) {
- handle_eflags(packet);
- } else if (hfi1_is_vnic_packet(packet)) {
+ if (hfi1_is_vnic_packet(packet)) {
hfi1_vnic_bypass_rcv(packet);
return RHF_RCV_CONTINUE;
}
- dd_dev_err(dd, "Unsupported bypass packet. Dropping\n");
- incr_cntr64(&dd->sw_rcv_bypass_packet_errors);
- if (!(dd->err_info_rcvport.status_and_code & OPA_EI_STATUS_SMASK)) {
- u64 *flits = packet->ebuf;
+ if (hfi1_setup_bypass_packet(packet))
+ return RHF_RCV_CONTINUE;
+
+ if (unlikely(rhf_err_flags(packet->rhf))) {
+ handle_eflags(packet);
+ return RHF_RCV_CONTINUE;
+ }
- if (flits && !(packet->rhf & RHF_LEN_ERR)) {
- dd->err_info_rcvport.packet_flit1 = flits[0];
- dd->err_info_rcvport.packet_flit2 =
- packet->tlen > sizeof(flits[0]) ? flits[1] : 0;
+ if (hfi1_16B_get_l2(packet->hdr) == 0x2) {
+ hfi1_16B_rcv(packet);
+ } else {
+ dd_dev_err(dd,
+ "Bypass packets other than 16B are not supported in normal operation. Dropping\n");
+ incr_cntr64(&dd->sw_rcv_bypass_packet_errors);
+ if (!(dd->err_info_rcvport.status_and_code &
+ OPA_EI_STATUS_SMASK)) {
+ u64 *flits = packet->ebuf;
+
+ if (flits && !(packet->rhf & RHF_LEN_ERR)) {
+ dd->err_info_rcvport.packet_flit1 = flits[0];
+ dd->err_info_rcvport.packet_flit2 =
+ packet->tlen > sizeof(flits[0]) ?
+ flits[1] : 0;
+ }
+ dd->err_info_rcvport.status_and_code |=
+ (OPA_EI_STATUS_SMASK | BAD_L2_ERR);
}
- dd->err_info_rcvport.status_and_code |=
- (OPA_EI_STATUS_SMASK | BAD_L2_ERR);
}
return RHF_RCV_CONTINUE;
}
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index fa9160f..dbbad76 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -66,6 +66,7 @@
#include <linux/i2c.h>
#include <linux/i2c-algo-bit.h>
#include <rdma/ib_hdrs.h>
+#include <rdma/opa_addr.h>
#include <linux/rhashtable.h>
#include <linux/netdevice.h>
#include <rdma/rdma_vt.h>
@@ -325,6 +326,7 @@ struct hfi1_ctxtdata {
struct hfi1_packet {
void *ebuf;
void *hdr;
+ void *payload;
struct hfi1_ctxtdata *rcd;
__le32 *rhf_addr;
struct rvt_qp *qp;
@@ -351,6 +353,83 @@ struct hfi1_packet {
bool fecn;
};
+/*
+ * OPA 16B Header
+ */
+#define OPA_16B_L4_MASK 0xFFull
+#define OPA_16B_SC_MASK 0x1F00000ull
+#define OPA_16B_SC_SHIFT 20
+#define OPA_16B_LID_MASK 0xFFFFFull
+#define OPA_16B_DLID_MASK 0xF000ull
+#define OPA_16B_DLID_SHIFT 20
+#define OPA_16B_DLID_HIGH_SHIFT 12
+#define OPA_16B_SLID_MASK 0xF00ull
+#define OPA_16B_SLID_SHIFT 20
+#define OPA_16B_SLID_HIGH_SHIFT 8
+#define OPA_16B_BECN_MASK 0x80000000ull
+#define OPA_16B_BECN_SHIFT 31
+#define OPA_16B_FECN_MASK 0x10000000ull
+#define OPA_16B_FECN_SHIFT 28
+#define OPA_16B_L2_MASK 0x60000000ull
+#define OPA_16B_L2_SHIFT 29
+
+/*
+ * OPA 16B L2/L4 Encodings
+ */
+#define OPA_16B_L2_TYPE 0x02
+#define OPA_16B_L4_IB_LOCAL 0x09
+#define OPA_16B_L4_IB_GLOBAL 0x0A
+#define OPA_16B_L4_ETHR OPA_VNIC_L4_ETHR
+
+static inline u8 hfi1_16B_get_l4(struct hfi1_16b_header *hdr)
+{
+ return (u8)(hdr->lrh[2] & OPA_16B_L4_MASK);
+}
+
+static inline u8 hfi1_16B_get_sc(struct hfi1_16b_header *hdr)
+{
+ return (u8)((hdr->lrh[1] & OPA_16B_SC_MASK) >> OPA_16B_SC_SHIFT);
+}
+
+static inline u32 hfi1_16B_get_dlid(struct hfi1_16b_header *hdr)
+{
+ return (u32)((hdr->lrh[1] & OPA_16B_LID_MASK) |
+ (((hdr->lrh[2] & OPA_16B_DLID_MASK) >>
+ OPA_16B_DLID_HIGH_SHIFT) << OPA_16B_DLID_SHIFT));
+}
+
+static inline u32 hfi1_16B_get_slid(struct hfi1_16b_header *hdr)
+{
+ return (u32)((hdr->lrh[0] & OPA_16B_LID_MASK) |
+ (((hdr->lrh[2] & OPA_16B_SLID_MASK) >>
+ OPA_16B_SLID_HIGH_SHIFT) << OPA_16B_SLID_SHIFT));
+}
+
+static inline u8 hfi1_16B_get_becn(struct hfi1_16b_header *hdr)
+{
+ return (u8)((hdr->lrh[0] & OPA_16B_BECN_MASK) >> OPA_16B_BECN_SHIFT);
+}
+
+static inline u8 hfi1_16B_get_fecn(struct hfi1_16b_header *hdr)
+{
+ return (u8)((hdr->lrh[1] & OPA_16B_FECN_MASK) >> OPA_16B_FECN_SHIFT);
+}
+
+static inline u8 hfi1_16B_get_l2(struct hfi1_16b_header *hdr)
+{
+ return (u8)((hdr->lrh[1] & OPA_16B_L2_MASK) >> OPA_16B_L2_SHIFT);
+}
+
+/*
+ * BTH
+ */
+#define OPA_16B_BTH_PAD_MASK 7
+static inline u8 hfi1_16B_bth_get_pad(struct ib_other_headers *ohdr)
+{
+ return (u8)((be32_to_cpu(ohdr->bth[0]) >> IB_BTH_PAD_SHIFT) &
+ OPA_16B_BTH_PAD_MASK);
+}
+
struct rvt_sge_state;
/*
@@ -2084,11 +2163,55 @@ static inline bool is_integrated(struct hfi1_devdata *dd)
/*
* hfi1_check_mcast- Check if the given lid is
- * in the IB multicast range.
+ * in the OPA multicast range.
+ *
+ * The LID might either reside in ah.dlid or might be
+ * in the GRH of the address handle as DGID if extended
+ * addresses are in use.
*/
-static inline bool hfi1_check_mcast(u16 lid)
+static inline bool hfi1_check_mcast(u32 lid)
+{
+ return ((lid >= opa_get_mcast_base(OPA_MCAST_NR)) &&
+ (lid != be32_to_cpu(OPA_LID_PERMISSIVE)));
+}
+
+#define opa_get_lid(lid, format) \
+ __opa_get_lid(lid, OPA_PORT_PACKET_FORMAT_##format)
+
+/* Convert a lid to a specific lid space */
+static inline u32 __opa_get_lid(u32 lid, u8 format)
+{
+ bool is_mcast = hfi1_check_mcast(lid);
+
+ switch (format) {
+ case OPA_PORT_PACKET_FORMAT_8B:
+ case OPA_PORT_PACKET_FORMAT_10B:
+ if (is_mcast)
+ return (lid - opa_get_mcast_base(OPA_MCAST_NR) +
+ 0xF0000);
+ return lid & 0xFFFFF;
+ case OPA_PORT_PACKET_FORMAT_16B:
+ if (is_mcast)
+ return (lid - opa_get_mcast_base(OPA_MCAST_NR) +
+ 0xF00000);
+ return lid & 0xFFFFFF;
+ case OPA_PORT_PACKET_FORMAT_9B:
+ if (is_mcast)
+ return (lid -
+ opa_get_mcast_base(OPA_MCAST_NR) +
+ be16_to_cpu(IB_MULTICAST_LID_BASE));
+ else
+ return lid & 0xFFFF;
+ default:
+ return lid;
+ }
+}
+
+/* Return true if the given lid is the OPA 16B multicast range */
+static inline bool hfi1_is_16B_mcast(u32 lid)
{
- return ((lid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
- (lid != be16_to_cpu(IB_LID_PERMISSIVE)));
+ return ((lid >=
+ opa_get_lid(opa_get_mcast_base(OPA_MCAST_NR), 16B)) &&
+ (lid != opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B)));
}
#endif /* _HFI1_KERNEL_H */
diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index baa67bf..cf74a56 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -1916,7 +1916,7 @@ void process_becn(struct hfi1_pportdata *ppd, u8 sl, u16 rlid, u32 lqpn,
void hfi1_rc_rcv(struct hfi1_packet *packet)
{
struct hfi1_ctxtdata *rcd = packet->rcd;
- void *data = packet->ebuf;
+ void *data = packet->payload;
u32 tlen = packet->tlen;
struct rvt_qp *qp = packet->qp;
struct hfi1_ibport *ibp = rcd_to_iport(rcd);
diff --git a/drivers/infiniband/hw/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
index 76c2451..366f7b9 100644
--- a/drivers/infiniband/hw/hfi1/uc.c
+++ b/drivers/infiniband/hw/hfi1/uc.c
@@ -297,7 +297,7 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
void hfi1_uc_rcv(struct hfi1_packet *packet)
{
struct hfi1_ibport *ibp = rcd_to_iport(packet->rcd);
- void *data = packet->ebuf;
+ void *data = packet->payload;
u32 tlen = packet->tlen;
struct rvt_qp *qp = packet->qp;
struct ib_other_headers *ohdr = packet->ohdr;
diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
index 6bf7a1b..dcf8c14 100644
--- a/drivers/infiniband/hw/hfi1/ud.c
+++ b/drivers/infiniband/hw/hfi1/ud.c
@@ -667,11 +667,10 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
struct hfi1_ibport *ibp = rcd_to_iport(packet->rcd);
struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
struct ib_header *hdr = packet->hdr;
- void *data = packet->ebuf;
+ void *data = packet->payload;
u32 tlen = packet->tlen;
struct rvt_qp *qp = packet->qp;
u8 sc5 = hfi1_9B_get_sc5(hdr, packet->rhf);
- u32 bth1;
u8 sl_from_sc;
u8 extra_bytes = packet->pad;
u8 opcode = packet->opcode;
@@ -679,7 +678,6 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
u32 dlid = packet->dlid;
u32 slid = packet->slid;
- bth1 = be32_to_cpu(ohdr->bth[1]);
qkey = ib_get_qkey(ohdr);
src_qp = ib_get_sqpn(ohdr);
pkey = ib_bth_get_pkey(ohdr);
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index b5feaa8..eb0fda7 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -571,7 +571,7 @@ static inline void hfi1_handle_packet(struct hfi1_packet *packet,
goto drop;
mcast = rvt_mcast_find(&ibp->rvp,
&packet->grh->dgid,
- packet->dlid);
+ opa_get_lid(packet->dlid, 9B));
if (!mcast)
goto drop;
list_for_each_entry_rcu(p, &mcast->qp_list, list) {
@@ -627,14 +627,17 @@ static inline void hfi1_handle_packet(struct hfi1_packet *packet,
void hfi1_ib_rcv(struct hfi1_packet *packet)
{
struct hfi1_ctxtdata *rcd = packet->rcd;
- bool is_mcast = false;
- if (unlikely(hfi1_check_mcast(packet->dlid)))
- is_mcast = true;
+ trace_input_ibhdr(rcd->dd, packet, !!(rhf_dc_info(packet->rhf)));
+ hfi1_handle_packet(packet, hfi1_check_mcast(packet->dlid));
+}
+
+void hfi1_16B_rcv(struct hfi1_packet *packet)
+{
+ struct hfi1_ctxtdata *rcd = packet->rcd;
- trace_input_ibhdr(rcd->dd, packet,
- !!(packet->rhf & RHF_DC_INFO_SMASK));
- hfi1_handle_packet(packet, is_mcast);
+ trace_input_ibhdr(rcd->dd, packet, false);
+ hfi1_handle_packet(packet, hfi1_check_mcast(packet->dlid));
}
/*
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 34267c7..590aab2 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -104,6 +104,17 @@ enum {
HFI1_HAS_GRH = (1 << 0),
};
+struct hfi1_16b_header {
+ u32 lrh[4];
+ union {
+ struct {
+ struct ib_grh grh;
+ struct ib_other_headers oth;
+ } l;
+ struct ib_other_headers oth;
+ } u;
+} __packed;
+
struct hfi1_ahg_info {
u32 ahgdesc[2];
u16 tx_flags;
@@ -378,6 +389,8 @@ void hfi1_send_complete(struct rvt_qp *qp, struct rvt_swqe *wqe,
void hfi1_ib_rcv(struct hfi1_packet *packet);
+void hfi1_16B_rcv(struct hfi1_packet *packet);
+
unsigned hfi1_get_npkeys(struct hfi1_devdata *);
int hfi1_verbs_send_dma(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
diff --git a/drivers/infiniband/hw/hfi1/vnic.h b/drivers/infiniband/hw/hfi1/vnic.h
index eec7c14..5ae7815 100644
--- a/drivers/infiniband/hw/hfi1/vnic.h
+++ b/drivers/infiniband/hw/hfi1/vnic.h
@@ -54,21 +54,6 @@
#define HFI1_VNIC_MAX_TXQ 16
#define HFI1_VNIC_MAX_PAD 12
-/* L2 header definitions */
-#define HFI1_L2_TYPE_OFFSET 0x7
-#define HFI1_L2_TYPE_SHFT 0x5
-#define HFI1_L2_TYPE_MASK 0x3
-
-#define HFI1_GET_L2_TYPE(hdr) \
- ((*((u8 *)(hdr) + HFI1_L2_TYPE_OFFSET) >> HFI1_L2_TYPE_SHFT) & \
- HFI1_L2_TYPE_MASK)
-
-/* L4 type definitions */
-#define HFI1_L4_TYPE_OFFSET 8
-
-#define HFI1_GET_L4_TYPE(data) \
- (*((u8 *)(data) + HFI1_L4_TYPE_OFFSET))
-
/* L4 header definitions */
#define HFI1_VNIC_L4_HDR_OFFSET OPA_VNIC_L2_HDR_LEN
diff --git a/drivers/infiniband/hw/hfi1/vnic_main.c b/drivers/infiniband/hw/hfi1/vnic_main.c
index 2917a23..f419cbb 100644
--- a/drivers/infiniband/hw/hfi1/vnic_main.c
+++ b/drivers/infiniband/hw/hfi1/vnic_main.c
@@ -564,8 +564,8 @@ void hfi1_vnic_bypass_rcv(struct hfi1_packet *packet)
int l4_type, vesw_id = -1;
u8 q_idx;
- l4_type = HFI1_GET_L4_TYPE(packet->ebuf);
- if (likely(l4_type == OPA_VNIC_L4_ETHR)) {
+ l4_type = hfi1_16B_get_l4(packet->ebuf);
+ if (likely(l4_type == OPA_16B_L4_ETHR)) {
vesw_id = HFI1_VNIC_GET_VESWID(packet->ebuf);
vinfo = idr_find(&dd->vnic.vesw_idr, vesw_id);
diff --git a/include/rdma/opa_vnic.h b/include/rdma/opa_vnic.h
index 39d6890..0c07a70 100644
--- a/include/rdma/opa_vnic.h
+++ b/include/rdma/opa_vnic.h
@@ -54,9 +54,6 @@
#include <rdma/ib_verbs.h>
-/* VNIC uses 16B header format */
-#define OPA_VNIC_L2_TYPE 0x2
-
/* 16 header bytes + 2 reserved bytes */
#define OPA_VNIC_L2_HDR_LEN (16 + 2)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 19/27] IB/hfi1: Add support to send 16B bypass packets
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (16 preceding siblings ...)
2017-08-04 20:53 ` [PATCH for-next 18/27] IB/hfi1: Add support to receive 16B bypass packets Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 20/27] IB/hfi1: Add support to process 16B header errors Dennis Dalessandro
` (7 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
We introduce struct hfi1_opa_header as a union
of ib (9B) and 16B headers.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/rc.c | 33 +++++++++++++++++------------
drivers/infiniband/hw/hfi1/ruc.c | 17 ++++++++-------
drivers/infiniband/hw/hfi1/trace_ibhdrs.h | 19 +++++++++--------
drivers/infiniband/hw/hfi1/uc.c | 4 ++--
drivers/infiniband/hw/hfi1/ud.c | 25 +++++++++++-----------
drivers/infiniband/hw/hfi1/verbs.c | 24 ++++++++++++++++++---
drivers/infiniband/hw/hfi1/verbs.h | 22 +++++++++----------
7 files changed, 84 insertions(+), 60 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index cf74a56..e3dbf6d 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -273,9 +273,9 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
if (IS_ERR(ps->s_txreq))
goto bail_no_tx;
- ohdr = &ps->s_txreq->phdr.hdr.u.oth;
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
- ohdr = &ps->s_txreq->phdr.hdr.u.l.oth;
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
/* Sending responses has higher priority over sending requests. */
if ((qp->s_flags & RVT_S_RESP_PENDING) &&
@@ -724,7 +724,8 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
u32 vl, plen;
struct send_context *sc;
struct pio_buf *pbuf;
- struct ib_header hdr;
+ struct hfi1_opa_header opah;
+ struct ib_header *hdr;
struct ib_other_headers *ohdr;
unsigned long flags;
@@ -741,16 +742,19 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
goto queue_ack;
/* Construct the header */
+ opah.hdr_type = 0;
+ hdr = &opah.ibh;
+
/* header size in 32-bit words LRH+BTH+AETH = (8+12+4)/4 */
hwords = 6;
if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
- hwords += hfi1_make_grh(ibp, &hdr.u.l.grh,
+ hwords += hfi1_make_grh(ibp, &hdr->u.l.grh,
rdma_ah_read_grh(&qp->remote_ah_attr),
hwords, 0);
- ohdr = &hdr.u.l.oth;
+ ohdr = &hdr->u.l.oth;
lrh0 = HFI1_LRH_GRH;
} else {
- ohdr = &hdr.u.oth;
+ ohdr = &hdr->u.oth;
lrh0 = HFI1_LRH_BTH;
}
/* read pkey_index w/o lock (its atomic) */
@@ -768,11 +772,11 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
pbc_flags |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
lrh0 |= (sc5 & 0xf) << 12 | (rdma_ah_get_sl(&qp->remote_ah_attr)
& 0xf) << 4;
- hdr.lrh[0] = cpu_to_be16(lrh0);
- hdr.lrh[1] = cpu_to_be16(rdma_ah_get_dlid(&qp->remote_ah_attr));
- hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
- hdr.lrh[3] = cpu_to_be16(ppd->lid |
- rdma_ah_get_path_bits(&qp->remote_ah_attr));
+ hdr->lrh[0] = cpu_to_be16(lrh0);
+ hdr->lrh[1] = cpu_to_be16(rdma_ah_get_dlid(&qp->remote_ah_attr));
+ hdr->lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
+ hdr->lrh[3] = cpu_to_be16(ppd->lid |
+ rdma_ah_get_path_bits(&qp->remote_ah_attr));
ohdr->bth[0] = cpu_to_be32(bth0);
ohdr->bth[1] = cpu_to_be32(qp->remote_qpn);
ohdr->bth[1] |= cpu_to_be32((!!is_fecn) << IB_BECN_SHIFT);
@@ -799,10 +803,10 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
}
trace_ack_output_ibhdr(dd_from_ibdev(qp->ibqp.device),
- &hdr, ib_is_sc5(sc5));
+ &opah, ib_is_sc5(sc5));
/* write the pbc and data */
- ppd->dd->pio_inline_send(ppd->dd, pbuf, pbc, &hdr, hwords);
+ ppd->dd->pio_inline_send(ppd->dd, pbuf, pbc, hdr, hwords);
return;
@@ -985,9 +989,10 @@ static void reset_sending_psn(struct rvt_qp *qp, u32 psn)
/*
* This should be called with the QP s_lock held and interrupts disabled.
*/
-void hfi1_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr)
+void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah)
{
struct ib_other_headers *ohdr;
+ struct ib_header *hdr = &opah->ibh;
struct rvt_swqe *wqe;
u32 opcode;
u32 psn;
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index 4afa00f..e30c64f 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -668,7 +668,8 @@ u32 hfi1_make_grh(struct hfi1_ibport *ibp, struct ib_grh *hdr,
return sizeof(struct ib_grh) / sizeof(u32);
}
-#define BTH2_OFFSET (offsetof(struct hfi1_sdma_header, hdr.u.oth.bth[2]) / 4)
+#define BTH2_OFFSET (offsetof(struct hfi1_sdma_header, \
+ hdr.ibh.u.oth.bth[2]) / 4)
/**
* build_ahg - create ahg in s_ahg
@@ -743,8 +744,8 @@ void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
qp->s_hdrwords +=
hfi1_make_grh(ibp,
- &ps->s_txreq->phdr.hdr.u.l.grh,
- rdma_ah_read_grh(&qp->remote_ah_attr),
+ &ps->s_txreq->phdr.hdr.ibh.u.l.grh,
+ &qp->remote_ah_attr.grh,
qp->s_hdrwords, nwords);
lrh0 = HFI1_LRH_GRH;
middle = 0;
@@ -773,14 +774,14 @@ void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
build_ahg(qp, bth2);
else
qp->s_flags &= ~RVT_S_AHG_VALID;
- ps->s_txreq->phdr.hdr.lrh[0] = cpu_to_be16(lrh0);
- ps->s_txreq->phdr.hdr.lrh[1] =
+ ps->s_txreq->phdr.hdr.ibh.lrh[0] = cpu_to_be16(lrh0);
+ ps->s_txreq->phdr.hdr.ibh.lrh[1] =
cpu_to_be16(rdma_ah_get_dlid(&qp->remote_ah_attr));
- ps->s_txreq->phdr.hdr.lrh[2] =
+ ps->s_txreq->phdr.hdr.ibh.lrh[2] =
cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
- ps->s_txreq->phdr.hdr.lrh[3] =
+ ps->s_txreq->phdr.hdr.ibh.lrh[3] =
cpu_to_be16(ppd_from_ibp(ibp)->lid |
- rdma_ah_get_path_bits(&qp->remote_ah_attr));
+ rdma_ah_get_path_bits(&qp->remote_ah_attr));
bth0 |= hfi1_get_pkey(ibp, qp->s_pkey_index);
bth0 |= extra_bytes << 20;
ohdr->bth[0] = cpu_to_be32(bth0);
diff --git a/drivers/infiniband/hw/hfi1/trace_ibhdrs.h b/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
index 0f2d2da..7324025 100644
--- a/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
+++ b/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
@@ -213,9 +213,9 @@ void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
DECLARE_EVENT_CLASS(hfi1_output_ibhdr_template,
TP_PROTO(struct hfi1_devdata *dd,
- struct ib_header *hdr,
+ struct hfi1_opa_header *opah,
bool sc5),
- TP_ARGS(dd, hdr, sc5),
+ TP_ARGS(dd, opah, sc5),
TP_STRUCT__entry(
DD_DEV_ENTRY(dd)
__field(u8, lnh)
@@ -238,10 +238,11 @@ void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
__field(u32, psn)
/* extended headers */
__dynamic_array(u8, ehdrs,
- hfi1_trace_ib_hdr_len(hdr))
+ hfi1_trace_ib_hdr_len(&opah->ibh))
),
TP_fast_assign(
struct ib_other_headers *ohdr;
+ struct ib_header *hdr = &opah->ibh;
DD_DEV_ASSIGN(dd);
@@ -294,18 +295,18 @@ void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
DEFINE_EVENT(hfi1_output_ibhdr_template, pio_output_ibhdr,
TP_PROTO(struct hfi1_devdata *dd,
- struct ib_header *hdr, bool sc5),
- TP_ARGS(dd, hdr, sc5));
+ struct hfi1_opa_header *opah, bool sc5),
+ TP_ARGS(dd, opah, sc5));
DEFINE_EVENT(hfi1_output_ibhdr_template, ack_output_ibhdr,
TP_PROTO(struct hfi1_devdata *dd,
- struct ib_header *hdr, bool sc5),
- TP_ARGS(dd, hdr, sc5));
+ struct hfi1_opa_header *opah, bool sc5),
+ TP_ARGS(dd, opah, sc5));
DEFINE_EVENT(hfi1_output_ibhdr_template, sdma_output_ibhdr,
TP_PROTO(struct hfi1_devdata *dd,
- struct ib_header *hdr, bool sc5),
- TP_ARGS(dd, hdr, sc5));
+ struct hfi1_opa_header *opah, bool sc5),
+ TP_ARGS(dd, opah, sc5));
#endif /* __HFI1_TRACE_IBHDRS_H */
diff --git a/drivers/infiniband/hw/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
index 366f7b9..e0bb766 100644
--- a/drivers/infiniband/hw/hfi1/uc.c
+++ b/drivers/infiniband/hw/hfi1/uc.c
@@ -93,9 +93,9 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
goto done_free_tx;
}
- ohdr = &ps->s_txreq->phdr.hdr.u.oth;
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
- ohdr = &ps->s_txreq->phdr.hdr.u.l.oth;
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
/* Get the next send request. */
wqe = rvt_get_swqe_ptr(qp, qp->s_cur);
diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
index dcf8c14..2af993c 100644
--- a/drivers/infiniband/hw/hfi1/ud.c
+++ b/drivers/infiniband/hw/hfi1/ud.c
@@ -357,12 +357,13 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
/* Header size in 32-bit words. */
- qp->s_hdrwords += hfi1_make_grh(ibp,
- &ps->s_txreq->phdr.hdr.u.l.grh,
- rdma_ah_read_grh(ah_attr),
- qp->s_hdrwords, nwords);
+ qp->s_hdrwords +=
+ hfi1_make_grh(ibp,
+ &ps->s_txreq->phdr.hdr.ibh.u.l.grh,
+ rdma_ah_read_grh(ah_attr),
+ qp->s_hdrwords, nwords);
lrh0 = HFI1_LRH_GRH;
- ohdr = &ps->s_txreq->phdr.hdr.u.l.oth;
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
/*
* Don't worry about sending to locally attached multicast
* QPs. It is unspecified by the spec. what happens.
@@ -370,7 +371,7 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
} else {
/* Header size in 32-bit words. */
lrh0 = HFI1_LRH_BTH;
- ohdr = &ps->s_txreq->phdr.hdr.u.oth;
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
}
if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) {
qp->s_hdrwords++;
@@ -392,21 +393,21 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
ps->s_txreq->sde = priv->s_sde;
priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
ps->s_txreq->psc = priv->s_sendcontext;
- ps->s_txreq->phdr.hdr.lrh[0] = cpu_to_be16(lrh0);
- ps->s_txreq->phdr.hdr.lrh[1] =
+ ps->s_txreq->phdr.hdr.ibh.lrh[0] = cpu_to_be16(lrh0);
+ ps->s_txreq->phdr.hdr.ibh.lrh[1] =
cpu_to_be16(rdma_ah_get_dlid(ah_attr));
- ps->s_txreq->phdr.hdr.lrh[2] =
+ ps->s_txreq->phdr.hdr.ibh.lrh[2] =
cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
if (rdma_ah_get_dlid(ah_attr) == be16_to_cpu(IB_LID_PERMISSIVE)) {
- ps->s_txreq->phdr.hdr.lrh[3] = IB_LID_PERMISSIVE;
+ ps->s_txreq->phdr.hdr.ibh.lrh[3] = IB_LID_PERMISSIVE;
} else {
lid = ppd->lid;
if (lid) {
lid |= rdma_ah_get_path_bits(ah_attr) &
((1 << ppd->lmc) - 1);
- ps->s_txreq->phdr.hdr.lrh[3] = cpu_to_be16(lid);
+ ps->s_txreq->phdr.hdr.ibh.lrh[3] = cpu_to_be16(lid);
} else {
- ps->s_txreq->phdr.hdr.lrh[3] = IB_LID_PERMISSIVE;
+ ps->s_txreq->phdr.hdr.ibh.lrh[3] = IB_LID_PERMISSIVE;
}
}
if (wqe->wr.send_flags & IB_SEND_SOLICITED)
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index eb0fda7..750996a 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -506,6 +506,24 @@ void hfi1_copy_sge(
}
}
+static u8 get_opcode(struct hfi1_opa_header *hdr)
+{
+ struct ib_other_headers *ohdr;
+
+ if (hdr->hdr_type) {
+ if (hfi1_16B_get_l4(&hdr->opah) == OPA_16B_L4_IB_LOCAL)
+ ohdr = &hdr->opah.u.oth;
+ else
+ ohdr = &hdr->opah.u.l.oth;
+ } else {
+ if (ib_get_lnh(&hdr->ibh) == HFI1_LRH_BTH)
+ ohdr = &hdr->ibh.u.oth;
+ else
+ ohdr = &hdr->ibh.u.l.oth;
+ }
+ return ib_bth_get_opcode(ohdr);
+}
+
/*
* Make sure the QP is ready and able to accept the given opcode.
*/
@@ -686,7 +704,7 @@ static void verbs_sdma_complete(
if (tx->wqe) {
hfi1_send_complete(qp, tx->wqe, IB_WC_SUCCESS);
} else if (qp->ibqp.qp_type == IB_QPT_RC) {
- struct ib_header *hdr;
+ struct hfi1_opa_header *hdr;
hdr = &tx->phdr.hdr;
hfi1_rc_send_complete(qp, hdr);
@@ -1175,7 +1193,7 @@ static inline send_routine get_send_routine(struct rvt_qp *qp,
{
struct hfi1_devdata *dd = dd_from_ibdev(qp->ibqp.device);
struct hfi1_qp_priv *priv = qp->priv;
- struct ib_header *h = &tx->phdr.hdr;
+ struct hfi1_opa_header *h = &tx->phdr.hdr;
if (unlikely(!(dd->flags & HFI1_HAS_SEND_DMA)))
return dd->process_pio_send;
@@ -1221,7 +1239,7 @@ int hfi1_verbs_send(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
int ret;
u8 lnh;
- hdr = &ps->s_txreq->phdr.hdr;
+ hdr = &ps->s_txreq->phdr.hdr.ibh;
/* locate the pkey within the headers */
lnh = ib_get_lnh(hdr);
if (lnh == HFI1_LRH_GRH)
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 590aab2..2022410 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -115,6 +115,14 @@ struct hfi1_16b_header {
} u;
} __packed;
+struct hfi1_opa_header {
+ union {
+ struct ib_header ibh; /* 9B header */
+ struct hfi1_16b_header opah; /* 16B header */
+ };
+ u8 hdr_type; /* 9B or 16B */
+} __packed;
+
struct hfi1_ahg_info {
u32 ahgdesc[2];
u16 tx_flags;
@@ -124,7 +132,7 @@ struct hfi1_ahg_info {
struct hfi1_sdma_header {
__le64 pbc;
- struct ib_header hdr;
+ struct hfi1_opa_header hdr;
} __packed;
/*
@@ -326,7 +334,7 @@ void hfi1_rc_hdrerr(
struct ib_ah *hfi1_create_qp0_ah(struct hfi1_ibport *ibp, u16 dlid);
-void hfi1_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr);
+void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah);
void hfi1_ud_rcv(struct hfi1_packet *packet);
@@ -347,16 +355,6 @@ void hfi1_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
extern const u32 rc_only_opcode;
extern const u32 uc_only_opcode;
-static inline u8 get_opcode(struct ib_header *h)
-{
- u16 lnh = be16_to_cpu(h->lrh[0]) & 3;
-
- if (lnh == IB_LNH_IBA_LOCAL)
- return be32_to_cpu(h->u.oth.bth[0]) >> 24;
- else
- return be32_to_cpu(h->u.l.oth.bth[0]) >> 24;
-}
-
int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct hfi1_packet *packet);
u32 hfi1_make_grh(struct hfi1_ibport *ibp, struct ib_grh *hdr,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 20/27] IB/hfi1: Add support to process 16B header errors
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (17 preceding siblings ...)
2017-08-04 20:54 ` [PATCH for-next 19/27] IB/hfi1: Add support to send " Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 21/27] IB/hfi1: Determine 9B/16B L2 header type based on Address handle Dennis Dalessandro
` (6 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Enhance hdr_rcverr() to also handle errors during
16B bypass packet receive.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/driver.c | 58 ++++++++++++++++++++++++++++-------
drivers/infiniband/hw/hfi1/hfi.h | 13 +++++++-
drivers/infiniband/hw/hfi1/ruc.c | 31 ++++++++++++-------
drivers/infiniband/hw/hfi1/ud.c | 3 +-
drivers/infiniband/hw/hfi1/verbs.c | 39 +++++++++++++++++++-----
drivers/infiniband/hw/hfi1/verbs.h | 1 +
6 files changed, 112 insertions(+), 33 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 5280d82..ae6a90d 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -269,8 +269,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
{
struct ib_header *rhdr = packet->hdr;
u32 rte = rhf_rcv_type_err(packet->rhf);
- u8 lnh = ib_get_lnh(rhdr);
- bool has_grh = false;
+ u32 mlid_base;
struct hfi1_ibport *ibp = rcd_to_iport(rcd);
struct hfi1_devdata *dd = ppd->dd;
struct rvt_dev_info *rdi = &dd->verbs_dev.rdi;
@@ -278,14 +277,20 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
if (packet->rhf & (RHF_VCRC_ERR | RHF_ICRC_ERR))
return;
- if (lnh == HFI1_LRH_BTH) {
- packet->ohdr = &rhdr->u.oth;
- } else if (lnh == HFI1_LRH_GRH) {
- has_grh = true;
- packet->ohdr = &rhdr->u.l.oth;
- packet->grh = &rhdr->u.l.grh;
- } else {
+ if (packet->etype == RHF_RCV_TYPE_BYPASS) {
goto drop;
+ } else {
+ u8 lnh = ib_get_lnh(rhdr);
+
+ mlid_base = be16_to_cpu(IB_MULTICAST_LID_BASE);
+ if (lnh == HFI1_LRH_BTH) {
+ packet->ohdr = &rhdr->u.oth;
+ } else if (lnh == HFI1_LRH_GRH) {
+ packet->ohdr = &rhdr->u.l.oth;
+ packet->grh = &rhdr->u.l.grh;
+ } else {
+ goto drop;
+ }
}
if (packet->rhf & RHF_TID_ERR) {
@@ -293,14 +298,13 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
u32 tlen = rhf_pkt_len(packet->rhf); /* in bytes */
u32 dlid = ib_get_dlid(rhdr);
u32 qp_num;
- u32 mlid_base = be16_to_cpu(IB_MULTICAST_LID_BASE);
/* Sanity check packet */
if (tlen < 24)
goto drop;
/* Check for GRH */
- if (has_grh) {
+ if (packet->grh) {
u32 vtf;
struct ib_grh *grh = packet->grh;
@@ -1370,6 +1374,35 @@ static inline void hfi1_setup_ib_header(struct hfi1_packet *packet)
packet->hlen = (u8 *)packet->rhf_addr - (u8 *)packet->hdr;
}
+static int hfi1_bypass_ingress_pkt_check(struct hfi1_packet *packet)
+{
+ struct hfi1_pportdata *ppd = packet->rcd->ppd;
+
+ /* slid and dlid cannot be 0 */
+ if ((!packet->slid) || (!packet->dlid))
+ return -EINVAL;
+
+ /* Compare port lid with incoming packet dlid */
+ if ((!(hfi1_is_16B_mcast(packet->dlid))) &&
+ (packet->dlid !=
+ opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B))) {
+ if (packet->dlid != ppd->lid)
+ return -EINVAL;
+ }
+
+ /* No multicast packets with SC15 */
+ if ((hfi1_is_16B_mcast(packet->dlid)) && (packet->sc == 0xF))
+ return -EINVAL;
+
+ /* Packets with permissive DLID always on SC15 */
+ if ((packet->dlid == opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE),
+ 16B)) &&
+ (packet->sc != 0xF))
+ return -EINVAL;
+
+ return 0;
+}
+
static int hfi1_setup_9B_packet(struct hfi1_packet *packet)
{
struct hfi1_ibport *ibp = rcd_to_iport(packet->rcd);
@@ -1479,6 +1512,9 @@ static int hfi1_setup_bypass_packet(struct hfi1_packet *packet)
packet->fecn = hfi1_16B_get_fecn(packet->hdr);
packet->becn = hfi1_16B_get_becn(packet->hdr);
+ if (hfi1_bypass_ingress_pkt_check(packet))
+ goto drop;
+
return 0;
drop:
hfi1_cdbg(PKT, "%s: packet dropped\n", __func__);
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index dbbad76..ee19660 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -372,6 +372,10 @@ struct hfi1_packet {
#define OPA_16B_FECN_SHIFT 28
#define OPA_16B_L2_MASK 0x60000000ull
#define OPA_16B_L2_SHIFT 29
+#define OPA_16B_PKEY_MASK 0xFFFF0000ull
+#define OPA_16B_PKEY_SHIFT 16
+#define OPA_16B_LEN_MASK 0x7FF00000ull
+#define OPA_16B_LEN_SHIFT 20
/*
* OPA 16B L2/L4 Encodings
@@ -420,6 +424,11 @@ static inline u8 hfi1_16B_get_l2(struct hfi1_16b_header *hdr)
return (u8)((hdr->lrh[1] & OPA_16B_L2_MASK) >> OPA_16B_L2_SHIFT);
}
+static inline u16 hfi1_16B_get_pkey(struct hfi1_16b_header *hdr)
+{
+ return (u16)((hdr->lrh[2] & OPA_16B_PKEY_MASK) >> OPA_16B_PKEY_SHIFT);
+}
+
/*
* BTH
*/
@@ -1597,9 +1606,9 @@ static void ingress_pkey_table_fail(struct hfi1_pportdata *ppd, u16 pkey,
* by HW and rcv_pkey_check function should be called instead.
*/
static inline int ingress_pkey_check(struct hfi1_pportdata *ppd, u16 pkey,
- u8 sc5, u8 idx, u16 slid)
+ u8 sc5, u8 idx, u32 slid, bool force)
{
- if (!(ppd->part_enforce & HFI1_PART_ENFORCE_IN))
+ if (!(force) && !(ppd->part_enforce & HFI1_PART_ENFORCE_IN))
return 0;
/* If SC15, pkey[0:14] must be 0x7fff */
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index e30c64f..d252f8f 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -227,15 +227,23 @@ int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct hfi1_packet *packet)
u32 sl = packet->sl;
int migrated;
u32 bth0, bth1;
+ u16 pkey;
bth0 = be32_to_cpu(packet->ohdr->bth[0]);
bth1 = be32_to_cpu(packet->ohdr->bth[1]);
- migrated = bth0 & IB_BTH_MIG_REQ;
+ if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+ pkey = hfi1_16B_get_pkey(packet->hdr);
+ migrated = bth1 & OPA_BTH_MIG_REQ;
+ } else {
+ pkey = ib_bth_get_pkey(packet->ohdr);
+ migrated = bth0 & IB_BTH_MIG_REQ;
+ }
if (qp->s_mig_state == IB_MIG_ARMED && migrated) {
if (!packet->grh) {
- if (rdma_ah_get_ah_flags(&qp->alt_ah_attr) &
- IB_AH_GRH)
+ if ((rdma_ah_get_ah_flags(&qp->alt_ah_attr) &
+ IB_AH_GRH) &&
+ (packet->etype != RHF_RCV_TYPE_BYPASS))
return 1;
} else {
const struct ib_global_route *grh;
@@ -254,10 +262,10 @@ int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct hfi1_packet *packet)
grh->dgid.global.interface_id))
return 1;
}
- if (unlikely(rcv_pkey_check(ppd_from_ibp(ibp), (u16)bth0,
+ if (unlikely(rcv_pkey_check(ppd_from_ibp(ibp), pkey,
sc5, slid))) {
- hfi1_bad_pkey(ibp, (u16)bth0, sl,
- 0, qp->ibqp.qp_num, slid, dlid);
+ hfi1_bad_pkey(ibp, pkey, sl, 0, qp->ibqp.qp_num,
+ slid, dlid);
return 1;
}
/* Validate the SLID. See Ch. 9.6.1.5 and 17.2.8 */
@@ -270,8 +278,9 @@ int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct hfi1_packet *packet)
spin_unlock_irqrestore(&qp->s_lock, flags);
} else {
if (!packet->grh) {
- if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) &
- IB_AH_GRH)
+ if ((rdma_ah_get_ah_flags(&qp->remote_ah_attr) &
+ IB_AH_GRH) &&
+ (packet->etype != RHF_RCV_TYPE_BYPASS))
return 1;
} else {
const struct ib_global_route *grh;
@@ -290,10 +299,10 @@ int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct hfi1_packet *packet)
grh->dgid.global.interface_id))
return 1;
}
- if (unlikely(rcv_pkey_check(ppd_from_ibp(ibp), (u16)bth0,
+ if (unlikely(rcv_pkey_check(ppd_from_ibp(ibp), pkey,
sc5, slid))) {
- hfi1_bad_pkey(ibp, (u16)bth0, sl,
- 0, qp->ibqp.qp_num, slid, dlid);
+ hfi1_bad_pkey(ibp, pkey, sl, 0, qp->ibqp.qp_num,
+ slid, dlid);
return 1;
}
/* Validate the SLID. See Ch. 9.6.1.5 */
diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
index 2af993c..b708376 100644
--- a/drivers/infiniband/hw/hfi1/ud.c
+++ b/drivers/infiniband/hw/hfi1/ud.c
@@ -109,7 +109,8 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
slid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
((1 << ppd->lmc) - 1));
if (unlikely(ingress_pkey_check(ppd, pkey, sc5,
- qp->s_pkey_index, slid))) {
+ qp->s_pkey_index,
+ slid, false))) {
hfi1_bad_pkey(ibp, pkey,
rdma_ah_get_sl(ah_attr),
sqp->ibqp.qp_num, qp->ibqp.qp_num,
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 750996a..df4103d 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -568,6 +568,24 @@ static u64 hfi1_fault_tx(struct rvt_qp *qp, u8 opcode, u64 pbc)
return pbc;
}
+static int hfi1_do_pkey_check(struct hfi1_packet *packet)
+{
+ struct hfi1_ctxtdata *rcd = packet->rcd;
+ struct hfi1_pportdata *ppd = rcd->ppd;
+ struct hfi1_16b_header *hdr = packet->hdr;
+ u16 pkey;
+
+ /* Pkey check needed only for bypass packets */
+ if (packet->etype != RHF_RCV_TYPE_BYPASS)
+ return 0;
+
+ /* Perform pkey check */
+ pkey = hfi1_16B_get_pkey(hdr);
+ return ingress_pkey_check(ppd, pkey, packet->sc,
+ packet->qp->s_pkey_index,
+ packet->slid, true);
+}
+
static inline void hfi1_handle_packet(struct hfi1_packet *packet,
bool is_mcast)
{
@@ -594,6 +612,8 @@ static inline void hfi1_handle_packet(struct hfi1_packet *packet,
goto drop;
list_for_each_entry_rcu(p, &mcast->qp_list, list) {
packet->qp = p->qp;
+ if (hfi1_do_pkey_check(packet))
+ goto drop;
spin_lock_irqsave(&packet->qp->r_lock, flags);
packet_handler = qp_ok(packet);
if (likely(packet_handler))
@@ -613,15 +633,16 @@ static inline void hfi1_handle_packet(struct hfi1_packet *packet,
qp_num = ib_bth_get_qpn(packet->ohdr);
rcu_read_lock();
packet->qp = rvt_lookup_qpn(rdi, &ibp->rvp, qp_num);
- if (!packet->qp) {
- rcu_read_unlock();
- goto drop;
- }
+ if (!packet->qp)
+ goto unlock_drop;
+
+ if (hfi1_do_pkey_check(packet))
+ goto unlock_drop;
+
if (unlikely(hfi1_dbg_fault_opcode(packet->qp, packet->opcode,
- true))) {
- rcu_read_unlock();
- goto drop;
- }
+ true)))
+ goto unlock_drop;
+
spin_lock_irqsave(&packet->qp->r_lock, flags);
packet_handler = qp_ok(packet);
if (likely(packet_handler))
@@ -632,6 +653,8 @@ static inline void hfi1_handle_packet(struct hfi1_packet *packet,
rcu_read_unlock();
}
return;
+unlock_drop:
+ rcu_read_unlock();
drop:
ibp->rvp.n_pkt_drops++;
}
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 2022410..68577a0 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -95,6 +95,7 @@
#define HFI1_VENDOR_IPG cpu_to_be16(0xFFA0)
#define IB_DEFAULT_GID_PREFIX cpu_to_be64(0xfe80000000000000ULL)
+#define OPA_BTH_MIG_REQ BIT(31)
#define RC_OP(x) IB_OPCODE_RC_##x
#define UC_OP(x) IB_OPCODE_UC_##x
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 21/27] IB/hfi1: Determine 9B/16B L2 header type based on Address handle
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (18 preceding siblings ...)
2017-08-04 20:54 ` [PATCH for-next 20/27] IB/hfi1: Add support to process 16B header errors Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 22/27] IB/hfi1: Add 16B UD support Dennis Dalessandro
` (5 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
When address handle attributes are initialized, the LIDs are
transformed to be in the 32 bit LID space.
When constructing the header, hfi1 driver will look at the LID
to determine the packet header to be created.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/core/sa_query.c | 21 ++++++--
drivers/infiniband/core/uverbs_cmd.c | 3 +
drivers/infiniband/hw/hfi1/hfi.h | 92 ++++++++++++++++++++++++++++++++++
drivers/infiniband/hw/hfi1/qp.c | 28 ++++++++++
drivers/infiniband/hw/hfi1/verbs.c | 12 ++++
drivers/infiniband/hw/hfi1/verbs.h | 1
include/rdma/ib_verbs.h | 15 ++++++
include/rdma/opa_addr.h | 4 +
8 files changed, 168 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 70fa4ca..61b0f45 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -50,6 +50,7 @@
#include <uapi/rdma/ib_user_sa.h>
#include <rdma/ib_marshall.h>
#include <rdma/ib_addr.h>
+#include <rdma/opa_addr.h>
#include "sa.h"
#include "core_priv.h"
@@ -1241,6 +1242,11 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
ah_attr->type = rdma_ah_find_type(device, port_num);
rdma_ah_set_dlid(ah_attr, be32_to_cpu(sa_path_get_dlid(rec)));
+
+ if ((ah_attr->type == RDMA_AH_ATTR_TYPE_OPA) &&
+ (rdma_ah_get_dlid(ah_attr) == be16_to_cpu(IB_LID_PERMISSIVE)))
+ rdma_ah_set_make_grd(ah_attr, true);
+
rdma_ah_set_sl(ah_attr, rec->sl);
rdma_ah_set_path_bits(ah_attr, be32_to_cpu(sa_path_get_slid(rec)) &
get_src_path_mask(device, port_num));
@@ -2290,12 +2296,15 @@ static void update_sm_ah(struct work_struct *work)
rdma_ah_set_sl(&ah_attr, port_attr.sm_sl);
rdma_ah_set_port_num(&ah_attr, port->port_num);
if (port_attr.grh_required) {
- rdma_ah_set_ah_flags(&ah_attr, IB_AH_GRH);
-
- rdma_ah_set_subnet_prefix(&ah_attr,
- cpu_to_be64(port_attr.subnet_prefix));
- rdma_ah_set_interface_id(&ah_attr,
- cpu_to_be64(IB_SA_WELL_KNOWN_GUID));
+ if (ah_attr.type == RDMA_AH_ATTR_TYPE_OPA) {
+ rdma_ah_set_make_grd(&ah_attr, true);
+ } else {
+ rdma_ah_set_ah_flags(&ah_attr, IB_AH_GRH);
+ rdma_ah_set_subnet_prefix(&ah_attr,
+ cpu_to_be64(port_attr.subnet_prefix));
+ rdma_ah_set_interface_id(&ah_attr,
+ cpu_to_be64(IB_SA_WELL_KNOWN_GUID));
+ }
}
new_ah->ah = rdma_create_ah(port->agent->qp->pd, &ah_attr);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 670176b..3ce4563 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -2000,6 +2000,7 @@ static int modify_qp(struct ib_uverbs_file *file,
rdma_ah_set_static_rate(&attr->ah_attr, cmd->base.dest.static_rate);
rdma_ah_set_port_num(&attr->ah_attr,
cmd->base.dest.port_num);
+ rdma_ah_set_make_grd(&attr->ah_attr, false);
attr->alt_ah_attr.type = rdma_ah_find_type(qp->device,
cmd->base.dest.port_num);
@@ -2023,6 +2024,7 @@ static int modify_qp(struct ib_uverbs_file *file,
cmd->base.alt_dest.static_rate);
rdma_ah_set_port_num(&attr->alt_ah_attr,
cmd->base.alt_dest.port_num);
+ rdma_ah_set_make_grd(&attr->alt_ah_attr, false);
ret = ib_modify_qp_with_udata(qp, attr,
modify_qp_mask(qp->qp_type,
@@ -2573,6 +2575,7 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
}
attr.type = rdma_ah_find_type(ib_dev, cmd.attr.port_num);
+ rdma_ah_set_make_grd(&attr, false);
rdma_ah_set_dlid(&attr, cmd.attr.dlid);
rdma_ah_set_sl(&attr, cmd.attr.sl);
rdma_ah_set_path_bits(&attr, cmd.attr.src_path_bits);
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index ee19660..cec9590 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -70,6 +70,7 @@
#include <linux/rhashtable.h>
#include <linux/netdevice.h>
#include <rdma/rdma_vt.h>
+#include <rdma/opa_addr.h>
#include "chip_registers.h"
#include "common.h"
@@ -353,6 +354,10 @@ struct hfi1_packet {
bool fecn;
};
+/* Packet types */
+#define HFI1_PKT_TYPE_9B 0
+#define HFI1_PKT_TYPE_16B 1
+
/*
* OPA 16B Header
*/
@@ -2170,6 +2175,31 @@ static inline bool is_integrated(struct hfi1_devdata *dd)
#define DD_DEV_ENTRY(dd) __string(dev, dev_name(&(dd)->pcidev->dev))
#define DD_DEV_ASSIGN(dd) __assign_str(dev, dev_name(&(dd)->pcidev->dev))
+static inline void hfi1_update_ah_attr(struct ib_device *ibdev,
+ struct rdma_ah_attr *attr)
+{
+ struct hfi1_pportdata *ppd;
+ struct hfi1_ibport *ibp;
+ u32 dlid = rdma_ah_get_dlid(attr);
+
+ /*
+ * Kernel clients may not have setup GRH information
+ * Set that here.
+ */
+ ibp = to_iport(ibdev, rdma_ah_get_port_num(attr));
+ ppd = ppd_from_ibp(ibp);
+ if ((((dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) ||
+ (ppd->lid >= be16_to_cpu(IB_MULTICAST_LID_BASE))) &&
+ (dlid != be32_to_cpu(OPA_LID_PERMISSIVE)) &&
+ (dlid != be16_to_cpu(IB_LID_PERMISSIVE)) &&
+ (!(rdma_ah_get_ah_flags(attr) & IB_AH_GRH))) ||
+ (rdma_ah_get_make_grd(attr))) {
+ rdma_ah_set_ah_flags(attr, IB_AH_GRH);
+ rdma_ah_set_interface_id(attr, OPA_MAKE_ID(dlid));
+ rdma_ah_set_subnet_prefix(attr, ibp->rvp.gid_prefix);
+ }
+}
+
/*
* hfi1_check_mcast- Check if the given lid is
* in the OPA multicast range.
@@ -2223,4 +2253,66 @@ static inline bool hfi1_is_16B_mcast(u32 lid)
opa_get_lid(opa_get_mcast_base(OPA_MCAST_NR), 16B)) &&
(lid != opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B)));
}
+
+static inline void hfi1_make_opa_lid(struct rdma_ah_attr *attr)
+{
+ const struct ib_global_route *grh = rdma_ah_read_grh(attr);
+ u32 dlid = rdma_ah_get_dlid(attr);
+
+ /* Modify ah_attr.dlid to be in the 32 bit LID space.
+ * This is how the address will be laid out:
+ * Assuming MCAST_NR to be 4,
+ * 32 bit permissive LID = 0xFFFFFFFF
+ * Multicast LID range = 0xFFFFFFFE to 0xF0000000
+ * Unicast LID range = 0xEFFFFFFF to 1
+ * Invalid LID = 0
+ */
+ if (ib_is_opa_gid(&grh->dgid))
+ dlid = opa_get_lid_from_gid(&grh->dgid);
+ else if ((dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
+ (dlid != be16_to_cpu(IB_LID_PERMISSIVE)) &&
+ (dlid != be32_to_cpu(OPA_LID_PERMISSIVE)))
+ dlid = dlid - be16_to_cpu(IB_MULTICAST_LID_BASE) +
+ opa_get_mcast_base(OPA_MCAST_NR);
+ else if (dlid == be16_to_cpu(IB_LID_PERMISSIVE))
+ dlid = be32_to_cpu(OPA_LID_PERMISSIVE);
+
+ rdma_ah_set_dlid(attr, dlid);
+}
+
+static inline u8 hfi1_get_packet_type(u32 lid)
+{
+ /* 9B if lid > 0xF0000000 */
+ if (lid >= opa_get_mcast_base(OPA_MCAST_NR))
+ return HFI1_PKT_TYPE_9B;
+
+ /* 16B if lid > 0xC000 */
+ if (lid >= opa_get_lid(opa_get_mcast_base(OPA_MCAST_NR), 9B))
+ return HFI1_PKT_TYPE_16B;
+
+ return HFI1_PKT_TYPE_9B;
+}
+
+static inline bool hfi1_get_hdr_type(u32 lid, struct rdma_ah_attr *attr)
+{
+ /*
+ * If there was an incoming 16B packet with permissive
+ * LIDs, OPA GIDs would have been programmed when those
+ * packets were received. A 16B packet will have to
+ * be sent in response to that packet. Return a 16B
+ * header type if that's the case.
+ */
+ if (rdma_ah_get_dlid(attr) == be32_to_cpu(OPA_LID_PERMISSIVE))
+ return (ib_is_opa_gid(&rdma_ah_read_grh(attr)->dgid)) ?
+ HFI1_PKT_TYPE_16B : HFI1_PKT_TYPE_9B;
+
+ /*
+ * Return a 16B header type if either the the destination
+ * or source lid is extended.
+ */
+ if (hfi1_get_packet_type(rdma_ah_get_dlid(attr)) == HFI1_PKT_TYPE_16B)
+ return HFI1_PKT_TYPE_16B;
+
+ return hfi1_get_packet_type(lid);
+}
#endif /* _HFI1_KERNEL_H */
diff --git a/drivers/infiniband/hw/hfi1/qp.c b/drivers/infiniband/hw/hfi1/qp.c
index b801d84..0fca6df 100644
--- a/drivers/infiniband/hw/hfi1/qp.c
+++ b/drivers/infiniband/hw/hfi1/qp.c
@@ -232,6 +232,31 @@ int hfi1_check_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
return 0;
}
+/*
+ * qp_set_16b - Set the hdr_type based on whether the slid or the
+ * dlid in the connection is extended. Only applicable for RC and UC
+ * QPs. UD QPs determine this on the fly from the ah in the wqe
+ */
+static inline void qp_set_16b(struct rvt_qp *qp)
+{
+ struct hfi1_pportdata *ppd;
+ struct hfi1_ibport *ibp;
+ struct hfi1_qp_priv *priv = qp->priv;
+
+ /* Update ah_attr to account for extended LIDs */
+ hfi1_update_ah_attr(qp->ibqp.device, &qp->remote_ah_attr);
+
+ /* Create 32 bit LIDs */
+ hfi1_make_opa_lid(&qp->remote_ah_attr);
+
+ if (!(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH))
+ return;
+
+ ibp = to_iport(qp->ibqp.device, qp->port_num);
+ ppd = ppd_from_ibp(ibp);
+ priv->hdr_type = hfi1_get_hdr_type(ppd->lid, &qp->remote_ah_attr);
+}
+
void hfi1_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
int attr_mask, struct ib_udata *udata)
{
@@ -242,6 +267,7 @@ void hfi1_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
priv->s_sc = ah_to_sc(ibqp->device, &qp->remote_ah_attr);
priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
+ qp_set_16b(qp);
}
if (attr_mask & IB_QP_PATH_MIG_STATE &&
@@ -251,6 +277,7 @@ void hfi1_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
priv->s_sc = ah_to_sc(ibqp->device, &qp->remote_ah_attr);
priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
+ qp_set_16b(qp);
}
}
@@ -751,6 +778,7 @@ void hfi1_migrate_qp(struct rvt_qp *qp)
qp->s_flags |= RVT_S_AHG_CLEAR;
priv->s_sc = ah_to_sc(qp->ibqp.device, &qp->remote_ah_attr);
priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
+ qp_set_16b(qp);
ev.device = qp->ibqp.device;
ev.element.qp = &qp->ibqp;
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index df4103d..62526f5 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1421,6 +1421,15 @@ static int query_port(struct rvt_dev_info *rdi, u8 port_num,
props->active_mtu = !valid_ib_mtu(ppd->ibmtu) ? props->max_mtu :
mtu_to_enum(ppd->ibmtu, IB_MTU_2048);
+ /*
+ * sm_lid of 0xFFFF needs special handling so that it can
+ * be differentiated from a permissve LID of 0xFFFF.
+ * We set the grh_required flag here so the SA can program
+ * the DGID in the address handle appropriately
+ */
+ if (props->sm_lid == be16_to_cpu(IB_LID_PERMISSIVE))
+ props->grh_required = true;
+
return 0;
}
@@ -1528,6 +1537,7 @@ static void hfi1_notify_new_ah(struct ib_device *ibdev,
struct hfi1_pportdata *ppd;
struct hfi1_devdata *dd;
u8 sc5;
+ struct rdma_ah_attr *attr = &ah->attr;
/*
* Do not trust reading anything from rvt_ah at this point as it is not
@@ -1537,6 +1547,8 @@ static void hfi1_notify_new_ah(struct ib_device *ibdev,
ibp = to_iport(ibdev, rdma_ah_get_port_num(ah_attr));
ppd = ppd_from_ibp(ibp);
sc5 = ibp->sl_to_sc[rdma_ah_get_sl(&ah->attr)];
+ hfi1_update_ah_attr(ibdev, attr);
+ hfi1_make_opa_lid(attr);
dd = dd_from_ppd(ppd);
ah->vl = sc_to_vlt(dd, sc5);
if (ah->vl < num_vls || ah->vl == 15)
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 68577a0..d3dd0c0 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -147,6 +147,7 @@ struct hfi1_qp_priv {
u8 s_sc; /* SC[0..4] for next packet */
struct iowait s_iowait;
struct rvt_qp *owner;
+ u8 hdr_type; /* 9B or 16B */
};
/*
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 3cb31e4..991783d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -862,6 +862,7 @@ struct roce_ah_attr {
struct opa_ah_attr {
u32 dlid;
u8 src_path_bits;
+ bool make_grd;
};
struct rdma_ah_attr {
@@ -3619,6 +3620,20 @@ static inline u8 rdma_ah_get_path_bits(const struct rdma_ah_attr *attr)
return 0;
}
+static inline void rdma_ah_set_make_grd(struct rdma_ah_attr *attr,
+ bool make_grd)
+{
+ if (attr->type == RDMA_AH_ATTR_TYPE_OPA)
+ attr->opa.make_grd = make_grd;
+}
+
+static inline bool rdma_ah_get_make_grd(const struct rdma_ah_attr *attr)
+{
+ if (attr->type == RDMA_AH_ATTR_TYPE_OPA)
+ return attr->opa.make_grd;
+ return false;
+}
+
static inline void rdma_ah_set_port_num(struct rdma_ah_attr *attr, u8 port_num)
{
attr->port_num = port_num;
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index 8d3ad4e..9ae126f 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -71,7 +71,7 @@
*
* @gid: The Global identifier
*/
-static inline bool ib_is_opa_gid(union ib_gid *gid)
+static inline bool ib_is_opa_gid(const union ib_gid *gid)
{
return ((be64_to_cpu(gid->global.interface_id) >> 40) ==
OPA_SPECIAL_OUI);
@@ -84,7 +84,7 @@ static inline bool ib_is_opa_gid(union ib_gid *gid)
*
* @gid: The Global identifier
*/
-static inline u32 opa_get_lid_from_gid(union ib_gid *gid)
+static inline u32 opa_get_lid_from_gid(const union ib_gid *gid)
{
return be64_to_cpu(gid->global.interface_id) & 0xFFFFFFFF;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 22/27] IB/hfi1: Add 16B UD support
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (19 preceding siblings ...)
2017-08-04 20:54 ` [PATCH for-next 21/27] IB/hfi1: Determine 9B/16B L2 header type based on Address handle Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 23/27] IB/hfi1: Add 16B trace support Dennis Dalessandro
` (4 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Add 16B bypass packet support for UD traffic types.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/driver.c | 35 ++-
drivers/infiniband/hw/hfi1/hfi.h | 117 +++++++++-
drivers/infiniband/hw/hfi1/mad.c | 8 -
drivers/infiniband/hw/hfi1/ruc.c | 4
drivers/infiniband/hw/hfi1/ud.c | 421 ++++++++++++++++++++++++++---------
drivers/infiniband/hw/hfi1/verbs.h | 2
include/rdma/opa_addr.h | 1
7 files changed, 457 insertions(+), 131 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index ae6a90d..fc7085d 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -437,23 +437,33 @@ void hfi1_process_ecn_slowpath(struct rvt_qp *qp, struct hfi1_packet *pkt,
bool do_cnp)
{
struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
- struct ib_header *hdr = pkt->hdr;
struct ib_other_headers *ohdr = pkt->ohdr;
struct ib_grh *grh = pkt->grh;
u32 rqpn = 0, bth1;
- u16 rlid, dlid = ib_get_dlid(hdr);
- u8 sc, svc_type;
+ u16 pkey, rlid, dlid = ib_get_dlid(pkt->hdr);
+ u8 hdr_type, sc, svc_type;
bool is_mcast = false;
+ if (pkt->etype == RHF_RCV_TYPE_BYPASS) {
+ is_mcast = hfi1_is_16B_mcast(dlid);
+ pkey = hfi1_16B_get_pkey(pkt->hdr);
+ sc = hfi1_16B_get_sc(pkt->hdr);
+ hdr_type = HFI1_PKT_TYPE_16B;
+ } else {
+ is_mcast = (dlid > be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
+ (dlid != be16_to_cpu(IB_LID_PERMISSIVE));
+ pkey = ib_bth_get_pkey(ohdr);
+ sc = hfi1_9B_get_sc5(pkt->hdr, pkt->rhf);
+ hdr_type = HFI1_PKT_TYPE_9B;
+ }
+
switch (qp->ibqp.qp_type) {
case IB_QPT_SMI:
case IB_QPT_GSI:
case IB_QPT_UD:
- rlid = ib_get_slid(hdr);
- rqpn = ib_get_sqpn(ohdr);
+ rlid = ib_get_slid(pkt->hdr);
+ rqpn = ib_get_sqpn(pkt->ohdr);
svc_type = IB_CC_SVCTYPE_UD;
- is_mcast = (dlid > be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
- (dlid != be16_to_cpu(IB_LID_PERMISSIVE));
break;
case IB_QPT_UC:
rlid = rdma_ah_get_dlid(&qp->remote_ah_attr);
@@ -469,14 +479,11 @@ void hfi1_process_ecn_slowpath(struct rvt_qp *qp, struct hfi1_packet *pkt,
return;
}
- sc = hfi1_9B_get_sc5(hdr, pkt->rhf);
-
bth1 = be32_to_cpu(ohdr->bth[1]);
- if (do_cnp && (bth1 & IB_FECN_SMASK)) {
- u16 pkey = ib_bth_get_pkey(ohdr);
-
- return_cnp(ibp, qp, rqpn, pkey, dlid, rlid, sc, grh);
- }
+ /* Call appropriate CNP handler */
+ if (do_cnp && (bth1 & IB_FECN_SMASK))
+ hfi1_handle_cnp_tbl[hdr_type](ibp, qp, rqpn, pkey,
+ dlid, rlid, sc, grh);
if (!is_mcast && (bth1 & IB_BECN_SMASK)) {
struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index cec9590..7e21192 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -831,6 +831,10 @@ struct hfi1_pportdata {
typedef int (*rhf_rcv_function_ptr)(struct hfi1_packet *packet);
typedef void (*opcode_handler)(struct hfi1_packet *packet);
+typedef void (*hfi1_make_req)(struct rvt_qp *qp,
+ struct hfi1_pkt_state *ps,
+ struct rvt_swqe *wqe);
+
/* return values for the RHF receive functions */
#define RHF_RCV_CONTINUE 0 /* keep going */
@@ -1373,6 +1377,13 @@ void hfi1_init_pportdata(struct pci_dev *pdev, struct hfi1_pportdata *ppd,
void hfi1_reset_vnic_msix_info(struct hfi1_ctxtdata *rcd);
extern const struct pci_device_id hfi1_pci_tbl[];
+void hfi1_make_ud_req_9B(struct rvt_qp *qp,
+ struct hfi1_pkt_state *ps,
+ struct rvt_swqe *wqe);
+
+void hfi1_make_ud_req_16B(struct rvt_qp *qp,
+ struct hfi1_pkt_state *ps,
+ struct rvt_swqe *wqe);
/* receive packet handler dispositions */
#define RCV_PKT_OK 0x0 /* keep going */
@@ -1507,6 +1518,18 @@ void process_becn(struct hfi1_pportdata *ppd, u8 sl, u16 rlid, u32 lqpn,
void return_cnp(struct hfi1_ibport *ibp, struct rvt_qp *qp, u32 remote_qpn,
u32 pkey, u32 slid, u32 dlid, u8 sc5,
const struct ib_grh *old_grh);
+void return_cnp_16B(struct hfi1_ibport *ibp, struct rvt_qp *qp,
+ u32 remote_qpn, u32 pkey, u32 slid, u32 dlid,
+ u8 sc5, const struct ib_grh *old_grh);
+typedef void (*hfi1_handle_cnp)(struct hfi1_ibport *ibp, struct rvt_qp *qp,
+ u32 remote_qpn, u32 pkey, u32 slid, u32 dlid,
+ u8 sc5, const struct ib_grh *old_grh);
+
+/* We support only two types - 9B and 16B for now */
+static const hfi1_handle_cnp hfi1_handle_cnp_tbl[2] = {
+ [HFI1_PKT_TYPE_9B] = &return_cnp,
+ [HFI1_PKT_TYPE_16B] = &return_cnp_16B
+};
#define PKEY_CHECK_INVALID -1
int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
u8 sc5, int8_t s_pkey_index);
@@ -1747,12 +1770,22 @@ static inline bool process_ecn(struct rvt_qp *qp, struct hfi1_packet *pkt,
bool do_cnp)
{
struct ib_other_headers *ohdr = pkt->ohdr;
- u32 bth1;
- bth1 = be32_to_cpu(ohdr->bth[1]);
- if (unlikely(bth1 & (IB_BECN_SMASK | IB_FECN_SMASK))) {
+ u32 bth1;
+ bool becn = false;
+ bool fecn = false;
+
+ if (pkt->etype == RHF_RCV_TYPE_BYPASS) {
+ fecn = hfi1_16B_get_fecn(pkt->hdr);
+ becn = hfi1_16B_get_becn(pkt->hdr);
+ } else {
+ bth1 = be32_to_cpu(ohdr->bth[1]);
+ fecn = bth1 & IB_FECN_SMASK;
+ becn = bth1 & IB_BECN_SMASK;
+ }
+ if (unlikely(fecn || becn)) {
hfi1_process_ecn_slowpath(qp, pkt, do_cnp);
- return !!(bth1 & IB_FECN_SMASK);
+ return fecn;
}
return false;
}
@@ -2315,4 +2348,80 @@ static inline bool hfi1_get_hdr_type(u32 lid, struct rdma_ah_attr *attr)
return hfi1_get_packet_type(lid);
}
+
+static inline void hfi1_make_ext_grh(struct hfi1_packet *packet,
+ struct ib_grh *grh, u32 slid,
+ u32 dlid)
+{
+ struct hfi1_ibport *ibp = &packet->rcd->ppd->ibport_data;
+ struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+
+ if (!ibp)
+ return;
+
+ grh->hop_limit = 1;
+ grh->sgid.global.subnet_prefix = ibp->rvp.gid_prefix;
+ if (slid == opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B))
+ grh->sgid.global.interface_id =
+ OPA_MAKE_ID(be32_to_cpu(OPA_LID_PERMISSIVE));
+ else
+ grh->sgid.global.interface_id = OPA_MAKE_ID(slid);
+
+ /*
+ * Upper layers (like mad) may compare the dgid in the
+ * wc that is obtained here with the sgid_index in
+ * the wr. Since sgid_index in wr is always 0 for
+ * extended lids, set the dgid here to the default
+ * IB gid.
+ */
+ grh->dgid.global.subnet_prefix = ibp->rvp.gid_prefix;
+ grh->dgid.global.interface_id =
+ cpu_to_be64(ppd->guids[HFI1_PORT_GUID_INDEX]);
+}
+
+static inline int hfi1_get_16b_padding(u32 hdr_size, u32 payload)
+{
+ return -(hdr_size + payload + (SIZE_OF_CRC << 2) +
+ SIZE_OF_LT) & 0x7;
+}
+
+static inline void hfi1_make_ib_hdr(struct ib_header *hdr,
+ u16 lrh0, u16 len,
+ u16 dlid, u16 slid)
+{
+ hdr->lrh[0] = cpu_to_be16(lrh0);
+ hdr->lrh[1] = cpu_to_be16(dlid);
+ hdr->lrh[2] = cpu_to_be16(len);
+ hdr->lrh[3] = cpu_to_be16(slid);
+}
+
+static inline void hfi1_make_16b_hdr(struct hfi1_16b_header *hdr,
+ u32 slid, u32 dlid,
+ u16 len, u16 pkey,
+ u8 becn, u8 fecn, u8 l4,
+ u8 sc)
+{
+ u32 lrh0 = 0;
+ u32 lrh1 = 0x40000000;
+ u32 lrh2 = 0;
+ u32 lrh3 = 0;
+
+ lrh0 = (lrh0 & ~OPA_16B_BECN_MASK) | (becn << OPA_16B_BECN_SHIFT);
+ lrh0 = (lrh0 & ~OPA_16B_LEN_MASK) | (len << OPA_16B_LEN_SHIFT);
+ lrh0 = (lrh0 & ~OPA_16B_LID_MASK) | (slid & OPA_16B_LID_MASK);
+ lrh1 = (lrh1 & ~OPA_16B_FECN_MASK) | (fecn << OPA_16B_FECN_SHIFT);
+ lrh1 = (lrh1 & ~OPA_16B_SC_MASK) | (sc << OPA_16B_SC_SHIFT);
+ lrh1 = (lrh1 & ~OPA_16B_LID_MASK) | (dlid & OPA_16B_LID_MASK);
+ lrh2 = (lrh2 & ~OPA_16B_SLID_MASK) |
+ ((slid >> OPA_16B_SLID_SHIFT) << OPA_16B_SLID_HIGH_SHIFT);
+ lrh2 = (lrh2 & ~OPA_16B_DLID_MASK) |
+ ((dlid >> OPA_16B_DLID_SHIFT) << OPA_16B_DLID_HIGH_SHIFT);
+ lrh2 = (lrh2 & ~OPA_16B_PKEY_MASK) | (pkey << OPA_16B_PKEY_SHIFT);
+ lrh2 = (lrh2 & ~OPA_16B_L4_MASK) | l4;
+
+ hdr->lrh[0] = lrh0;
+ hdr->lrh[1] = lrh1;
+ hdr->lrh[2] = lrh2;
+ hdr->lrh[3] = lrh3;
+}
#endif /* _HFI1_KERNEL_H */
diff --git a/drivers/infiniband/hw/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
index dea3fa0..40f6104 100644
--- a/drivers/infiniband/hw/hfi1/mad.c
+++ b/drivers/infiniband/hw/hfi1/mad.c
@@ -373,12 +373,10 @@ void hfi1_handle_trap_timer(unsigned long data)
* Send a bad P_Key trap (ch. 14.3.8).
*/
void hfi1_bad_pkey(struct hfi1_ibport *ibp, u32 key, u32 sl,
- u32 qp1, u32 qp2, u16 lid1, u16 lid2)
+ u32 qp1, u32 qp2, u32 lid1, u32 lid2)
{
struct trap_node *trap;
u32 lid = ppd_from_ibp(ibp)->lid;
- u32 _lid1 = lid1;
- u32 _lid2 = lid2;
ibp->rvp.n_pkt_drops++;
ibp->rvp.pkey_violations++;
@@ -389,8 +387,8 @@ void hfi1_bad_pkey(struct hfi1_ibport *ibp, u32 key, u32 sl,
return;
/* Send violation trap */
- trap->data.ntc_257_258.lid1 = cpu_to_be32(_lid1);
- trap->data.ntc_257_258.lid2 = cpu_to_be32(_lid2);
+ trap->data.ntc_257_258.lid1 = cpu_to_be32(lid1);
+ trap->data.ntc_257_258.lid2 = cpu_to_be32(lid2);
trap->data.ntc_257_258.key = cpu_to_be32(key);
trap->data.ntc_257_258.sl = sl << 3;
trap->data.ntc_257_258.qp1 = cpu_to_be32(qp1);
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index d252f8f..6839bfa 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -649,7 +649,7 @@ static void ruc_loopback(struct rvt_qp *sqp)
* @ibp: a pointer to the IB port
* @hdr: a pointer to the GRH header being constructed
* @grh: the global route address to send to
- * @hwords: the number of 32 bit words of header being sent
+ * @hwords: size of header after grh being sent in dwords
* @nwords: the number of 32 bit words of data being sent
*
* Return the size of the header in 32 bit words.
@@ -661,7 +661,7 @@ u32 hfi1_make_grh(struct hfi1_ibport *ibp, struct ib_grh *hdr,
cpu_to_be32((IB_GRH_VERSION << IB_GRH_VERSION_SHIFT) |
(grh->traffic_class << IB_GRH_TCLASS_SHIFT) |
(grh->flow_label << IB_GRH_FLOW_SHIFT));
- hdr->paylen = cpu_to_be16((hwords - 2 + nwords + SIZE_OF_CRC) << 2);
+ hdr->paylen = cpu_to_be16((hwords + nwords) << 2);
/* next_hdr is defined by C8-7 in ch. 8.4.1 */
hdr->next_hdr = IB_GRH_NEXT_HDR;
hdr->hop_limit = grh->hop_limit;
diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
index b708376..2ba74fd 100644
--- a/drivers/infiniband/hw/hfi1/ud.c
+++ b/drivers/infiniband/hw/hfi1/ud.c
@@ -53,6 +53,12 @@
#include "verbs_txreq.h"
#include "qp.h"
+/* We support only two types - 9B and 16B for now */
+static const hfi1_make_req hfi1_make_ud_req_tbl[2] = {
+ [HFI1_PKT_TYPE_9B] = &hfi1_make_ud_req_9B,
+ [HFI1_PKT_TYPE_16B] = &hfi1_make_ud_req_16B
+};
+
/**
* ud_loopback - handle send on loopback QPs
* @sqp: the sending QP
@@ -67,6 +73,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
{
struct hfi1_ibport *ibp = to_iport(sqp->ibqp.device, sqp->port_num);
struct hfi1_pportdata *ppd;
+ struct hfi1_qp_priv *priv = sqp->priv;
struct rvt_qp *qp;
struct rdma_ah_attr *ah_attr;
unsigned long flags;
@@ -102,7 +109,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
if (qp->ibqp.qp_num > 1) {
u16 pkey;
- u16 slid;
+ u32 slid;
u8 sc5 = ibp->sl_to_sc[rdma_ah_get_sl(ah_attr)];
pkey = hfi1_get_pkey(ibp, sqp->s_pkey_index);
@@ -176,9 +183,33 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
struct ib_grh grh;
- const struct ib_global_route *grd = rdma_ah_read_grh(ah_attr);
+ struct ib_global_route grd = *(rdma_ah_read_grh(ah_attr));
+
+ /*
+ * For loopback packets with extended LIDs, the
+ * sgid_index in the GRH is 0 and the dgid is
+ * OPA GID of the sender. While creating a response
+ * to the loopback packet, IB core creates the new
+ * sgid_index from the DGID and that will be the
+ * OPA_GID_INDEX. The new dgid is from the sgid
+ * index and that will be in the IB GID format.
+ *
+ * We now have a case where the sent packet had a
+ * different sgid_index and dgid compared to the
+ * one that was received in response.
+ *
+ * Fix this inconsistency.
+ */
+ if (priv->hdr_type == HFI1_PKT_TYPE_16B) {
+ if (grd.sgid_index == 0)
+ grd.sgid_index = OPA_GID_INDEX;
- hfi1_make_grh(ibp, &grh, grd, 0, 0);
+ if (ib_is_opa_gid(&grd.dgid))
+ grd.dgid.global.interface_id =
+ cpu_to_be64(ppd->guids[HFI1_PORT_GUID_INDEX]);
+ }
+
+ hfi1_make_grh(ibp, &grh, &grd, 0, 0);
hfi1_copy_sge(&qp->r_sge, &grh,
sizeof(grh), true, false);
wc.wc_flags |= IB_WC_GRH;
@@ -235,7 +266,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
wc.pkey_index = 0;
}
wc.slid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
- ((1 << ppd->lmc) - 1));
+ ((1 << ppd->lmc) - 1));
/* Check for loopback when the port lid is not set */
if (wc.slid == 0 && sqp->ibqp.qp_type == IB_QPT_GSI)
wc.slid = be16_to_cpu(IB_LID_PERMISSIVE);
@@ -252,6 +283,183 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
rcu_read_unlock();
}
+static void hfi1_make_bth_deth(struct rvt_qp *qp, struct rvt_swqe *wqe,
+ struct ib_other_headers *ohdr,
+ u16 *pkey, u32 extra_bytes, bool bypass)
+{
+ u32 bth0;
+ struct hfi1_ibport *ibp;
+
+ ibp = to_iport(qp->ibqp.device, qp->port_num);
+ if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) {
+ ohdr->u.ud.imm_data = wqe->wr.ex.imm_data;
+ bth0 = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE << 24;
+ } else {
+ bth0 = IB_OPCODE_UD_SEND_ONLY << 24;
+ }
+
+ if (wqe->wr.send_flags & IB_SEND_SOLICITED)
+ bth0 |= IB_BTH_SOLICITED;
+ bth0 |= extra_bytes << 20;
+ if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_SMI)
+ *pkey = hfi1_get_pkey(ibp, wqe->ud_wr.pkey_index);
+ else
+ *pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+ if (!bypass)
+ bth0 |= *pkey;
+ ohdr->bth[0] = cpu_to_be32(bth0);
+ ohdr->bth[1] = cpu_to_be32(wqe->ud_wr.remote_qpn);
+ ohdr->bth[2] = cpu_to_be32(mask_psn(wqe->psn));
+ /*
+ * Qkeys with the high order bit set mean use the
+ * qkey from the QP context instead of the WR (see 10.2.5).
+ */
+ ohdr->u.ud.deth[0] = cpu_to_be32((int)wqe->ud_wr.remote_qkey < 0 ?
+ qp->qkey : wqe->ud_wr.remote_qkey);
+ ohdr->u.ud.deth[1] = cpu_to_be32(qp->ibqp.qp_num);
+}
+
+void hfi1_make_ud_req_9B(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
+ struct rvt_swqe *wqe)
+{
+ u32 nwords, extra_bytes;
+ u16 len, slid, dlid, pkey;
+ u16 lrh0 = 0;
+ u8 sc5;
+ struct hfi1_qp_priv *priv = qp->priv;
+ struct ib_other_headers *ohdr;
+ struct rdma_ah_attr *ah_attr;
+ struct hfi1_pportdata *ppd;
+ struct hfi1_ibport *ibp;
+ struct ib_grh *grh;
+
+ ibp = to_iport(qp->ibqp.device, qp->port_num);
+ ppd = ppd_from_ibp(ibp);
+ ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
+
+ extra_bytes = -wqe->length & 3;
+ nwords = ((wqe->length + extra_bytes) >> 2) + SIZE_OF_CRC;
+ /* header size in dwords LRH+BTH+DETH = (8+12+8)/4. */
+ qp->s_hdrwords = 7;
+ if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM)
+ qp->s_hdrwords++;
+
+ if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
+ grh = &ps->s_txreq->phdr.hdr.ibh.u.l.grh;
+ qp->s_hdrwords += hfi1_make_grh(ibp, grh,
+ rdma_ah_read_grh(ah_attr),
+ qp->s_hdrwords - 2, nwords);
+ lrh0 = HFI1_LRH_GRH;
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
+ } else {
+ lrh0 = HFI1_LRH_BTH;
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
+ }
+
+ sc5 = ibp->sl_to_sc[rdma_ah_get_sl(ah_attr)];
+ lrh0 |= (rdma_ah_get_sl(ah_attr) & 0xf) << 4;
+ if (qp->ibqp.qp_type == IB_QPT_SMI) {
+ lrh0 |= 0xF000; /* Set VL (see ch. 13.5.3.1) */
+ priv->s_sc = 0xf;
+ } else {
+ lrh0 |= (sc5 & 0xf) << 12;
+ priv->s_sc = sc5;
+ }
+
+ dlid = opa_get_lid(rdma_ah_get_dlid(ah_attr), 9B);
+ if (dlid == be16_to_cpu(IB_LID_PERMISSIVE)) {
+ slid = be16_to_cpu(IB_LID_PERMISSIVE);
+ } else {
+ u16 lid = (u16)ppd->lid;
+
+ if (lid) {
+ lid |= rdma_ah_get_path_bits(ah_attr) &
+ ((1 << ppd->lmc) - 1);
+ slid = lid;
+ } else {
+ slid = be16_to_cpu(IB_LID_PERMISSIVE);
+ }
+ }
+ hfi1_make_bth_deth(qp, wqe, ohdr, &pkey, extra_bytes, false);
+ len = qp->s_hdrwords + nwords;
+
+ /* Setup the packet */
+ ps->s_txreq->phdr.hdr.hdr_type = HFI1_PKT_TYPE_9B;
+ hfi1_make_ib_hdr(&ps->s_txreq->phdr.hdr.ibh,
+ lrh0, len, dlid, slid);
+}
+
+void hfi1_make_ud_req_16B(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
+ struct rvt_swqe *wqe)
+{
+ struct hfi1_qp_priv *priv = qp->priv;
+ struct ib_other_headers *ohdr;
+ struct rdma_ah_attr *ah_attr;
+ struct hfi1_pportdata *ppd;
+ struct hfi1_ibport *ibp;
+ u32 dlid, slid, nwords, extra_bytes;
+ u16 len, pkey;
+ u8 l4, sc5;
+
+ ibp = to_iport(qp->ibqp.device, qp->port_num);
+ ppd = ppd_from_ibp(ibp);
+ ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
+ /* header size in dwords 16B LRH+BTH+DETH = (16+12+8)/4. */
+ qp->s_hdrwords = 9;
+ if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM)
+ qp->s_hdrwords++;
+
+ /* SW provides space for CRC and LT for bypass packets. */
+ extra_bytes = hfi1_get_16b_padding((qp->s_hdrwords << 2),
+ wqe->length);
+ nwords = ((wqe->length + extra_bytes + SIZE_OF_LT) >> 2) + SIZE_OF_CRC;
+
+ if ((rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) &&
+ hfi1_check_mcast(rdma_ah_get_dlid(ah_attr))) {
+ struct ib_grh *grh;
+ struct ib_global_route *grd = rdma_ah_retrieve_grh(ah_attr);
+ /*
+ * Ensure OPA GIDs are transformed to IB gids
+ * before creating the GRH.
+ */
+ if (grd->sgid_index == OPA_GID_INDEX) {
+ dd_dev_warn(ppd->dd, "Bad sgid_index. sgid_index: %d\n",
+ grd->sgid_index);
+ grd->sgid_index = 0;
+ }
+ grh = &ps->s_txreq->phdr.hdr.opah.u.l.grh;
+ qp->s_hdrwords += hfi1_make_grh(ibp, grh, grd,
+ qp->s_hdrwords - 4, nwords);
+ ohdr = &ps->s_txreq->phdr.hdr.opah.u.l.oth;
+ l4 = OPA_16B_L4_IB_GLOBAL;
+ } else {
+ ohdr = &ps->s_txreq->phdr.hdr.opah.u.oth;
+ l4 = OPA_16B_L4_IB_LOCAL;
+ }
+
+ sc5 = ibp->sl_to_sc[rdma_ah_get_sl(ah_attr)];
+ if (qp->ibqp.qp_type == IB_QPT_SMI)
+ priv->s_sc = 0xf;
+ else
+ priv->s_sc = sc5;
+
+ dlid = opa_get_lid(rdma_ah_get_dlid(ah_attr), 16B);
+ if (!ppd->lid)
+ slid = be32_to_cpu(OPA_LID_PERMISSIVE);
+ else
+ slid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
+ ((1 << ppd->lmc) - 1));
+
+ hfi1_make_bth_deth(qp, wqe, ohdr, &pkey, extra_bytes, true);
+ /* Convert dwords to flits */
+ len = (qp->s_hdrwords + nwords) >> 1;
+
+ /* Setup the packet */
+ ps->s_txreq->phdr.hdr.hdr_type = HFI1_PKT_TYPE_16B;
+ hfi1_make_16b_hdr(&ps->s_txreq->phdr.hdr.opah,
+ slid, dlid, len, pkey, 0, 0, l4, priv->s_sc);
+}
+
/**
* hfi1_make_ud_req - construct a UD request packet
* @qp: the QP
@@ -263,18 +471,12 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
{
struct hfi1_qp_priv *priv = qp->priv;
- struct ib_other_headers *ohdr;
struct rdma_ah_attr *ah_attr;
struct hfi1_pportdata *ppd;
struct hfi1_ibport *ibp;
struct rvt_swqe *wqe;
- u32 nwords;
- u32 extra_bytes;
- u32 bth0;
- u16 lrh0;
- u16 lid;
int next_cur;
- u8 sc5;
+ u32 lid;
ps->s_txreq = get_txreq(ps->dev, qp);
if (IS_ERR(ps->s_txreq))
@@ -311,13 +513,14 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
ibp = to_iport(qp->ibqp.device, qp->port_num);
ppd = ppd_from_ibp(ibp);
ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
- if (rdma_ah_get_dlid(ah_attr) < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
- rdma_ah_get_dlid(ah_attr) == be16_to_cpu(IB_LID_PERMISSIVE)) {
+ priv->hdr_type = hfi1_get_hdr_type(ppd->lid, ah_attr);
+ if ((!hfi1_check_mcast(rdma_ah_get_dlid(ah_attr))) ||
+ (rdma_ah_get_dlid(ah_attr) == be32_to_cpu(OPA_LID_PERMISSIVE))) {
lid = rdma_ah_get_dlid(ah_attr) & ~((1 << ppd->lmc) - 1);
if (unlikely(!loopback &&
- (lid == ppd->lid ||
- (lid == be16_to_cpu(IB_LID_PERMISSIVE) &&
- qp->ibqp.qp_type == IB_QPT_GSI)))) {
+ ((lid == ppd->lid) ||
+ ((lid == be32_to_cpu(OPA_LID_PERMISSIVE)) &&
+ (qp->ibqp.qp_type == IB_QPT_GSI))))) {
unsigned long tflags = ps->flags;
/*
* If DMAs are in progress, we can't generate
@@ -341,11 +544,6 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
}
qp->s_cur = next_cur;
- extra_bytes = -wqe->length & 3;
- nwords = (wqe->length + extra_bytes) >> 2;
-
- /* header size in 32-bit words LRH+BTH+DETH = (8+12+8)/4. */
- qp->s_hdrwords = 7;
ps->s_txreq->s_cur_size = wqe->length;
ps->s_txreq->ss = &qp->s_sge;
qp->s_srate = rdma_ah_get_static_rate(ah_attr);
@@ -356,78 +554,12 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
qp->s_sge.num_sge = wqe->wr.num_sge;
qp->s_sge.total_len = wqe->length;
- if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
- /* Header size in 32-bit words. */
- qp->s_hdrwords +=
- hfi1_make_grh(ibp,
- &ps->s_txreq->phdr.hdr.ibh.u.l.grh,
- rdma_ah_read_grh(ah_attr),
- qp->s_hdrwords, nwords);
- lrh0 = HFI1_LRH_GRH;
- ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
- /*
- * Don't worry about sending to locally attached multicast
- * QPs. It is unspecified by the spec. what happens.
- */
- } else {
- /* Header size in 32-bit words. */
- lrh0 = HFI1_LRH_BTH;
- ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
- }
- if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) {
- qp->s_hdrwords++;
- ohdr->u.ud.imm_data = wqe->wr.ex.imm_data;
- bth0 = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE << 24;
- } else {
- bth0 = IB_OPCODE_UD_SEND_ONLY << 24;
- }
- sc5 = ibp->sl_to_sc[rdma_ah_get_sl(ah_attr)];
- lrh0 |= (rdma_ah_get_sl(ah_attr) & 0xf) << 4;
- if (qp->ibqp.qp_type == IB_QPT_SMI) {
- lrh0 |= 0xF000; /* Set VL (see ch. 13.5.3.1) */
- priv->s_sc = 0xf;
- } else {
- lrh0 |= (sc5 & 0xf) << 12;
- priv->s_sc = sc5;
- }
+ /* Make the appropriate header */
+ hfi1_make_ud_req_tbl[priv->hdr_type](qp, ps, qp->s_wqe);
priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
ps->s_txreq->sde = priv->s_sde;
priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
ps->s_txreq->psc = priv->s_sendcontext;
- ps->s_txreq->phdr.hdr.ibh.lrh[0] = cpu_to_be16(lrh0);
- ps->s_txreq->phdr.hdr.ibh.lrh[1] =
- cpu_to_be16(rdma_ah_get_dlid(ah_attr));
- ps->s_txreq->phdr.hdr.ibh.lrh[2] =
- cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
- if (rdma_ah_get_dlid(ah_attr) == be16_to_cpu(IB_LID_PERMISSIVE)) {
- ps->s_txreq->phdr.hdr.ibh.lrh[3] = IB_LID_PERMISSIVE;
- } else {
- lid = ppd->lid;
- if (lid) {
- lid |= rdma_ah_get_path_bits(ah_attr) &
- ((1 << ppd->lmc) - 1);
- ps->s_txreq->phdr.hdr.ibh.lrh[3] = cpu_to_be16(lid);
- } else {
- ps->s_txreq->phdr.hdr.ibh.lrh[3] = IB_LID_PERMISSIVE;
- }
- }
- if (wqe->wr.send_flags & IB_SEND_SOLICITED)
- bth0 |= IB_BTH_SOLICITED;
- bth0 |= extra_bytes << 20;
- if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_SMI)
- bth0 |= hfi1_get_pkey(ibp, wqe->ud_wr.pkey_index);
- else
- bth0 |= hfi1_get_pkey(ibp, qp->s_pkey_index);
- ohdr->bth[0] = cpu_to_be32(bth0);
- ohdr->bth[1] = cpu_to_be32(wqe->ud_wr.remote_qpn);
- ohdr->bth[2] = cpu_to_be32(mask_psn(wqe->psn));
- /*
- * Qkeys with the high order bit set mean use the
- * qkey from the QP context instead of the WR (see 10.2.5).
- */
- ohdr->u.ud.deth[0] = cpu_to_be32((int)wqe->ud_wr.remote_qkey < 0 ?
- qp->qkey : wqe->ud_wr.remote_qkey);
- ohdr->u.ud.deth[1] = cpu_to_be32(qp->ibqp.qp_num);
/* disarm any ahg */
priv->s_ahg->ahgcount = 0;
priv->s_ahg->ahgidx = 0;
@@ -497,6 +629,64 @@ int hfi1_lookup_pkey_idx(struct hfi1_ibport *ibp, u16 pkey)
return -1;
}
+void return_cnp_16B(struct hfi1_ibport *ibp, struct rvt_qp *qp,
+ u32 remote_qpn, u32 pkey, u32 slid, u32 dlid,
+ u8 sc5, const struct ib_grh *old_grh)
+{
+ u64 pbc, pbc_flags = 0;
+ u32 bth0, plen, vl, hwords = 7;
+ u16 len;
+ u8 l4;
+ struct hfi1_16b_header hdr;
+ struct ib_other_headers *ohdr;
+ struct pio_buf *pbuf;
+ struct send_context *ctxt = qp_to_send_context(qp, sc5);
+ struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+ u32 nwords;
+
+ /* Populate length */
+ nwords = ((hfi1_get_16b_padding(hwords << 2, 0) +
+ SIZE_OF_LT) >> 2) + SIZE_OF_CRC;
+ if (old_grh) {
+ struct ib_grh *grh = &hdr.u.l.grh;
+
+ grh->version_tclass_flow = old_grh->version_tclass_flow;
+ grh->paylen = cpu_to_be16((hwords - 4 + nwords) << 2);
+ grh->hop_limit = 0xff;
+ grh->sgid = old_grh->dgid;
+ grh->dgid = old_grh->sgid;
+ ohdr = &hdr.u.l.oth;
+ l4 = OPA_16B_L4_IB_GLOBAL;
+ hwords += sizeof(struct ib_grh) / sizeof(u32);
+ } else {
+ ohdr = &hdr.u.oth;
+ l4 = OPA_16B_L4_IB_LOCAL;
+ }
+
+ /* BIT 16 to 19 is TVER. Bit 20 to 22 is pad cnt */
+ bth0 = (IB_OPCODE_CNP << 24) | (1 << 16) |
+ (hfi1_get_16b_padding(hwords << 2, 0) << 20);
+ ohdr->bth[0] = cpu_to_be32(bth0);
+
+ ohdr->bth[1] = cpu_to_be32(remote_qpn);
+ ohdr->bth[2] = 0; /* PSN 0 */
+
+ /* Convert dwords to flits */
+ len = (hwords + nwords) >> 1;
+ hfi1_make_16b_hdr(&hdr, slid, dlid, len, pkey, 1, 0, l4, sc5);
+
+ plen = 2 /* PBC */ + hwords + nwords;
+ pbc_flags |= PBC_PACKET_BYPASS | PBC_INSERT_BYPASS_ICRC;
+ vl = sc_to_vlt(ppd->dd, sc5);
+ pbc = create_pbc(ppd, pbc_flags, qp->srate_mbps, vl, plen);
+ if (ctxt) {
+ pbuf = sc_buffer_alloc(ctxt, plen, NULL, NULL);
+ if (pbuf)
+ ppd->dd->pio_inline_send(ppd->dd, pbuf, pbc,
+ &hdr, hwords);
+ }
+}
+
void return_cnp(struct hfi1_ibport *ibp, struct rvt_qp *qp, u32 remote_qpn,
u32 pkey, u32 slid, u32 dlid, u8 sc5,
const struct ib_grh *old_grh)
@@ -535,11 +725,7 @@ void return_cnp(struct hfi1_ibport *ibp, struct rvt_qp *qp, u32 remote_qpn,
ohdr->bth[1] = cpu_to_be32(remote_qpn | (1 << IB_BECN_SHIFT));
ohdr->bth[2] = 0; /* PSN 0 */
- hdr.lrh[0] = cpu_to_be16(lrh0);
- hdr.lrh[1] = cpu_to_be16(dlid);
- hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
- hdr.lrh[3] = cpu_to_be16(slid);
-
+ hfi1_make_ib_hdr(&hdr, lrh0, hwords + SIZE_OF_CRC, dlid, slid);
plen = 2 /* PBC */ + hwords;
pbc_flags |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
vl = sc_to_vlt(ppd->dd, sc5);
@@ -672,18 +858,33 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
void *data = packet->payload;
u32 tlen = packet->tlen;
struct rvt_qp *qp = packet->qp;
- u8 sc5 = hfi1_9B_get_sc5(hdr, packet->rhf);
+ u8 sc5 = packet->sc;
u8 sl_from_sc;
- u8 extra_bytes = packet->pad;
u8 opcode = packet->opcode;
u8 sl = packet->sl;
u32 dlid = packet->dlid;
u32 slid = packet->slid;
+ u8 extra_bytes;
+ bool dlid_is_permissive;
+ bool slid_is_permissive;
+ extra_bytes = packet->pad + packet->extra_byte + (SIZE_OF_CRC << 2);
qkey = ib_get_qkey(ohdr);
src_qp = ib_get_sqpn(ohdr);
- pkey = ib_bth_get_pkey(ohdr);
- extra_bytes += (SIZE_OF_CRC << 2);
+
+ if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+ u32 permissive_lid =
+ opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B);
+
+ pkey = hfi1_16B_get_pkey(packet->hdr);
+ dlid_is_permissive = (dlid == permissive_lid);
+ slid_is_permissive = (slid == permissive_lid);
+ } else {
+ hdr = packet->hdr;
+ pkey = ib_bth_get_pkey(ohdr);
+ dlid_is_permissive = (dlid == be16_to_cpu(IB_LID_PERMISSIVE));
+ slid_is_permissive = (slid == be16_to_cpu(IB_LID_PERMISSIVE));
+ }
sl_from_sc = ibp->sc_to_sl[sc5];
process_ecn(qp, packet, (opcode != IB_OPCODE_CNP));
@@ -701,8 +902,7 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
* and the QKEY matches (see 9.6.1.4.1 and 9.6.1.5.1).
*/
if (qp->ibqp.qp_num) {
- if (unlikely(hdr->lrh[1] == IB_LID_PERMISSIVE ||
- hdr->lrh[3] == IB_LID_PERMISSIVE))
+ if (unlikely(dlid_is_permissive || slid_is_permissive))
goto drop;
if (qp->ibqp.qp_num > 1) {
if (unlikely(rcv_pkey_check(ppd, pkey, sc5, slid))) {
@@ -740,8 +940,7 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
if (tlen > 2048)
goto drop;
- if ((hdr->lrh[1] == IB_LID_PERMISSIVE ||
- hdr->lrh[3] == IB_LID_PERMISSIVE) &&
+ if ((dlid_is_permissive || slid_is_permissive) &&
smp->mgmt_class != IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
goto drop;
@@ -794,7 +993,18 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
goto drop;
}
if (packet->grh) {
- hfi1_copy_sge(&qp->r_sge, &hdr->u.l.grh,
+ hfi1_copy_sge(&qp->r_sge, packet->grh,
+ sizeof(struct ib_grh), true, false);
+ wc.wc_flags |= IB_WC_GRH;
+ } else if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+ struct ib_grh grh;
+ /*
+ * Assuming we only created 16B on the send side
+ * if we want to use large LIDs, since GRH was stripped
+ * out when creating 16B, add back the GRH here.
+ */
+ hfi1_make_ext_grh(packet, &grh, slid, dlid);
+ hfi1_copy_sge(&qp->r_sge, &grh,
sizeof(struct ib_grh), true, false);
wc.wc_flags |= IB_WC_GRH;
} else {
@@ -827,14 +1037,15 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
} else {
wc.pkey_index = 0;
}
-
+ if (slid_is_permissive)
+ slid = be32_to_cpu(OPA_LID_PERMISSIVE);
wc.slid = slid;
wc.sl = sl_from_sc;
/*
* Save the LMC lower bits if the destination LID is a unicast LID.
*/
- wc.dlid_path_bits = dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE) ? 0 :
+ wc.dlid_path_bits = hfi1_check_mcast(dlid) ? 0 :
dlid & ((1 << ppd_from_ibp(ibp)->lmc) - 1);
wc.port_num = qp->port_num;
/* Signal completion event if the solicited bit is set. */
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index d3dd0c0..4928ee4 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -259,7 +259,7 @@ static inline int hfi1_send_ok(struct rvt_qp *qp)
* This must be called with s_lock held.
*/
void hfi1_bad_pkey(struct hfi1_ibport *ibp, u32 key, u32 sl,
- u32 qp1, u32 qp2, u16 lid1, u16 lid2);
+ u32 qp1, u32 qp2, u32 lid1, u32 lid2);
void hfi1_cap_mask_chg(struct rvt_dev_info *rdi, u8 port_num);
void hfi1_sys_guid_chg(struct hfi1_ibport *ibp);
void hfi1_node_desc_chg(struct hfi1_ibport *ibp);
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index 9ae126f..e6e90f1 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -54,6 +54,7 @@
#define OPA_MAKE_ID(x) (cpu_to_be64(OPA_SPECIAL_OUI << 40 | (x)))
#define OPA_TO_IB_UCAST_LID(x) (((x) >= be16_to_cpu(IB_MULTICAST_LID_BASE)) \
? 0 : x)
+#define OPA_GID_INDEX 0x1
/**
* 0xF8 - 4 bits of multicast range and 1 bit for collective range
* Example: For 24 bit LID space,
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 23/27] IB/hfi1: Add 16B trace support
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (20 preceding siblings ...)
2017-08-04 20:54 ` [PATCH for-next 22/27] IB/hfi1: Add 16B UD support Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 24/27] IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids Dennis Dalessandro
` (3 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Add trace support to 16B bypass packets during send and
receive.
Sample input header trace:
<idle>-0 [000] d.h. 271742.509477: input_ibhdr: [0000:05:00.0] (16B)
len:24 sc:0 dlid:0xf0000b slid:0x10002 age:0 becn:0 fecn:0 l4:10 rc:0
sc:0 pkey:0x8001 entropy:0x0000 op:0x65,UD_SEND_ONLY_WITH_IMMEDIATE se:0
m:1 pad:3 tver:0 qpn:0xffffff a:0 psn:0x00000001 hlen:248 deth qkey
0x01234567 sqpn 0x000004
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/hfi.h | 25 ++
drivers/infiniband/hw/hfi1/trace.c | 153 ++++++++++++-
drivers/infiniband/hw/hfi1/trace_ibhdrs.h | 351 ++++++++++++++++++++---------
3 files changed, 406 insertions(+), 123 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 7e21192..b07f42c 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -381,6 +381,11 @@ struct hfi1_packet {
#define OPA_16B_PKEY_SHIFT 16
#define OPA_16B_LEN_MASK 0x7FF00000ull
#define OPA_16B_LEN_SHIFT 20
+#define OPA_16B_RC_MASK 0xE000000ull
+#define OPA_16B_RC_SHIFT 25
+#define OPA_16B_AGE_MASK 0xFF0000ull
+#define OPA_16B_AGE_SHIFT 16
+#define OPA_16B_ENTROPY_MASK 0xFFFFull
/*
* OPA 16B L2/L4 Encodings
@@ -434,6 +439,26 @@ static inline u16 hfi1_16B_get_pkey(struct hfi1_16b_header *hdr)
return (u16)((hdr->lrh[2] & OPA_16B_PKEY_MASK) >> OPA_16B_PKEY_SHIFT);
}
+static inline u8 hfi1_16B_get_rc(struct hfi1_16b_header *hdr)
+{
+ return (u8)((hdr->lrh[1] & OPA_16B_RC_MASK) >> OPA_16B_RC_SHIFT);
+}
+
+static inline u8 hfi1_16B_get_age(struct hfi1_16b_header *hdr)
+{
+ return (u8)((hdr->lrh[3] & OPA_16B_AGE_MASK) >> OPA_16B_AGE_SHIFT);
+}
+
+static inline u16 hfi1_16B_get_len(struct hfi1_16b_header *hdr)
+{
+ return (u16)((hdr->lrh[0] & OPA_16B_LEN_MASK) >> OPA_16B_LEN_SHIFT);
+}
+
+static inline u16 hfi1_16B_get_entropy(struct hfi1_16b_header *hdr)
+{
+ return (u16)(hdr->lrh[3] & OPA_16B_ENTROPY_MASK);
+}
+
/*
* BTH
*/
diff --git a/drivers/infiniband/hw/hfi1/trace.c b/drivers/infiniband/hw/hfi1/trace.c
index b80b74d..9938bb9 100644
--- a/drivers/infiniband/hw/hfi1/trace.c
+++ b/drivers/infiniband/hw/hfi1/trace.c
@@ -47,7 +47,7 @@
#define CREATE_TRACE_POINTS
#include "trace.h"
-u8 hfi1_trace_ib_hdr_len(struct ib_header *hdr)
+static u8 __get_ib_hdr_len(struct ib_header *hdr)
{
struct ib_other_headers *ohdr;
u8 opcode;
@@ -61,9 +61,60 @@ u8 hfi1_trace_ib_hdr_len(struct ib_header *hdr)
0 : hdr_len_by_opcode[opcode] - (12 + 8);
}
+static u8 __get_16b_hdr_len(struct hfi1_16b_header *hdr)
+{
+ struct ib_other_headers *ohdr;
+ u8 opcode;
+
+ if (hfi1_16B_get_l4(hdr) == OPA_16B_L4_IB_LOCAL)
+ ohdr = &hdr->u.oth;
+ else
+ ohdr = &hdr->u.l.oth;
+ opcode = ib_bth_get_opcode(ohdr);
+ return hdr_len_by_opcode[opcode] == 0 ?
+ 0 : hdr_len_by_opcode[opcode] - (12 + 8 + 8);
+}
+
+u8 hfi1_trace_packet_hdr_len(struct hfi1_packet *packet)
+{
+ if (packet->etype != RHF_RCV_TYPE_BYPASS)
+ return __get_ib_hdr_len(packet->hdr);
+ else
+ return __get_16b_hdr_len(packet->hdr);
+}
+
+u8 hfi1_trace_opa_hdr_len(struct hfi1_opa_header *opa_hdr)
+{
+ if (!opa_hdr->hdr_type)
+ return __get_ib_hdr_len(&opa_hdr->ibh);
+ else
+ return __get_16b_hdr_len(&opa_hdr->opah);
+}
+
const char *hfi1_trace_get_packet_str(struct hfi1_packet *packet)
{
- return "IB";
+ if (packet->etype != RHF_RCV_TYPE_BYPASS)
+ return "IB";
+
+ switch (hfi1_16B_get_l2(packet->hdr)) {
+ case 0:
+ return "0";
+ case 1:
+ return "1";
+ case 2:
+ return "16B";
+ case 3:
+ return "9B";
+ }
+ return "";
+}
+
+const char *hfi1_trace_get_packet_type_str(u8 l4)
+{
+ if (l4)
+ return "16B";
+ else
+ return "9B";
}
#define IMM_PRN "imm:%d"
@@ -89,10 +140,10 @@ u8 hfi1_trace_ib_hdr_len(struct ib_header *hdr)
return "";
}
-void hfi1_trace_parse_bth(struct ib_other_headers *ohdr,
- u8 *ack, u8 *becn, u8 *fecn, u8 *mig,
- u8 *se, u8 *pad, u8 *opcode, u8 *tver,
- u16 *pkey, u32 *psn, u32 *qpn)
+void hfi1_trace_parse_9b_bth(struct ib_other_headers *ohdr,
+ u8 *ack, u8 *becn, u8 *fecn, u8 *mig,
+ u8 *se, u8 *pad, u8 *opcode, u8 *tver,
+ u16 *pkey, u32 *psn, u32 *qpn)
{
*ack = ib_bth_get_ackreq(ohdr);
*becn = ib_bth_get_becn(ohdr);
@@ -107,8 +158,22 @@ void hfi1_trace_parse_bth(struct ib_other_headers *ohdr,
*qpn = ib_bth_get_qpn(ohdr);
}
+void hfi1_trace_parse_16b_bth(struct ib_other_headers *ohdr,
+ u8 *ack, u8 *mig, u8 *opcode,
+ u8 *pad, u8 *se, u8 *tver,
+ u32 *psn, u32 *qpn)
+{
+ *ack = ib_bth_get_ackreq(ohdr);
+ *mig = ib_bth_get_migreq(ohdr);
+ *opcode = ib_bth_get_opcode(ohdr);
+ *pad = ib_bth_get_pad(ohdr);
+ *se = ib_bth_get_se(ohdr);
+ *tver = ib_bth_get_tver(ohdr);
+ *psn = ib_bth_get_psn(ohdr);
+ *qpn = ib_bth_get_qpn(ohdr);
+}
+
void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
- struct ib_other_headers **ohdr,
u8 *lnh, u8 *lver, u8 *sl, u8 *sc,
u16 *len, u32 *dlid, u32 *slid)
{
@@ -119,11 +184,79 @@ void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
*len = ib_get_len(hdr);
*dlid = ib_get_dlid(hdr);
*slid = ib_get_slid(hdr);
+}
+
+void hfi1_trace_parse_16b_hdr(struct hfi1_16b_header *hdr,
+ u8 *age, u8 *becn, u8 *fecn,
+ u8 *l4, u8 *rc, u8 *sc,
+ u16 *entropy, u16 *len, u16 *pkey,
+ u32 *dlid, u32 *slid)
+{
+ *age = hfi1_16B_get_age(hdr);
+ *becn = hfi1_16B_get_becn(hdr);
+ *fecn = hfi1_16B_get_fecn(hdr);
+ *l4 = hfi1_16B_get_l4(hdr);
+ *rc = hfi1_16B_get_rc(hdr);
+ *sc = hfi1_16B_get_sc(hdr);
+ *entropy = hfi1_16B_get_entropy(hdr);
+ *len = hfi1_16B_get_len(hdr);
+ *pkey = hfi1_16B_get_pkey(hdr);
+ *dlid = hfi1_16B_get_dlid(hdr);
+ *slid = hfi1_16B_get_slid(hdr);
+}
+
+#define LRH_PRN "len:%d sc:%d dlid:0x%.4x slid:0x%.4x "
+#define LRH_9B_PRN "lnh:%d,%s lver:%d sl:%d"
+#define LRH_16B_PRN "age:%d becn:%d fecn:%d l4:%d " \
+ "rc:%d sc:%d pkey:0x%.4x entropy:0x%.4x"
+const char *hfi1_trace_fmt_lrh(struct trace_seq *p, bool bypass,
+ u8 age, u8 becn, u8 fecn, u8 l4,
+ u8 lnh, const char *lnh_name, u8 lver,
+ u8 rc, u8 sc, u8 sl, u16 entropy,
+ u16 len, u16 pkey, u32 dlid, u32 slid)
+{
+ const char *ret = trace_seq_buffer_ptr(p);
+
+ trace_seq_printf(p, LRH_PRN, len, sc, dlid, slid);
+
+ if (bypass)
+ trace_seq_printf(p, LRH_16B_PRN,
+ age, becn, fecn, l4, rc, sc, pkey, entropy);
- if (*lnh == HFI1_LRH_BTH)
- *ohdr = &hdr->u.oth;
else
- *ohdr = &hdr->u.l.oth;
+ trace_seq_printf(p, LRH_9B_PRN,
+ lnh, lnh_name, lver, sl);
+ trace_seq_putc(p, 0);
+
+ return ret;
+}
+
+#define BTH_9B_PRN \
+ "op:0x%.2x,%s se:%d m:%d pad:%d tver:%d pkey:0x%.4x " \
+ "f:%d b:%d qpn:0x%.6x a:%d psn:0x%.8x"
+#define BTH_16B_PRN \
+ "op:0x%.2x,%s se:%d m:%d pad:%d tver:%d " \
+ "qpn:0x%.6x a:%d psn:0x%.8x"
+const char *hfi1_trace_fmt_bth(struct trace_seq *p, bool bypass,
+ u8 ack, u8 becn, u8 fecn, u8 mig,
+ u8 se, u8 pad, u8 opcode, const char *opname,
+ u8 tver, u16 pkey, u32 psn, u32 qpn)
+{
+ const char *ret = trace_seq_buffer_ptr(p);
+
+ if (bypass)
+ trace_seq_printf(p, BTH_16B_PRN,
+ opcode, opname,
+ se, mig, pad, tver, qpn, ack, psn);
+
+ else
+ trace_seq_printf(p, BTH_9B_PRN,
+ opcode, opname,
+ se, mig, pad, tver, pkey, fecn, becn,
+ qpn, ack, psn);
+ trace_seq_putc(p, 0);
+
+ return ret;
}
const char *parse_everbs_hdrs(
diff --git a/drivers/infiniband/hw/hfi1/trace_ibhdrs.h b/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
index 7324025..6721f84 100644
--- a/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
+++ b/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
@@ -95,17 +95,39 @@
ib_opcode_name(UD_SEND_ONLY_WITH_IMMEDIATE), \
ib_opcode_name(CNP))
+u8 ibhdr_exhdr_len(struct ib_header *hdr);
const char *parse_everbs_hdrs(struct trace_seq *p, u8 opcode, void *ehdrs);
-u8 hfi1_trace_ib_hdr_len(struct ib_header *hdr);
+u8 hfi1_trace_opa_hdr_len(struct hfi1_opa_header *opah);
+u8 hfi1_trace_packet_hdr_len(struct hfi1_packet *packet);
+const char *hfi1_trace_get_packet_type_str(u8 l4);
const char *hfi1_trace_get_packet_str(struct hfi1_packet *packet);
-void hfi1_trace_parse_bth(struct ib_other_headers *ohdr,
- u8 *ack, u8 *becn, u8 *fecn, u8 *mig,
- u8 *se, u8 *pad, u8 *opcode, u8 *tver,
- u16 *pkey, u32 *psn, u32 *qpn);
+void hfi1_trace_parse_9b_bth(struct ib_other_headers *ohdr,
+ u8 *ack, u8 *becn, u8 *fecn, u8 *mig,
+ u8 *se, u8 *pad, u8 *opcode, u8 *tver,
+ u16 *pkey, u32 *psn, u32 *qpn);
void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
- struct ib_other_headers **ohdr,
u8 *lnh, u8 *lver, u8 *sl, u8 *sc,
u16 *len, u32 *dlid, u32 *slid);
+void hfi1_trace_parse_16b_bth(struct ib_other_headers *ohdr,
+ u8 *ack, u8 *mig, u8 *opcode,
+ u8 *pad, u8 *se, u8 *tver,
+ u32 *psn, u32 *qpn);
+void hfi1_trace_parse_16b_hdr(struct hfi1_16b_header *hdr,
+ u8 *age, u8 *becn, u8 *fecn,
+ u8 *l4, u8 *rc, u8 *sc,
+ u16 *entropy, u16 *len, u16 *pkey,
+ u32 *dlid, u32 *slid);
+
+const char *hfi1_trace_fmt_lrh(struct trace_seq *p, bool bypass,
+ u8 age, u8 becn, u8 fecn, u8 l4,
+ u8 lnh, const char *lnh_name, u8 lver,
+ u8 rc, u8 sc, u8 sl, u16 entropy,
+ u16 len, u16 pkey, u32 dlid, u32 slid);
+
+const char *hfi1_trace_fmt_bth(struct trace_seq *p, bool bypass,
+ u8 ack, u8 becn, u8 fecn, u8 mig,
+ u8 se, u8 pad, u8 opcode, const char *opname,
+ u8 tver, u16 pkey, u32 psn, u32 qpn);
#define __parse_ib_ehdrs(op, ehdrs) parse_everbs_hdrs(p, op, ehdrs)
@@ -114,13 +136,8 @@ void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
__print_symbolic(lrh, \
lrh_name(LRH_BTH), \
lrh_name(LRH_GRH))
-
-#define LRH_PRN "len:%d sc:%d dlid:0x%.4x slid:0x%.4x"
-#define LRH_9B_PRN "lnh:%d,%s lver:%d sl:%d "
-#define BTH_PRN \
- "op:0x%.2x,%s se:%d m:%d pad:%d tver:%d pkey:0x%.4x " \
- "f:%d b:%d qpn:0x%.6x a:%d psn:0x%.8x"
-#define EHDR_PRN "hlen:%d %s"
+#define PKT_ENTRY(pkt) __string(ptype, hfi1_trace_get_packet_str(packet))
+#define PKT_ASSIGN(pkt) __assign_str(ptype, hfi1_trace_get_packet_str(packet))
DECLARE_EVENT_CLASS(hfi1_input_ibhdr_template,
TP_PROTO(struct hfi1_devdata *dd,
@@ -129,75 +146,125 @@ void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
TP_ARGS(dd, packet, sc5),
TP_STRUCT__entry(
DD_DEV_ENTRY(dd)
+ PKT_ENTRY(packet)
+ __field(bool, bypass)
+ __field(u8, ack)
+ __field(u8, age)
+ __field(u8, becn)
+ __field(u8, fecn)
+ __field(u8, l4)
__field(u8, lnh)
__field(u8, lver)
- __field(u8, sl)
- __field(u16, len)
- __field(u32, dlid)
- __field(u8, sc)
- __field(u32, slid)
- __field(u8, opcode)
- __field(u8, se)
__field(u8, mig)
+ __field(u8, opcode)
__field(u8, pad)
+ __field(u8, rc)
+ __field(u8, sc)
+ __field(u8, se)
+ __field(u8, sl)
__field(u8, tver)
+ __field(u16, entropy)
+ __field(u16, len)
__field(u16, pkey)
- __field(u8, fecn)
- __field(u8, becn)
- __field(u32, qpn)
- __field(u8, ack)
+ __field(u32, dlid)
__field(u32, psn)
+ __field(u32, qpn)
+ __field(u32, slid)
/* extended headers */
__dynamic_array(u8, ehdrs,
- hfi1_trace_ib_hdr_len(packet->hdr))
+ hfi1_trace_packet_hdr_len(packet))
),
TP_fast_assign(
- struct ib_other_headers *ohdr;
+ DD_DEV_ASSIGN(dd);
+ PKT_ASSIGN(packet);
- DD_DEV_ASSIGN(dd);
+ if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+ __entry->bypass = true;
+ hfi1_trace_parse_16b_hdr(packet->hdr,
+ &__entry->age,
+ &__entry->becn,
+ &__entry->fecn,
+ &__entry->l4,
+ &__entry->rc,
+ &__entry->sc,
+ &__entry->entropy,
+ &__entry->len,
+ &__entry->pkey,
+ &__entry->dlid,
+ &__entry->slid);
- hfi1_trace_parse_9b_hdr(packet->hdr, sc5,
- &ohdr,
- &__entry->lnh,
- &__entry->lver,
- &__entry->sl,
- &__entry->sc,
- &__entry->len,
- &__entry->dlid,
- &__entry->slid);
+ hfi1_trace_parse_16b_bth(packet->ohdr,
+ &__entry->ack,
+ &__entry->mig,
+ &__entry->opcode,
+ &__entry->pad,
+ &__entry->se,
+ &__entry->tver,
+ &__entry->psn,
+ &__entry->qpn);
+ } else {
+ __entry->bypass = false;
+ hfi1_trace_parse_9b_hdr(packet->hdr, sc5,
+ &__entry->lnh,
+ &__entry->lver,
+ &__entry->sl,
+ &__entry->sc,
+ &__entry->len,
+ &__entry->dlid,
+ &__entry->slid);
- hfi1_trace_parse_bth(ohdr, &__entry->ack,
- &__entry->becn, &__entry->fecn,
- &__entry->mig, &__entry->se,
- &__entry->pad, &__entry->opcode,
- &__entry->tver, &__entry->pkey,
- &__entry->psn, &__entry->qpn);
- /* extended headers */
- memcpy(__get_dynamic_array(ehdrs), &ohdr->u,
- __get_dynamic_array_len(ehdrs));
+ hfi1_trace_parse_9b_bth(packet->ohdr,
+ &__entry->ack,
+ &__entry->becn,
+ &__entry->fecn,
+ &__entry->mig,
+ &__entry->se,
+ &__entry->pad,
+ &__entry->opcode,
+ &__entry->tver,
+ &__entry->pkey,
+ &__entry->psn,
+ &__entry->qpn);
+ }
+ /* extended headers */
+ memcpy(__get_dynamic_array(ehdrs),
+ &packet->ohdr->u,
+ __get_dynamic_array_len(ehdrs));
),
- TP_printk("[%s] (IB) " LRH_PRN " " LRH_9B_PRN " "
- BTH_PRN " " EHDR_PRN,
+ TP_printk("[%s] (%s) %s %s hlen:%d %s",
__get_str(dev),
- __entry->len,
- __entry->sc,
- __entry->dlid,
- __entry->slid,
- __entry->lnh, show_lnh(__entry->lnh),
- __entry->lver,
- __entry->sl,
- /* BTH */
- __entry->opcode, show_ib_opcode(__entry->opcode),
- __entry->se,
- __entry->mig,
- __entry->pad,
- __entry->tver,
- __entry->pkey,
- __entry->fecn,
- __entry->becn,
- __entry->qpn,
- __entry->ack,
- __entry->psn,
+ __get_str(ptype),
+ hfi1_trace_fmt_lrh(p,
+ __entry->bypass,
+ __entry->age,
+ __entry->becn,
+ __entry->fecn,
+ __entry->l4,
+ __entry->lnh,
+ show_lnh(__entry->lnh),
+ __entry->lver,
+ __entry->rc,
+ __entry->sc,
+ __entry->sl,
+ __entry->entropy,
+ __entry->len,
+ __entry->pkey,
+ __entry->dlid,
+ __entry->slid),
+ hfi1_trace_fmt_bth(p,
+ __entry->bypass,
+ __entry->ack,
+ __entry->becn,
+ __entry->fecn,
+ __entry->mig,
+ __entry->se,
+ __entry->pad,
+ __entry->opcode,
+ show_ib_opcode(__entry->opcode),
+ __entry->tver,
+ __entry->pkey,
+ __entry->psn,
+ __entry->qpn),
/* extended headers */
__get_dynamic_array_len(ehdrs),
__parse_ib_ehdrs(
@@ -213,78 +280,136 @@ void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
DECLARE_EVENT_CLASS(hfi1_output_ibhdr_template,
TP_PROTO(struct hfi1_devdata *dd,
- struct hfi1_opa_header *opah,
- bool sc5),
+ struct hfi1_opa_header *opah, bool sc5),
TP_ARGS(dd, opah, sc5),
TP_STRUCT__entry(
DD_DEV_ENTRY(dd)
+ __field(bool, bypass)
+ __field(u8, ack)
+ __field(u8, age)
+ __field(u8, becn)
+ __field(u8, fecn)
+ __field(u8, l4)
__field(u8, lnh)
__field(u8, lver)
- __field(u8, sl)
- __field(u16, len)
- __field(u32, dlid)
- __field(u8, sc)
- __field(u32, slid)
- __field(u8, opcode)
- __field(u8, se)
__field(u8, mig)
+ __field(u8, opcode)
__field(u8, pad)
+ __field(u8, rc)
+ __field(u8, sc)
+ __field(u8, se)
+ __field(u8, sl)
__field(u8, tver)
+ __field(u16, entropy)
+ __field(u16, len)
__field(u16, pkey)
- __field(u8, fecn)
- __field(u8, becn)
- __field(u32, qpn)
- __field(u8, ack)
+ __field(u32, dlid)
__field(u32, psn)
+ __field(u32, qpn)
+ __field(u32, slid)
/* extended headers */
__dynamic_array(u8, ehdrs,
- hfi1_trace_ib_hdr_len(&opah->ibh))
+ hfi1_trace_opa_hdr_len(opah))
),
TP_fast_assign(
struct ib_other_headers *ohdr;
- struct ib_header *hdr = &opah->ibh;
DD_DEV_ASSIGN(dd);
- hfi1_trace_parse_9b_hdr(hdr, sc5,
- &ohdr, &__entry->lnh,
- &__entry->lver, &__entry->sl,
- &__entry->sc, &__entry->len,
- &__entry->dlid, &__entry->slid);
+ if (opah->hdr_type) {
+ __entry->bypass = true;
+ hfi1_trace_parse_16b_hdr(&opah->opah,
+ &__entry->age,
+ &__entry->becn,
+ &__entry->fecn,
+ &__entry->l4,
+ &__entry->rc,
+ &__entry->sc,
+ &__entry->entropy,
+ &__entry->len,
+ &__entry->pkey,
+ &__entry->dlid,
+ &__entry->slid);
- hfi1_trace_parse_bth(ohdr, &__entry->ack,
- &__entry->becn, &__entry->fecn,
- &__entry->mig, &__entry->se,
- &__entry->pad, &__entry->opcode,
- &__entry->tver, &__entry->pkey,
- &__entry->psn, &__entry->qpn);
+ if (entry->l4 == OPA_16B_L4_IB_LOCAL)
+ ohdr = &opah->opah.u.oth;
+ else
+ ohdr = &opah->opah.u.l.oth;
+ hfi1_trace_parse_16b_bth(ohdr,
+ &__entry->ack,
+ &__entry->mig,
+ &__entry->opcode,
+ &__entry->pad,
+ &__entry->se,
+ &__entry->tver,
+ &__entry->psn,
+ &__entry->qpn);
+ } else {
+ __entry->bypass = false;
+ hfi1_trace_parse_9b_hdr(&opah->ibh, sc5,
+ &__entry->lnh,
+ &__entry->lver,
+ &__entry->sl,
+ &__entry->sc,
+ &__entry->len,
+ &__entry->dlid,
+ &__entry->slid);
+ if (entry->lnh == HFI1_LRH_BTH)
+ ohdr = &opah->ibh.u.oth;
+ else
+ ohdr = &opah->ibh.u.l.oth;
+ hfi1_trace_parse_9b_bth(ohdr,
+ &__entry->ack,
+ &__entry->becn,
+ &__entry->fecn,
+ &__entry->mig,
+ &__entry->se,
+ &__entry->pad,
+ &__entry->opcode,
+ &__entry->tver,
+ &__entry->pkey,
+ &__entry->psn,
+ &__entry->qpn);
+ }
/* extended headers */
memcpy(__get_dynamic_array(ehdrs),
&ohdr->u, __get_dynamic_array_len(ehdrs));
),
- TP_printk("[%s] (IB) " LRH_PRN " " LRH_9B_PRN " "
- BTH_PRN " " EHDR_PRN,
+ TP_printk("[%s] (%s) %s %s hlen:%d %s",
__get_str(dev),
- __entry->len,
- __entry->sc,
- __entry->dlid,
- __entry->slid,
- __entry->lnh, show_lnh(__entry->lnh),
- __entry->lver,
- __entry->sl,
- /* BTH */
- __entry->opcode, show_ib_opcode(__entry->opcode),
- __entry->se,
- __entry->mig,
- __entry->pad,
- __entry->tver,
- __entry->pkey,
- __entry->fecn,
- __entry->becn,
- __entry->qpn,
- __entry->ack,
- __entry->psn,
+ hfi1_trace_get_packet_type_str(__entry->l4),
+ hfi1_trace_fmt_lrh(p,
+ __entry->bypass,
+ __entry->age,
+ __entry->becn,
+ __entry->fecn,
+ __entry->l4,
+ __entry->lnh,
+ show_lnh(__entry->lnh),
+ __entry->lver,
+ __entry->rc,
+ __entry->sc,
+ __entry->sl,
+ __entry->entropy,
+ __entry->len,
+ __entry->pkey,
+ __entry->dlid,
+ __entry->slid),
+ hfi1_trace_fmt_bth(p,
+ __entry->bypass,
+ __entry->ack,
+ __entry->becn,
+ __entry->fecn,
+ __entry->mig,
+ __entry->se,
+ __entry->pad,
+ __entry->opcode,
+ show_ib_opcode(__entry->opcode),
+ __entry->tver,
+ __entry->pkey,
+ __entry->psn,
+ __entry->qpn),
/* extended headers */
__get_dynamic_array_len(ehdrs),
__parse_ib_ehdrs(
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 24/27] IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (21 preceding siblings ...)
2017-08-04 20:54 ` [PATCH for-next 23/27] IB/hfi1: Add 16B trace support Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 25/27] IB/hfi1: Add 16B RC/UC support Dennis Dalessandro
` (2 subsequent siblings)
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Increase lid used in hfi1 driver to 32 bits. qib continues
to use 16 bit lids.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/chip.c | 12 +++--
drivers/infiniband/hw/hfi1/hfi.h | 2 -
drivers/infiniband/hw/hfi1/mad.c | 91 ++++++++++++++++++++++++++++++-----
drivers/infiniband/hw/hfi1/verbs.c | 23 ---------
drivers/infiniband/hw/hfi1/verbs.h | 2 -
drivers/infiniband/hw/qib/qib_mad.c | 4 +-
include/rdma/rdma_vt.h | 2 -
7 files changed, 93 insertions(+), 43 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 1023701..24fda52 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -10067,10 +10067,16 @@ static void set_lidlmc(struct hfi1_pportdata *ppd)
struct hfi1_devdata *dd = ppd->dd;
u32 mask = ~((1U << ppd->lmc) - 1);
u64 c1 = read_csr(ppd->dd, DCC_CFG_PORT_CONFIG1);
+ u32 lid;
+ /*
+ * Program 0 in CSR if port lid is extended. This prevents
+ * 9B packets being sent out for large lids.
+ */
+ lid = (ppd->lid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) ? 0 : ppd->lid;
c1 &= ~(DCC_CFG_PORT_CONFIG1_TARGET_DLID_SMASK
| DCC_CFG_PORT_CONFIG1_DLID_MASK_SMASK);
- c1 |= ((ppd->lid & DCC_CFG_PORT_CONFIG1_TARGET_DLID_MASK)
+ c1 |= ((lid & DCC_CFG_PORT_CONFIG1_TARGET_DLID_MASK)
<< DCC_CFG_PORT_CONFIG1_TARGET_DLID_SHIFT) |
((mask & DCC_CFG_PORT_CONFIG1_DLID_MASK_MASK)
<< DCC_CFG_PORT_CONFIG1_DLID_MASK_SHIFT);
@@ -10081,7 +10087,7 @@ static void set_lidlmc(struct hfi1_pportdata *ppd)
*/
sreg = ((mask & SEND_CTXT_CHECK_SLID_MASK_MASK) <<
SEND_CTXT_CHECK_SLID_MASK_SHIFT) |
- (((ppd->lid & mask) & SEND_CTXT_CHECK_SLID_VALUE_MASK) <<
+ (((lid & mask) & SEND_CTXT_CHECK_SLID_VALUE_MASK) <<
SEND_CTXT_CHECK_SLID_VALUE_SHIFT);
for (i = 0; i < dd->chip_send_contexts; i++) {
@@ -10091,7 +10097,7 @@ static void set_lidlmc(struct hfi1_pportdata *ppd)
}
/* Now we have to do the same thing for the sdma engines */
- sdma_update_lmc(dd, mask, ppd->lid);
+ sdma_update_lmc(dd, mask, lid);
}
static const char *state_completed_string(u32 completed)
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index b07f42c..52cae11 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -718,7 +718,7 @@ struct hfi1_pportdata {
u32 ibmaxlen;
u32 current_egress_rate; /* units [10^6 bits/sec] */
/* LID programmed for this instance */
- u16 lid;
+ u32 lid;
/* list of pkeys programmed; 0 if not set */
u16 pkeys[MAX_PKEY_VALUES];
u16 link_width_supported;
diff --git a/drivers/infiniband/hw/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
index 40f6104..1be1d74 100644
--- a/drivers/infiniband/hw/hfi1/mad.c
+++ b/drivers/infiniband/hw/hfi1/mad.c
@@ -234,6 +234,61 @@ static void subn_handle_opa_trap_repress(struct hfi1_ibport *ibp,
spin_unlock_irqrestore(&ibp->rvp.lock, flags);
}
+static void hfi1_update_sm_ah_attr(struct hfi1_ibport *ibp,
+ struct rdma_ah_attr *attr, u32 dlid)
+{
+ rdma_ah_set_dlid(attr, dlid);
+ rdma_ah_set_port_num(attr, ppd_from_ibp(ibp)->port);
+ if (dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) {
+ struct ib_global_route *grh = rdma_ah_retrieve_grh(attr);
+
+ rdma_ah_set_ah_flags(attr, IB_AH_GRH);
+ grh->sgid_index = 0;
+ grh->hop_limit = 1;
+ grh->dgid.global.subnet_prefix =
+ ibp->rvp.gid_prefix;
+ grh->dgid.global.interface_id = OPA_MAKE_ID(dlid);
+ }
+}
+
+static int hfi1_modify_qp0_ah(struct hfi1_ibport *ibp,
+ struct rvt_ah *ah, u32 dlid)
+{
+ struct rdma_ah_attr attr;
+ struct rvt_qp *qp0;
+ int ret = -EINVAL;
+
+ memset(&attr, 0, sizeof(attr));
+ attr.type = ah->ibah.type;
+ hfi1_update_sm_ah_attr(ibp, &attr, dlid);
+ rcu_read_lock();
+ qp0 = rcu_dereference(ibp->rvp.qp[0]);
+ if (qp0)
+ ret = rdma_modify_ah(&ah->ibah, &attr);
+ rcu_read_unlock();
+ return ret;
+}
+
+static struct ib_ah *hfi1_create_qp0_ah(struct hfi1_ibport *ibp, u32 dlid)
+{
+ struct rdma_ah_attr attr;
+ struct ib_ah *ah = ERR_PTR(-EINVAL);
+ struct rvt_qp *qp0;
+ struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+ struct hfi1_devdata *dd = dd_from_ppd(ppd);
+ u8 port_num = ppd->port;
+
+ memset(&attr, 0, sizeof(attr));
+ attr.type = rdma_ah_find_type(&dd->verbs_dev.rdi.ibdev, port_num);
+ hfi1_update_sm_ah_attr(ibp, &attr, dlid);
+ rcu_read_lock();
+ qp0 = rcu_dereference(ibp->rvp.qp[0]);
+ if (qp0)
+ ah = rdma_create_ah(qp0->ibqp.pd, &attr);
+ rcu_read_unlock();
+ return ah;
+}
+
static void send_trap(struct hfi1_ibport *ibp, struct trap_node *trap)
{
struct ib_mad_send_buf *send_buf;
@@ -1283,8 +1338,8 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
struct hfi1_ibport *ibp;
u8 clientrereg;
unsigned long flags;
- u32 smlid, opa_lid; /* tmp vars to hold LID values */
- u16 lid;
+ u32 smlid;
+ u32 lid;
u8 ls_old, ls_new, ps_new;
u8 vls;
u8 msl;
@@ -1301,22 +1356,20 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
return reply((struct ib_mad_hdr *)smp);
}
- opa_lid = be32_to_cpu(pi->lid);
- if (opa_lid & 0xFFFF0000) {
- pr_warn("OPA_PortInfo lid out of range: %X\n", opa_lid);
+ lid = be32_to_cpu(pi->lid);
+ if (lid & 0xFF000000) {
+ pr_warn("OPA_PortInfo lid out of range: %X\n", lid);
smp->status |= IB_SMP_INVALID_FIELD;
goto get_only;
}
- lid = (u16)(opa_lid & 0x0000FFFF);
smlid = be32_to_cpu(pi->sm_lid);
- if (smlid & 0xFFFF0000) {
+ if (smlid & 0xFF000000) {
pr_warn("OPA_PortInfo SM lid out of range: %X\n", smlid);
smp->status |= IB_SMP_INVALID_FIELD;
goto get_only;
}
- smlid &= 0x0000FFFF;
clientrereg = (pi->clientrereg_subnettimeout &
OPA_PI_MASK_CLIENT_REREGISTER);
@@ -1331,12 +1384,16 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
ls_old = driver_lstate(ppd);
ibp->rvp.mkey = pi->mkey;
- ibp->rvp.gid_prefix = pi->subnet_prefix;
+ if (ibp->rvp.gid_prefix != pi->subnet_prefix) {
+ ibp->rvp.gid_prefix = pi->subnet_prefix;
+ event.event = IB_EVENT_GID_CHANGE;
+ ib_dispatch_event(&event);
+ }
ibp->rvp.mkey_lease_period = be16_to_cpu(pi->mkey_lease_period);
/* Must be a valid unicast LID address. */
if ((lid == 0 && ls_old > IB_PORT_INIT) ||
- lid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) {
+ (hfi1_is_16B_mcast(lid))) {
smp->status |= IB_SMP_INVALID_FIELD;
pr_warn("SubnSet(OPA_PortInfo) lid invalid 0x%x\n",
lid);
@@ -1349,6 +1406,16 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
hfi1_set_lid(ppd, lid, pi->mkeyprotect_lmc & OPA_PI_MASK_LMC);
event.event = IB_EVENT_LID_CHANGE;
ib_dispatch_event(&event);
+
+ if (HFI1_PORT_GUID_INDEX + 1 < HFI1_GUIDS_PER_PORT) {
+ /* Manufacture GID from LID to support extended
+ * addresses
+ */
+ ppd->guids[HFI1_PORT_GUID_INDEX + 1] =
+ be64_to_cpu(OPA_MAKE_ID(lid));
+ event.event = IB_EVENT_GID_CHANGE;
+ ib_dispatch_event(&event);
+ }
}
msl = pi->smsl & OPA_PI_MASK_SMSL;
@@ -1359,7 +1426,7 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
/* Must be a valid unicast LID address. */
if ((smlid == 0 && ls_old > IB_PORT_INIT) ||
- smlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) {
+ (hfi1_is_16B_mcast(smlid))) {
smp->status |= IB_SMP_INVALID_FIELD;
pr_warn("SubnSet(OPA_PortInfo) smlid invalid 0x%x\n", smlid);
} else if (smlid != ibp->rvp.sm_lid || msl != ibp->rvp.sm_sl) {
@@ -1367,7 +1434,7 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
spin_lock_irqsave(&ibp->rvp.lock, flags);
if (ibp->rvp.sm_ah) {
if (smlid != ibp->rvp.sm_lid)
- rdma_ah_set_dlid(&ibp->rvp.sm_ah->attr, smlid);
+ hfi1_modify_qp0_ah(ibp, ibp->rvp.sm_ah, smlid);
if (msl != ibp->rvp.sm_sl)
rdma_ah_set_sl(&ibp->rvp.sm_ah->attr, msl);
}
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 62526f5..6c9fa43 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1394,7 +1394,7 @@ static int query_port(struct rvt_dev_info *rdi, u8 port_num,
struct hfi1_ibdev *verbs_dev = dev_from_rdi(rdi);
struct hfi1_devdata *dd = dd_from_dev(verbs_dev);
struct hfi1_pportdata *ppd = &dd->pport[port_num - 1];
- u16 lid = ppd->lid;
+ u32 lid = ppd->lid;
/* props being zeroed by the caller, avoid zeroing it here */
props->lid = lid ? lid : 0;
@@ -1555,27 +1555,6 @@ static void hfi1_notify_new_ah(struct ib_device *ibdev,
ah->log_pmtu = ilog2(dd->vld[ah->vl].mtu);
}
-struct ib_ah *hfi1_create_qp0_ah(struct hfi1_ibport *ibp, u16 dlid)
-{
- struct rdma_ah_attr attr;
- struct ib_ah *ah = ERR_PTR(-EINVAL);
- struct rvt_qp *qp0;
- struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
- struct hfi1_devdata *dd = dd_from_ppd(ppd);
- u8 port_num = ppd->port;
-
- memset(&attr, 0, sizeof(attr));
- attr.type = rdma_ah_find_type(&dd->verbs_dev.rdi.ibdev, port_num);
- rdma_ah_set_dlid(&attr, dlid);
- rdma_ah_set_port_num(&attr, ppd_from_ibp(ibp)->port);
- rcu_read_lock();
- qp0 = rcu_dereference(ibp->rvp.qp[0]);
- if (qp0)
- ah = rdma_create_ah(qp0->ibqp.pd, &attr);
- rcu_read_unlock();
- return ah;
-}
-
/**
* hfi1_get_npkeys - return the size of the PKEY table for context 0
* @dd: the hfi1_ib device
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index 4928ee4..ab1618e 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -334,8 +334,6 @@ void hfi1_rc_hdrerr(
u8 ah_to_sc(struct ib_device *ibdev, struct rdma_ah_attr *ah_attr);
-struct ib_ah *hfi1_create_qp0_ah(struct hfi1_ibport *ibp, u16 dlid);
-
void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah);
void hfi1_ud_rcv(struct hfi1_packet *packet);
diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c
index 6b9b43b..549c719 100644
--- a/drivers/infiniband/hw/qib/qib_mad.c
+++ b/drivers/infiniband/hw/qib/qib_mad.c
@@ -105,7 +105,7 @@ static void qib_send_trap(struct qib_ibport *ibp, void *data, unsigned len)
if (ibp->rvp.sm_lid != be16_to_cpu(IB_LID_PERMISSIVE)) {
struct ib_ah *ah;
- ah = qib_create_qp0_ah(ibp, ibp->rvp.sm_lid);
+ ah = qib_create_qp0_ah(ibp, (u16)ibp->rvp.sm_lid);
if (IS_ERR(ah))
ret = PTR_ERR(ah);
else {
@@ -496,7 +496,7 @@ static int subn_get_portinfo(struct ib_smp *smp, struct ib_device *ibdev,
pip->mkey = ibp->rvp.mkey;
pip->gid_prefix = ibp->rvp.gid_prefix;
pip->lid = cpu_to_be16(ppd->lid);
- pip->sm_lid = cpu_to_be16(ibp->rvp.sm_lid);
+ pip->sm_lid = cpu_to_be16((u16)ibp->rvp.sm_lid);
pip->cap_mask = cpu_to_be32(ibp->rvp.port_cap_flags);
/* pip->diag_code; */
pip->mkey_lease_period = cpu_to_be16(ibp->rvp.mkey_lease_period);
diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h
index fdfac0f..1d94f3c 100644
--- a/include/rdma/rdma_vt.h
+++ b/include/rdma/rdma_vt.h
@@ -91,7 +91,7 @@ struct rvt_ibport {
__be16 pma_counter_select[5];
u16 pma_tag;
u16 mkey_lease_period;
- u16 sm_lid;
+ u32 sm_lid;
u8 sm_sl;
u8 mkeyprot;
u8 subnet_timeout;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 25/27] IB/hfi1: Add 16B RC/UC support
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (22 preceding siblings ...)
2017-08-04 20:54 ` [PATCH for-next 24/27] IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 26/27] IB/hfi1: Enhance PIO/SDMA send for 16B Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 27/27] IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs Dennis Dalessandro
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Add 16B bypass packet support for RC/UC traffic types.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/hfi.h | 4
drivers/infiniband/hw/hfi1/rc.c | 390 +++++++++++++++++++++++-------------
drivers/infiniband/hw/hfi1/ruc.c | 201 ++++++++++++++-----
drivers/infiniband/hw/hfi1/uc.c | 40 +++-
drivers/infiniband/hw/hfi1/verbs.h | 3
5 files changed, 445 insertions(+), 193 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 52cae11..1dfbf16 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -459,6 +459,8 @@ static inline u16 hfi1_16B_get_entropy(struct hfi1_16b_header *hdr)
return (u16)(hdr->lrh[3] & OPA_16B_ENTROPY_MASK);
}
+#define OPA_16B_MAKE_QW(low_dw, high_dw) (((u64)(high_dw) << 32) | (low_dw))
+
/*
* BTH
*/
@@ -1538,7 +1540,7 @@ static inline u32 egress_cycles(u32 len, u32 rate)
}
void set_link_ipg(struct hfi1_pportdata *ppd);
-void process_becn(struct hfi1_pportdata *ppd, u8 sl, u16 rlid, u32 lqpn,
+void process_becn(struct hfi1_pportdata *ppd, u8 sl, u32 rlid, u32 lqpn,
u32 rqpn, u8 svc_type);
void return_cnp(struct hfi1_ibport *ibp, struct rvt_qp *qp, u32 remote_qpn,
u32 pkey, u32 slid, u32 dlid, u8 sc5,
diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index e3dbf6d..99defcc 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -100,8 +100,12 @@ static int make_rc_ack(struct hfi1_ibdev *dev, struct rvt_qp *qp,
if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK))
goto bail;
- /* header size in 32-bit words LRH+BTH = (8+12)/4. */
- hwords = 5;
+ if (priv->hdr_type == HFI1_PKT_TYPE_9B)
+ /* header size in 32-bit words LRH+BTH = (8+12)/4. */
+ hwords = 5;
+ else
+ /* header size in 32-bit words 16B LRH+BTH = (16+12)/4. */
+ hwords = 7;
switch (qp->s_ack_state) {
case OP(RDMA_READ_RESPONSE_LAST):
@@ -258,8 +262,7 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
struct ib_other_headers *ohdr;
struct rvt_sge_state *ss;
struct rvt_swqe *wqe;
- /* header size in 32-bit words LRH+BTH = (8+12)/4. */
- u32 hwords = 5;
+ u32 hwords;
u32 len;
u32 bth0 = 0;
u32 bth2;
@@ -273,9 +276,23 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
if (IS_ERR(ps->s_txreq))
goto bail_no_tx;
- ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
- if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
- ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
+ ps->s_txreq->phdr.hdr.hdr_type = priv->hdr_type;
+ if (priv->hdr_type == HFI1_PKT_TYPE_9B) {
+ /* header size in 32-bit words LRH+BTH = (8+12)/4. */
+ hwords = 5;
+ if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
+ else
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
+ } else {
+ /* header size in 32-bit words 16B LRH+BTH = (16+12)/4. */
+ hwords = 7;
+ if ((rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH) &&
+ (hfi1_check_mcast(rdma_ah_get_dlid(&qp->remote_ah_attr))))
+ ohdr = &ps->s_txreq->phdr.hdr.opah.u.l.oth;
+ else
+ ohdr = &ps->s_txreq->phdr.hdr.opah.u.oth;
+ }
/* Sending responses has higher priority over sending requests. */
if ((qp->s_flags & RVT_S_RESP_PENDING) &&
@@ -703,6 +720,154 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
return 0;
}
+static inline void hfi1_make_bth_aeth(struct rvt_qp *qp,
+ struct ib_other_headers *ohdr,
+ u32 bth0, u32 bth1)
+{
+ if (qp->r_nak_state)
+ ohdr->u.aeth = cpu_to_be32((qp->r_msn & IB_MSN_MASK) |
+ (qp->r_nak_state <<
+ IB_AETH_CREDIT_SHIFT));
+ else
+ ohdr->u.aeth = rvt_compute_aeth(qp);
+
+ ohdr->bth[0] = cpu_to_be32(bth0);
+ ohdr->bth[1] = cpu_to_be32(bth1 | qp->remote_qpn);
+ ohdr->bth[2] = cpu_to_be32(mask_psn(qp->r_ack_psn));
+}
+
+static inline void hfi1_queue_rc_ack(struct rvt_qp *qp, bool is_fecn)
+{
+ struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
+ unsigned long flags;
+
+ spin_lock_irqsave(&qp->s_lock, flags);
+ if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK))
+ goto unlock;
+ this_cpu_inc(*ibp->rvp.rc_qacks);
+ qp->s_flags |= RVT_S_ACK_PENDING | RVT_S_RESP_PENDING;
+ qp->s_nak_state = qp->r_nak_state;
+ qp->s_ack_psn = qp->r_ack_psn;
+ if (is_fecn)
+ qp->s_flags |= RVT_S_ECN;
+
+ /* Schedule the send tasklet. */
+ hfi1_schedule_send(qp);
+unlock:
+ spin_unlock_irqrestore(&qp->s_lock, flags);
+}
+
+static inline void hfi1_make_rc_ack_9B(struct rvt_qp *qp,
+ struct hfi1_opa_header *opa_hdr,
+ u8 sc5, bool is_fecn,
+ u64 *pbc_flags, u32 *hwords,
+ u32 *nwords)
+{
+ struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
+ struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+ struct ib_header *hdr = &opa_hdr->ibh;
+ struct ib_other_headers *ohdr;
+ u16 lrh0 = HFI1_LRH_BTH;
+ u16 pkey;
+ u32 bth0, bth1;
+
+ opa_hdr->hdr_type = HFI1_PKT_TYPE_9B;
+ ohdr = &hdr->u.oth;
+ /* header size in 32-bit words LRH+BTH+AETH = (8+12+4)/4 */
+ *hwords = 6;
+
+ if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
+ *hwords += hfi1_make_grh(ibp, &hdr->u.l.grh,
+ rdma_ah_read_grh(&qp->remote_ah_attr),
+ *hwords - 2, SIZE_OF_CRC);
+ ohdr = &hdr->u.l.oth;
+ lrh0 = HFI1_LRH_GRH;
+ }
+ /* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
+ *pbc_flags |= ((!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT);
+
+ /* read pkey_index w/o lock (its atomic) */
+ pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+
+ lrh0 |= (sc5 & IB_SC_MASK) << IB_SC_SHIFT |
+ (rdma_ah_get_sl(&qp->remote_ah_attr) & IB_SL_MASK) <<
+ IB_SL_SHIFT;
+
+ hfi1_make_ib_hdr(hdr, lrh0, *hwords + SIZE_OF_CRC,
+ opa_get_lid(rdma_ah_get_dlid(&qp->remote_ah_attr), 9B),
+ ppd->lid | rdma_ah_get_path_bits(&qp->remote_ah_attr));
+
+ bth0 = pkey | (OP(ACKNOWLEDGE) << 24);
+ if (qp->s_mig_state == IB_MIG_MIGRATED)
+ bth0 |= IB_BTH_MIG_REQ;
+ bth1 = (!!is_fecn) << IB_BECN_SHIFT;
+ hfi1_make_bth_aeth(qp, ohdr, bth0, bth1);
+}
+
+static inline void hfi1_make_rc_ack_16B(struct rvt_qp *qp,
+ struct hfi1_opa_header *opa_hdr,
+ u8 sc5, bool is_fecn,
+ u64 *pbc_flags, u32 *hwords,
+ u32 *nwords)
+{
+ struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
+ struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+ struct hfi1_16b_header *hdr = &opa_hdr->opah;
+ struct ib_other_headers *ohdr;
+ u32 bth0, bth1;
+ u16 len, pkey;
+ u8 becn = !!is_fecn;
+ u8 l4 = OPA_16B_L4_IB_LOCAL;
+ u8 extra_bytes;
+
+ opa_hdr->hdr_type = HFI1_PKT_TYPE_16B;
+ ohdr = &hdr->u.oth;
+ /* header size in 32-bit words 16B LRH+BTH+AETH = (16+12+4)/4 */
+ *hwords = 8;
+ extra_bytes = hfi1_get_16b_padding(*hwords << 2, 0);
+ *nwords = SIZE_OF_CRC + ((extra_bytes + SIZE_OF_LT) >> 2);
+
+ if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH) &&
+ hfi1_check_mcast(rdma_ah_get_dlid(&qp->remote_ah_attr))) {
+ *hwords += hfi1_make_grh(ibp, &hdr->u.l.grh,
+ rdma_ah_read_grh(&qp->remote_ah_attr),
+ *hwords - 4, *nwords);
+ ohdr = &hdr->u.l.oth;
+ l4 = OPA_16B_L4_IB_GLOBAL;
+ }
+ *pbc_flags |= PBC_PACKET_BYPASS | PBC_INSERT_BYPASS_ICRC;
+
+ /* read pkey_index w/o lock (its atomic) */
+ pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+
+ /* Convert dwords to flits */
+ len = (*hwords + *nwords) >> 1;
+
+ hfi1_make_16b_hdr(hdr,
+ ppd->lid | rdma_ah_get_path_bits(&qp->remote_ah_attr),
+ opa_get_lid(rdma_ah_get_dlid(&qp->remote_ah_attr),
+ 16B),
+ len, pkey, becn, 0, l4, sc5);
+
+ bth0 = pkey | (OP(ACKNOWLEDGE) << 24);
+ bth0 |= extra_bytes << 20;
+ if (qp->s_mig_state == IB_MIG_MIGRATED)
+ bth1 = OPA_BTH_MIG_REQ;
+ hfi1_make_bth_aeth(qp, ohdr, bth0, bth1);
+}
+
+typedef void (*hfi1_make_rc_ack)(struct rvt_qp *qp,
+ struct hfi1_opa_header *opa_hdr,
+ u8 sc5, bool is_fecn,
+ u64 *pbc_flags, u32 *hwords,
+ u32 *nwords);
+
+/* We support only two types - 9B and 16B for now */
+static const hfi1_make_rc_ack hfi1_make_rc_ack_tbl[2] = {
+ [HFI1_PKT_TYPE_9B] = &hfi1_make_rc_ack_9B,
+ [HFI1_PKT_TYPE_16B] = &hfi1_make_rc_ack_16B
+};
+
/**
* hfi1_send_rc_ack - Construct an ACK packet and send it
* @qp: a pointer to the QP
@@ -711,87 +876,48 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
* Note that RDMA reads and atomics are handled in the
* send side QP state and send engine.
*/
-void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
- int is_fecn)
+void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd,
+ struct rvt_qp *qp, bool is_fecn)
{
struct hfi1_ibport *ibp = rcd_to_iport(rcd);
+ struct hfi1_qp_priv *priv = qp->priv;
struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+ u8 sc5 = ibp->sl_to_sc[rdma_ah_get_sl(&qp->remote_ah_attr)];
u64 pbc, pbc_flags = 0;
- u16 lrh0;
- u16 sc5;
- u32 bth0;
- u32 hwords;
- u32 vl, plen;
- struct send_context *sc;
+ u32 hwords = 0;
+ u32 nwords = 0;
+ u32 plen;
struct pio_buf *pbuf;
- struct hfi1_opa_header opah;
- struct ib_header *hdr;
- struct ib_other_headers *ohdr;
- unsigned long flags;
+ struct hfi1_opa_header opa_hdr;
/* clear the defer count */
qp->r_adefered = 0;
/* Don't send ACK or NAK if a RDMA read or atomic is pending. */
- if (qp->s_flags & RVT_S_RESP_PENDING)
- goto queue_ack;
+ if (qp->s_flags & RVT_S_RESP_PENDING) {
+ hfi1_queue_rc_ack(qp, is_fecn);
+ return;
+ }
/* Ensure s_rdma_ack_cnt changes are committed */
smp_read_barrier_depends();
- if (qp->s_rdma_ack_cnt)
- goto queue_ack;
-
- /* Construct the header */
- opah.hdr_type = 0;
- hdr = &opah.ibh;
-
- /* header size in 32-bit words LRH+BTH+AETH = (8+12+4)/4 */
- hwords = 6;
- if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
- hwords += hfi1_make_grh(ibp, &hdr->u.l.grh,
- rdma_ah_read_grh(&qp->remote_ah_attr),
- hwords, 0);
- ohdr = &hdr->u.l.oth;
- lrh0 = HFI1_LRH_GRH;
- } else {
- ohdr = &hdr->u.oth;
- lrh0 = HFI1_LRH_BTH;
+ if (qp->s_rdma_ack_cnt) {
+ hfi1_queue_rc_ack(qp, is_fecn);
+ return;
}
- /* read pkey_index w/o lock (its atomic) */
- bth0 = hfi1_get_pkey(ibp, qp->s_pkey_index) | (OP(ACKNOWLEDGE) << 24);
- if (qp->s_mig_state == IB_MIG_MIGRATED)
- bth0 |= IB_BTH_MIG_REQ;
- if (qp->r_nak_state)
- ohdr->u.aeth = cpu_to_be32((qp->r_msn & IB_MSN_MASK) |
- (qp->r_nak_state <<
- IB_AETH_CREDIT_SHIFT));
- else
- ohdr->u.aeth = rvt_compute_aeth(qp);
- sc5 = ibp->sl_to_sc[rdma_ah_get_sl(&qp->remote_ah_attr)];
- /* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
- pbc_flags |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
- lrh0 |= (sc5 & 0xf) << 12 | (rdma_ah_get_sl(&qp->remote_ah_attr)
- & 0xf) << 4;
- hdr->lrh[0] = cpu_to_be16(lrh0);
- hdr->lrh[1] = cpu_to_be16(rdma_ah_get_dlid(&qp->remote_ah_attr));
- hdr->lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
- hdr->lrh[3] = cpu_to_be16(ppd->lid |
- rdma_ah_get_path_bits(&qp->remote_ah_attr));
- ohdr->bth[0] = cpu_to_be32(bth0);
- ohdr->bth[1] = cpu_to_be32(qp->remote_qpn);
- ohdr->bth[1] |= cpu_to_be32((!!is_fecn) << IB_BECN_SHIFT);
- ohdr->bth[2] = cpu_to_be32(mask_psn(qp->r_ack_psn));
/* Don't try to send ACKs if the link isn't ACTIVE */
if (driver_lstate(ppd) != IB_PORT_ACTIVE)
return;
- sc = rcd->sc;
- plen = 2 /* PBC */ + hwords;
- vl = sc_to_vlt(ppd->dd, sc5);
- pbc = create_pbc(ppd, pbc_flags, qp->srate_mbps, vl, plen);
+ /* Make the appropriate header */
+ hfi1_make_rc_ack_tbl[priv->hdr_type](qp, &opa_hdr, sc5, is_fecn,
+ &pbc_flags, &hwords, &nwords);
- pbuf = sc_buffer_alloc(sc, plen, NULL, NULL);
+ plen = 2 /* PBC */ + hwords + nwords;
+ pbc = create_pbc(ppd, pbc_flags, qp->srate_mbps,
+ sc_to_vlt(ppd->dd, sc5), plen);
+ pbuf = sc_buffer_alloc(rcd->sc, plen, NULL, NULL);
if (!pbuf) {
/*
* We have no room to send at the moment. Pass
@@ -799,32 +925,18 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
* so that when enough buffer space becomes available,
* the ACK is sent ahead of other outgoing packets.
*/
- goto queue_ack;
+ hfi1_queue_rc_ack(qp, is_fecn);
+ return;
}
-
trace_ack_output_ibhdr(dd_from_ibdev(qp->ibqp.device),
- &opah, ib_is_sc5(sc5));
+ &opa_hdr, ib_is_sc5(sc5));
/* write the pbc and data */
- ppd->dd->pio_inline_send(ppd->dd, pbuf, pbc, hdr, hwords);
-
+ ppd->dd->pio_inline_send(ppd->dd, pbuf, pbc,
+ (priv->hdr_type == HFI1_PKT_TYPE_9B ?
+ (void *)&opa_hdr.ibh :
+ (void *)&opa_hdr.opah), hwords);
return;
-
-queue_ack:
- spin_lock_irqsave(&qp->s_lock, flags);
- if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK))
- goto unlock;
- this_cpu_inc(*ibp->rvp.rc_qacks);
- qp->s_flags |= RVT_S_ACK_PENDING | RVT_S_RESP_PENDING;
- qp->s_nak_state = qp->r_nak_state;
- qp->s_ack_psn = qp->r_ack_psn;
- if (is_fecn)
- qp->s_flags |= RVT_S_ECN;
-
- /* Schedule the send engine. */
- hfi1_schedule_send(qp);
-unlock:
- spin_unlock_irqrestore(&qp->s_lock, flags);
}
/**
@@ -992,8 +1104,10 @@ static void reset_sending_psn(struct rvt_qp *qp, u32 psn)
void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah)
{
struct ib_other_headers *ohdr;
- struct ib_header *hdr = &opah->ibh;
+ struct hfi1_qp_priv *priv = qp->priv;
struct rvt_swqe *wqe;
+ struct ib_header *hdr = NULL;
+ struct hfi1_16b_header *hdr_16b = NULL;
u32 opcode;
u32 psn;
@@ -1002,10 +1116,22 @@ void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah)
return;
/* Find out where the BTH is */
- if (ib_get_lnh(hdr) == HFI1_LRH_BTH)
- ohdr = &hdr->u.oth;
- else
- ohdr = &hdr->u.l.oth;
+ if (priv->hdr_type == HFI1_PKT_TYPE_9B) {
+ hdr = &opah->ibh;
+ if (ib_get_lnh(hdr) == HFI1_LRH_BTH)
+ ohdr = &hdr->u.oth;
+ else
+ ohdr = &hdr->u.l.oth;
+ } else {
+ u8 l4;
+
+ hdr_16b = &opah->opah;
+ l4 = hfi1_16B_get_l4(hdr_16b);
+ if (l4 == OPA_16B_L4_IB_LOCAL)
+ ohdr = &hdr_16b->u.oth;
+ else
+ ohdr = &hdr_16b->u.l.oth;
+ }
opcode = ib_bth_get_opcode(ohdr);
if (opcode >= OP(RDMA_READ_RESPONSE_FIRST) &&
@@ -1405,36 +1531,34 @@ static void rdma_seq_err(struct rvt_qp *qp, struct hfi1_ibport *ibp, u32 psn,
/**
* rc_rcv_resp - process an incoming RC response packet
- * @ibp: the port this packet came in on
- * @ohdr: the other headers for this packet
- * @data: the packet data
- * @tlen: the packet length
- * @qp: the QP for this packet
- * @opcode: the opcode for this packet
- * @psn: the packet sequence number for this packet
- * @hdrsize: the header length
- * @pmtu: the path MTU
+ * @packet: data packet information
*
* This is called from hfi1_rc_rcv() to process an incoming RC response
* packet for the given QP.
* Called at interrupt level.
*/
-static void rc_rcv_resp(struct hfi1_ibport *ibp,
- struct ib_other_headers *ohdr,
- void *data, u32 tlen, struct rvt_qp *qp,
- u32 opcode, u32 psn, u32 hdrsize, u32 pmtu,
- struct hfi1_ctxtdata *rcd)
+static void rc_rcv_resp(struct hfi1_packet *packet)
{
+ struct hfi1_ctxtdata *rcd = packet->rcd;
+ void *data = packet->payload;
+ u32 tlen = packet->tlen;
+ struct rvt_qp *qp = packet->qp;
+ struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
+ struct ib_other_headers *ohdr = packet->ohdr;
struct rvt_swqe *wqe;
enum ib_wc_status status;
unsigned long flags;
int diff;
- u32 pad;
- u32 aeth;
u64 val;
+ u32 aeth;
+ u32 psn = ib_bth_get_psn(packet->ohdr);
+ u32 pmtu = qp->pmtu;
+ u16 hdrsize = packet->hlen;
+ u8 opcode = packet->opcode;
+ u8 pad = packet->pad;
+ u8 extra_bytes = pad + packet->extra_byte + (SIZE_OF_CRC << 2);
spin_lock_irqsave(&qp->s_lock, flags);
-
trace_hfi1_ack(qp, psn);
/* Ignore invalid responses. */
@@ -1500,7 +1624,7 @@ static void rc_rcv_resp(struct hfi1_ibport *ibp,
if (unlikely(wqe->wr.opcode != IB_WR_RDMA_READ))
goto ack_op_err;
read_middle:
- if (unlikely(tlen != (hdrsize + pmtu + 4)))
+ if (unlikely(tlen != (hdrsize + pmtu + extra_bytes)))
goto ack_len_err;
if (unlikely(pmtu >= qp->s_rdma_read_len))
goto ack_len_err;
@@ -1532,13 +1656,11 @@ static void rc_rcv_resp(struct hfi1_ibport *ibp,
aeth = be32_to_cpu(ohdr->u.aeth);
if (!do_rc_ack(qp, aeth, psn, opcode, 0, rcd))
goto ack_done;
- /* Get the number of bytes the message was padded by. */
- pad = ib_bth_get_pad(ohdr);
/*
* Check that the data size is >= 0 && <= pmtu.
* Remember to account for ICRC (4).
*/
- if (unlikely(tlen < (hdrsize + pad + 4)))
+ if (unlikely(tlen < (hdrsize + extra_bytes)))
goto ack_len_err;
/*
* If this is a response to a resent RDMA read, we
@@ -1556,16 +1678,14 @@ static void rc_rcv_resp(struct hfi1_ibport *ibp,
goto ack_seq_err;
if (unlikely(wqe->wr.opcode != IB_WR_RDMA_READ))
goto ack_op_err;
- /* Get the number of bytes the message was padded by. */
- pad = ib_bth_get_pad(ohdr);
/*
* Check that the data size is >= 1 && <= pmtu.
* Remember to account for ICRC (4).
*/
- if (unlikely(tlen <= (hdrsize + pad + 4)))
+ if (unlikely(tlen <= (hdrsize + extra_bytes)))
goto ack_len_err;
read_last:
- tlen -= hdrsize + pad + 4;
+ tlen -= hdrsize + extra_bytes;
if (unlikely(tlen != qp->s_rdma_read_len))
goto ack_len_err;
aeth = be32_to_cpu(ohdr->u.aeth);
@@ -1850,7 +1970,7 @@ static void log_cca_event(struct hfi1_pportdata *ppd, u8 sl, u32 rlid,
spin_unlock_irqrestore(&ppd->cc_log_lock, flags);
}
-void process_becn(struct hfi1_pportdata *ppd, u8 sl, u16 rlid, u32 lqpn,
+void process_becn(struct hfi1_pportdata *ppd, u8 sl, u32 rlid, u32 lqpn,
u32 rqpn, u8 svc_type)
{
struct cca_timer *cca_timer;
@@ -1907,12 +2027,7 @@ void process_becn(struct hfi1_pportdata *ppd, u8 sl, u16 rlid, u32 lqpn,
/**
* hfi1_rc_rcv - process an incoming RC packet
- * @rcd: the context pointer
- * @hdr: the header of this packet
- * @rcv_flags: flags relevant to rcv processing
- * @data: the packet data
- * @tlen: the packet length
- * @qp: the QP for this packet
+ * @packet: data packet information
*
* This is called from qp_rcv() to process an incoming RC packet
* for the given QP.
@@ -1926,10 +2041,10 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
struct rvt_qp *qp = packet->qp;
struct hfi1_ibport *ibp = rcd_to_iport(rcd);
struct ib_other_headers *ohdr = packet->ohdr;
- u32 bth0;
+ u32 bth0 = be32_to_cpu(ohdr->bth[0]);
u32 opcode = packet->opcode;
u32 hdrsize = packet->hlen;
- u32 psn;
+ u32 psn = ib_bth_get_psn(packet->ohdr);
u32 pad = packet->pad;
struct ib_wc wc;
u32 pmtu = qp->pmtu;
@@ -1940,15 +2055,14 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
bool is_fecn = false;
bool copy_last = false;
u32 rkey;
+ u8 extra_bytes = pad + packet->extra_byte + (SIZE_OF_CRC << 2);
lockdep_assert_held(&qp->r_lock);
- bth0 = be32_to_cpu(ohdr->bth[0]);
if (hfi1_ruc_check_hdr(ibp, packet))
return;
is_fecn = process_ecn(qp, packet, false);
- psn = ib_bth_get_psn(ohdr);
/*
* Process responses (ACKs) before anything else. Note that the
@@ -1958,8 +2072,7 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
*/
if (opcode >= OP(RDMA_READ_RESPONSE_FIRST) &&
opcode <= OP(ATOMIC_ACKNOWLEDGE)) {
- rc_rcv_resp(ibp, ohdr, data, tlen, qp, opcode, psn,
- hdrsize, pmtu, rcd);
+ rc_rcv_resp(packet);
if (is_fecn)
goto send_ack;
return;
@@ -2026,7 +2139,12 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
case OP(RDMA_WRITE_MIDDLE):
send_middle:
/* Check for invalid length PMTU or posted rwqe len. */
- if (unlikely(tlen != (hdrsize + pmtu + 4)))
+ /*
+ * There will be no padding for 9B packet but 16B packets
+ * will come in with some padding since we always add
+ * CRC and LT bytes which will need to be flit aligned
+ */
+ if (unlikely(tlen != (hdrsize + pmtu + extra_bytes)))
goto nack_inv;
qp->r_rcv_len += pmtu;
if (unlikely(qp->r_rcv_len > qp->r_len))
@@ -2080,10 +2198,10 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
send_last:
/* Check for invalid length. */
/* LAST len should be >= 1 */
- if (unlikely(tlen < (hdrsize + pad + 4)))
+ if (unlikely(tlen < (hdrsize + extra_bytes)))
goto nack_inv;
- /* Don't count the CRC. */
- tlen -= (hdrsize + pad + 4);
+ /* Don't count the CRC(and padding and LT byte for 16B). */
+ tlen -= (hdrsize + extra_bytes);
wc.byte_len = tlen + qp->r_rcv_len;
if (unlikely(wc.byte_len > qp->r_len))
goto nack_inv;
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index 6839bfa..b3291f0 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -735,73 +735,186 @@ static inline void build_ahg(struct rvt_qp *qp, u32 npsn)
}
}
-void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
- u32 bth0, u32 bth2, int middle,
- struct hfi1_pkt_state *ps)
+static inline void hfi1_make_ruc_bth(struct rvt_qp *qp,
+ struct ib_other_headers *ohdr,
+ u32 bth0, u32 bth1, u32 bth2)
+{
+ bth1 |= qp->remote_qpn;
+ ohdr->bth[0] = cpu_to_be32(bth0);
+ ohdr->bth[1] = cpu_to_be32(bth1);
+ ohdr->bth[2] = cpu_to_be32(bth2);
+}
+
+static inline void hfi1_make_ruc_header_16B(struct rvt_qp *qp,
+ struct ib_other_headers *ohdr,
+ u32 bth0, u32 bth2, int middle,
+ struct hfi1_pkt_state *ps)
+{
+ struct hfi1_qp_priv *priv = qp->priv;
+ struct hfi1_ibport *ibp = ps->ibp;
+ struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+ u32 bth1 = 0;
+ u32 slid;
+ u16 pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+ u8 l4 = OPA_16B_L4_IB_LOCAL;
+ u8 extra_bytes = hfi1_get_16b_padding((qp->s_hdrwords << 2),
+ ps->s_txreq->s_cur_size);
+ u32 nwords = SIZE_OF_CRC + ((ps->s_txreq->s_cur_size +
+ extra_bytes + SIZE_OF_LT) >> 2);
+ u8 becn = 0;
+
+ if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH) &&
+ hfi1_check_mcast(rdma_ah_get_dlid(&qp->remote_ah_attr))) {
+ struct ib_grh *grh;
+ struct ib_global_route *grd =
+ rdma_ah_retrieve_grh(&qp->remote_ah_attr);
+ int hdrwords;
+
+ /*
+ * Ensure OPA GIDs are transformed to IB gids
+ * before creating the GRH.
+ */
+ if (grd->sgid_index == OPA_GID_INDEX)
+ grd->sgid_index = 0;
+ grh = &ps->s_txreq->phdr.hdr.opah.u.l.grh;
+ l4 = OPA_16B_L4_IB_GLOBAL;
+ hdrwords = qp->s_hdrwords - 4;
+ qp->s_hdrwords += hfi1_make_grh(ibp, grh, grd,
+ hdrwords, nwords);
+ middle = 0;
+ }
+
+ if (qp->s_mig_state == IB_MIG_MIGRATED)
+ bth1 |= OPA_BTH_MIG_REQ;
+ else
+ middle = 0;
+
+ if (middle)
+ build_ahg(qp, bth2);
+ else
+ qp->s_flags &= ~RVT_S_AHG_VALID;
+
+ bth0 |= pkey;
+ bth0 |= extra_bytes << 20;
+ if (qp->s_flags & RVT_S_ECN) {
+ qp->s_flags &= ~RVT_S_ECN;
+ /* we recently received a FECN, so return a BECN */
+ becn = 1;
+ }
+ hfi1_make_ruc_bth(qp, ohdr, bth0, bth1, bth2);
+
+ if (!ppd->lid)
+ slid = be32_to_cpu(OPA_LID_PERMISSIVE);
+ else
+ slid = ppd->lid |
+ (rdma_ah_get_path_bits(&qp->remote_ah_attr) &
+ ((1 << ppd->lmc) - 1));
+
+ hfi1_make_16b_hdr(&ps->s_txreq->phdr.hdr.opah,
+ slid,
+ opa_get_lid(rdma_ah_get_dlid(&qp->remote_ah_attr),
+ 16B),
+ (qp->s_hdrwords + nwords) >> 1,
+ pkey, becn, 0, l4, priv->s_sc);
+}
+
+static inline void hfi1_make_ruc_header_9B(struct rvt_qp *qp,
+ struct ib_other_headers *ohdr,
+ u32 bth0, u32 bth2, int middle,
+ struct hfi1_pkt_state *ps)
{
struct hfi1_qp_priv *priv = qp->priv;
struct hfi1_ibport *ibp = ps->ibp;
- u16 lrh0;
- u32 nwords;
- u32 extra_bytes;
- u32 bth1;
-
- /* Construct the header. */
- extra_bytes = -ps->s_txreq->s_cur_size & 3;
- nwords = (ps->s_txreq->s_cur_size + extra_bytes) >> 2;
- lrh0 = HFI1_LRH_BTH;
+ struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+ u32 bth1 = 0;
+ u16 pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+ u16 lrh0 = HFI1_LRH_BTH;
+ u16 slid;
+ u8 extra_bytes = -ps->s_txreq->s_cur_size & 3;
+ u32 nwords = SIZE_OF_CRC + ((ps->s_txreq->s_cur_size +
+ extra_bytes) >> 2);
+
if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
- qp->s_hdrwords +=
- hfi1_make_grh(ibp,
- &ps->s_txreq->phdr.hdr.ibh.u.l.grh,
- &qp->remote_ah_attr.grh,
- qp->s_hdrwords, nwords);
+ struct ib_grh *grh = &ps->s_txreq->phdr.hdr.ibh.u.l.grh;
+ int hdrwords = qp->s_hdrwords - 2;
+
lrh0 = HFI1_LRH_GRH;
+ qp->s_hdrwords +=
+ hfi1_make_grh(ibp, grh,
+ rdma_ah_read_grh(&qp->remote_ah_attr),
+ hdrwords, nwords);
middle = 0;
}
lrh0 |= (priv->s_sc & 0xf) << 12 |
(rdma_ah_get_sl(&qp->remote_ah_attr) & 0xf) << 4;
- /*
- * reset s_ahg/AHG fields
- *
- * This insures that the ahgentry/ahgcount
- * are at a non-AHG default to protect
- * build_verbs_tx_desc() from using
- * an include ahgidx.
- *
- * build_ahg() will modify as appropriate
- * to use the AHG feature.
- */
- priv->s_ahg->tx_flags = 0;
- priv->s_ahg->ahgcount = 0;
- priv->s_ahg->ahgidx = 0;
+
if (qp->s_mig_state == IB_MIG_MIGRATED)
bth0 |= IB_BTH_MIG_REQ;
else
middle = 0;
+
if (middle)
build_ahg(qp, bth2);
else
qp->s_flags &= ~RVT_S_AHG_VALID;
- ps->s_txreq->phdr.hdr.ibh.lrh[0] = cpu_to_be16(lrh0);
- ps->s_txreq->phdr.hdr.ibh.lrh[1] =
- cpu_to_be16(rdma_ah_get_dlid(&qp->remote_ah_attr));
- ps->s_txreq->phdr.hdr.ibh.lrh[2] =
- cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
- ps->s_txreq->phdr.hdr.ibh.lrh[3] =
- cpu_to_be16(ppd_from_ibp(ibp)->lid |
- rdma_ah_get_path_bits(&qp->remote_ah_attr));
- bth0 |= hfi1_get_pkey(ibp, qp->s_pkey_index);
+
+ bth0 |= pkey;
bth0 |= extra_bytes << 20;
- ohdr->bth[0] = cpu_to_be32(bth0);
- bth1 = qp->remote_qpn;
if (qp->s_flags & RVT_S_ECN) {
qp->s_flags &= ~RVT_S_ECN;
/* we recently received a FECN, so return a BECN */
bth1 |= (IB_BECN_MASK << IB_BECN_SHIFT);
}
- ohdr->bth[1] = cpu_to_be32(bth1);
- ohdr->bth[2] = cpu_to_be32(bth2);
+ hfi1_make_ruc_bth(qp, ohdr, bth0, bth1, bth2);
+
+ if (!ppd->lid)
+ slid = be16_to_cpu(IB_LID_PERMISSIVE);
+ else
+ slid = ppd->lid |
+ (rdma_ah_get_path_bits(&qp->remote_ah_attr) &
+ ((1 << ppd->lmc) - 1));
+ hfi1_make_ib_hdr(&ps->s_txreq->phdr.hdr.ibh,
+ lrh0,
+ qp->s_hdrwords + nwords,
+ opa_get_lid(rdma_ah_get_dlid(&qp->remote_ah_attr), 9B),
+ ppd_from_ibp(ibp)->lid |
+ rdma_ah_get_path_bits(&qp->remote_ah_attr));
+}
+
+typedef void (*hfi1_make_ruc_hdr)(struct rvt_qp *qp,
+ struct ib_other_headers *ohdr,
+ u32 bth0, u32 bth2, int middle,
+ struct hfi1_pkt_state *ps);
+
+/* We support only two types - 9B and 16B for now */
+static const hfi1_make_ruc_hdr hfi1_ruc_header_tbl[2] = {
+ [HFI1_PKT_TYPE_9B] = &hfi1_make_ruc_header_9B,
+ [HFI1_PKT_TYPE_16B] = &hfi1_make_ruc_header_16B
+};
+
+void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
+ u32 bth0, u32 bth2, int middle,
+ struct hfi1_pkt_state *ps)
+{
+ struct hfi1_qp_priv *priv = qp->priv;
+
+ /*
+ * reset s_ahg/AHG fields
+ *
+ * This insures that the ahgentry/ahgcount
+ * are at a non-AHG default to protect
+ * build_verbs_tx_desc() from using
+ * an include ahgidx.
+ *
+ * build_ahg() will modify as appropriate
+ * to use the AHG feature.
+ */
+ priv->s_ahg->tx_flags = 0;
+ priv->s_ahg->ahgcount = 0;
+ priv->s_ahg->ahgidx = 0;
+
+ /* Make the appropriate header */
+ hfi1_ruc_header_tbl[priv->hdr_type](qp, ohdr, bth0, bth2, middle, ps);
}
/* when sending, force a reschedule every one of these periods */
diff --git a/drivers/infiniband/hw/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
index e0bb766..0b64617 100644
--- a/drivers/infiniband/hw/hfi1/uc.c
+++ b/drivers/infiniband/hw/hfi1/uc.c
@@ -65,7 +65,7 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
struct hfi1_qp_priv *priv = qp->priv;
struct ib_other_headers *ohdr;
struct rvt_swqe *wqe;
- u32 hwords = 5;
+ u32 hwords;
u32 bth0 = 0;
u32 len;
u32 pmtu = qp->pmtu;
@@ -93,9 +93,23 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
goto done_free_tx;
}
- ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
- if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
- ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
+ ps->s_txreq->phdr.hdr.hdr_type = priv->hdr_type;
+ if (priv->hdr_type == HFI1_PKT_TYPE_9B) {
+ /* header size in 32-bit words LRH+BTH = (8+12)/4. */
+ hwords = 5;
+ if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
+ else
+ ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
+ } else {
+ /* header size in 32-bit words 16B LRH+BTH = (16+12)/4. */
+ hwords = 7;
+ if ((rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH) &&
+ (hfi1_check_mcast(rdma_ah_get_dlid(&qp->remote_ah_attr))))
+ ohdr = &ps->s_txreq->phdr.hdr.opah.u.l.oth;
+ else
+ ohdr = &ps->s_txreq->phdr.hdr.opah.u.oth;
+ }
/* Get the next send request. */
wqe = rvt_get_swqe_ptr(qp, qp->s_cur);
@@ -309,6 +323,7 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
u32 pmtu = qp->pmtu;
struct ib_reth *reth;
int ret;
+ u8 extra_bytes = pad + packet->extra_byte + (SIZE_OF_CRC << 2);
if (hfi1_ruc_check_hdr(ibp, packet))
return;
@@ -408,7 +423,12 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
/* FALLTHROUGH */
case OP(SEND_MIDDLE):
/* Check for invalid length PMTU or posted rwqe len. */
- if (unlikely(tlen != (hdrsize + pmtu + 4)))
+ /*
+ * There will be no padding for 9B packet but 16B packets
+ * will come in with some padding since we always add
+ * CRC and LT bytes which will need to be flit aligned
+ */
+ if (unlikely(tlen != (hdrsize + pmtu + extra_bytes)))
goto rewind;
qp->r_rcv_len += pmtu;
if (unlikely(qp->r_rcv_len > qp->r_len))
@@ -428,10 +448,10 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
send_last:
/* Check for invalid length. */
/* LAST len should be >= 1 */
- if (unlikely(tlen < (hdrsize + pad + 4)))
+ if (unlikely(tlen < (hdrsize + extra_bytes)))
goto rewind;
/* Don't count the CRC. */
- tlen -= (hdrsize + pad + 4);
+ tlen -= (hdrsize + extra_bytes);
wc.byte_len = tlen + qp->r_rcv_len;
if (unlikely(wc.byte_len > qp->r_len))
goto rewind;
@@ -524,7 +544,7 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
if (unlikely(tlen < (hdrsize + pad + 4)))
goto drop;
/* Don't count the CRC. */
- tlen -= (hdrsize + pad + 4);
+ tlen -= (hdrsize + extra_bytes);
if (unlikely(tlen + qp->r_rcv_len != qp->r_len))
goto drop;
if (test_and_clear_bit(RVT_R_REWIND_SGE, &qp->r_aflags)) {
@@ -544,14 +564,12 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
case OP(RDMA_WRITE_LAST):
rdma_last:
- /* Get the number of bytes the message was padded by. */
- pad = ib_bth_get_pad(ohdr);
/* Check for invalid length. */
/* LAST len should be >= 1 */
if (unlikely(tlen < (hdrsize + pad + 4)))
goto drop;
/* Don't count the CRC. */
- tlen -= (hdrsize + pad + 4);
+ tlen -= (hdrsize + extra_bytes);
if (unlikely(tlen + qp->r_rcv_len != qp->r_len))
goto drop;
hfi1_copy_sge(&qp->r_sge, data, tlen, true, false);
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index ab1618e..b164298 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -373,7 +373,8 @@ void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
void hfi1_send_complete(struct rvt_qp *qp, struct rvt_swqe *wqe,
enum ib_wc_status status);
-void hfi1_send_rc_ack(struct hfi1_ctxtdata *, struct rvt_qp *qp, int is_fecn);
+void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
+ bool is_fecn);
int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 26/27] IB/hfi1: Enhance PIO/SDMA send for 16B
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (23 preceding siblings ...)
2017-08-04 20:54 ` [PATCH for-next 25/27] IB/hfi1: Add 16B RC/UC support Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 27/27] IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs Dennis Dalessandro
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
PIO/SDMA send logic now uses the hdr_type field to determine
the type of packet that has been constructed. Based on the hdr_type,
certain things such as PBC flags, padding count and the LT extra
trailing bytes are determined.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/hfi.h | 2
drivers/infiniband/hw/hfi1/user_sdma.c | 7 +
drivers/infiniband/hw/hfi1/verbs.c | 197 +++++++++++++++++++++-----------
drivers/infiniband/hw/hfi1/verbs.h | 1
4 files changed, 135 insertions(+), 72 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 1dfbf16..dff3d3f 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1558,7 +1558,7 @@ typedef void (*hfi1_handle_cnp)(struct hfi1_ibport *ibp, struct rvt_qp *qp,
[HFI1_PKT_TYPE_16B] = &return_cnp_16B
};
#define PKEY_CHECK_INVALID -1
-int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
+int egress_pkey_check(struct hfi1_pportdata *ppd, u32 slid, u16 pkey,
u8 sc5, int8_t s_pkey_index);
#define PACKET_EGRESS_TIMEOUT 350
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 8f02707..aae1f40 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -502,6 +502,8 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
struct sdma_req_info info;
struct user_sdma_request *req;
u8 opcode, sc, vl;
+ u16 pkey;
+ u32 slid;
int req_queued = 0;
u16 dlid;
u32 selector;
@@ -635,8 +637,9 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
}
/* Checking P_KEY for requests from user-space */
- if (egress_pkey_check(dd->pport, req->hdr.lrh, req->hdr.bth, sc,
- PKEY_CHECK_INVALID)) {
+ pkey = (u16)be32_to_cpu(req->hdr.bth[0]);
+ slid = be16_to_cpu(req->hdr.lrh[3]);
+ if (egress_pkey_check(dd->pport, slid, pkey, sc, PKEY_CHECK_INVALID)) {
ret = -EINVAL;
goto free_req;
}
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 6c9fa43..eee2320 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -506,24 +506,6 @@ void hfi1_copy_sge(
}
}
-static u8 get_opcode(struct hfi1_opa_header *hdr)
-{
- struct ib_other_headers *ohdr;
-
- if (hdr->hdr_type) {
- if (hfi1_16B_get_l4(&hdr->opah) == OPA_16B_L4_IB_LOCAL)
- ohdr = &hdr->opah.u.oth;
- else
- ohdr = &hdr->opah.u.l.oth;
- } else {
- if (ib_get_lnh(&hdr->ibh) == HFI1_LRH_BTH)
- ohdr = &hdr->ibh.u.oth;
- else
- ohdr = &hdr->ibh.u.l.oth;
- }
- return ib_bth_get_opcode(ohdr);
-}
-
/*
* Make sure the QP is ready and able to accept the given opcode.
*/
@@ -830,12 +812,27 @@ static int build_verbs_tx_desc(
int ret = 0;
struct hfi1_sdma_header *phdr = &tx->phdr;
u16 hdrbytes = tx->hdr_dwords << 2;
+ u32 *hdr;
+ u8 extra_bytes = 0;
+ static char trail_buf[12]; /* CRC = 4, LT = 1, Pad = 0 to 7 bytes */
+ if (tx->phdr.hdr.hdr_type) {
+ /*
+ * hdrbytes accounts for PBC. Need to subtract 8 bytes
+ * before calculating padding.
+ */
+ extra_bytes = hfi1_get_16b_padding(hdrbytes - 8, length) +
+ (SIZE_OF_CRC << 2) + SIZE_OF_LT;
+ hdr = (u32 *)&phdr->hdr.opah;
+ } else {
+ hdr = (u32 *)&phdr->hdr.ibh;
+ }
if (!ahg_info->ahgcount) {
ret = sdma_txinit_ahg(
&tx->txreq,
ahg_info->tx_flags,
- hdrbytes + length,
+ hdrbytes + length +
+ extra_bytes,
ahg_info->ahgidx,
0,
NULL,
@@ -865,8 +862,17 @@ static int build_verbs_tx_desc(
goto bail_txadd;
}
/* add the ulp payload - if any. tx->ss can be NULL for acks */
- if (tx->ss)
+ if (tx->ss) {
ret = build_verbs_ulp_payload(sde, length, tx);
+ if (ret)
+ goto bail_txadd;
+ }
+
+ /* add icrc, lt byte, and padding to flit */
+ if (extra_bytes != 0)
+ ret = sdma_txadd_kvaddr(sde->dd, &tx->txreq,
+ trail_buf, extra_bytes);
+
bail_txadd:
return ret;
}
@@ -878,26 +884,42 @@ int hfi1_verbs_send_dma(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
struct hfi1_ahg_info *ahg_info = priv->s_ahg;
u32 hdrwords = qp->s_hdrwords;
u32 len = ps->s_txreq->s_cur_size;
- u32 plen = hdrwords + ((len + 3) >> 2) + 2; /* includes pbc */
+ u32 plen;
struct hfi1_ibdev *dev = ps->dev;
struct hfi1_pportdata *ppd = ps->ppd;
struct verbs_txreq *tx;
u8 sc5 = priv->s_sc;
-
int ret;
+ u32 dwords;
+ bool bypass = false;
+
+ if (ps->s_txreq->phdr.hdr.hdr_type) {
+ u8 extra_bytes = hfi1_get_16b_padding((hdrwords << 2), len);
+
+ dwords = (len + extra_bytes + (SIZE_OF_CRC << 2) +
+ SIZE_OF_LT) >> 2;
+ bypass = true;
+ } else {
+ dwords = (len + 3) >> 2;
+ }
+ plen = hdrwords + dwords + 2;
tx = ps->s_txreq;
if (!sdma_txreq_built(&tx->txreq)) {
if (likely(pbc == 0)) {
u32 vl = sc_to_vlt(dd_from_ibdev(qp->ibqp.device), sc5);
- u8 opcode = get_opcode(&tx->phdr.hdr);
/* No vl15 here */
- /* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
- pbc |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
+ /* set PBC_DC_INFO bit (aka SC[4]) in pbc */
+ if (ps->s_txreq->phdr.hdr.hdr_type)
+ pbc |= PBC_PACKET_BYPASS |
+ PBC_INSERT_BYPASS_ICRC;
+ else
+ pbc |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
- if (unlikely(hfi1_dbg_fault_opcode(qp, opcode, false)))
- pbc = hfi1_fault_tx(qp, opcode, pbc);
+ if (unlikely(hfi1_dbg_fault_opcode(qp, ps->opcode,
+ false)))
+ pbc = hfi1_fault_tx(qp, ps->opcode, pbc);
pbc = create_pbc(ppd,
pbc,
qp->srate_mbps,
@@ -1000,10 +1022,10 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
u32 hdrwords = qp->s_hdrwords;
struct rvt_sge_state *ss = ps->s_txreq->ss;
u32 len = ps->s_txreq->s_cur_size;
- u32 dwords = (len + 3) >> 2;
- u32 plen = hdrwords + dwords + 2; /* includes pbc */
+ u32 dwords;
+ u32 plen;
struct hfi1_pportdata *ppd = ps->ppd;
- u32 *hdr = (u32 *)&ps->s_txreq->phdr.hdr;
+ u32 *hdr;
u8 sc5;
unsigned long flags = 0;
struct send_context *sc;
@@ -1011,6 +1033,23 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
int wc_status = IB_WC_SUCCESS;
int ret = 0;
pio_release_cb cb = NULL;
+ u32 lrh0_16b;
+ bool bypass = false;
+ u8 extra_bytes = 0;
+
+ if (ps->s_txreq->phdr.hdr.hdr_type) {
+ u8 pad_size = hfi1_get_16b_padding((hdrwords << 2), len);
+
+ extra_bytes = pad_size + (SIZE_OF_CRC << 2) + SIZE_OF_LT;
+ dwords = (len + extra_bytes) >> 2;
+ hdr = (u32 *)&ps->s_txreq->phdr.hdr.opah;
+ lrh0_16b = ps->s_txreq->phdr.hdr.opah.lrh[0];
+ bypass = true;
+ } else {
+ dwords = (len + 3) >> 2;
+ hdr = (u32 *)&ps->s_txreq->phdr.hdr.ibh;
+ }
+ plen = hdrwords + dwords + 2;
/* only RC/UC use complete */
switch (qp->ibqp.qp_type) {
@@ -1028,13 +1067,14 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
if (likely(pbc == 0)) {
u8 vl = sc_to_vlt(dd_from_ibdev(qp->ibqp.device), sc5);
- struct verbs_txreq *tx = ps->s_txreq;
- u8 opcode = get_opcode(&tx->phdr.hdr);
- /* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
- pbc |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
- if (unlikely(hfi1_dbg_fault_opcode(qp, opcode, false)))
- pbc = hfi1_fault_tx(qp, opcode, pbc);
+ /* set PBC_DC_INFO bit (aka SC[4]) in pbc */
+ if (ps->s_txreq->phdr.hdr.hdr_type)
+ pbc |= PBC_PACKET_BYPASS | PBC_INSERT_BYPASS_ICRC;
+ else
+ pbc |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
+ if (unlikely(hfi1_dbg_fault_opcode(qp, ps->opcode, false)))
+ pbc = hfi1_fault_tx(qp, ps->opcode, pbc);
pbc = create_pbc(ppd, pbc, qp->srate_mbps, vl, plen);
}
if (cb)
@@ -1071,11 +1111,12 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
}
}
- if (len == 0) {
+ if (dwords == 0) {
pio_copy(ppd->dd, pbuf, pbc, hdr, hdrwords);
} else {
+ seg_pio_copy_start(pbuf, pbc,
+ hdr, hdrwords * 4);
if (ss) {
- seg_pio_copy_start(pbuf, pbc, hdr, hdrwords * 4);
while (len) {
void *addr = ss->sge.vaddr;
u32 slen = ss->sge.length;
@@ -1086,8 +1127,20 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
seg_pio_copy_mid(pbuf, addr, slen);
len -= slen;
}
- seg_pio_copy_end(pbuf);
}
+ /*
+ * Bypass packet will need to copy additional
+ * bytes to accommodate for CRC and LT bytes
+ */
+ if (extra_bytes) {
+ u8 *empty_buf;
+
+ empty_buf = kcalloc(extra_bytes, sizeof(u8),
+ GFP_KERNEL);
+ seg_pio_copy_mid(pbuf, empty_buf, extra_bytes);
+ kfree(empty_buf);
+ }
+ seg_pio_copy_end(pbuf);
}
trace_pio_output_ibhdr(dd_from_ibdev(qp->ibqp.device),
@@ -1137,10 +1190,10 @@ static inline int egress_pkey_matches_entry(u16 pkey, u16 ent)
/**
* egress_pkey_check - check P_KEY of a packet
- * @ppd: Physical IB port data
- * @lrh: Local route header
- * @bth: Base transport header
- * @sc5: SC for packet
+ * @ppd: Physical IB port data
+ * @slid: SLID for packet
+ * @bkey: PKEY for header
+ * @sc5: SC for packet
* @s_pkey_index: It will be used for look up optimization for kernel contexts
* only. If it is negative value, then it means user contexts is calling this
* function.
@@ -1149,19 +1202,16 @@ static inline int egress_pkey_matches_entry(u16 pkey, u16 ent)
*
* Return: 0 on success, otherwise, 1
*/
-int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
+int egress_pkey_check(struct hfi1_pportdata *ppd, u32 slid, u16 pkey,
u8 sc5, int8_t s_pkey_index)
{
struct hfi1_devdata *dd;
int i;
- u16 pkey;
int is_user_ctxt_mechanism = (s_pkey_index < 0);
if (!(ppd->part_enforce & HFI1_PART_ENFORCE_OUT))
return 0;
- pkey = (u16)be32_to_cpu(bth[0]);
-
/* If SC15, pkey[0:14] must be 0x7fff */
if ((sc5 == 0xf) && ((pkey & PKEY_LOW_15_MASK) != PKEY_LOW_15_MASK))
goto bad;
@@ -1194,8 +1244,6 @@ int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
dd = ppd->dd;
if (!(dd->err_info_xmit_constraint.status &
OPA_EI_STATUS_SMASK)) {
- u16 slid = be16_to_cpu(lrh[3]);
-
dd->err_info_xmit_constraint.status |=
OPA_EI_STATUS_SMASK;
dd->err_info_xmit_constraint.slid = slid;
@@ -1212,11 +1260,11 @@ int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
* and size
*/
static inline send_routine get_send_routine(struct rvt_qp *qp,
- struct verbs_txreq *tx)
+ struct hfi1_pkt_state *ps)
{
struct hfi1_devdata *dd = dd_from_ibdev(qp->ibqp.device);
struct hfi1_qp_priv *priv = qp->priv;
- struct hfi1_opa_header *h = &tx->phdr.hdr;
+ struct verbs_txreq *tx = ps->s_txreq;
if (unlikely(!(dd->flags & HFI1_HAS_SEND_DMA)))
return dd->process_pio_send;
@@ -1228,11 +1276,9 @@ static inline send_routine get_send_routine(struct rvt_qp *qp,
break;
case IB_QPT_UC:
case IB_QPT_RC: {
- u8 op = get_opcode(h);
-
if (piothreshold &&
tx->s_cur_size <= min(piothreshold, qp->pmtu) &&
- (BIT(op & OPMASK) & pio_opmask[op >> 5]) &&
+ (BIT(ps->opcode & OPMASK) & pio_opmask[ps->opcode >> 5]) &&
iowait_sdma_pending(&priv->s_iowait) == 0 &&
!sdma_txreq_built(&tx->txreq))
return dd->process_pio_send;
@@ -1257,25 +1303,38 @@ int hfi1_verbs_send(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
struct hfi1_devdata *dd = dd_from_ibdev(qp->ibqp.device);
struct hfi1_qp_priv *priv = qp->priv;
struct ib_other_headers *ohdr;
- struct ib_header *hdr;
send_routine sr;
int ret;
- u8 lnh;
+ u16 pkey;
+ u32 slid;
- hdr = &ps->s_txreq->phdr.hdr.ibh;
/* locate the pkey within the headers */
- lnh = ib_get_lnh(hdr);
- if (lnh == HFI1_LRH_GRH)
- ohdr = &hdr->u.l.oth;
- else
- ohdr = &hdr->u.oth;
-
- sr = get_send_routine(qp, ps->s_txreq);
- ret = egress_pkey_check(dd->pport,
- hdr->lrh,
- ohdr->bth,
- priv->s_sc,
- qp->s_pkey_index);
+ if (ps->s_txreq->phdr.hdr.hdr_type) {
+ struct hfi1_16b_header *hdr = &ps->s_txreq->phdr.hdr.opah;
+ u8 l4 = hfi1_16B_get_l4(hdr);
+
+ if (l4 == OPA_16B_L4_IB_GLOBAL)
+ ohdr = &hdr->u.l.oth;
+ else
+ ohdr = &hdr->u.oth;
+ slid = hfi1_16B_get_slid(hdr);
+ pkey = hfi1_16B_get_pkey(hdr);
+ } else {
+ struct ib_header *hdr = &ps->s_txreq->phdr.hdr.ibh;
+ u8 lnh = ib_get_lnh(hdr);
+
+ if (lnh == HFI1_LRH_GRH)
+ ohdr = &hdr->u.l.oth;
+ else
+ ohdr = &hdr->u.oth;
+ slid = ib_get_slid(hdr);
+ pkey = ib_bth_get_pkey(ohdr);
+ }
+
+ ps->opcode = ib_bth_get_opcode(ohdr);
+ sr = get_send_routine(qp, ps);
+ ret = egress_pkey_check(dd->pport, slid, pkey,
+ priv->s_sc, qp->s_pkey_index);
if (unlikely(ret)) {
/*
* The value we are returning here does not get propagated to
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index b164298..56ead87 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -163,6 +163,7 @@ struct hfi1_pkt_state {
unsigned long timeout;
unsigned long timeout_int;
int cpu;
+ u8 opcode;
bool in_thread;
bool pkts_sent;
};
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH for-next 27/27] IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
` (24 preceding siblings ...)
2017-08-04 20:54 ` [PATCH for-next 26/27] IB/hfi1: Enhance PIO/SDMA send for 16B Dennis Dalessandro
@ 2017-08-04 20:54 ` Dennis Dalessandro
25 siblings, 0 replies; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-04 20:54 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt,
Dasaratharaman Chandramouli
From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Enabling this bit helps core components query for extended address
support using the rdma_cap_opa_ah interface.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/verbs.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index eee2320..e340b02 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1912,7 +1912,8 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
dd->verbs_dev.rdi.dparms.psn_mask = PSN_MASK;
dd->verbs_dev.rdi.dparms.psn_shift = PSN_SHIFT;
dd->verbs_dev.rdi.dparms.psn_modify_mask = PSN_MODIFY_MASK;
- dd->verbs_dev.rdi.dparms.core_cap_flags = RDMA_CORE_PORT_INTEL_OPA;
+ dd->verbs_dev.rdi.dparms.core_cap_flags = RDMA_CORE_PORT_INTEL_OPA |
+ RDMA_CORE_CAP_OPA_AH;
dd->verbs_dev.rdi.dparms.max_mad_size = OPA_MGMT_MAD_SIZE;
dd->verbs_dev.rdi.driver_f.qp_priv_alloc = qp_priv_alloc;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <20170804205320.17853.77236.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-08-06 8:18 ` Leon Romanovsky
[not found] ` <20170806081857.GC3636-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
0 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2017-08-06 8:18 UTC (permalink / raw)
To: Dennis Dalessandro
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
[-- Attachment #1: Type: text/plain, Size: 2553 bytes --]
On Fri, Aug 04, 2017 at 01:53:21PM -0700, Dennis Dalessandro wrote:
> From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> sm_lid field in struct ib_port_attr is increased to 32 bits. This
> enables core components to use larger LIDs if needed.
> The user ABI is unchanged and return 16 bit values when queried.
>
> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
> drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
> include/rdma/ib_verbs.h | 2 +-
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index 7ef74b0..38dce45 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
> resp.bad_pkey_cntr = attr.bad_pkey_cntr;
> resp.qkey_viol_cntr = attr.qkey_viol_cntr;
> resp.pkey_tbl_len = attr.pkey_tbl_len;
> - resp.sm_lid = attr.sm_lid;
> - if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
> + if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
> resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
> - else
> + resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
> + } else {
> resp.lid = (u16)attr.lid;
> + resp.sm_lid = (u16)attr.sm_lid;
I see that lid is already has casting from u32 to u16 and now it is sm_lid.
Do we have more elegant way to achieve that? And comment for future
developers can be good too.
> + }
> resp.lmc = attr.lmc;
> resp.max_vl_num = attr.max_vl_num;
> resp.sm_sl = attr.sm_sl;
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 4eccf89..5f4f2d3 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -549,7 +549,7 @@ struct ib_port_attr {
> u32 bad_pkey_cntr;
> u32 qkey_viol_cntr;
> u16 pkey_tbl_len;
> - u16 sm_lid;
> + u32 sm_lid;
> u32 lid;
> u8 lmc;
> u8 max_vl_num;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <20170806081857.GC3636-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-08-06 8:22 ` Leon Romanovsky
[not found] ` <20170806082217.GE3636-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
0 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2017-08-06 8:22 UTC (permalink / raw)
To: Dennis Dalessandro
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
[-- Attachment #1: Type: text/plain, Size: 2804 bytes --]
On Sun, Aug 06, 2017 at 11:18:57AM +0300, Leon Romanovsky wrote:
> On Fri, Aug 04, 2017 at 01:53:21PM -0700, Dennis Dalessandro wrote:
> > From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >
> > sm_lid field in struct ib_port_attr is increased to 32 bits. This
> > enables core components to use larger LIDs if needed.
> > The user ABI is unchanged and return 16 bit values when queried.
> >
> > Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > ---
> > drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
> > include/rdma/ib_verbs.h | 2 +-
> > 2 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> > index 7ef74b0..38dce45 100644
> > --- a/drivers/infiniband/core/uverbs_cmd.c
> > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > @@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
> > resp.bad_pkey_cntr = attr.bad_pkey_cntr;
> > resp.qkey_viol_cntr = attr.qkey_viol_cntr;
> > resp.pkey_tbl_len = attr.pkey_tbl_len;
> > - resp.sm_lid = attr.sm_lid;
> > - if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
> > + if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
> > resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
> > - else
> > + resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
> > + } else {
> > resp.lid = (u16)attr.lid;
> > + resp.sm_lid = (u16)attr.sm_lid;
>
> I see that lid is already has casting from u32 to u16 and now it is sm_lid.
> Do we have more elegant way to achieve that? And comment for future
> developers can be good too.
I see now, you changed lid in patch 11. The same comment there.
>
> > + }
> > resp.lmc = attr.lmc;
> > resp.max_vl_num = attr.max_vl_num;
> > resp.sm_sl = attr.sm_sl;
> > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> > index 4eccf89..5f4f2d3 100644
> > --- a/include/rdma/ib_verbs.h
> > +++ b/include/rdma/ib_verbs.h
> > @@ -549,7 +549,7 @@ struct ib_port_attr {
> > u32 bad_pkey_cntr;
> > u32 qkey_viol_cntr;
> > u16 pkey_tbl_len;
> > - u16 sm_lid;
> > + u32 sm_lid;
> > u32 lid;
> > u8 lmc;
> > u8 max_vl_num;
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <20170806082217.GE3636-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-08-08 14:41 ` Leon Romanovsky
[not found] ` <20170808144146.GF28851-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
0 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2017-08-08 14:41 UTC (permalink / raw)
To: Dennis Dalessandro
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, Don Hiatt, Ira Weiny,
Dasaratharaman Chandramouli
[-- Attachment #1: Type: text/plain, Size: 2486 bytes --]
On Sun, Aug 06, 2017 at 11:22:17AM +0300, Leon Romanovsky wrote:
> On Sun, Aug 06, 2017 at 11:18:57AM +0300, Leon Romanovsky wrote:
> > On Fri, Aug 04, 2017 at 01:53:21PM -0700, Dennis Dalessandro wrote:
> > > From: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > >
> > > sm_lid field in struct ib_port_attr is increased to 32 bits. This
> > > enables core components to use larger LIDs if needed.
> > > The user ABI is unchanged and return 16 bit values when queried.
> > >
> > > Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > ---
> > > drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
> > > include/rdma/ib_verbs.h | 2 +-
> > > 2 files changed, 6 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> > > index 7ef74b0..38dce45 100644
> > > --- a/drivers/infiniband/core/uverbs_cmd.c
> > > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > > @@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
> > > resp.bad_pkey_cntr = attr.bad_pkey_cntr;
> > > resp.qkey_viol_cntr = attr.qkey_viol_cntr;
> > > resp.pkey_tbl_len = attr.pkey_tbl_len;
> > > - resp.sm_lid = attr.sm_lid;
> > > - if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
> > > + if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
> > > resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
> > > - else
> > > + resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
> > > + } else {
> > > resp.lid = (u16)attr.lid;
> > > + resp.sm_lid = (u16)attr.sm_lid;
> >
> > I see that lid is already has casting from u32 to u16 and now it is sm_lid.
> > Do we have more elegant way to achieve that? And comment for future
> > developers can be good too.
>
> I see now, you changed lid in patch 11. The same comment there.
To be clear on that point: NAK on *current* implementation.
At least, it should be done in separate function, with comment explains
why are you doing and why it is safe to cast from higher value to lower
value and proper checks that you are not loosing information when you
are casting.
Thanks
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* RE: [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <20170808144146.GF28851-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-08-08 16:35 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E67CFAF96-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 40+ messages in thread
From: Weiny, Ira @ 2017-08-08 16:35 UTC (permalink / raw)
To: Leon Romanovsky, Dalessandro, Dennis
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Hiatt, Don,
Dasaratharaman Chandramouli
>
> On Sun, Aug 06, 2017 at 11:22:17AM +0300, Leon Romanovsky wrote:
> > On Sun, Aug 06, 2017 at 11:18:57AM +0300, Leon Romanovsky wrote:
> > > On Fri, Aug 04, 2017 at 01:53:21PM -0700, Dennis Dalessandro wrote:
> > > > From: Dasaratharaman Chandramouli
> > > > <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > >
> > > > sm_lid field in struct ib_port_attr is increased to 32 bits. This
> > > > enables core components to use larger LIDs if needed.
> > > > The user ABI is unchanged and return 16 bit values when queried.
> > > >
> > > > Signed-off-by: Dasaratharaman Chandramouli
> > > > <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > ---
> > > > drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
> > > > include/rdma/ib_verbs.h | 2 +-
> > > > 2 files changed, 6 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/core/uverbs_cmd.c
> > > > b/drivers/infiniband/core/uverbs_cmd.c
> > > > index 7ef74b0..38dce45 100644
> > > > --- a/drivers/infiniband/core/uverbs_cmd.c
> > > > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > > > @@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct
> ib_uverbs_file *file,
> > > > resp.bad_pkey_cntr = attr.bad_pkey_cntr;
> > > > resp.qkey_viol_cntr = attr.qkey_viol_cntr;
> > > > resp.pkey_tbl_len = attr.pkey_tbl_len;
> > > > - resp.sm_lid = attr.sm_lid;
> > > > - if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
> > > > + if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
> > > > resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
> > > > - else
> > > > + resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
> > > > + } else {
> > > > resp.lid = (u16)attr.lid;
> > > > + resp.sm_lid = (u16)attr.sm_lid;
> > >
> > > I see that lid is already has casting from u32 to u16 and now it is sm_lid.
> > > Do we have more elegant way to achieve that? And comment for future
> > > developers can be good too.
> >
> > I see now, you changed lid in patch 11. The same comment there.
>
> To be clear on that point: NAK on *current* implementation.
>
> At least, it should be done in separate function, with comment explains why
> are you doing and why it is safe to cast from higher value to lower value and
> proper checks that you are not loosing information when you are casting.
>
A comment is reasonable. However, I'm not sure a separate function is needed. With the current interface there is no way to pass all the information through.
Software which requires the use of these values will need to know to get them from other sources (OPA MAD queries for the SM LID for example.)
Ira
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E67CFAF96-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-08-08 16:49 ` Don Hiatt
[not found] ` <018c6e70-78d5-51ec-993f-35be575a6da1-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-08-09 6:56 ` Leon Romanovsky
1 sibling, 1 reply; 40+ messages in thread
From: Don Hiatt @ 2017-08-08 16:49 UTC (permalink / raw)
To: Weiny, Ira, Leon Romanovsky
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Dasaratharaman Chandramouli
On 8/8/2017 9:35 AM, Weiny, Ira wrote:
>> On Sun, Aug 06, 2017 at 11:22:17AM +0300, Leon Romanovsky wrote:
>>> On Sun, Aug 06, 2017 at 11:18:57AM +0300, Leon Romanovsky wrote:
>>>> On Fri, Aug 04, 2017 at 01:53:21PM -0700, Dennis Dalessandro wrote:
>>>>> From: Dasaratharaman Chandramouli
>>>>> <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>>
>>>>> sm_lid field in struct ib_port_attr is increased to 32 bits. This
>>>>> enables core components to use larger LIDs if needed.
>>>>> The user ABI is unchanged and return 16 bit values when queried.
>>>>>
>>>>> Signed-off-by: Dasaratharaman Chandramouli
>>>>> <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>> Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>> Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>> ---
>>>>> drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
>>>>> include/rdma/ib_verbs.h | 2 +-
>>>>> 2 files changed, 6 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/infiniband/core/uverbs_cmd.c
>>>>> b/drivers/infiniband/core/uverbs_cmd.c
>>>>> index 7ef74b0..38dce45 100644
>>>>> --- a/drivers/infiniband/core/uverbs_cmd.c
>>>>> +++ b/drivers/infiniband/core/uverbs_cmd.c
>>>>> @@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct
>> ib_uverbs_file *file,
>>>>> resp.bad_pkey_cntr = attr.bad_pkey_cntr;
>>>>> resp.qkey_viol_cntr = attr.qkey_viol_cntr;
>>>>> resp.pkey_tbl_len = attr.pkey_tbl_len;
>>>>> - resp.sm_lid = attr.sm_lid;
>>>>> - if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
>>>>> + if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
>>>>> resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
>>>>> - else
>>>>> + resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
>>>>> + } else {
>>>>> resp.lid = (u16)attr.lid;
>>>>> + resp.sm_lid = (u16)attr.sm_lid;
>>>> I see that lid is already has casting from u32 to u16 and now it is sm_lid.
>>>> Do we have more elegant way to achieve that? And comment for future
>>>> developers can be good too.
>>> I see now, you changed lid in patch 11. The same comment there.
>> To be clear on that point: NAK on *current* implementation.
>>
>> At least, it should be done in separate function, with comment explains why
>> are you doing and why it is safe to cast from higher value to lower value and
>> proper checks that you are not loosing information when you are casting.
>>
> A comment is reasonable. However, I'm not sure a separate function is needed. With the current interface there is no way to pass all the information through.
>
> Software which requires the use of these values will need to know to get them from other sources (OPA MAD queries for the SM LID for example.)
>
> Ira
>
Hi Leon,
I had already created a function to do the cast in the 'IB/core: Change
wc.slid from 16 to 32 bits' patch
but I did not apply it here. Sorry about that. My plan is to rename the
functions to ib_lid_{spu16,be16}
and introduce them in this patch with a comment in the git log that
these functions exist. I will also add
a comment stating why we aren't losing any information.
Thank you.
don
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E67CFAF96-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-08-08 16:49 ` Don Hiatt
@ 2017-08-09 6:56 ` Leon Romanovsky
1 sibling, 0 replies; 40+ messages in thread
From: Leon Romanovsky @ 2017-08-09 6:56 UTC (permalink / raw)
To: Weiny, Ira
Cc: Dalessandro, Dennis,
dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Hiatt, Don,
Dasaratharaman Chandramouli
[-- Attachment #1: Type: text/plain, Size: 3248 bytes --]
On Tue, Aug 08, 2017 at 04:35:47PM +0000, Weiny, Ira wrote:
> >
> > On Sun, Aug 06, 2017 at 11:22:17AM +0300, Leon Romanovsky wrote:
> > > On Sun, Aug 06, 2017 at 11:18:57AM +0300, Leon Romanovsky wrote:
> > > > On Fri, Aug 04, 2017 at 01:53:21PM -0700, Dennis Dalessandro wrote:
> > > > > From: Dasaratharaman Chandramouli
> > > > > <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > >
> > > > > sm_lid field in struct ib_port_attr is increased to 32 bits. This
> > > > > enables core components to use larger LIDs if needed.
> > > > > The user ABI is unchanged and return 16 bit values when queried.
> > > > >
> > > > > Signed-off-by: Dasaratharaman Chandramouli
> > > > > <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > ---
> > > > > drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
> > > > > include/rdma/ib_verbs.h | 2 +-
> > > > > 2 files changed, 6 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/infiniband/core/uverbs_cmd.c
> > > > > b/drivers/infiniband/core/uverbs_cmd.c
> > > > > index 7ef74b0..38dce45 100644
> > > > > --- a/drivers/infiniband/core/uverbs_cmd.c
> > > > > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > > > > @@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct
> > ib_uverbs_file *file,
> > > > > resp.bad_pkey_cntr = attr.bad_pkey_cntr;
> > > > > resp.qkey_viol_cntr = attr.qkey_viol_cntr;
> > > > > resp.pkey_tbl_len = attr.pkey_tbl_len;
> > > > > - resp.sm_lid = attr.sm_lid;
> > > > > - if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
> > > > > + if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
> > > > > resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
> > > > > - else
> > > > > + resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
> > > > > + } else {
> > > > > resp.lid = (u16)attr.lid;
> > > > > + resp.sm_lid = (u16)attr.sm_lid;
> > > >
> > > > I see that lid is already has casting from u32 to u16 and now it is sm_lid.
> > > > Do we have more elegant way to achieve that? And comment for future
> > > > developers can be good too.
> > >
> > > I see now, you changed lid in patch 11. The same comment there.
> >
> > To be clear on that point: NAK on *current* implementation.
> >
> > At least, it should be done in separate function, with comment explains why
> > are you doing and why it is safe to cast from higher value to lower value and
> > proper checks that you are not loosing information when you are casting.
> >
>
> A comment is reasonable. However, I'm not sure a separate function is needed. With the current interface there is no way to pass all the information through.
>
> Software which requires the use of these values will need to know to get them from other sources (OPA MAD queries for the SM LID for example.)
The function, check and warning are intended to catch possible buggy
flows and user errors. It is a bit optimistic way of thinking that all
software uses SMs.
>
> Ira
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <018c6e70-78d5-51ec-993f-35be575a6da1-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-08-09 6:57 ` Leon Romanovsky
[not found] ` <20170809065730.GC1423-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
0 siblings, 1 reply; 40+ messages in thread
From: Leon Romanovsky @ 2017-08-09 6:57 UTC (permalink / raw)
To: Don Hiatt
Cc: Weiny, Ira, dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Dasaratharaman Chandramouli
[-- Attachment #1: Type: text/plain, Size: 3725 bytes --]
On Tue, Aug 08, 2017 at 09:49:17AM -0700, Don Hiatt wrote:
>
>
> On 8/8/2017 9:35 AM, Weiny, Ira wrote:
> > > On Sun, Aug 06, 2017 at 11:22:17AM +0300, Leon Romanovsky wrote:
> > > > On Sun, Aug 06, 2017 at 11:18:57AM +0300, Leon Romanovsky wrote:
> > > > > On Fri, Aug 04, 2017 at 01:53:21PM -0700, Dennis Dalessandro wrote:
> > > > > > From: Dasaratharaman Chandramouli
> > > > > > <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > >
> > > > > > sm_lid field in struct ib_port_attr is increased to 32 bits. This
> > > > > > enables core components to use larger LIDs if needed.
> > > > > > The user ABI is unchanged and return 16 bit values when queried.
> > > > > >
> > > > > > Signed-off-by: Dasaratharaman Chandramouli
> > > > > > <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > > Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > > Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > > > > ---
> > > > > > drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
> > > > > > include/rdma/ib_verbs.h | 2 +-
> > > > > > 2 files changed, 6 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/infiniband/core/uverbs_cmd.c
> > > > > > b/drivers/infiniband/core/uverbs_cmd.c
> > > > > > index 7ef74b0..38dce45 100644
> > > > > > --- a/drivers/infiniband/core/uverbs_cmd.c
> > > > > > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > > > > > @@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct
> > > ib_uverbs_file *file,
> > > > > > resp.bad_pkey_cntr = attr.bad_pkey_cntr;
> > > > > > resp.qkey_viol_cntr = attr.qkey_viol_cntr;
> > > > > > resp.pkey_tbl_len = attr.pkey_tbl_len;
> > > > > > - resp.sm_lid = attr.sm_lid;
> > > > > > - if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
> > > > > > + if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
> > > > > > resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
> > > > > > - else
> > > > > > + resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
> > > > > > + } else {
> > > > > > resp.lid = (u16)attr.lid;
> > > > > > + resp.sm_lid = (u16)attr.sm_lid;
> > > > > I see that lid is already has casting from u32 to u16 and now it is sm_lid.
> > > > > Do we have more elegant way to achieve that? And comment for future
> > > > > developers can be good too.
> > > > I see now, you changed lid in patch 11. The same comment there.
> > > To be clear on that point: NAK on *current* implementation.
> > >
> > > At least, it should be done in separate function, with comment explains why
> > > are you doing and why it is safe to cast from higher value to lower value and
> > > proper checks that you are not loosing information when you are casting.
> > >
> > A comment is reasonable. However, I'm not sure a separate function is needed. With the current interface there is no way to pass all the information through.
> >
> > Software which requires the use of these values will need to know to get them from other sources (OPA MAD queries for the SM LID for example.)
> >
> > Ira
> >
> Hi Leon,
>
> I had already created a function to do the cast in the 'IB/core: Change
> wc.slid from 16 to 32 bits' patch
> but I did not apply it here. Sorry about that. My plan is to rename the
> functions to ib_lid_{spu16,be16}
> and introduce them in this patch with a comment in the git log that these
> functions exist. I will also add
> a comment stating why we aren't losing any information.
These functions should be introduced before "casting patches".
Thanks
>
> Thank you.
>
> don
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017
2017-08-04 20:52 [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017 Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 01/27] IB/hfi1: Revert egress pkey check enforcement Dennis Dalessandro
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
@ 2017-08-10 17:05 ` Dennis Dalessandro
[not found] ` <f73b2f88-bc1b-92ed-1632-d2f3b1583d60-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2 siblings, 1 reply; 40+ messages in thread
From: Dennis Dalessandro @ 2017-08-10 17:05 UTC (permalink / raw)
To: dledford
Cc: Mike Marciniszyn, Bartlomiej Dudek, Jakub Byczkowski, linux-rdma,
Ira Weiny, Alex Estrin, stable, Michael J. Ruhl, Don Hiatt,
Sebastian Sanchez
On 8/4/2017 4:52 PM, Dennis Dalessandro wrote:
> Dasaratharaman Chandramouli (10):
> IB/core: Convert ah_attr from OPA to IB when copying to user
> IB/srpt: Increase lid and sm_lid to 32 bits
> IB/IPoIB: Increase local_lid to 32 bits
> IB/mad: Change slid in RMPP recv from 16 to 32 bits
> IB/core: Change port_attr.lid size from 16 to 32 bits
> IB/core: Change port_attr.sm_lid from 16 to 32 bits
> IB/CM: Create appropriate path records when handling CM request
> IB/CM: Set appropriate slid and dlid when handling CM request
> IB/rdmavt,hfi1,qib: Enhance rdmavt and hfi1 to use 32 bit lids
> IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs
Leon had some comments with these patches and Don is going to submit an
updated version of these 10. These were submitted previously back in
early June and sat on the list without comment. Hearing no feedback I
rebased them and sent with my other patches so we could get the driver
patches out.
In hindsight we probably should have just submitted another revision to
these 10 and sent the following 11 and the other 6 as a separate series.
If that makes things easier we can still do that. Let me know the
easiest way for you to consume these.
> Don Hiatt (11):
> IB/core: Change wc.slid from 16 to 32 bits
> IB/CM: Add OPA Path record support to CM
> IB/rdmavt,hfi1,qib: Modify check_ah() to account for extended LIDs
> IB/hfi1: Add support to receive 16B bypass packets
> IB/hfi1: Add support to send 16B bypass packets
> IB/hfi1: Add support to process 16B header errors
> IB/hfi1: Determine 9B/16B L2 header type based on Address handle
> IB/hfi1: Add 16B UD support
> IB/hfi1: Add 16B trace support
> IB/hfi1: Add 16B RC/UC support
> IB/hfi1: Enhance PIO/SDMA send for 16B
These will still apply, as will the other 6 patches in this series.
-Denny
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017
[not found] ` <f73b2f88-bc1b-92ed-1632-d2f3b1583d60-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-08-10 18:17 ` Don Hiatt
2017-08-18 19:07 ` Doug Ledford
0 siblings, 1 reply; 40+ messages in thread
From: Don Hiatt @ 2017-08-10 18:17 UTC (permalink / raw)
To: Dennis Dalessandro, dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: Mike Marciniszyn, Bartlomiej Dudek, Jakub Byczkowski,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny, Alex Estrin,
stable-u79uwXL29TY76Z2rM5mHXA, Michael J. Ruhl, Sebastian Sanchez
On 8/10/2017 10:05 AM, Dennis Dalessandro wrote:
> On 8/4/2017 4:52 PM, Dennis Dalessandro wrote:
>
>> Dasaratharaman Chandramouli (10):
>> IB/core: Convert ah_attr from OPA to IB when copying to user
>> IB/srpt: Increase lid and sm_lid to 32 bits
>> IB/IPoIB: Increase local_lid to 32 bits
>> IB/mad: Change slid in RMPP recv from 16 to 32 bits
>> IB/core: Change port_attr.lid size from 16 to 32 bits
>> IB/core: Change port_attr.sm_lid from 16 to 32 bits
>> IB/CM: Create appropriate path records when handling CM request
>> IB/CM: Set appropriate slid and dlid when handling CM request
>> IB/rdmavt,hfi1,qib: Enhance rdmavt and hfi1 to use 32 bit lids
>> IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support
>> extended LIDs
>
I submitted a v4 of just the 'extended lid' changes which is 8 patches
in total.
These patches from Denny's original patch series (from Dasa) are still
valid:
IB/CM: Set appropriate slid and dlid when handling CM request
IB/rdmavt,hfi1,qib: Enhance rdmavt and hfi1 to use 32 bit lids
IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support
extended LIDs
> Leon had some comments with these patches and Don is going to submit
> an updated version of these 10. These were submitted previously back
> in early June and sat on the list without comment. Hearing no feedback
> I rebased them and sent with my other patches so we could get the
> driver patches out.
>
> In hindsight we probably should have just submitted another revision
> to these 10 and sent the following 11 and the other 6 as a separate
> series. If that makes things easier we can still do that. Let me know
> the easiest way for you to consume these.
>
>> Don Hiatt (11):
>> IB/core: Change wc.slid from 16 to 32 bits
>> IB/CM: Add OPA Path record support to CM
>> IB/rdmavt,hfi1,qib: Modify check_ah() to account for extended
>> LIDs
>> IB/hfi1: Add support to receive 16B bypass packets
>> IB/hfi1: Add support to send 16B bypass packets
>> IB/hfi1: Add support to process 16B header errors
>> IB/hfi1: Determine 9B/16B L2 header type based on Address handle
>> IB/hfi1: Add 16B UD support
>> IB/hfi1: Add 16B trace support
>> IB/hfi1: Add 16B RC/UC support
>> IB/hfi1: Enhance PIO/SDMA send for 16B
>
> These will still apply, as will the other 6 patches in this series.
>
> -Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid from 16 to 32 bits
[not found] ` <20170809065730.GC1423-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
@ 2017-08-10 18:18 ` Don Hiatt
0 siblings, 0 replies; 40+ messages in thread
From: Don Hiatt @ 2017-08-10 18:18 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Weiny, Ira, dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Dasaratharaman Chandramouli
On 8/8/2017 11:57 PM, Leon Romanovsky wrote:
> On Tue, Aug 08, 2017 at 09:49:17AM -0700, Don Hiatt wrote:
>>
>> On 8/8/2017 9:35 AM, Weiny, Ira wrote:
>>>> On Sun, Aug 06, 2017 at 11:22:17AM +0300, Leon Romanovsky wrote:
>>>>> On Sun, Aug 06, 2017 at 11:18:57AM +0300, Leon Romanovsky wrote:
>>>>>> On Fri, Aug 04, 2017 at 01:53:21PM -0700, Dennis Dalessandro wrote:
>>>>>>> From: Dasaratharaman Chandramouli
>>>>>>> <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>>>>
>>>>>>> sm_lid field in struct ib_port_attr is increased to 32 bits. This
>>>>>>> enables core components to use larger LIDs if needed.
>>>>>>> The user ABI is unchanged and return 16 bit values when queried.
>>>>>>>
>>>>>>> Signed-off-by: Dasaratharaman Chandramouli
>>>>>>> <dasaratharaman.chandramouli-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>>>> Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>>>> Signed-off-by: Don Hiatt <don.hiatt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>>>> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>>>>>>> ---
>>>>>>> drivers/infiniband/core/uverbs_cmd.c | 8 +++++---
>>>>>>> include/rdma/ib_verbs.h | 2 +-
>>>>>>> 2 files changed, 6 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/infiniband/core/uverbs_cmd.c
>>>>>>> b/drivers/infiniband/core/uverbs_cmd.c
>>>>>>> index 7ef74b0..38dce45 100644
>>>>>>> --- a/drivers/infiniband/core/uverbs_cmd.c
>>>>>>> +++ b/drivers/infiniband/core/uverbs_cmd.c
>>>>>>> @@ -275,11 +275,13 @@ ssize_t ib_uverbs_query_port(struct
>>>> ib_uverbs_file *file,
>>>>>>> resp.bad_pkey_cntr = attr.bad_pkey_cntr;
>>>>>>> resp.qkey_viol_cntr = attr.qkey_viol_cntr;
>>>>>>> resp.pkey_tbl_len = attr.pkey_tbl_len;
>>>>>>> - resp.sm_lid = attr.sm_lid;
>>>>>>> - if (rdma_cap_opa_ah(ib_dev, cmd.port_num))
>>>>>>> + if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
>>>>>>> resp.lid = OPA_TO_IB_UCAST_LID(attr.lid);
>>>>>>> - else
>>>>>>> + resp.sm_lid = OPA_TO_IB_UCAST_LID(attr.sm_lid);
>>>>>>> + } else {
>>>>>>> resp.lid = (u16)attr.lid;
>>>>>>> + resp.sm_lid = (u16)attr.sm_lid;
>>>>>> I see that lid is already has casting from u32 to u16 and now it is sm_lid.
>>>>>> Do we have more elegant way to achieve that? And comment for future
>>>>>> developers can be good too.
>>>>> I see now, you changed lid in patch 11. The same comment there.
>>>> To be clear on that point: NAK on *current* implementation.
>>>>
>>>> At least, it should be done in separate function, with comment explains why
>>>> are you doing and why it is safe to cast from higher value to lower value and
>>>> proper checks that you are not loosing information when you are casting.
>>>>
>>> A comment is reasonable. However, I'm not sure a separate function is needed. With the current interface there is no way to pass all the information through.
>>>
>>> Software which requires the use of these values will need to know to get them from other sources (OPA MAD queries for the SM LID for example.)
>>>
>>> Ira
>>>
>> Hi Leon,
>>
>> I had already created a function to do the cast in the 'IB/core: Change
>> wc.slid from 16 to 32 bits' patch
>> but I did not apply it here. Sorry about that. My plan is to rename the
>> functions to ib_lid_{spu16,be16}
>> and introduce them in this patch with a comment in the git log that these
>> functions exist. I will also add
>> a comment stating why we aren't losing any information.
> These functions should be introduced before "casting patches".
>
> Thanks
Hi Leon,
I re-submitted these patches as v4.
Thank you.
don
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017
2017-08-10 18:17 ` Don Hiatt
@ 2017-08-18 19:07 ` Doug Ledford
2017-08-22 18:23 ` Doug Ledford
0 siblings, 1 reply; 40+ messages in thread
From: Doug Ledford @ 2017-08-18 19:07 UTC (permalink / raw)
To: Don Hiatt, Dennis Dalessandro
Cc: Mike Marciniszyn, Bartlomiej Dudek, Jakub Byczkowski, linux-rdma,
Ira Weiny, Alex Estrin, stable, Michael J. Ruhl,
Sebastian Sanchez
On Thu, 2017-08-10 at 11:17 -0700, Don Hiatt wrote:
>
> On 8/10/2017 10:05 AM, Dennis Dalessandro wrote:
> > On 8/4/2017 4:52 PM, Dennis Dalessandro wrote:
> >
> > > Dasaratharaman Chandramouli (10):
> > > IB/core: Convert ah_attr from OPA to IB when copying to
> > > user
> > > IB/srpt: Increase lid and sm_lid to 32 bits
> > > IB/IPoIB: Increase local_lid to 32 bits
> > > IB/mad: Change slid in RMPP recv from 16 to 32 bits
> > > IB/core: Change port_attr.lid size from 16 to 32 bits
> > > IB/core: Change port_attr.sm_lid from 16 to 32 bits
> > > IB/CM: Create appropriate path records when handling CM
> > > request
> > > IB/CM: Set appropriate slid and dlid when handling CM
> > > request
> > > IB/rdmavt,hfi1,qib: Enhance rdmavt and hfi1 to use 32 bit
> > > lids
> > > IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support
> > > extended LIDs
>
> I submitted a v4 of just the 'extended lid' changes which is 8
> patches
> in total.
> These patches from Denny's original patch series (from Dasa) are
> still
> valid:
>
> IB/CM: Set appropriate slid and dlid when handling CM request
> IB/rdmavt,hfi1,qib: Enhance rdmavt and hfi1 to use 32 bit
> lids
> IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support
> extended LIDs
OK, I *think* I've got things figured out, but I need to request that
you guys submit things differently in the future.
In general, if you submit a series of 7 patches that does some "thing",
then whether or not a resubmit adds or reduces the patch count, it
should still be just that "thing". And once you do a submission for a
"thing" as a series, that series should never get subsumed into another
"thing". In order for me to keep v1 versus v2 of any given "thing"
straight and not miss stuff, or waste a ton of time trying to apply
things that are already applied, a "thing" must remain the same "thing"
unless it is clearly withdrawn entirely and submitted as part of a new
"thing".
That said, I have the two series that updated the lid size and added
the support to the CM. I think that means that patches 7 through 16
are now all applied, but I still need to process the others (and
digging that "thing" out of the middle is even hard than if it were
just at the beginning or the end). Let me know if that is not correct.
>
> > Leon had some comments with these patches and Don is going to
> > submit
> > an updated version of these 10. These were submitted previously
> > back
> > in early June and sat on the list without comment. Hearing no
> > feedback
> > I rebased them and sent with my other patches so we could get the
> > driver patches out.
> >
> > In hindsight we probably should have just submitted another
> > revision
> > to these 10 and sent the following 11 and the other 6 as a
> > separate
> > series. If that makes things easier we can still do that. Let me
> > know
> > the easiest way for you to consume these.
> >
> > > Don Hiatt (11):
> > > IB/core: Change wc.slid from 16 to 32 bits
> > > IB/CM: Add OPA Path record support to CM
> > > IB/rdmavt,hfi1,qib: Modify check_ah() to account for
> > > extended
> > > LIDs
> > > IB/hfi1: Add support to receive 16B bypass packets
> > > IB/hfi1: Add support to send 16B bypass packets
> > > IB/hfi1: Add support to process 16B header errors
> > > IB/hfi1: Determine 9B/16B L2 header type based on Address
> > > handle
> > > IB/hfi1: Add 16B UD support
> > > IB/hfi1: Add 16B trace support
> > > IB/hfi1: Add 16B RC/UC support
> > > IB/hfi1: Enhance PIO/SDMA send for 16B
> >
> > These will still apply, as will the other 6 patches in this series.
> >
> > -Denny
>
>
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017
2017-08-18 19:07 ` Doug Ledford
@ 2017-08-22 18:23 ` Doug Ledford
0 siblings, 0 replies; 40+ messages in thread
From: Doug Ledford @ 2017-08-22 18:23 UTC (permalink / raw)
To: Don Hiatt, Dennis Dalessandro
Cc: Mike Marciniszyn, Bartlomiej Dudek, Jakub Byczkowski, linux-rdma,
Ira Weiny, Alex Estrin, stable, Michael J. Ruhl,
Sebastian Sanchez
On Fri, 2017-08-18 at 15:07 -0400, Doug Ledford wrote:
> That said, I have the two series that updated the lid size and added
> the support to the CM. I think that means that patches 7 through 16
> are now all applied, but I still need to process the others (and
> digging that "thing" out of the middle is even hard than if it were
> just at the beginning or the end). Let me know if that is not
> correct.
Patches 1-6 and 17-27 processed and applied. Thanks.
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2017-08-22 18:23 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-04 20:52 [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017 Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 01/27] IB/hfi1: Revert egress pkey check enforcement Dennis Dalessandro
[not found] ` <20170804204842.17853.14858.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-08-04 20:52 ` [PATCH for-next 02/27] IB/hfi1: Remove pmtu from the QP structure Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 03/27] IB/hfi1: Remove lstate from hfi1_pportdata Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 04/27] IB/hfi1: Use host_link_state to read state when DC is shut down Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 05/27] IB/hfi1: Protect context array set/clear with spinlock Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 06/27] IB/hf1: User context locking is inconsistent Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 07/27] IB/core: Convert ah_attr from OPA to IB when copying to user Dennis Dalessandro
2017-08-04 20:52 ` [PATCH for-next 08/27] IB/srpt: Increase lid and sm_lid to 32 bits Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 09/27] IB/IPoIB: Increase local_lid " Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 10/27] IB/mad: Change slid in RMPP recv from 16 " Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 11/27] IB/core: Change port_attr.lid size " Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 12/27] IB/core: Change port_attr.sm_lid " Dennis Dalessandro
[not found] ` <20170804205320.17853.77236.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2017-08-06 8:18 ` Leon Romanovsky
[not found] ` <20170806081857.GC3636-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-08-06 8:22 ` Leon Romanovsky
[not found] ` <20170806082217.GE3636-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-08-08 14:41 ` Leon Romanovsky
[not found] ` <20170808144146.GF28851-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-08-08 16:35 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E67CFAF96-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-08-08 16:49 ` Don Hiatt
[not found] ` <018c6e70-78d5-51ec-993f-35be575a6da1-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-08-09 6:57 ` Leon Romanovsky
[not found] ` <20170809065730.GC1423-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-08-10 18:18 ` Don Hiatt
2017-08-09 6:56 ` Leon Romanovsky
2017-08-04 20:53 ` [PATCH for-next 13/27] IB/core: Change wc.slid " Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 14/27] IB/CM: Add OPA Path record support to CM Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 15/27] IB/CM: Create appropriate path records when handling CM request Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 16/27] IB/CM: Set appropriate slid and dlid " Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 17/27] IB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs Dennis Dalessandro
2017-08-04 20:53 ` [PATCH for-next 18/27] IB/hfi1: Add support to receive 16B bypass packets Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 19/27] IB/hfi1: Add support to send " Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 20/27] IB/hfi1: Add support to process 16B header errors Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 21/27] IB/hfi1: Determine 9B/16B L2 header type based on Address handle Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 22/27] IB/hfi1: Add 16B UD support Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 23/27] IB/hfi1: Add 16B trace support Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 24/27] IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 25/27] IB/hfi1: Add 16B RC/UC support Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 26/27] IB/hfi1: Enhance PIO/SDMA send for 16B Dennis Dalessandro
2017-08-04 20:54 ` [PATCH for-next 27/27] IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs Dennis Dalessandro
2017-08-10 17:05 ` [PATCH for-next 00/27] IB/hfi1, rdmavt, core, etc: patches for next 08/04/2017 Dennis Dalessandro
[not found] ` <f73b2f88-bc1b-92ed-1632-d2f3b1583d60-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-08-10 18:17 ` Don Hiatt
2017-08-18 19:07 ` Doug Ledford
2017-08-22 18:23 ` Doug Ledford
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).