* [PATCH 5/6] i40iw: Assign MSS only when it is a new MTU
From: Henry Orosco @ 2016-12-06 21:49 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
e1000-rdma-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Mustafa Ismail,
Henry Orosco
In-Reply-To: <20161206214935.41584-1-henry.orosco-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
From: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Currently we are changing the MSS regardless of whether
there is a change or not in MTU. Fix to make the
assignment of MSS dependent on an MTU change.
Signed-off-by: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Henry Orosco <henry.orosco-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/i40iw/i40iw_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_main.c b/drivers/infiniband/hw/i40iw/i40iw_main.c
index 85d8fa6..cf9d288 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_main.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_main.c
@@ -1724,6 +1724,8 @@ static void i40iw_l2param_change(struct i40e_info *ldev, struct i40e_client *cli
for (i = 0; i < I40E_CLIENT_MAX_USER_PRIORITY; i++)
l2params->qs_handle_list[i] = params->qos.prio_qos[i].qs_handle;
+ l2params->mss = (params->mtu) ? params->mtu - I40IW_MTU_TO_MSS : iwdev->mss;
+
INIT_WORK(&work->work, i40iw_l2params_worker);
queue_work(iwdev->param_wq, &work->work);
}
--
2.8.3
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 4/6] i40iw: Fix race condition in terminate timer's handler
From: Henry Orosco @ 2016-12-06 21:49 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
e1000-rdma-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Shiraz Saleem
In-Reply-To: <20161206214935.41584-1-henry.orosco-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
From: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Add a QP reference when terminate timer is started to ensure
the destroy QP doesn't race ahead to free the QP while it is being
referenced in the terminate timer's handler.
Signed-off-by: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/i40iw/i40iw_cm.c | 2 +-
drivers/infiniband/hw/i40iw/i40iw_utils.c | 5 ++++-
drivers/infiniband/hw/i40iw/i40iw_verbs.c | 2 +-
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.c b/drivers/infiniband/hw/i40iw/i40iw_cm.c
index ff95fea..a217d2f 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_cm.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_cm.c
@@ -3471,7 +3471,7 @@ static void i40iw_cm_disconn_true(struct i40iw_qp *iwqp)
*terminate-handler to issue cm_disconn which can re-free
*a QP even after its refcnt=0.
*/
- del_timer(&iwqp->terminate_timer);
+ i40iw_terminate_del_timer(qp);
if (!iwqp->flush_issued) {
iwqp->flush_issued = 1;
issue_flush = 1;
diff --git a/drivers/infiniband/hw/i40iw/i40iw_utils.c b/drivers/infiniband/hw/i40iw/i40iw_utils.c
index 4a08ffb..7d4af77 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_utils.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_utils.c
@@ -823,6 +823,7 @@ static void i40iw_terminate_timeout(unsigned long context)
struct i40iw_sc_qp *qp = (struct i40iw_sc_qp *)&iwqp->sc_qp;
i40iw_terminate_done(qp, 1);
+ i40iw_rem_ref(&iwqp->ibqp);
}
/**
@@ -834,6 +835,7 @@ void i40iw_terminate_start_timer(struct i40iw_sc_qp *qp)
struct i40iw_qp *iwqp;
iwqp = (struct i40iw_qp *)qp->back_qp;
+ i40iw_add_ref(&iwqp->ibqp);
init_timer(&iwqp->terminate_timer);
iwqp->terminate_timer.function = i40iw_terminate_timeout;
iwqp->terminate_timer.expires = jiffies + HZ;
@@ -850,7 +852,8 @@ void i40iw_terminate_del_timer(struct i40iw_sc_qp *qp)
struct i40iw_qp *iwqp;
iwqp = (struct i40iw_qp *)qp->back_qp;
- del_timer(&iwqp->terminate_timer);
+ if (del_timer(&iwqp->terminate_timer))
+ i40iw_rem_ref(&iwqp->ibqp);
}
/**
diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
index 206d72b..18526e6 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
@@ -959,7 +959,7 @@ int i40iw_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
goto exit;
}
if (iwqp->sc_qp.term_flags)
- del_timer(&iwqp->terminate_timer);
+ i40iw_terminate_del_timer(&iwqp->sc_qp);
info.next_iwarp_state = I40IW_QP_STATE_ERROR;
if ((iwqp->hw_tcp_state > I40IW_TCP_STATE_CLOSED) &&
iwdev->iw_status &&
--
2.8.3
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 3/6] i40iw: Fix memory leak in CQP destroy when in reset
From: Henry Orosco @ 2016-12-06 21:49 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
e1000-rdma-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Mustafa Ismail
In-Reply-To: <20161206214935.41584-1-henry.orosco-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
From: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
On a device close, the control QP (CQP) is destroyed by calling
cqp_destroy which destroys the CQP and frees its SD buffer memory.
However, if the reset flag is true, cqp_destroy is never called and
leads to a memory leak on SD buffer memory. Fix this by always calling
cqp_destroy, on device close, regardless of reset. The exception to this
when CQP create fails. In this case, the SD buffer memory is already
freed on an error check and there is no need to call cqp_destroy.
Signed-off-by: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/i40iw/i40iw_main.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_main.c b/drivers/infiniband/hw/i40iw/i40iw_main.c
index 4ce05b8..85d8fa6 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_main.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_main.c
@@ -237,14 +237,11 @@ static irqreturn_t i40iw_irq_handler(int irq, void *data)
*/
static void i40iw_destroy_cqp(struct i40iw_device *iwdev, bool free_hwcqp)
{
- enum i40iw_status_code status = 0;
struct i40iw_sc_dev *dev = &iwdev->sc_dev;
struct i40iw_cqp *cqp = &iwdev->cqp;
- if (free_hwcqp && dev->cqp_ops->cqp_destroy)
- status = dev->cqp_ops->cqp_destroy(dev->cqp);
- if (status)
- i40iw_pr_err("destroy cqp failed");
+ if (free_hwcqp)
+ dev->cqp_ops->cqp_destroy(dev->cqp);
i40iw_free_dma_mem(dev->hw, &cqp->sq);
kfree(cqp->scratch_array);
@@ -1475,7 +1472,7 @@ static void i40iw_deinit_device(struct i40iw_device *iwdev, bool reset)
i40iw_del_hmc_objects(dev, dev->hmc_info, true, reset);
/* fallthrough */
case CQP_CREATED:
- i40iw_destroy_cqp(iwdev, !reset);
+ i40iw_destroy_cqp(iwdev, true);
/* fallthrough */
case INITIAL_STATE:
i40iw_cleanup_cm_core(&iwdev->cm_core);
--
2.8.3
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 2/6] i40iw: Fix QP flush to not hang on empty queues or failure
From: Henry Orosco @ 2016-12-06 21:49 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
e1000-rdma-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Shiraz Saleem
In-Reply-To: <20161206214935.41584-1-henry.orosco-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
From: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
When flush QP and there are no pending work requests, signal completion
to unblock i40iw_drain_sq and i40iw_drain_rq which are waiting on
completion for iwqp->sq_drained and iwqp->sq_drained respectively.
Also, signal completion if flush QP fails to prevent the drain SQ or RQ
from being blocked indefintely.
Signed-off-by: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/i40iw/i40iw.h | 9 ++++++---
drivers/infiniband/hw/i40iw/i40iw_hw.c | 26 ++++++++++++++++++++++++--
2 files changed, 30 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/i40iw/i40iw.h b/drivers/infiniband/hw/i40iw/i40iw.h
index 51b8280..2aab85b 100644
--- a/drivers/infiniband/hw/i40iw/i40iw.h
+++ b/drivers/infiniband/hw/i40iw/i40iw.h
@@ -112,9 +112,12 @@
#define I40IW_DRV_OPT_MCAST_LOGPORT_MAP 0x00000800
#define IW_HMC_OBJ_TYPE_NUM ARRAY_SIZE(iw_hmc_obj_types)
-#define IW_CFG_FPM_QP_COUNT 32768
-#define I40IW_MAX_PAGES_PER_FMR 512
-#define I40IW_MIN_PAGES_PER_FMR 1
+#define IW_CFG_FPM_QP_COUNT 32768
+#define I40IW_MAX_PAGES_PER_FMR 512
+#define I40IW_MIN_PAGES_PER_FMR 1
+#define I40IW_CQP_COMPL_RQ_WQE_FLUSHED 2
+#define I40IW_CQP_COMPL_SQ_WQE_FLUSHED 3
+#define I40IW_CQP_COMPL_RQ_SQ_WQE_FLUSHED 4
#define I40IW_MTU_TO_MSS 40
#define I40IW_DEFAULT_MSS 1460
diff --git a/drivers/infiniband/hw/i40iw/i40iw_hw.c b/drivers/infiniband/hw/i40iw/i40iw_hw.c
index b2854b1..4394a67 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_hw.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_hw.c
@@ -622,6 +622,7 @@ enum i40iw_status_code i40iw_hw_flush_wqes(struct i40iw_device *iwdev,
struct i40iw_qp_flush_info *hw_info;
struct i40iw_cqp_request *cqp_request;
struct cqp_commands_info *cqp_info;
+ struct i40iw_qp *iwqp = (struct i40iw_qp *)qp->back_qp;
cqp_request = i40iw_get_cqp_request(&iwdev->cqp, wait);
if (!cqp_request)
@@ -636,9 +637,30 @@ enum i40iw_status_code i40iw_hw_flush_wqes(struct i40iw_device *iwdev,
cqp_info->in.u.qp_flush_wqes.qp = qp;
cqp_info->in.u.qp_flush_wqes.scratch = (uintptr_t)cqp_request;
status = i40iw_handle_cqp_op(iwdev, cqp_request);
- if (status)
+ if (status) {
i40iw_pr_err("CQP-OP Flush WQE's fail");
- return status;
+ complete(&iwqp->sq_drained);
+ complete(&iwqp->rq_drained);
+ return status;
+ }
+ if (!cqp_request->compl_info.maj_err_code) {
+ switch (cqp_request->compl_info.min_err_code) {
+ case I40IW_CQP_COMPL_RQ_WQE_FLUSHED:
+ complete(&iwqp->sq_drained);
+ break;
+ case I40IW_CQP_COMPL_SQ_WQE_FLUSHED:
+ complete(&iwqp->rq_drained);
+ break;
+ case I40IW_CQP_COMPL_RQ_SQ_WQE_FLUSHED:
+ break;
+ default:
+ complete(&iwqp->sq_drained);
+ complete(&iwqp->rq_drained);
+ break;
+ }
+ }
+
+ return 0;
}
/**
--
2.8.3
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 1/6] i40iw: Fix double free of QP
From: Henry Orosco @ 2016-12-06 21:49 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
e1000-rdma-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Mustafa Ismail,
Henry Orosco
In-Reply-To: <20161206214935.41584-1-henry.orosco-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
From: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
A QP can be double freed if i40iw_cm_disconn() is
called while it is currently being freed by
i40iw_rem_ref(). The fix in i40iw_cm_disconn() will
first check if the QP is already freed before
making another request for the QP to be freed.
Signed-off-by: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Henry Orosco <henry.orosco-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/i40iw/i40iw.h | 2 +-
drivers/infiniband/hw/i40iw/i40iw_cm.c | 20 ++++++++++++++++----
drivers/infiniband/hw/i40iw/i40iw_hw.c | 4 +++-
3 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/i40iw/i40iw.h b/drivers/infiniband/hw/i40iw/i40iw.h
index ef188e6..51b8280 100644
--- a/drivers/infiniband/hw/i40iw/i40iw.h
+++ b/drivers/infiniband/hw/i40iw/i40iw.h
@@ -512,7 +512,7 @@ u32 i40iw_initialize_hw_resources(struct i40iw_device *iwdev);
int i40iw_register_rdma_device(struct i40iw_device *iwdev);
void i40iw_port_ibevent(struct i40iw_device *iwdev);
-int i40iw_cm_disconn(struct i40iw_qp *);
+void i40iw_cm_disconn(struct i40iw_qp *iwqp);
void i40iw_cm_disconn_worker(void *);
int mini_cm_recv_pkt(struct i40iw_cm_core *, struct i40iw_device *,
struct sk_buff *);
diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.c b/drivers/infiniband/hw/i40iw/i40iw_cm.c
index 25af89a..ff95fea 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_cm.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_cm.c
@@ -3359,21 +3359,33 @@ static void i40iw_cm_init_tsa_conn(struct i40iw_qp *iwqp,
* i40iw_cm_disconn - when a connection is being closed
* @iwqp: associate qp for the connection
*/
-int i40iw_cm_disconn(struct i40iw_qp *iwqp)
+void i40iw_cm_disconn(struct i40iw_qp *iwqp)
{
struct disconn_work *work;
struct i40iw_device *iwdev = iwqp->iwdev;
struct i40iw_cm_core *cm_core = &iwdev->cm_core;
+ unsigned long flags;
work = kzalloc(sizeof(*work), GFP_ATOMIC);
if (!work)
- return -ENOMEM; /* Timer will clean up */
-
+ return; /* Timer will clean up */
+
+ spin_lock_irqsave(&iwdev->qptable_lock, flags);
+ if (!iwdev->qp_table[iwqp->ibqp.qp_num]) {
+ spin_unlock_irqrestore(&iwdev->qptable_lock, flags);
+ i40iw_debug(&iwdev->sc_dev, I40IW_DEBUG_CM,
+ "%s qp_id %d is already freed\n",
+ __func__, iwqp->ibqp.qp_num);
+ kfree(work);
+ return;
+ }
i40iw_add_ref(&iwqp->ibqp);
+ spin_unlock_irqrestore(&iwdev->qptable_lock, flags);
+
work->iwqp = iwqp;
INIT_WORK(&work->work, i40iw_disconnect_worker);
queue_work(cm_core->disconn_wq, &work->work);
- return 0;
+ return;
}
/**
diff --git a/drivers/infiniband/hw/i40iw/i40iw_hw.c b/drivers/infiniband/hw/i40iw/i40iw_hw.c
index 5e2c16c..b2854b1 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_hw.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_hw.c
@@ -308,7 +308,9 @@ void i40iw_process_aeq(struct i40iw_device *iwdev)
iwqp = iwdev->qp_table[info->qp_cq_id];
if (!iwqp) {
spin_unlock_irqrestore(&iwdev->qptable_lock, flags);
- i40iw_pr_err("qp_id %d is already freed\n", info->qp_cq_id);
+ i40iw_debug(dev, I40IW_DEBUG_AEQ,
+ "%s qp_id %d is already freed\n",
+ __func__, info->qp_cq_id);
continue;
}
i40iw_add_ref(&iwqp->ibqp);
--
2.8.3
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 0/6] i40iw: Fixes for i40iw
From: Henry Orosco @ 2016-12-06 21:49 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
e1000-rdma-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Henry Orosco
The following patch series has fixes for the i40iw driver.
Mustafa Ismail (4):
i40iw: Fix double free of QP
i40iw: Fix memory leak in CQP destroy when in reset
i40iw: Assign MSS only when it is a new MTU
i40iw: Fix incorrect check for error
Shiraz Saleem (2):
i40iw: Fix QP flush to not hang on empty queues or failure
i40iw: Fix race condition in terminate timer's handler
drivers/infiniband/hw/i40iw/i40iw.h | 11 +++++++----
drivers/infiniband/hw/i40iw/i40iw_cm.c | 22 +++++++++++++++++-----
drivers/infiniband/hw/i40iw/i40iw_hw.c | 30 +++++++++++++++++++++++++++---
drivers/infiniband/hw/i40iw/i40iw_main.c | 11 +++++------
drivers/infiniband/hw/i40iw/i40iw_puda.c | 2 +-
drivers/infiniband/hw/i40iw/i40iw_utils.c | 5 ++++-
drivers/infiniband/hw/i40iw/i40iw_verbs.c | 2 +-
7 files changed, 62 insertions(+), 21 deletions(-)
--
2.8.3
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Enabling peer to peer device transactions for PCIe devices
From: Logan Gunthorpe @ 2016-12-06 21:47 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Stephen Bates, Dan Williams, Haggai Eran,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org,
christian.koenig-5C7GfCeVMHo@public.gmane.org,
Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org,
John.Bridgman-5C7GfCeVMHo@public.gmane.org,
Alexander.Deucher-5C7GfCeVMHo@public.gmane.org,
Linux-media-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
Max Gurtovoy, linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
"serguei.sagalovitch-5C7GfCeVMHo@public.gmane.org" <serguei.sagalov>
In-Reply-To: <20161206172838.GB19318-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Hey,
> Okay, so clearly this needs a kernel side NVMe specific allocator
> and locking so users don't step on each other..
Yup, ideally. That's why device dax isn't ideal for this application: it
doesn't provide any way to prevent users from stepping on each other.
> Or as Christoph says some kind of general mechanism to get these
> bounce buffers..
Yeah, I imagine a general allocate from BAR/region system would be very
useful.
> Ah, I see.
>
> As a first draft I'd stick with some kind of API built into the
> /dev/nvmeX that backs the filesystem. The user app would fstat the
> target file, open /dev/block/MAJOR(st_dev):MINOR(st_dev), do some
> ioctl to get a CMB mmap, and then proceed from there..
>
> When that is all working kernel-side, it would make sense to look at a
> more general mechanism that could be used unprivileged??
That makes a lot of sense to me. I suggested mmapping the char device
because it's really easy, but I can see that an ioctl on the block
device does seem more general and device agnostic.
> This is similar to the GPU issues too.. On NVMe you don't need to pin
> the pages, you just need to lock that VMA so it doesn't get freed from
> the NVMe CMB allocator while the IO is running...
> Probably in the long run the get_user_pages is going to have to be
> pushed down into drivers.. Future MMU coherent IO hardware also does
> not need the pinning or other overheads.
Yup. Yup.
Logan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
From: Jason Gunthorpe @ 2016-12-06 21:39 UTC (permalink / raw)
To: Or Gerlitz
Cc: Doug Ledford, Hefty, Sean, Steve Wise,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Liran Liss,
Matan Barak, Leon Romanovsky
In-Reply-To: <CAJ3xEMhYs0jtXYDqAXEw17Fk4padG903J3enL+uNPg_fNk-9Uw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Tue, Dec 06, 2016 at 11:26:51PM +0200, Or Gerlitz wrote:
> Do you want the change along what was proposed by Liran (allow
> user-space to query the supported protocols per port) or you are
> okay
Why doesn't mellanox come up with a plan to actually make all of this
work in some sensible way? So far only Mellanox needs this.
I've already proposed disallowing multiprotocol struct ib_devices.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-next 01/10] IB/core: Add raw packet protocol
From: Or Gerlitz @ 2016-12-06 21:26 UTC (permalink / raw)
To: Doug Ledford
Cc: Jason Gunthorpe, Hefty, Sean, Steve Wise,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Liran Liss,
Matan Barak, Leon Romanovsky
In-Reply-To: <HE1PR0501MB2812A66403A8C19AF83742EDB1830-692Kmc8YnlIVrnpjwTCbp8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
On Mon, Dec 5, 2016 at 10:14 PM, Liran Liss <liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
>>> Anyway, returning to the initial matter at hand: I would like to start
>>> with each port reporting a bit mask of the supported protocols on that
>>> link (RoCE v1/v2, Raw Ethernet, iWARP, etc.) It will be used for
>>> reporting device capabilities in general for tools, as well as by
>>> applications that don't use rdmacm.
>> Why don't we start by defining how it is supposed to even work and how to fix
>> the RDMA CM before adding even more random capability bits?
> Extensions to RDMA CM is an important topic, which we can continue to discuss as devices that require it are introduced.
> Protocol capabilities are needed for information purposes as well as for applications that do not use RDMA CM.
> They are not "random", but depict what a device can do.
Doug,
Can we get a maintainer say here? I didn't see complaints on the mlx5
IB driver patch that opens IB device also in the lack of RoCE support.
We have a valid real life use-case which needs this driver patch.
Per your request, I made also some core patch to expose QPTs to
user-space and we went into this long discussion.
Do you want the change along what was proposed by Liran (allow
user-space to query the supported protocols per port) or you are okay
with taking only the mlx5 patches for now and discuss the rest later
or something else?
Please let us know
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: buildlib/cbuild missing yaml on SLE12SP2
From: Jason Gunthorpe @ 2016-12-06 20:15 UTC (permalink / raw)
To: Steve Wise; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <02cc01d24ffb$79ad93d0$6d08bb70$@opengridcomputing.com>
On Tue, Dec 06, 2016 at 02:01:04PM -0600, Steve Wise wrote:
> Hey Jason,
>
> I'm trying to run buildlib/cbuild build-images travis on rdma-core, and I'm
> missing yaml. But I think I have it installed. At least I have perl_YAML
> installed. This is with SLE12 SP2 RC1.
You need python-yaml
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* buildlib/cbuild missing yaml on SLE12SP2
From: Steve Wise @ 2016-12-06 20:01 UTC (permalink / raw)
To: 'Jason Gunthorpe'; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hey Jason,
I'm trying to run buildlib/cbuild build-images travis on rdma-core, and I'm
missing yaml. But I think I have it installed. At least I have perl_YAML
installed. This is with SLE12 SP2 RC1.
stevo4-asicdesigners-com:/usr/local/src/rdma-core # zypper search yaml
Loading repository data...
Reading installed packages...
S | Name | Summary | Type
--+----------------------+--------------------------------------------+---------
--
| libyaml | A YAML 1.1 parser and emitter written in C |
srcpackage
i | libyaml-0-2 | Shared library from libyaml | package
i | libyaml-devel | Development files for libyaml | package
| perl-Bootloader-YAML | YAML interface for perl-Bootloader | package
i | perl-YAML | YAML Ain't Markup Language (tm) | package
| perl-YAML | YAML Ain't Markup Language (tm) |
srcpackage
i | perl-YAML-LibYAML | YAML::LibYAML Perl module | package
| perl-YAML-LibYAML | YAML::LibYAML Perl module |
srcpackage
| perl-YAML-Syck | Fast, lightweight YAML loader and dumper | package
| perl-YAML-Syck | Fast, lightweight YAML loader and dumper |
srcpackage
stevo4-asicdesigners-com:/usr/local/src/rdma-core # rpm -ql perl-YAML
/usr/lib/perl5/vendor_perl/5.18.2/Test
/usr/lib/perl5/vendor_perl/5.18.2/Test/YAML.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML
/usr/lib/perl5/vendor_perl/5.18.2/YAML.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Any.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Dumper
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Dumper.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Dumper/Base.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Error.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Loader
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Loader.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Loader/Base.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Marshall.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Mo.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Node.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Tag.pm
/usr/lib/perl5/vendor_perl/5.18.2/YAML/Types.pm
/usr/lib/perl5/vendor_perl/5.18.2/x86_64-linux-thread-multi
/usr/share/doc/packages/perl-YAML
/usr/share/doc/packages/perl-YAML/Changes
/usr/share/doc/packages/perl-YAML/LICENSE
/usr/share/doc/packages/perl-YAML/README
/usr/share/man/man3/Test::YAML.3pm.gz
/usr/share/man/man3/YAML.3pm.gz
/usr/share/man/man3/YAML::Any.3pm.gz
/usr/share/man/man3/YAML::Dumper.3pm.gz
/usr/share/man/man3/YAML::Dumper::Base.3pm.gz
/usr/share/man/man3/YAML::Error.3pm.gz
/usr/share/man/man3/YAML::Loader.3pm.gz
/usr/share/man/man3/YAML::Loader::Base.3pm.gz
/usr/share/man/man3/YAML::Marshall.3pm.gz
/usr/share/man/man3/YAML::Node.3pm.gz
/usr/share/man/man3/YAML::Tag.3pm.gz
/usr/share/man/man3/YAML::Types.3pm.gz
Any ideas?
Thanks,
Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] libcxgb4: set *bad_wr in post_send/recv error paths
From: Steve Wise @ 2016-12-06 19:49 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA, leon-DgEjT+Ai2ygdnm+yROfE0A
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
providers/cxgb4/qp.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/providers/cxgb4/qp.c b/providers/cxgb4/qp.c
index 376a00a..3b448dd 100644
--- a/providers/cxgb4/qp.c
+++ b/providers/cxgb4/qp.c
@@ -303,11 +303,13 @@ int c4iw_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
pthread_spin_lock(&qhp->lock);
if (t4_wq_in_error(&qhp->wq)) {
pthread_spin_unlock(&qhp->lock);
+ *bad_wr = wr;
return -EINVAL;
}
num_wrs = t4_sq_avail(&qhp->wq);
if (num_wrs == 0) {
pthread_spin_unlock(&qhp->lock);
+ *bad_wr = wr;
return -ENOMEM;
}
while (wr) {
@@ -403,12 +405,14 @@ int c4iw_post_receive(struct ibv_qp *ibqp, struct ibv_recv_wr *wr,
pthread_spin_lock(&qhp->lock);
if (t4_wq_in_error(&qhp->wq)) {
pthread_spin_unlock(&qhp->lock);
+ *bad_wr = wr;
return -EINVAL;
}
INC_STAT(recv);
num_wrs = t4_rq_avail(&qhp->wq);
if (num_wrs == 0) {
pthread_spin_unlock(&qhp->lock);
+ *bad_wr = wr;
return -ENOMEM;
}
while (wr) {
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH infiniband-diags] Add Bull device ID support to device white lists
From: Hal Rosenstock @ 2016-12-06 19:22 UTC (permalink / raw)
To: Weiny, Ira; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Hal Rosenstock <hal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
diff --git a/libibnetdisc/src/ibnetdisc.c b/libibnetdisc/src/ibnetdisc.c
index 6ef7805..223e097 100644
--- a/libibnetdisc/src/ibnetdisc.c
+++ b/libibnetdisc/src/ibnetdisc.c
@@ -197,7 +197,7 @@ static int is_mlnx_ext_port_info_supported(ibnd_port_t * port)
{
uint16_t devid = (uint16_t) mad_get_field(port->node->info, 0, IB_NODE_DEVID_F);
- if ((devid >= 0xc738 && devid <= 0xc73b) || devid == 0xcb20 || devid == 0xcf08)
+ if ((devid >= 0xc738 && devid <= 0xc73b) || devid == 0xcb20 || devid == 0xcf08 || devid == 0x1b02)
return 1;
if (devid >= 0x1003 && devid <= 0x1016)
return 1;
diff --git a/src/ibdiag_common.c b/src/ibdiag_common.c
index 2a1e971..6809a1e 100644
--- a/src/ibdiag_common.c
+++ b/src/ibdiag_common.c
@@ -530,7 +530,7 @@ int is_port_info_extended_supported(ib_portid_t * dest, int port,
int is_mlnx_ext_port_info_supported(uint32_t devid)
{
if (ibd_ibnetdisc_flags & IBND_CONFIG_MLX_EPI) {
- if ((devid >= 0xc738 && devid <= 0xc73b) || devid == 0xcb20 || devid == 0xcf08)
+ if ((devid >= 0xc738 && devid <= 0xc73b) || devid == 0xcb20 || devid == 0xcf08 || devid == 0x1b02)
return 1;
if (devid >= 0x1003 && devid <= 0x1016)
return 1;
diff --git a/src/vendstat.c b/src/vendstat.c
index fb42a78..284ef93 100644
--- a/src/vendstat.c
+++ b/src/vendstat.c
@@ -149,6 +149,7 @@ static uint16_t ext_fw_info_device[][2] = {
{0xcf08, 0xcf08}, /* Switch-IB2 */
{0x01b3, 0x01b3}, /* IS-4 */
{0x1003, 0x1016}, /* Connect-X */
+ {0x1b02, 0x1b02}, /* Bull */
{0x0000, 0x0000}};
static int is_ext_fw_info_supported(uint16_t device_id) {
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH] i40iw: Use correct src address in memcpy to rdma stats counters
From: Shiraz Saleem @ 2016-12-06 17:39 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
e1000-rdma-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org
In-Reply-To: <1478883341-22124-1-git-send-email-shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Doug - Can you please pick this up for 4.9? It was a bug introduced in 4.7.
On Fri, Nov 11, 2016 at 09:55:41AM -0700, Saleem, Shiraz wrote:
> hw_stats is a pointer to i40_iw_dev_stats struct in i40iw_get_hw_stats().
> Use hw_stats and not &hw_stats in the memcpy to copy the i40iw device stats
> data into rdma_hw_stats counters.
>
> Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")
>
> Signed-off-by: Shiraz Saleem <shiraz.saleem-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Faisal Latif <faisal.latif-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
> drivers/infiniband/hw/i40iw/i40iw_verbs.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
> index b71394b..02c8f9a 100644
> --- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c
> +++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
> @@ -2498,7 +2498,7 @@ static int i40iw_get_hw_stats(struct ib_device *ibdev,
> return -ENOSYS;
> }
>
> - memcpy(&stats->value[0], &hw_stats, sizeof(*hw_stats));
> + memcpy(&stats->value[0], hw_stats, sizeof(*hw_stats));
>
> return stats->num_counters;
> }
> --
> 2.8.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Enabling peer to peer device transactions for PCIe devices
From: Jason Gunthorpe @ 2016-12-06 17:28 UTC (permalink / raw)
To: Logan Gunthorpe
Cc: Haggai Eran, John.Bridgman-5C7GfCeVMHo@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org,
Felix.Kuehling-5C7GfCeVMHo@public.gmane.org,
serguei.sagalovitch-5C7GfCeVMHo@public.gmane.org,
Paul.Blinzer-5C7GfCeVMHo@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
Stephen Bates, ben.sander-5C7GfCeVMHo@public.gmane.org,
Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org,
linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Alexander.Deucher-5C7GfCeVMHo@public.gmane.org, Max Gurtovoy,
christian.koenig-5C7GfCeVMHo@public.gmane.org,
"Linux-media-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-media@
In-Reply-To: <ec136c34-417d-8a55-c176-2c1d759a5fb8-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
On Tue, Dec 06, 2016 at 09:51:15AM -0700, Logan Gunthorpe wrote:
> Hey,
>
> On 06/12/16 09:38 AM, Jason Gunthorpe wrote:
> >>> I'm not opposed to mapping /dev/nvmeX. However, the lookup is trivial
> >>> to accomplish in sysfs through /sys/dev/char to find the sysfs path of the
> >>> device-dax instance under the nvme device, or if you already have the nvme
> >>> sysfs path the dax instance(s) will appear under the "dax" sub-directory.
> >>
> >> Personally I think mapping the dax resource in the sysfs tree is a nice
> >> way to do this and a bit more intuitive than mapping a /dev/nvmeX.
> >
> > It is still not at all clear to me what userpsace is supposed to do
> > with this on nvme.. How is the CMB usable from userspace?
>
> The flow is pretty simple. For example to write to NVMe from an RDMA device:
>
> 1) Obtain a chunk of the CMB to use as a buffer(either by mmaping
> /dev/nvmx, the device dax char device or through a block layer interface
> (which sounds like a good suggestion from Christoph, but I'm not really
> sure how it would look).
Okay, so clearly this needs a kernel side NVMe specific allocator
and locking so users don't step on each other..
Or as Christoph says some kind of general mechanism to get these
bounce buffers..
> 2) Create an MR with the buffer and use an RDMA function to fill it with
> data from a remote host. This will cause the RDMA hardware to write
> directly to the memory in the NVMe card.
>
> 3) Using O_DIRECT, write the buffer to a file on the NVMe filesystem.
> When the address reaches hardware the NVMe will recognize it as local
> memory and copy it directly there.
Ah, I see.
As a first draft I'd stick with some kind of API built into the
/dev/nvmeX that backs the filesystem. The user app would fstat the
target file, open /dev/block/MAJOR(st_dev):MINOR(st_dev), do some
ioctl to get a CMB mmap, and then proceed from there..
When that is all working kernel-side, it would make sense to look at a
more general mechanism that could be used unprivileged??
> Thus we are able to transfer data to any file on an NVMe device without
> going through system memory. This has benefits on systems with lots of
> activity in system memory but step 3 is likely to be slowish due to the
> need to pin/unpin the memory for every transaction.
This is similar to the GPU issues too.. On NVMe you don't need to pin
the pages, you just need to lock that VMA so it doesn't get freed from
the NVMe CMB allocator while the IO is running...
Probably in the long run the get_user_pages is going to have to be
pushed down into drivers.. Future MMU coherent IO hardware also does
not need the pinning or other overheads.
Jason
^ permalink raw reply
* Re: Enabling peer to peer device transactions for PCIe devices
From: Christoph Hellwig @ 2016-12-06 17:12 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Haggai Eran, John.Bridgman-5C7GfCeVMHo@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org,
Felix.Kuehling-5C7GfCeVMHo@public.gmane.org,
serguei.sagalovitch-5C7GfCeVMHo@public.gmane.org,
Paul.Blinzer-5C7GfCeVMHo@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
Stephen Bates, ben.sander-5C7GfCeVMHo@public.gmane.org,
Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org,
linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Alexander.Deucher-5C7GfCeVMHo@public.gmane.org, Max Gurtovoy,
christian.koenig-5C7GfCeVMHo@public.gmane.org,
"Linux-media-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-media@
In-Reply-To: <20161206163850.GC28066-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
On Tue, Dec 06, 2016 at 09:38:50AM -0700, Jason Gunthorpe wrote:
> > > I'm not opposed to mapping /dev/nvmeX. However, the lookup is trivial
> > > to accomplish in sysfs through /sys/dev/char to find the sysfs path of the
> > > device-dax instance under the nvme device, or if you already have the nvme
> > > sysfs path the dax instance(s) will appear under the "dax" sub-directory.
> >
> > Personally I think mapping the dax resource in the sysfs tree is a nice
> > way to do this and a bit more intuitive than mapping a /dev/nvmeX.
>
> It is still not at all clear to me what userpsace is supposed to do
> with this on nvme.. How is the CMB usable from userspace?
I don't think trying to expose it to userspace makes any sense.
Exposing it to in-kernel storage targets on the other hand makes a lot
of sense.
^ permalink raw reply
* [PATCH net-next 7/7] bnxt_en: Add interface to support RDMA driver.
From: Michael Chan @ 2016-12-06 17:09 UTC (permalink / raw)
To: davem; +Cc: netdev, selvin.xavier, somnath.kotur, dledford, linux-rdma
In-Reply-To: <1481044178-25193-1-git-send-email-michael.chan@broadcom.com>
Since the network driver and RDMA driver operate on the same PCI function,
we need to create an interface to allow the RDMA driver to share resources
with the network driver.
1. Create a new bnxt_en_dev struct which will be returned by
bnxt_ulp_probe() upon success. After that, all calls from the RDMA driver
to bnxt_en will pass a pointer to this struct.
2. This struct contains additional function pointers to register, request
msix, send fw messages, register for async events.
3. If the RDMA driver wants to enable RDMA on the function, it needs to
call the function pointer bnxt_register_device(). A ulp_ops structure
is passed for RCU protected upcalls from bnxt_en to the RDMA driver.
4. The RDMA driver can call firmware APIs using the bnxt_send_fw_msg()
function pointer.
5. 1 stats context is reserved when the RDMA driver registers. MSIX
and completion rings are reserved when the RDMA driver calls
bnxt_request_msix() function pointer.
6. When the RDMA driver calls bnxt_unregister_device(), all RDMA resources
will be cleaned up.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/Makefile | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 41 ++-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 6 +
drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 346 ++++++++++++++++++++++++++
drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h | 93 +++++++
5 files changed, 483 insertions(+), 5 deletions(-)
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
diff --git a/drivers/net/ethernet/broadcom/bnxt/Makefile b/drivers/net/ethernet/broadcom/bnxt/Makefile
index b233a86..6082ed1 100644
--- a/drivers/net/ethernet/broadcom/bnxt/Makefile
+++ b/drivers/net/ethernet/broadcom/bnxt/Makefile
@@ -1,3 +1,3 @@
obj-$(CONFIG_BNXT) += bnxt_en.o
-bnxt_en-y := bnxt.o bnxt_sriov.o bnxt_ethtool.o bnxt_dcb.o
+bnxt_en-y := bnxt.o bnxt_sriov.o bnxt_ethtool.o bnxt_dcb.o bnxt_ulp.o
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index c26735ea..d334da3 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -52,6 +52,7 @@
#include "bnxt_hsi.h"
#include "bnxt.h"
+#include "bnxt_ulp.h"
#include "bnxt_sriov.h"
#include "bnxt_ethtool.h"
#include "bnxt_dcb.h"
@@ -1528,12 +1529,11 @@ static int bnxt_async_event_process(struct bnxt *bp,
set_bit(BNXT_RESET_TASK_SILENT_SP_EVENT, &bp->sp_event);
break;
default:
- netdev_err(bp->dev, "unhandled ASYNC event (id 0x%x)\n",
- event_id);
goto async_event_process_exit;
}
schedule_work(&bp->sp_task);
async_event_process_exit:
+ bnxt_ulp_async_events(bp, cmpl);
return 0;
}
@@ -3547,7 +3547,7 @@ static int bnxt_hwrm_vnic_ctx_alloc(struct bnxt *bp, u16 vnic_id, u16 ctx_idx)
return rc;
}
-static int bnxt_hwrm_vnic_cfg(struct bnxt *bp, u16 vnic_id)
+int bnxt_hwrm_vnic_cfg(struct bnxt *bp, u16 vnic_id)
{
unsigned int ring = 0, grp_idx;
struct bnxt_vnic_info *vnic = &bp->vnic_info[vnic_id];
@@ -3595,6 +3595,9 @@ static int bnxt_hwrm_vnic_cfg(struct bnxt *bp, u16 vnic_id)
#endif
if ((bp->flags & BNXT_FLAG_STRIP_VLAN) || def_vlan)
req.flags |= cpu_to_le32(VNIC_CFG_REQ_FLAGS_VLAN_STRIP_MODE);
+ if (!vnic_id && bnxt_ulp_registered(bp->edev, BNXT_ROCE_ULP))
+ req.flags |=
+ cpu_to_le32(VNIC_CFG_REQ_FLAGS_ROCE_DUAL_VNIC_MODE);
return hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
}
@@ -4842,6 +4845,16 @@ unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp)
#endif
}
+void bnxt_set_max_func_stat_ctxs(struct bnxt *bp, unsigned int max)
+{
+ if (BNXT_PF(bp))
+ bp->pf.max_stat_ctxs = max;
+#if defined(CONFIG_BNXT_SRIOV)
+ else
+ bp->vf.max_stat_ctxs = max;
+#endif
+}
+
unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp)
{
if (BNXT_PF(bp))
@@ -4851,6 +4864,16 @@ unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp)
#endif
}
+void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max)
+{
+ if (BNXT_PF(bp))
+ bp->pf.max_cp_rings = max;
+#if defined(CONFIG_BNXT_SRIOV)
+ else
+ bp->vf.max_cp_rings = max;
+#endif
+}
+
static unsigned int bnxt_get_max_func_irqs(struct bnxt *bp)
{
if (BNXT_PF(bp))
@@ -6767,6 +6790,8 @@ static void bnxt_remove_one(struct pci_dev *pdev)
pci_iounmap(pdev, bp->bar2);
pci_iounmap(pdev, bp->bar1);
pci_iounmap(pdev, bp->bar0);
+ kfree(bp->edev);
+ bp->edev = NULL;
free_netdev(dev);
pci_release_regions(pdev);
@@ -6936,6 +6961,7 @@ void bnxt_restore_pf_fw_resources(struct bnxt *bp)
{
ASSERT_RTNL();
bnxt_hwrm_func_qcaps(bp);
+ bnxt_subtract_ulp_resources(bp, BNXT_ROCE_ULP);
}
static void bnxt_parse_log_pcie_link(struct bnxt *bp)
@@ -7047,6 +7073,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (rc)
goto init_err;
+ bp->ulp_probe = bnxt_ulp_probe;
+
/* Get the MAX capabilities for this function */
rc = bnxt_hwrm_func_qcaps(bp);
if (rc) {
@@ -7144,12 +7172,15 @@ static pci_ers_result_t bnxt_io_error_detected(struct pci_dev *pdev,
pci_channel_state_t state)
{
struct net_device *netdev = pci_get_drvdata(pdev);
+ struct bnxt *bp = netdev_priv(netdev);
netdev_info(netdev, "PCI I/O error detected\n");
rtnl_lock();
netif_device_detach(netdev);
+ bnxt_ulp_stop(bp);
+
if (state == pci_channel_io_perm_failure) {
rtnl_unlock();
return PCI_ERS_RESULT_DISCONNECT;
@@ -7195,8 +7226,10 @@ static pci_ers_result_t bnxt_io_slot_reset(struct pci_dev *pdev)
if (!err && netif_running(netdev))
err = bnxt_open(netdev);
- if (!err)
+ if (!err) {
result = PCI_ERS_RESULT_RECOVERED;
+ bnxt_ulp_start(bp);
+ }
}
if (result != PCI_ERS_RESULT_RECOVERED && netif_running(netdev))
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index eec2415..16defe9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -972,6 +972,9 @@ struct bnxt {
#define BNXT_SINGLE_PF(bp) (BNXT_PF(bp) && !BNXT_NPAR(bp))
#define BNXT_CHIP_TYPE_NITRO_A0(bp) ((bp)->flags & BNXT_FLAG_CHIP_NITRO_A0)
+ struct bnxt_en_dev *edev;
+ struct bnxt_en_dev * (*ulp_probe)(struct net_device *);
+
struct bnxt_napi **bnapi;
struct bnxt_rx_ring_info *rx_ring;
@@ -1242,9 +1245,12 @@ static inline void bnxt_disable_poll(struct bnxt_napi *bnapi)
int hwrm_send_message_silent(struct bnxt *, void *, u32, int);
int bnxt_hwrm_func_rgtr_async_events(struct bnxt *bp, unsigned long *bmap,
int bmap_size);
+int bnxt_hwrm_vnic_cfg(struct bnxt *bp, u16 vnic_id);
int bnxt_hwrm_set_coal(struct bnxt *);
unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp);
+void bnxt_set_max_func_stat_ctxs(struct bnxt *bp, unsigned int max);
unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp);
+void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max);
void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max);
void bnxt_tx_disable(struct bnxt *bp);
void bnxt_tx_enable(struct bnxt *bp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
new file mode 100644
index 0000000..781948f
--- /dev/null
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -0,0 +1,346 @@
+/* Broadcom NetXtreme-C/E network driver.
+ *
+ * Copyright (c) 2016 Broadcom Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/pci.h>
+#include <linux/netdevice.h>
+#include <linux/rtnetlink.h>
+#include <linux/bitops.h>
+#include <linux/irq.h>
+#include <asm/byteorder.h>
+#include <linux/bitmap.h>
+
+#include "bnxt_hsi.h"
+#include "bnxt.h"
+#include "bnxt_ulp.h"
+
+static int bnxt_register_dev(struct bnxt_en_dev *edev, int ulp_id,
+ struct bnxt_ulp_ops *ulp_ops, void *handle)
+{
+ struct net_device *dev = edev->net;
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_ulp *ulp;
+
+ ASSERT_RTNL();
+ if (ulp_id >= BNXT_MAX_ULP)
+ return -EINVAL;
+
+ ulp = &edev->ulp_tbl[ulp_id];
+ if (rcu_access_pointer(ulp->ulp_ops)) {
+ netdev_err(bp->dev, "ulp id %d already registered\n", ulp_id);
+ return -EBUSY;
+ }
+ if (ulp_id == BNXT_ROCE_ULP) {
+ unsigned int max_stat_ctxs;
+
+ max_stat_ctxs = bnxt_get_max_func_stat_ctxs(bp);
+ if (max_stat_ctxs <= BNXT_MIN_ROCE_STAT_CTXS ||
+ bp->num_stat_ctxs == max_stat_ctxs)
+ return -ENOMEM;
+ bnxt_set_max_func_stat_ctxs(bp, max_stat_ctxs -
+ BNXT_MIN_ROCE_STAT_CTXS);
+ }
+
+ atomic_set(&ulp->ref_count, 0);
+ ulp->handle = handle;
+ rcu_assign_pointer(ulp->ulp_ops, ulp_ops);
+
+ if (ulp_id == BNXT_ROCE_ULP) {
+ if (test_bit(BNXT_STATE_OPEN, &bp->state))
+ bnxt_hwrm_vnic_cfg(bp, 0);
+ }
+
+ return 0;
+}
+
+static int bnxt_unregister_dev(struct bnxt_en_dev *edev, int ulp_id)
+{
+ struct net_device *dev = edev->net;
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_ulp *ulp;
+ int i;
+
+ ASSERT_RTNL();
+ if (ulp_id >= BNXT_MAX_ULP)
+ return -EINVAL;
+
+ ulp = &edev->ulp_tbl[ulp_id];
+ if (!rcu_access_pointer(ulp->ulp_ops)) {
+ netdev_err(bp->dev, "ulp id %d not registered\n", ulp_id);
+ return -EINVAL;
+ }
+ if (ulp_id == BNXT_ROCE_ULP) {
+ unsigned int max_stat_ctxs;
+
+ max_stat_ctxs = bnxt_get_max_func_stat_ctxs(bp);
+ bnxt_set_max_func_stat_ctxs(bp, max_stat_ctxs + 1);
+ }
+ if (ulp->max_async_event_id)
+ bnxt_hwrm_func_rgtr_async_events(bp, NULL, 0);
+
+ RCU_INIT_POINTER(ulp->ulp_ops, NULL);
+ synchronize_rcu();
+ ulp->max_async_event_id = 0;
+ ulp->async_events_bmap = NULL;
+ while (atomic_read(&ulp->ref_count) != 0 && i < 10) {
+ msleep(100);
+ i++;
+ }
+ return 0;
+}
+
+static int bnxt_req_msix_vecs(struct bnxt_en_dev *edev, int ulp_id,
+ struct bnxt_msix_entry *ent, int num_msix)
+{
+ struct net_device *dev = edev->net;
+ struct bnxt *bp = netdev_priv(dev);
+ int max_idx, max_cp_rings;
+ int avail_msix, i, idx;
+
+ ASSERT_RTNL();
+ if (ulp_id != BNXT_ROCE_ULP)
+ return -EINVAL;
+
+ if (!(bp->flags & BNXT_FLAG_USING_MSIX))
+ return -ENODEV;
+
+ max_cp_rings = bnxt_get_max_func_cp_rings(bp);
+ max_idx = min_t(int, bp->total_irqs, max_cp_rings);
+ avail_msix = max_idx - bp->cp_nr_rings;
+ if (!avail_msix)
+ return -ENOMEM;
+ if (avail_msix > num_msix)
+ avail_msix = num_msix;
+
+ idx = max_idx - avail_msix;
+ for (i = 0; i < avail_msix; i++) {
+ ent[i].vector = bp->irq_tbl[idx + i].vector;
+ ent[i].ring_idx = idx + i;
+ ent[i].db_offset = (idx + i) * 0x80;
+ }
+ bnxt_set_max_func_irqs(bp, max_idx - avail_msix);
+ bnxt_set_max_func_cp_rings(bp, max_cp_rings - avail_msix);
+ edev->ulp_tbl[ulp_id].msix_requested = avail_msix;
+ return avail_msix;
+}
+
+static int bnxt_free_msix_vecs(struct bnxt_en_dev *edev, int ulp_id)
+{
+ struct net_device *dev = edev->net;
+ struct bnxt *bp = netdev_priv(dev);
+ int max_cp_rings, msix_requested;
+
+ ASSERT_RTNL();
+ if (ulp_id != BNXT_ROCE_ULP)
+ return -EINVAL;
+
+ max_cp_rings = bnxt_get_max_func_cp_rings(bp);
+ msix_requested = edev->ulp_tbl[ulp_id].msix_requested;
+ bnxt_set_max_func_cp_rings(bp, max_cp_rings + msix_requested);
+ edev->ulp_tbl[ulp_id].msix_requested = 0;
+ bnxt_set_max_func_irqs(bp, bp->total_irqs);
+ return 0;
+}
+
+void bnxt_subtract_ulp_resources(struct bnxt *bp, int ulp_id)
+{
+ ASSERT_RTNL();
+ if (bnxt_ulp_registered(bp->edev, ulp_id)) {
+ struct bnxt_en_dev *edev = bp->edev;
+ unsigned int msix_req, max;
+
+ msix_req = edev->ulp_tbl[ulp_id].msix_requested;
+ max = bnxt_get_max_func_cp_rings(bp);
+ bnxt_set_max_func_cp_rings(bp, max - msix_req);
+ max = bnxt_get_max_func_stat_ctxs(bp);
+ bnxt_set_max_func_stat_ctxs(bp, max - 1);
+ }
+}
+
+static int bnxt_send_msg(struct bnxt_en_dev *edev, int ulp_id,
+ struct bnxt_fw_msg *fw_msg)
+{
+ struct net_device *dev = edev->net;
+ struct bnxt *bp = netdev_priv(dev);
+ struct input *req;
+ int rc;
+
+ mutex_lock(&bp->hwrm_cmd_lock);
+ req = fw_msg->msg;
+ req->resp_addr = cpu_to_le64(bp->hwrm_cmd_resp_dma_addr);
+ rc = _hwrm_send_message(bp, fw_msg->msg, fw_msg->msg_len,
+ fw_msg->timeout);
+ if (!rc) {
+ struct output *resp = bp->hwrm_cmd_resp_addr;
+ u32 len = le16_to_cpu(resp->resp_len);
+
+ if (fw_msg->resp_max_len < len)
+ len = fw_msg->resp_max_len;
+
+ memcpy(fw_msg->resp, resp, len);
+ }
+ mutex_unlock(&bp->hwrm_cmd_lock);
+ return rc;
+}
+
+static void bnxt_ulp_get(struct bnxt_ulp *ulp)
+{
+ atomic_inc(&ulp->ref_count);
+}
+
+static void bnxt_ulp_put(struct bnxt_ulp *ulp)
+{
+ atomic_dec(&ulp->ref_count);
+}
+
+void bnxt_ulp_stop(struct bnxt *bp)
+{
+ struct bnxt_en_dev *edev = bp->edev;
+ struct bnxt_ulp_ops *ops;
+ int i;
+
+ if (!edev)
+ return;
+
+ for (i = 0; i < BNXT_MAX_ULP; i++) {
+ struct bnxt_ulp *ulp = &edev->ulp_tbl[i];
+
+ rtnl_dereference(ulp->ulp_ops);
+ if (!ops || !ops->ulp_stop)
+ continue;
+ ops->ulp_stop(ulp->handle);
+ }
+}
+
+void bnxt_ulp_start(struct bnxt *bp)
+{
+ struct bnxt_en_dev *edev = bp->edev;
+ struct bnxt_ulp_ops *ops;
+ int i;
+
+ if (!edev)
+ return;
+
+ for (i = 0; i < BNXT_MAX_ULP; i++) {
+ struct bnxt_ulp *ulp = &edev->ulp_tbl[i];
+
+ ops = rtnl_dereference(ulp->ulp_ops);
+ if (!ops || !ops->ulp_start)
+ continue;
+ ops->ulp_start(ulp->handle);
+ }
+}
+
+void bnxt_ulp_sriov_cfg(struct bnxt *bp, int num_vfs)
+{
+ struct bnxt_en_dev *edev = bp->edev;
+ struct bnxt_ulp_ops *ops;
+ int i;
+
+ if (!edev)
+ return;
+
+ for (i = 0; i < BNXT_MAX_ULP; i++) {
+ struct bnxt_ulp *ulp = &edev->ulp_tbl[i];
+
+ rcu_read_lock();
+ ops = rcu_dereference(ulp->ulp_ops);
+ if (!ops || !ops->ulp_sriov_config) {
+ rcu_read_unlock();
+ continue;
+ }
+ bnxt_ulp_get(ulp);
+ rcu_read_unlock();
+ ops->ulp_sriov_config(ulp->handle, num_vfs);
+ bnxt_ulp_put(ulp);
+ }
+}
+
+void bnxt_ulp_async_events(struct bnxt *bp, struct hwrm_async_event_cmpl *cmpl)
+{
+ u16 event_id = le16_to_cpu(cmpl->event_id);
+ struct bnxt_en_dev *edev = bp->edev;
+ struct bnxt_ulp_ops *ops;
+ int i;
+
+ if (!edev)
+ return;
+
+ rcu_read_lock();
+ for (i = 0; i < BNXT_MAX_ULP; i++) {
+ struct bnxt_ulp *ulp = &edev->ulp_tbl[i];
+
+ ops = rcu_dereference(ulp->ulp_ops);
+ if (!ops || !ops->ulp_async_notifier)
+ continue;
+ if (!ulp->async_events_bmap ||
+ event_id > ulp->max_async_event_id)
+ continue;
+
+ /* Read max_async_event_id first before testing the bitmap. */
+ smp_rmb();
+ if (test_bit(event_id, ulp->async_events_bmap))
+ ops->ulp_async_notifier(ulp->handle, cmpl);
+ }
+ rcu_read_unlock();
+}
+
+static int bnxt_register_async_events(struct bnxt_en_dev *edev, int ulp_id,
+ unsigned long *events_bmap, u16 max_id)
+{
+ struct net_device *dev = edev->net;
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_ulp *ulp;
+
+ if (ulp_id >= BNXT_MAX_ULP)
+ return -EINVAL;
+
+ ulp = &edev->ulp_tbl[ulp_id];
+ ulp->async_events_bmap = events_bmap;
+ /* Make sure bnxt_ulp_async_events() sees this order */
+ smp_wmb();
+ ulp->max_async_event_id = max_id;
+ bnxt_hwrm_func_rgtr_async_events(bp, events_bmap, max_id + 1);
+ return 0;
+}
+
+static const struct bnxt_en_ops bnxt_en_ops_tbl = {
+ .bnxt_register_device = bnxt_register_dev,
+ .bnxt_unregister_device = bnxt_unregister_dev,
+ .bnxt_request_msix = bnxt_req_msix_vecs,
+ .bnxt_free_msix = bnxt_free_msix_vecs,
+ .bnxt_send_fw_msg = bnxt_send_msg,
+ .bnxt_register_fw_async_events = bnxt_register_async_events,
+};
+
+struct bnxt_en_dev *bnxt_ulp_probe(struct net_device *dev)
+{
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_en_dev *edev;
+
+ edev = bp->edev;
+ if (!edev) {
+ edev = kzalloc(sizeof(*edev), GFP_KERNEL);
+ if (!edev)
+ return ERR_PTR(-ENOMEM);
+ edev->en_ops = &bnxt_en_ops_tbl;
+ if (bp->flags & BNXT_FLAG_ROCEV1_CAP)
+ edev->flags |= BNXT_EN_FLAG_ROCEV1_CAP;
+ if (bp->flags & BNXT_FLAG_ROCEV2_CAP)
+ edev->flags |= BNXT_EN_FLAG_ROCEV2_CAP;
+ edev->net = dev;
+ edev->pdev = bp->pdev;
+ bp->edev = edev;
+ }
+ return bp->edev;
+}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
new file mode 100644
index 0000000..74f816e
--- /dev/null
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
@@ -0,0 +1,93 @@
+/* Broadcom NetXtreme-C/E network driver.
+ *
+ * Copyright (c) 2016 Broadcom Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#ifndef BNXT_ULP_H
+#define BNXT_ULP_H
+
+#define BNXT_ROCE_ULP 0
+#define BNXT_OTHER_ULP 1
+#define BNXT_MAX_ULP 2
+
+#define BNXT_MIN_ROCE_CP_RINGS 2
+#define BNXT_MIN_ROCE_STAT_CTXS 1
+
+struct hwrm_async_event_cmpl;
+struct bnxt;
+
+struct bnxt_ulp_ops {
+ /* async_notifier() cannot sleep (in BH context) */
+ void (*ulp_async_notifier)(void *, struct hwrm_async_event_cmpl *);
+ void (*ulp_stop)(void *);
+ void (*ulp_start)(void *);
+ void (*ulp_sriov_config)(void *, int);
+};
+
+struct bnxt_msix_entry {
+ u32 vector;
+ u32 ring_idx;
+ u32 db_offset;
+};
+
+struct bnxt_fw_msg {
+ void *msg;
+ int msg_len;
+ void *resp;
+ int resp_max_len;
+ int timeout;
+};
+
+struct bnxt_ulp {
+ void *handle;
+ struct bnxt_ulp_ops __rcu *ulp_ops;
+ unsigned long *async_events_bmap;
+ u16 max_async_event_id;
+ u16 msix_requested;
+ atomic_t ref_count;
+};
+
+struct bnxt_en_dev {
+ struct net_device *net;
+ struct pci_dev *pdev;
+ u32 flags;
+ #define BNXT_EN_FLAG_ROCEV1_CAP 0x1
+ #define BNXT_EN_FLAG_ROCEV2_CAP 0x2
+ #define BNXT_EN_FLAG_ROCE_CAP (BNXT_EN_FLAG_ROCEV1_CAP | \
+ BNXT_EN_FLAG_ROCEV2_CAP)
+ const struct bnxt_en_ops *en_ops;
+ struct bnxt_ulp ulp_tbl[BNXT_MAX_ULP];
+};
+
+struct bnxt_en_ops {
+ int (*bnxt_register_device)(struct bnxt_en_dev *, int,
+ struct bnxt_ulp_ops *, void *);
+ int (*bnxt_unregister_device)(struct bnxt_en_dev *, int);
+ int (*bnxt_request_msix)(struct bnxt_en_dev *, int,
+ struct bnxt_msix_entry *, int);
+ int (*bnxt_free_msix)(struct bnxt_en_dev *, int);
+ int (*bnxt_send_fw_msg)(struct bnxt_en_dev *, int,
+ struct bnxt_fw_msg *);
+ int (*bnxt_register_fw_async_events)(struct bnxt_en_dev *, int,
+ unsigned long *, u16);
+};
+
+static inline bool bnxt_ulp_registered(struct bnxt_en_dev *edev, int ulp_id)
+{
+ if (edev && rcu_access_pointer(edev->ulp_tbl[ulp_id].ulp_ops))
+ return true;
+ return false;
+}
+
+void bnxt_subtract_ulp_resources(struct bnxt *bp, int ulp_id);
+void bnxt_ulp_stop(struct bnxt *bp);
+void bnxt_ulp_start(struct bnxt *bp);
+void bnxt_ulp_sriov_cfg(struct bnxt *bp, int num_vfs);
+void bnxt_ulp_async_events(struct bnxt *bp, struct hwrm_async_event_cmpl *cmpl);
+struct bnxt_en_dev *bnxt_ulp_probe(struct net_device *dev);
+
+#endif
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 6/7] bnxt_en: Refactor the driver registration function with firmware.
From: Michael Chan @ 2016-12-06 17:09 UTC (permalink / raw)
To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
selvin.xavier-dY08KVG/lbpWk0Htik3J/w,
somnath.kotur-dY08KVG/lbpWk0Htik3J/w,
dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1481044178-25193-1-git-send-email-michael.chan-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
The driver register function with firmware consists of passing version
information and registering for async events. To support the RDMA driver,
the async events that we need to register may change. Separate the
driver register function into 2 parts so that we can just update the
async events for the RDMA driver.
Signed-off-by: Michael Chan <michael.chan-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 34 ++++++++++++++++++++++++++-----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 ++
2 files changed, 31 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7218d65..c26735ea 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3117,27 +3117,46 @@ int hwrm_send_message_silent(struct bnxt *bp, void *msg, u32 msg_len,
return rc;
}
-static int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp)
+int bnxt_hwrm_func_rgtr_async_events(struct bnxt *bp, unsigned long *bmap,
+ int bmap_size)
{
struct hwrm_func_drv_rgtr_input req = {0};
- int i;
DECLARE_BITMAP(async_events_bmap, 256);
u32 *events = (u32 *)async_events_bmap;
+ int i;
bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_DRV_RGTR, -1, -1);
req.enables =
- cpu_to_le32(FUNC_DRV_RGTR_REQ_ENABLES_OS_TYPE |
- FUNC_DRV_RGTR_REQ_ENABLES_VER |
- FUNC_DRV_RGTR_REQ_ENABLES_ASYNC_EVENT_FWD);
+ cpu_to_le32(FUNC_DRV_RGTR_REQ_ENABLES_ASYNC_EVENT_FWD);
memset(async_events_bmap, 0, sizeof(async_events_bmap));
for (i = 0; i < ARRAY_SIZE(bnxt_async_events_arr); i++)
__set_bit(bnxt_async_events_arr[i], async_events_bmap);
+ if (bmap && bmap_size) {
+ for (i = 0; i < bmap_size; i++) {
+ if (test_bit(i, bmap))
+ __set_bit(i, async_events_bmap);
+ }
+ }
+
for (i = 0; i < 8; i++)
req.async_event_fwd[i] |= cpu_to_le32(events[i]);
+ return hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
+}
+
+static int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp)
+{
+ struct hwrm_func_drv_rgtr_input req = {0};
+
+ bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_DRV_RGTR, -1, -1);
+
+ req.enables =
+ cpu_to_le32(FUNC_DRV_RGTR_REQ_ENABLES_OS_TYPE |
+ FUNC_DRV_RGTR_REQ_ENABLES_VER);
+
req.os_type = cpu_to_le16(FUNC_DRV_RGTR_REQ_OS_TYPE_LINUX);
req.ver_maj = DRV_VER_MAJ;
req.ver_min = DRV_VER_MIN;
@@ -3146,6 +3165,7 @@ static int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp)
if (BNXT_PF(bp)) {
DECLARE_BITMAP(vf_req_snif_bmap, 256);
u32 *data = (u32 *)vf_req_snif_bmap;
+ int i;
memset(vf_req_snif_bmap, 0, sizeof(vf_req_snif_bmap));
for (i = 0; i < ARRAY_SIZE(bnxt_vf_req_snif); i++)
@@ -7023,6 +7043,10 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (rc)
goto init_err;
+ rc = bnxt_hwrm_func_rgtr_async_events(bp, NULL, 0);
+ if (rc)
+ goto init_err;
+
/* Get the MAX capabilities for this function */
rc = bnxt_hwrm_func_qcaps(bp);
if (rc) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index d796836..eec2415 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1240,6 +1240,8 @@ static inline void bnxt_disable_poll(struct bnxt_napi *bnapi)
int _hwrm_send_message(struct bnxt *, void *, u32, int);
int hwrm_send_message(struct bnxt *, void *, u32, int);
int hwrm_send_message_silent(struct bnxt *, void *, u32, int);
+int bnxt_hwrm_func_rgtr_async_events(struct bnxt *bp, unsigned long *bmap,
+ int bmap_size);
int bnxt_hwrm_set_coal(struct bnxt *);
unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp);
unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp);
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH net-next 5/7] bnxt_en: Reserve RDMA resources by default.
From: Michael Chan @ 2016-12-06 17:09 UTC (permalink / raw)
To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
selvin.xavier-dY08KVG/lbpWk0Htik3J/w,
somnath.kotur-dY08KVG/lbpWk0Htik3J/w,
dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1481044178-25193-1-git-send-email-michael.chan-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
If the device supports RDMA, we'll setup network default rings so that
there are enough minimum resources for RDMA, if possible. However, the
user can still increase network rings to the max if he wants. The actual
RDMA resources won't be reserved until the RDMA driver registers.
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Michael Chan <michael.chan-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 58 ++++++++++++++++++++++++++++++-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 9 +++++
2 files changed, 66 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1f6be83..7218d65 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4166,6 +4166,11 @@ static int bnxt_hwrm_func_qcaps(struct bnxt *bp)
if (rc)
goto hwrm_func_qcaps_exit;
+ if (resp->flags & cpu_to_le32(FUNC_QCAPS_RESP_FLAGS_ROCE_V1_SUPPORTED))
+ bp->flags |= BNXT_FLAG_ROCEV1_CAP;
+ if (resp->flags & cpu_to_le32(FUNC_QCAPS_RESP_FLAGS_ROCE_V2_SUPPORTED))
+ bp->flags |= BNXT_FLAG_ROCEV2_CAP;
+
bp->tx_push_thresh = 0;
if (resp->flags &
cpu_to_le32(FUNC_QCAPS_RESP_FLAGS_PUSH_MODE_SUPPORTED))
@@ -4808,6 +4813,24 @@ static int bnxt_setup_int_mode(struct bnxt *bp)
return rc;
}
+unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp)
+{
+ if (BNXT_PF(bp))
+ return bp->pf.max_stat_ctxs;
+#if defined(CONFIG_BNXT_SRIOV)
+ return bp->vf.max_stat_ctxs;
+#endif
+}
+
+unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp)
+{
+ if (BNXT_PF(bp))
+ return bp->pf.max_cp_rings;
+#if defined(CONFIG_BNXT_SRIOV)
+ return bp->vf.max_cp_rings;
+#endif
+}
+
static unsigned int bnxt_get_max_func_irqs(struct bnxt *bp)
{
if (BNXT_PF(bp))
@@ -6832,6 +6855,39 @@ int bnxt_get_max_rings(struct bnxt *bp, int *max_rx, int *max_tx, bool shared)
return bnxt_trim_rings(bp, max_rx, max_tx, cp, shared);
}
+static int bnxt_get_dflt_rings(struct bnxt *bp, int *max_rx, int *max_tx,
+ bool shared)
+{
+ int rc;
+
+ rc = bnxt_get_max_rings(bp, max_rx, max_tx, shared);
+ if (rc)
+ return rc;
+
+ if (bp->flags & BNXT_FLAG_ROCE_CAP) {
+ int max_cp, max_stat, max_irq;
+
+ /* Reserve minimum resources for RoCE */
+ max_cp = bnxt_get_max_func_cp_rings(bp);
+ max_stat = bnxt_get_max_func_stat_ctxs(bp);
+ max_irq = bnxt_get_max_func_irqs(bp);
+ if (max_cp <= BNXT_MIN_ROCE_CP_RINGS ||
+ max_irq <= BNXT_MIN_ROCE_CP_RINGS ||
+ max_stat <= BNXT_MIN_ROCE_STAT_CTXS)
+ return 0;
+
+ max_cp -= BNXT_MIN_ROCE_CP_RINGS;
+ max_irq -= BNXT_MIN_ROCE_CP_RINGS;
+ max_stat -= BNXT_MIN_ROCE_STAT_CTXS;
+ max_cp = min_t(int, max_cp, max_irq);
+ max_cp = min_t(int, max_cp, max_stat);
+ rc = bnxt_trim_rings(bp, max_rx, max_tx, max_cp, shared);
+ if (rc)
+ rc = 0;
+ }
+ return rc;
+}
+
static int bnxt_set_dflt_rings(struct bnxt *bp)
{
int dflt_rings, max_rx_rings, max_tx_rings, rc;
@@ -6840,7 +6896,7 @@ static int bnxt_set_dflt_rings(struct bnxt *bp)
if (sh)
bp->flags |= BNXT_FLAG_SHARED_RINGS;
dflt_rings = netif_get_num_default_rss_queues();
- rc = bnxt_get_max_rings(bp, &max_rx_rings, &max_tx_rings, sh);
+ rc = bnxt_get_dflt_rings(bp, &max_rx_rings, &max_tx_rings, sh);
if (rc)
return rc;
bp->rx_nr_rings = min_t(int, dflt_rings, max_rx_rings);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 43a4b17..d796836 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -387,6 +387,9 @@ struct rx_tpa_end_cmp_ext {
#define DB_KEY_TX_PUSH (0x4 << 28)
#define DB_LONG_TX_PUSH (0x2 << 24)
+#define BNXT_MIN_ROCE_CP_RINGS 2
+#define BNXT_MIN_ROCE_STAT_CTXS 1
+
#define INVALID_HW_RING_ID ((u16)-1)
/* The hardware supports certain page sizes. Use the supported page sizes
@@ -953,6 +956,10 @@ struct bnxt {
#define BNXT_FLAG_PORT_STATS 0x400
#define BNXT_FLAG_UDP_RSS_CAP 0x800
#define BNXT_FLAG_EEE_CAP 0x1000
+ #define BNXT_FLAG_ROCEV1_CAP 0x8000
+ #define BNXT_FLAG_ROCEV2_CAP 0x10000
+ #define BNXT_FLAG_ROCE_CAP (BNXT_FLAG_ROCEV1_CAP | \
+ BNXT_FLAG_ROCEV2_CAP)
#define BNXT_FLAG_CHIP_NITRO_A0 0x1000000
#define BNXT_FLAG_ALL_CONFIG_FEATS (BNXT_FLAG_TPA | \
@@ -1234,6 +1241,8 @@ static inline void bnxt_disable_poll(struct bnxt_napi *bnapi)
int hwrm_send_message(struct bnxt *, void *, u32, int);
int hwrm_send_message_silent(struct bnxt *, void *, u32, int);
int bnxt_hwrm_set_coal(struct bnxt *);
+unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp);
+unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp);
void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max);
void bnxt_tx_disable(struct bnxt *bp);
void bnxt_tx_enable(struct bnxt *bp);
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH net-next 4/7] bnxt_en: Improve completion ring allocation for VFs.
From: Michael Chan @ 2016-12-06 17:09 UTC (permalink / raw)
To: davem; +Cc: netdev, selvin.xavier, somnath.kotur, dledford, linux-rdma
In-Reply-To: <1481044178-25193-1-git-send-email-michael.chan@broadcom.com>
All available remaining completion rings not used by the PF should be
made available for the VFs so that there are enough rings in the VF to
support RDMA. The earlier workaround code of capping the rings by the
statistics context is removed.
When SRIOV is disabled, call a new function bnxt_restore_pf_fw_resources()
to restore FW resources. Later on we need to add some logic to account
for RDMA resources.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 +++++++-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 14 ++++----------
3 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 8b00ef4..1f6be83 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4152,7 +4152,7 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp)
return rc;
}
-int bnxt_hwrm_func_qcaps(struct bnxt *bp)
+static int bnxt_hwrm_func_qcaps(struct bnxt *bp)
{
int rc = 0;
struct hwrm_func_qcaps_input req = {0};
@@ -6856,6 +6856,12 @@ static int bnxt_set_dflt_rings(struct bnxt *bp)
return rc;
}
+void bnxt_restore_pf_fw_resources(struct bnxt *bp)
+{
+ ASSERT_RTNL();
+ bnxt_hwrm_func_qcaps(bp);
+}
+
static void bnxt_parse_log_pcie_link(struct bnxt *bp)
{
enum pcie_link_width width = PCIE_LNK_WIDTH_UNKNOWN;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 0ee2cc4..43a4b17 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1234,7 +1234,6 @@ static inline void bnxt_disable_poll(struct bnxt_napi *bnapi)
int hwrm_send_message(struct bnxt *, void *, u32, int);
int hwrm_send_message_silent(struct bnxt *, void *, u32, int);
int bnxt_hwrm_set_coal(struct bnxt *);
-int bnxt_hwrm_func_qcaps(struct bnxt *);
void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max);
void bnxt_tx_disable(struct bnxt *bp);
void bnxt_tx_enable(struct bnxt *bp);
@@ -1245,4 +1244,5 @@ static inline void bnxt_disable_poll(struct bnxt_napi *bnapi)
int bnxt_close_nic(struct bnxt *, bool, bool);
int bnxt_setup_mq_tc(struct net_device *dev, u8 tc);
int bnxt_get_max_rings(struct bnxt *, int *, int *, bool);
+void bnxt_restore_pf_fw_resources(struct bnxt *bp);
#endif
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index bff626a..c696025 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -420,15 +420,7 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs)
bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_CFG, -1, -1);
/* Remaining rings are distributed equally amongs VF's for now */
- /* TODO: the following workaroud is needed to restrict total number
- * of vf_cp_rings not exceed number of HW ring groups. This WA should
- * be removed once new HWRM provides HW ring groups capability in
- * hwrm_func_qcap.
- */
- vf_cp_rings = min_t(u16, pf->max_cp_rings, pf->max_stat_ctxs);
- vf_cp_rings = (vf_cp_rings - bp->cp_nr_rings) / num_vfs;
- /* TODO: restore this logic below once the WA above is removed */
- /* vf_cp_rings = (pf->max_cp_rings - bp->cp_nr_rings) / num_vfs; */
+ vf_cp_rings = (pf->max_cp_rings - bp->cp_nr_rings) / num_vfs;
vf_stat_ctx = (pf->max_stat_ctxs - bp->num_stat_ctxs) / num_vfs;
if (bp->flags & BNXT_FLAG_AGG_RINGS)
vf_rx_rings = (pf->max_rx_rings - bp->rx_nr_rings * 2) /
@@ -590,7 +582,9 @@ void bnxt_sriov_disable(struct bnxt *bp)
bp->pf.active_vfs = 0;
/* Reclaim all resources for the PF. */
- bnxt_hwrm_func_qcaps(bp);
+ rtnl_lock();
+ bnxt_restore_pf_fw_resources(bp);
+ rtnl_unlock();
}
int bnxt_sriov_configure(struct pci_dev *pdev, int num_vfs)
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 3/7] bnxt_en: Move function reset to bnxt_init_one().
From: Michael Chan @ 2016-12-06 17:09 UTC (permalink / raw)
To: davem; +Cc: netdev, selvin.xavier, somnath.kotur, dledford, linux-rdma
In-Reply-To: <1481044178-25193-1-git-send-email-michael.chan@broadcom.com>
Now that MSIX is enabled in bnxt_init_one(), resources may be allocated by
the RDMA driver before the network device is opened. So we cannot do
function reset in bnxt_open() which will clear all the resources.
The proper place to do function reset now is in bnxt_init_one().
If we get AER, we'll do function reset as well.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 25 ++++++-------------------
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 -
2 files changed, 6 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 9178bf8..8b00ef4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5613,22 +5613,7 @@ int bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
static int bnxt_open(struct net_device *dev)
{
struct bnxt *bp = netdev_priv(dev);
- int rc = 0;
- if (!test_bit(BNXT_STATE_FN_RST_DONE, &bp->state)) {
- rc = bnxt_hwrm_func_reset(bp);
- if (rc) {
- netdev_err(bp->dev, "hwrm chip reset failure rc: %x\n",
- rc);
- rc = -EBUSY;
- return rc;
- }
- /* Do func_reset during the 1st PF open only to prevent killing
- * the VFs when the PF is brought down and up.
- */
- if (BNXT_PF(bp))
- set_bit(BNXT_STATE_FN_RST_DONE, &bp->state);
- }
return __bnxt_open_nic(bp, true, true);
}
@@ -7028,6 +7013,10 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (rc)
goto init_err;
+ rc = bnxt_hwrm_func_reset(bp);
+ if (rc)
+ goto init_err;
+
rc = bnxt_init_int_mode(bp);
if (rc)
goto init_err;
@@ -7069,7 +7058,6 @@ static pci_ers_result_t bnxt_io_error_detected(struct pci_dev *pdev,
pci_channel_state_t state)
{
struct net_device *netdev = pci_get_drvdata(pdev);
- struct bnxt *bp = netdev_priv(netdev);
netdev_info(netdev, "PCI I/O error detected\n");
@@ -7084,8 +7072,6 @@ static pci_ers_result_t bnxt_io_error_detected(struct pci_dev *pdev,
if (netif_running(netdev))
bnxt_close(netdev);
- /* So that func_reset will be done during slot_reset */
- clear_bit(BNXT_STATE_FN_RST_DONE, &bp->state);
pci_disable_device(pdev);
rtnl_unlock();
@@ -7119,7 +7105,8 @@ static pci_ers_result_t bnxt_io_slot_reset(struct pci_dev *pdev)
} else {
pci_set_master(pdev);
- if (netif_running(netdev))
+ err = bnxt_hwrm_func_reset(bp);
+ if (!err && netif_running(netdev))
err = bnxt_open(netdev);
if (!err)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 1461355..0ee2cc4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1021,7 +1021,6 @@ struct bnxt {
unsigned long state;
#define BNXT_STATE_OPEN 0
#define BNXT_STATE_IN_SP_TASK 1
-#define BNXT_STATE_FN_RST_DONE 2
struct bnxt_irq *irq_tbl;
int total_irqs;
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 2/7] bnxt_en: Enable MSIX early in bnxt_init_one().
From: Michael Chan @ 2016-12-06 17:09 UTC (permalink / raw)
To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
selvin.xavier-dY08KVG/lbpWk0Htik3J/w,
somnath.kotur-dY08KVG/lbpWk0Htik3J/w,
dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1481044178-25193-1-git-send-email-michael.chan-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
To better support the new RDMA driver, we need to move pci_enable_msix()
from bnxt_open() to bnxt_init_one(). This way, MSIX vectors are available
to the RDMA driver whether the network device is up or down.
Part of the existing bnxt_setup_int_mode() function is now refactored into
a new bnxt_init_int_mode(). bnxt_init_int_mode() is called during
bnxt_init_one() to enable MSIX. The remaining logic in
bnxt_setup_int_mode() to map the IRQs to the completion rings is called
during bnxt_open().
Signed-off-by: Somnath Kotur <somnath.kotur-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Michael Chan <michael.chan-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 183 +++++++++++++++++++-----------
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 +
2 files changed, 115 insertions(+), 69 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6cdfe3e..9178bf8 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4743,6 +4743,80 @@ static int bnxt_trim_rings(struct bnxt *bp, int *rx, int *tx, int max,
return 0;
}
+static void bnxt_setup_msix(struct bnxt *bp)
+{
+ const int len = sizeof(bp->irq_tbl[0].name);
+ struct net_device *dev = bp->dev;
+ int tcs, i;
+
+ tcs = netdev_get_num_tc(dev);
+ if (tcs > 1) {
+ bp->tx_nr_rings_per_tc = bp->tx_nr_rings / tcs;
+ if (bp->tx_nr_rings_per_tc == 0) {
+ netdev_reset_tc(dev);
+ bp->tx_nr_rings_per_tc = bp->tx_nr_rings;
+ } else {
+ int i, off, count;
+
+ bp->tx_nr_rings = bp->tx_nr_rings_per_tc * tcs;
+ for (i = 0; i < tcs; i++) {
+ count = bp->tx_nr_rings_per_tc;
+ off = i * count;
+ netdev_set_tc_queue(dev, i, count, off);
+ }
+ }
+ }
+
+ for (i = 0; i < bp->cp_nr_rings; i++) {
+ char *attr;
+
+ if (bp->flags & BNXT_FLAG_SHARED_RINGS)
+ attr = "TxRx";
+ else if (i < bp->rx_nr_rings)
+ attr = "rx";
+ else
+ attr = "tx";
+
+ snprintf(bp->irq_tbl[i].name, len, "%s-%s-%d", dev->name, attr,
+ i);
+ bp->irq_tbl[i].handler = bnxt_msix;
+ }
+}
+
+static void bnxt_setup_inta(struct bnxt *bp)
+{
+ const int len = sizeof(bp->irq_tbl[0].name);
+
+ if (netdev_get_num_tc(bp->dev))
+ netdev_reset_tc(bp->dev);
+
+ snprintf(bp->irq_tbl[0].name, len, "%s-%s-%d", bp->dev->name, "TxRx",
+ 0);
+ bp->irq_tbl[0].handler = bnxt_inta;
+}
+
+static int bnxt_setup_int_mode(struct bnxt *bp)
+{
+ int rc;
+
+ if (bp->flags & BNXT_FLAG_USING_MSIX)
+ bnxt_setup_msix(bp);
+ else
+ bnxt_setup_inta(bp);
+
+ rc = bnxt_set_real_num_queues(bp);
+ return rc;
+}
+
+static unsigned int bnxt_get_max_func_irqs(struct bnxt *bp)
+{
+ if (BNXT_PF(bp))
+ return bp->pf.max_irqs;
+#if defined(CONFIG_BNXT_SRIOV)
+ return bp->vf.max_irqs;
+#endif
+}
+
void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max_irqs)
{
if (BNXT_PF(bp))
@@ -4753,16 +4827,12 @@ void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max_irqs)
#endif
}
-static int bnxt_setup_msix(struct bnxt *bp)
+static int bnxt_init_msix(struct bnxt *bp)
{
- struct msix_entry *msix_ent;
- struct net_device *dev = bp->dev;
int i, total_vecs, rc = 0, min = 1;
- const int len = sizeof(bp->irq_tbl[0].name);
-
- bp->flags &= ~BNXT_FLAG_USING_MSIX;
- total_vecs = bp->cp_nr_rings;
+ struct msix_entry *msix_ent;
+ total_vecs = bnxt_get_max_func_irqs(bp);
msix_ent = kcalloc(total_vecs, sizeof(struct msix_entry), GFP_KERNEL);
if (!msix_ent)
return -ENOMEM;
@@ -4783,8 +4853,10 @@ static int bnxt_setup_msix(struct bnxt *bp)
bp->irq_tbl = kcalloc(total_vecs, sizeof(struct bnxt_irq), GFP_KERNEL);
if (bp->irq_tbl) {
- int tcs;
+ for (i = 0; i < total_vecs; i++)
+ bp->irq_tbl[i].vector = msix_ent[i].vector;
+ bp->total_irqs = total_vecs;
/* Trim rings based upon num of vectors allocated */
rc = bnxt_trim_rings(bp, &bp->rx_nr_rings, &bp->tx_nr_rings,
total_vecs, min == 1);
@@ -4792,43 +4864,10 @@ static int bnxt_setup_msix(struct bnxt *bp)
goto msix_setup_exit;
bp->tx_nr_rings_per_tc = bp->tx_nr_rings;
- tcs = netdev_get_num_tc(dev);
- if (tcs > 1) {
- bp->tx_nr_rings_per_tc = bp->tx_nr_rings / tcs;
- if (bp->tx_nr_rings_per_tc == 0) {
- netdev_reset_tc(dev);
- bp->tx_nr_rings_per_tc = bp->tx_nr_rings;
- } else {
- int i, off, count;
+ bp->cp_nr_rings = (min == 1) ?
+ max_t(int, bp->tx_nr_rings, bp->rx_nr_rings) :
+ bp->tx_nr_rings + bp->rx_nr_rings;
- bp->tx_nr_rings = bp->tx_nr_rings_per_tc * tcs;
- for (i = 0; i < tcs; i++) {
- count = bp->tx_nr_rings_per_tc;
- off = i * count;
- netdev_set_tc_queue(dev, i, count, off);
- }
- }
- }
- bp->cp_nr_rings = total_vecs;
-
- for (i = 0; i < bp->cp_nr_rings; i++) {
- char *attr;
-
- bp->irq_tbl[i].vector = msix_ent[i].vector;
- if (bp->flags & BNXT_FLAG_SHARED_RINGS)
- attr = "TxRx";
- else if (i < bp->rx_nr_rings)
- attr = "rx";
- else
- attr = "tx";
-
- snprintf(bp->irq_tbl[i].name, len,
- "%s-%s-%d", dev->name, attr, i);
- bp->irq_tbl[i].handler = bnxt_msix;
- }
- rc = bnxt_set_real_num_queues(bp);
- if (rc)
- goto msix_setup_exit;
} else {
rc = -ENOMEM;
goto msix_setup_exit;
@@ -4838,52 +4877,54 @@ static int bnxt_setup_msix(struct bnxt *bp)
return 0;
msix_setup_exit:
- netdev_err(bp->dev, "bnxt_setup_msix err: %x\n", rc);
+ netdev_err(bp->dev, "bnxt_init_msix err: %x\n", rc);
+ kfree(bp->irq_tbl);
+ bp->irq_tbl = NULL;
pci_disable_msix(bp->pdev);
kfree(msix_ent);
return rc;
}
-static int bnxt_setup_inta(struct bnxt *bp)
+static int bnxt_init_inta(struct bnxt *bp)
{
- int rc;
- const int len = sizeof(bp->irq_tbl[0].name);
-
- if (netdev_get_num_tc(bp->dev))
- netdev_reset_tc(bp->dev);
-
bp->irq_tbl = kcalloc(1, sizeof(struct bnxt_irq), GFP_KERNEL);
- if (!bp->irq_tbl) {
- rc = -ENOMEM;
- return rc;
- }
+ if (!bp->irq_tbl)
+ return -ENOMEM;
+
+ bp->total_irqs = 1;
bp->rx_nr_rings = 1;
bp->tx_nr_rings = 1;
bp->cp_nr_rings = 1;
bp->tx_nr_rings_per_tc = bp->tx_nr_rings;
bp->flags |= BNXT_FLAG_SHARED_RINGS;
bp->irq_tbl[0].vector = bp->pdev->irq;
- snprintf(bp->irq_tbl[0].name, len,
- "%s-%s-%d", bp->dev->name, "TxRx", 0);
- bp->irq_tbl[0].handler = bnxt_inta;
- rc = bnxt_set_real_num_queues(bp);
- return rc;
+ return 0;
}
-static int bnxt_setup_int_mode(struct bnxt *bp)
+static int bnxt_init_int_mode(struct bnxt *bp)
{
int rc = 0;
if (bp->flags & BNXT_FLAG_MSIX_CAP)
- rc = bnxt_setup_msix(bp);
+ rc = bnxt_init_msix(bp);
if (!(bp->flags & BNXT_FLAG_USING_MSIX) && BNXT_PF(bp)) {
/* fallback to INTA */
- rc = bnxt_setup_inta(bp);
+ rc = bnxt_init_inta(bp);
}
return rc;
}
+static void bnxt_clear_int_mode(struct bnxt *bp)
+{
+ if (bp->flags & BNXT_FLAG_USING_MSIX)
+ pci_disable_msix(bp->pdev);
+
+ kfree(bp->irq_tbl);
+ bp->irq_tbl = NULL;
+ bp->flags &= ~BNXT_FLAG_USING_MSIX;
+}
+
static void bnxt_free_irq(struct bnxt *bp)
{
struct bnxt_irq *irq;
@@ -4902,10 +4943,6 @@ static void bnxt_free_irq(struct bnxt *bp)
free_irq(irq->vector, bp->bnapi[i]);
irq->requested = 0;
}
- if (bp->flags & BNXT_FLAG_USING_MSIX)
- pci_disable_msix(bp->pdev);
- kfree(bp->irq_tbl);
- bp->irq_tbl = NULL;
}
static int bnxt_request_irq(struct bnxt *bp)
@@ -6695,6 +6732,7 @@ static void bnxt_remove_one(struct pci_dev *pdev)
cancel_work_sync(&bp->sp_task);
bp->sp_event = 0;
+ bnxt_clear_int_mode(bp);
bnxt_hwrm_func_drv_unrgtr(bp);
bnxt_free_hwrm_resources(bp);
bnxt_dcb_free(bp);
@@ -6990,10 +7028,14 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (rc)
goto init_err;
- rc = register_netdev(dev);
+ rc = bnxt_init_int_mode(bp);
if (rc)
goto init_err;
+ rc = register_netdev(dev);
+ if (rc)
+ goto init_err_clr_int;
+
netdev_info(dev, "%s found at mem %lx, node addr %pM\n",
board_info[ent->driver_data].name,
(long)pci_resource_start(pdev, 0), dev->dev_addr);
@@ -7002,6 +7044,9 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
return 0;
+init_err_clr_int:
+ bnxt_clear_int_mode(bp);
+
init_err:
pci_iounmap(pdev, bp->bar0);
pci_release_regions(pdev);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 8327d0d..1461355 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1024,6 +1024,7 @@ struct bnxt {
#define BNXT_STATE_FN_RST_DONE 2
struct bnxt_irq *irq_tbl;
+ int total_irqs;
u8 mac_addr[ETH_ALEN];
#ifdef CONFIG_BNXT_DCB
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH net-next 1/7] bnxt_en: Add bnxt_set_max_func_irqs().
From: Michael Chan @ 2016-12-06 17:09 UTC (permalink / raw)
To: davem; +Cc: netdev, selvin.xavier, somnath.kotur, dledford, linux-rdma
In-Reply-To: <1481044178-25193-1-git-send-email-michael.chan@broadcom.com>
By refactoring existing code into this new function. The new function
will be used in subsequent patches.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 17 +++++++++++------
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 +
2 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index e8ab5fd..6cdfe3e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4743,6 +4743,16 @@ static int bnxt_trim_rings(struct bnxt *bp, int *rx, int *tx, int max,
return 0;
}
+void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max_irqs)
+{
+ if (BNXT_PF(bp))
+ bp->pf.max_irqs = max_irqs;
+#if defined(CONFIG_BNXT_SRIOV)
+ else
+ bp->vf.max_irqs = max_irqs;
+#endif
+}
+
static int bnxt_setup_msix(struct bnxt *bp)
{
struct msix_entry *msix_ent;
@@ -6949,12 +6959,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
bnxt_set_tpa_flags(bp);
bnxt_set_ring_params(bp);
- if (BNXT_PF(bp))
- bp->pf.max_irqs = max_irqs;
-#if defined(CONFIG_BNXT_SRIOV)
- else
- bp->vf.max_irqs = max_irqs;
-#endif
+ bnxt_set_max_func_irqs(bp, max_irqs);
bnxt_set_dflt_rings(bp);
/* Default RSS hash cfg. */
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index b4abc1b..8327d0d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1235,6 +1235,7 @@ static inline void bnxt_disable_poll(struct bnxt_napi *bnapi)
int hwrm_send_message_silent(struct bnxt *, void *, u32, int);
int bnxt_hwrm_set_coal(struct bnxt *);
int bnxt_hwrm_func_qcaps(struct bnxt *);
+void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max);
void bnxt_tx_disable(struct bnxt *bp);
void bnxt_tx_enable(struct bnxt *bp);
int bnxt_hwrm_set_pause(struct bnxt *);
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 0/7] bnxt_en: Add interface to support RDMA driver.
From: Michael Chan @ 2016-12-06 17:09 UTC (permalink / raw)
To: davem; +Cc: netdev, selvin.xavier, somnath.kotur, dledford, linux-rdma
This series adds an interface to support a brand new RDMA driver bnxt_re.
The first step is to re-arrange some code so that pci_enable_msix() can
be called during pci probe. The purpose is to allow the RDMA driver to
initialize and stay initialized whether the netdev is up or down.
Then we make some changes to VF resource allocation so that there is
enough resources to support RDMA.
Finally the last patch adds a simple interface to allow the RDMA driver to
probe and register itself with any bnxt_en devices that support RDMA.
Once registered, the RDMA driver can request MSIX, send fw messages, and
receive some notifications.
David, please consider this series for net-next. Thanks.
Michael Chan (7):
bnxt_en: Add bnxt_set_max_func_irqs().
bnxt_en: Enable MSIX early in bnxt_init_one().
bnxt_en: Move function reset to bnxt_init_one().
bnxt_en: Improve completion ring allocation for VFs.
bnxt_en: Reserve RDMA resources by default.
bnxt_en: Refactor the driver registration function with firmware.
bnxt_en: Add interface to support RDMA driver.
drivers/net/ethernet/broadcom/bnxt/Makefile | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 360 +++++++++++++++++-------
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 22 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 14 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 346 +++++++++++++++++++++++
drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h | 93 ++++++
6 files changed, 722 insertions(+), 115 deletions(-)
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
--
1.8.3.1
^ permalink raw reply
* Re: [PATCH] net: return value of skb_linearize should be handled in Linux kernel
From: Yuval Shaia @ 2016-12-06 16:57 UTC (permalink / raw)
To: Zhouyi Zhou
Cc: faisal.latif-ral2JQCrhuEAvxtiuMwx3w,
dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w,
jeffrey.t.kirsher-ral2JQCrhuEAvxtiuMwx3w,
QLogic-Storage-Upstream-h88ZbnxC6KDQT0dZR+AlfA,
jejb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
martin.petersen-QHcLZuEGTsvQT0dZR+AlfA,
jth-DgEjT+Ai2ygdnm+yROfE0A, jon.maloy-IzeFyvvaP7pWk0Htik3J/w,
ying.xue-CWA4WttNNZF54TAoqtyWWQ, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
intel-wired-lan-qjLDD68F18P21nG7glBr7A,
netdev-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA,
fcoe-devel-s9riP+hp16TNLxjTenLetw,
tipc-discussion-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
In-Reply-To: <1481008233-16777-1-git-send-email-zhouzhouyi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
On Tue, Dec 06, 2016 at 03:10:33PM +0800, Zhouyi Zhou wrote:
> kmalloc_reserve may fail to allocate memory inside skb_linearize,
> which means skb_linearize's return value should not be ignored.
> Following patch correct the uses of skb_linearize.
>
> Compiled in x86_64
FWIW compiled also on SPARC
Reviewed-by: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>
> Signed-off-by: Zhouyi Zhou <zhouzhouyi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> drivers/infiniband/hw/nes/nes_nic.c | 5 +++--
> drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c | 6 +++++-
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 +--
> drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 7 +++++--
> drivers/scsi/fcoe/fcoe.c | 5 ++++-
> net/tipc/link.c | 3 ++-
> net/tipc/name_distr.c | 5 ++++-
> 7 files changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
> index 2b27d13..69372ea 100644
> --- a/drivers/infiniband/hw/nes/nes_nic.c
> +++ b/drivers/infiniband/hw/nes/nes_nic.c
> @@ -662,10 +662,11 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
> nesnic->sq_head &= nesnic->sq_size-1;
> }
> } else {
> - nesvnic->linearized_skbs++;
> hoffset = skb_transport_header(skb) - skb->data;
> nhoffset = skb_network_header(skb) - skb->data;
> - skb_linearize(skb);
> + if (skb_linearize(skb))
> + return NETDEV_TX_BUSY;
> + nesvnic->linearized_skbs++;
> skb_set_transport_header(skb, hoffset);
> skb_set_network_header(skb, nhoffset);
> if (!nes_nic_send(skb, netdev))
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
> index 2a653ec..ab787cb 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
> @@ -490,7 +490,11 @@ int ixgbe_fcoe_ddp(struct ixgbe_adapter *adapter,
> */
> if ((fh->fh_r_ctl == FC_RCTL_DD_SOL_DATA) &&
> (fctl & FC_FC_END_SEQ)) {
> - skb_linearize(skb);
> + int err = 0;
> +
> + err = skb_linearize(skb);
> + if (err)
> + return err;
> crc = (struct fcoe_crc_eof *)skb_put(skb, sizeof(*crc));
> crc->fcoe_eof = FC_EOF_T;
> }
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index fee1f29..4926d48 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -2173,8 +2173,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
> total_rx_bytes += ddp_bytes;
> total_rx_packets += DIV_ROUND_UP(ddp_bytes,
> mss);
> - }
> - if (!ddp_bytes) {
> + } else {
> dev_kfree_skb_any(skb);
> continue;
> }
> diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
> index f9ddb61..197d02e 100644
> --- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
> +++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
> @@ -542,8 +542,11 @@ static void bnx2fc_recv_frame(struct sk_buff *skb)
> return;
> }
>
> - if (skb_is_nonlinear(skb))
> - skb_linearize(skb);
> + if (skb_linearize(skb)) {
> + kfree_skb(skb);
> + return;
> + }
> +
> mac = eth_hdr(skb)->h_source;
> dest_mac = eth_hdr(skb)->h_dest;
>
> diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
> index 9bd41a3..f691b97 100644
> --- a/drivers/scsi/fcoe/fcoe.c
> +++ b/drivers/scsi/fcoe/fcoe.c
> @@ -1685,7 +1685,10 @@ static void fcoe_recv_frame(struct sk_buff *skb)
> skb->dev ? skb->dev->name : "<NULL>");
>
> port = lport_priv(lport);
> - skb_linearize(skb); /* check for skb_is_nonlinear is within skb_linearize */
> + if (skb_linearize(skb)) {
> + kfree_skb(skb);
> + return;
> + }
>
> /*
> * Frame length checks and setting up the header pointers
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index bda89bf..077c570 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -1446,7 +1446,8 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb,
> if (tipc_own_addr(l->net) > msg_prevnode(hdr))
> l->net_plane = msg_net_plane(hdr);
>
> - skb_linearize(skb);
> + if (skb_linearize(skb))
> + goto exit;
> hdr = buf_msg(skb);
> data = msg_data(hdr);
>
> diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
> index c1cfd92..4e05d2a 100644
> --- a/net/tipc/name_distr.c
> +++ b/net/tipc/name_distr.c
> @@ -356,7 +356,10 @@ void tipc_named_rcv(struct net *net, struct sk_buff_head *inputq)
>
> spin_lock_bh(&tn->nametbl_lock);
> for (skb = skb_dequeue(inputq); skb; skb = skb_dequeue(inputq)) {
> - skb_linearize(skb);
> + if (skb_linearize(skb)) {
> + kfree_skb(skb);
> + continue;
> + }
> msg = buf_msg(skb);
> mtype = msg_type(msg);
> item = (struct distr_item *)msg_data(msg);
> --
> 1.9.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox