* Re: [PATCH v4 7/8] ib_isert: log the connection reject message
From: Sagi Grimberg @ 2016-10-26 8:30 UTC (permalink / raw)
To: Steve Wise, dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, hch-jcswGhMUV9g,
axboe-b10kYP2dOMg, santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <7baca6d0f38f9556d8b6a14a19b4c2d505503219.1477426743.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Acked-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 0/3] iopmem : A block device for PCIe memory
From: Haggai Eran @ 2016-10-26 8:24 UTC (permalink / raw)
To: Dan Williams, Stephen Bates
Cc: jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
sbates-Rgftl6RXld5BDgjK7y7TUQ, Raj, Ashok,
linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, David Woodhouse,
Jonathan Corbet,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
jim.macdonald-FgSLVYC75IpWk0Htik3J/w,
linux-block-u79uwXL29TY76Z2rM5mHXA, Linux MM, Jens Axboe,
Christoph Hellwig
In-Reply-To: <CAPcyv4gJ_c-6s2BUjsu6okR1EF53R+KNuXnOc5jv0fuwJaa3cQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 10/19/2016 6:51 AM, Dan Williams wrote:
> On Tue, Oct 18, 2016 at 2:42 PM, Stephen Bates <sbates-pv7U853sEMVWk0Htik3J/w@public.gmane.org> wrote:
>> 1. Address Translation. Suggestions have been made that in certain
>> architectures and topologies the dma_addr_t passed to the DMA master
>> in a peer-2-peer transfer will not correctly route to the IO memory
>> intended. However in our testing to date we have not seen this to be
>> an issue, even in systems with IOMMUs and PCIe switches. It is our
>> understanding that an IOMMU only maps system memory and would not
>> interfere with device memory regions.
I'm not sure that's the case. I think it works because with ZONE_DEVICE,
the iommu driver will simply treat a dma_map_page call as any other PFN,
and create a mapping as it does for any memory page.
>> (It certainly has no opportunity
>> to do so if the transfer gets routed through a switch).
It can still go through the IOMMU if you enable ACS upstream forwarding.
> There may still be platforms where peer-to-peer cycles are routed up
> through the root bridge and then back down to target device, but we
> can address that when / if it happens.
I agree.
> I wonder if we could (ab)use a
> software-defined 'pasid' as the requester id for a peer-to-peer
> mapping that needs address translation.
Why would you need that? Isn't it enough to map the peer-to-peer
addresses correctly in the iommu driver?
Haggai
^ permalink raw reply
* [PATCH v3 9/9] IB/mlx5: Simplify completion into a wait_event
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Convert the completion 'mlx5_ib_umr_context:done' to a wait_event as it
just waits for the return value to be filled.
Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 +-
drivers/infiniband/hw/mlx5/mr.c | 7 ++++---
include/rdma/ib_verbs.h | 3 ++-
3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index dcdcd19..af682c5 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -524,7 +524,7 @@ struct mlx5_ib_mw {
struct mlx5_ib_umr_context {
struct ib_cqe cqe;
enum ib_wc_status status;
- struct completion done;
+ wait_queue_head_t wq;
};
struct umr_common {
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 19c292a..5228459 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -846,14 +846,14 @@ static void mlx5_ib_umr_done(struct ib_cq *cq, struct ib_wc *wc)
container_of(wc->wr_cqe, struct mlx5_ib_umr_context, cqe);
context->status = wc->status;
- complete(&context->done);
+ wake_up(&context->wq);
}
static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context *context)
{
context->cqe.done = mlx5_ib_umr_done;
context->status = -1;
- init_completion(&context->done);
+ init_waitqueue_head(&context->wq);
}
static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
@@ -866,13 +866,14 @@ static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
mlx5_ib_init_umr_context(&umr_context);
umrwr->wr.wr_cqe = &umr_context.cqe;
+ umr_context.status = IB_WC_STATUS_NONE;
down(&umrc->sem);
err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
if (err) {
mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
} else {
- wait_for_completion(&umr_context.done);
+ wait_event(umr_context.wq, umr_context.status != IB_WC_STATUS_NONE);
if (umr_context.status != IB_WC_SUCCESS) {
mlx5_ib_warn(dev, "reg umr failed (%u)\n",
umr_context.status);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5ad43a4..994d967 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -844,7 +844,8 @@ enum ib_wc_status {
IB_WC_INV_EEC_STATE_ERR,
IB_WC_FATAL_ERR,
IB_WC_RESP_TIMEOUT_ERR,
- IB_WC_GENERAL_ERR
+ IB_WC_GENERAL_ERR,
+ IB_WC_STATUS_NONE
};
const char *__attribute_const__ ib_wc_status_msg(enum ib_wc_status status);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v3 8/9] IB/mlx5: Add helper mlx5_ib_post_send_wait
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Clean up the following common code (to post a list of work requests to the
send queue of the specified QP) at various places and add a helper function
'mlx5_ib_post_send_wait' to implement the same.
- Initialize 'mlx5_ib_umr_context' on stack
- Assign "mlx5_umr_wr:wr:wr_cqe to umr_context.cqe
- Acquire the semaphore
- call ib_post_send with a single ib_send_wr
- wait_for_completion()
- Check for umr_context.status
- Release the semaphore
As semaphores are going away in the future, moving all of these into the
shared helper leaves only a single function using the semaphore, which
can then be rewritten to use something else.
Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
drivers/infiniband/hw/mlx5/mr.c | 115 +++++++++++-----------------------------
1 file changed, 32 insertions(+), 83 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index d4ad672..19c292a 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -856,16 +856,40 @@ static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context *context)
init_completion(&context->done);
}
+static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
+ struct mlx5_umr_wr *umrwr)
+{
+ struct umr_common *umrc = &dev->umrc;
+ struct ib_send_wr *bad;
+ int err;
+ struct mlx5_ib_umr_context umr_context;
+
+ mlx5_ib_init_umr_context(&umr_context);
+ umrwr->wr.wr_cqe = &umr_context.cqe;
+
+ down(&umrc->sem);
+ err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
+ if (err) {
+ mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
+ } else {
+ wait_for_completion(&umr_context.done);
+ if (umr_context.status != IB_WC_SUCCESS) {
+ mlx5_ib_warn(dev, "reg umr failed (%u)\n",
+ umr_context.status);
+ err = -EFAULT;
+ }
+ }
+ up(&umrc->sem);
+ return err;
+}
+
static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
u64 virt_addr, u64 len, int npages,
int page_shift, int order, int access_flags)
{
struct mlx5_ib_dev *dev = to_mdev(pd->device);
struct device *ddev = dev->ib_dev.dma_device;
- struct umr_common *umrc = &dev->umrc;
- struct mlx5_ib_umr_context umr_context;
struct mlx5_umr_wr umrwr = {};
- struct ib_send_wr *bad;
struct mlx5_ib_mr *mr;
struct ib_sge sg;
int size;
@@ -894,24 +918,12 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
if (err)
goto free_mr;
- mlx5_ib_init_umr_context(&umr_context);
-
- umrwr.wr.wr_cqe = &umr_context.cqe;
prep_umr_reg_wqe(pd, &umrwr.wr, &sg, dma, npages, mr->mmkey.key,
page_shift, virt_addr, len, access_flags);
- down(&umrc->sem);
- err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
- if (err) {
- mlx5_ib_warn(dev, "post send failed, err %d\n", err);
+ err = mlx5_ib_post_send_wait(dev, &umrwr);
+ if (err != -EFAULT)
goto unmap_dma;
- } else {
- wait_for_completion(&umr_context.done);
- if (umr_context.status != IB_WC_SUCCESS) {
- mlx5_ib_warn(dev, "reg umr failed\n");
- err = -EFAULT;
- }
- }
mr->mmkey.iova = virt_addr;
mr->mmkey.size = len;
@@ -920,7 +932,6 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
mr->live = 1;
unmap_dma:
- up(&umrc->sem);
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
kfree(mr_pas);
@@ -940,13 +951,10 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
{
struct mlx5_ib_dev *dev = mr->dev;
struct device *ddev = dev->ib_dev.dma_device;
- struct umr_common *umrc = &dev->umrc;
- struct mlx5_ib_umr_context umr_context;
struct ib_umem *umem = mr->umem;
int size;
__be64 *pas;
dma_addr_t dma;
- struct ib_send_wr *bad;
struct mlx5_umr_wr wr;
struct ib_sge sg;
int err = 0;
@@ -1011,10 +1019,7 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
dma_sync_single_for_device(ddev, dma, size, DMA_TO_DEVICE);
- mlx5_ib_init_umr_context(&umr_context);
-
memset(&wr, 0, sizeof(wr));
- wr.wr.wr_cqe = &umr_context.cqe;
sg.addr = dma;
sg.length = ALIGN(npages * sizeof(u64),
@@ -1031,19 +1036,7 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
wr.mkey = mr->mmkey.key;
wr.target.offset = start_page_index;
- down(&umrc->sem);
- err = ib_post_send(umrc->qp, &wr.wr, &bad);
- if (err) {
- mlx5_ib_err(dev, "UMR post send failed, err %d\n", err);
- } else {
- wait_for_completion(&umr_context.done);
- if (umr_context.status != IB_WC_SUCCESS) {
- mlx5_ib_err(dev, "UMR completion failed, code %d\n",
- umr_context.status);
- err = -EFAULT;
- }
- }
- up(&umrc->sem);
+ err = mlx5_ib_post_send_wait(dev, &wr);
}
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
@@ -1210,39 +1203,14 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
{
struct mlx5_core_dev *mdev = dev->mdev;
- struct umr_common *umrc = &dev->umrc;
- struct mlx5_ib_umr_context umr_context;
struct mlx5_umr_wr umrwr = {};
- struct ib_send_wr *bad;
- int err;
if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
return 0;
- mlx5_ib_init_umr_context(&umr_context);
-
- umrwr.wr.wr_cqe = &umr_context.cqe;
prep_umr_unreg_wqe(dev, &umrwr.wr, mr->mmkey.key);
- down(&umrc->sem);
- err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
- if (err) {
- up(&umrc->sem);
- mlx5_ib_dbg(dev, "err %d\n", err);
- goto error;
- } else {
- wait_for_completion(&umr_context.done);
- up(&umrc->sem);
- }
- if (umr_context.status != IB_WC_SUCCESS) {
- mlx5_ib_warn(dev, "unreg umr failed\n");
- err = -EFAULT;
- goto error;
- }
- return 0;
-
-error:
- return err;
+ return mlx5_ib_post_send_wait(dev, &umrwr);
}
static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
@@ -1251,19 +1219,13 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
{
struct mlx5_ib_dev *dev = to_mdev(pd->device);
struct device *ddev = dev->ib_dev.dma_device;
- struct mlx5_ib_umr_context umr_context;
- struct ib_send_wr *bad;
struct mlx5_umr_wr umrwr = {};
struct ib_sge sg;
- struct umr_common *umrc = &dev->umrc;
dma_addr_t dma = 0;
__be64 *mr_pas = NULL;
int size;
int err;
- mlx5_ib_init_umr_context(&umr_context);
-
- umrwr.wr.wr_cqe = &umr_context.cqe;
umrwr.wr.send_flags = MLX5_IB_SEND_UMR_FAIL_IF_FREE;
if (flags & IB_MR_REREG_TRANS) {
@@ -1291,21 +1253,8 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
}
/* post send request to UMR QP */
- down(&umrc->sem);
- err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
+ err = mlx5_ib_post_send_wait(dev, &umrwr);
- if (err) {
- mlx5_ib_warn(dev, "post send failed, err %d\n", err);
- } else {
- wait_for_completion(&umr_context.done);
- if (umr_context.status != IB_WC_SUCCESS) {
- mlx5_ib_warn(dev, "reg umr failed (%u)\n",
- umr_context.status);
- err = -EFAULT;
- }
- }
-
- up(&umrc->sem);
if (flags & IB_MR_REREG_TRANS) {
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
kfree(mr_pas);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v3 7/9] IB/mthca: Replace counting semaphore event_sem with wait_event
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan@linaro.org>
Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with a conditional wait_event.
Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
drivers/infiniband/hw/mthca/mthca_cmd.c | 45 ++++++++++++++++++++++-----------
drivers/infiniband/hw/mthca/mthca_dev.h | 3 ++-
2 files changed, 32 insertions(+), 16 deletions(-)
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index 49c6e19..fab5509 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -405,6 +405,32 @@ void mthca_cmd_event(struct mthca_dev *dev,
complete(&context->done);
}
+static inline struct mthca_cmd_context *
+mthca_try_get_context(struct mthca_cmd *cmd)
+{
+ struct mthca_cmd_context *context = NULL;
+ spin_lock(&cmd->context_lock);
+
+ if (cmd->free_head < 0)
+ goto out;
+
+ context = &cmd->context[cmd->free_head];
+ context->token += cmd->token_mask + 1;
+ cmd->free_head = context->next;
+out:
+ spin_unlock(&cmd->context_lock);
+ return context;
+}
+
+/* wait for and acquire a free context */
+static inline struct mthca_cmd_context *
+mthca_get_free_context(struct mthca_cmd *cmd)
+{
+ struct mthca_cmd_context *context;
+ wait_event(cmd->wq, (context = mthca_try_get_context(cmd)));
+ return context;
+}
+
static int mthca_cmd_wait(struct mthca_dev *dev,
u64 in_param,
u64 *out_param,
@@ -417,15 +443,7 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
int err = 0;
struct mthca_cmd_context *context;
- down(&dev->cmd.event_sem);
-
- spin_lock(&dev->cmd.context_lock);
- BUG_ON(dev->cmd.free_head < 0);
- context = &dev->cmd.context[dev->cmd.free_head];
- context->token += dev->cmd.token_mask + 1;
- dev->cmd.free_head = context->next;
- spin_unlock(&dev->cmd.context_lock);
-
+ context = mthca_get_free_context(&dev->cmd);
init_completion(&context->done);
err = mthca_cmd_post(dev, in_param,
@@ -458,8 +476,8 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
context->next = dev->cmd.free_head;
dev->cmd.free_head = context - dev->cmd.context;
spin_unlock(&dev->cmd.context_lock);
+ wake_up(&dev->cmd.wq);
- up(&dev->cmd.event_sem);
return err;
}
@@ -571,7 +589,7 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
dev->cmd.context[dev->cmd.max_cmds - 1].next = -1;
dev->cmd.free_head = 0;
- sema_init(&dev->cmd.event_sem, dev->cmd.max_cmds);
+ init_waitqueue_head(&dev->cmd.wq);
spin_lock_init(&dev->cmd.context_lock);
for (dev->cmd.token_mask = 1;
@@ -590,12 +608,9 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
*/
void mthca_cmd_use_polling(struct mthca_dev *dev)
{
- int i;
-
dev->cmd.flags &= ~MTHCA_CMD_USE_EVENTS;
- for (i = 0; i < dev->cmd.max_cmds; ++i)
- down(&dev->cmd.event_sem);
+ dev->cmd.free_head = -1;
kfree(dev->cmd.context);
}
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 87ab964..2fc86db 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -46,6 +46,7 @@
#include <linux/list.h>
#include <linux/semaphore.h>
+#include <rdma/ib_sa.h>
#include "mthca_provider.h"
#include "mthca_doorbell.h"
@@ -121,7 +122,7 @@ struct mthca_cmd {
struct pci_pool *pool;
struct mutex hcr_mutex;
struct mutex poll_mutex;
- struct semaphore event_sem;
+ wait_queue_head_t wq;
int max_cmds;
spinlock_t context_lock;
int free_head;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
^ permalink raw reply related
* [PATCH v3 6/9] IB/hns: Replace counting semaphore event_sem with wait_event
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan@linaro.org>
Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with a conditional wait_event.
Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
drivers/infiniband/hw/hns/hns_roce_cmd.c | 44 ++++++++++++++++++++---------
drivers/infiniband/hw/hns/hns_roce_device.h | 2 +-
2 files changed, 31 insertions(+), 15 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 51a0675..e3025ff 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -189,6 +189,32 @@ void hns_roce_cmd_event(struct hns_roce_dev *hr_dev, u16 token, u8 status,
complete(&context->done);
}
+static inline struct hns_roce_cmd_context *
+hns_roce_try_get_context(struct hns_roce_cmdq *cmd)
+{
+ struct hns_roce_cmd_context *context = NULL;
+ spin_lock(&cmd->context_lock);
+
+ if (cmd->free_head < 0)
+ goto out;
+
+ context = &cmd->context[cmd->free_head];
+ context->token += cmd->token_mask + 1;
+ cmd->free_head = context->next;
+out:
+ spin_unlock(&cmd->context_lock);
+ return context;
+}
+
+/* wait for and acquire a free context */
+static inline struct hns_roce_cmd_context *
+hns_roce_get_free_context(struct hns_roce_cmdq *cmd)
+{
+ struct hns_roce_cmd_context *context;
+ wait_event(cmd->wq, (context = hns_roce_try_get_context(cmd)));
+ return context;
+}
+
/* this should be called with "use_events" */
static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
u64 out_param, unsigned long in_modifier,
@@ -200,13 +226,7 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
struct hns_roce_cmd_context *context;
int ret = 0;
- spin_lock(&cmd->context_lock);
- WARN_ON(cmd->free_head < 0);
- context = &cmd->context[cmd->free_head];
- context->token += cmd->token_mask + 1;
- cmd->free_head = context->next;
- spin_unlock(&cmd->context_lock);
-
+ context = hns_roce_get_free_context(cmd);
init_completion(&context->done);
ret = hns_roce_cmd_mbox_post_hw(hr_dev, in_param, out_param,
@@ -238,6 +258,7 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
context->next = cmd->free_head;
cmd->free_head = context - cmd->context;
spin_unlock(&cmd->context_lock);
+ wake_up(&cmd->wq);
return ret;
}
@@ -248,10 +269,8 @@ static int hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
{
int ret = 0;
- down(&hr_dev->cmd.event_sem);
ret = __hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
in_modifier, op_modifier, op, timeout);
- up(&hr_dev->cmd.event_sem);
return ret;
}
@@ -313,7 +332,7 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
hr_cmd->context[hr_cmd->max_cmds - 1].next = -1;
hr_cmd->free_head = 0;
- sema_init(&hr_cmd->event_sem, hr_cmd->max_cmds);
+ init_waitqueue_head(&hr_cmd->wq);
spin_lock_init(&hr_cmd->context_lock);
hr_cmd->token_mask = CMD_TOKEN_MASK;
@@ -325,12 +344,9 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
{
struct hns_roce_cmdq *hr_cmd = &hr_dev->cmd;
- int i;
hr_cmd->use_events = 0;
-
- for (i = 0; i < hr_cmd->max_cmds; ++i)
- down(&hr_cmd->event_sem);
+ hr_cmd->free_head = -1;
kfree(hr_cmd->context);
}
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 2afe075..ac95f52 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -364,7 +364,7 @@ struct hns_roce_cmdq {
* Event mode: cmd register mutex protection,
* ensure to not exceed max_cmds and user use limit region
*/
- struct semaphore event_sem;
+ wait_queue_head_t wq;
int max_cmds;
spinlock_t context_lock;
int free_head;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
^ permalink raw reply related
* [PATCH v3 5/9] IB/isert: Replace semaphore sem with completion
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
The semaphore 'sem' in isert_device is used as completion, so convert
it to struct completion. Semaphores are going away in the future.
Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
drivers/infiniband/ulp/isert/ib_isert.c | 6 +++---
drivers/infiniband/ulp/isert/ib_isert.h | 3 ++-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 6dd43f6..de80f56 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -619,7 +619,7 @@
mutex_unlock(&isert_np->mutex);
isert_info("np %p: Allow accept_np to continue\n", isert_np);
- up(&isert_np->sem);
+ complete(&isert_np->comp);
}
static void
@@ -2311,7 +2311,7 @@ struct rdma_cm_id *
isert_err("Unable to allocate struct isert_np\n");
return -ENOMEM;
}
- sema_init(&isert_np->sem, 0);
+ init_completion(&isert_np->comp);
mutex_init(&isert_np->mutex);
INIT_LIST_HEAD(&isert_np->accepted);
INIT_LIST_HEAD(&isert_np->pending);
@@ -2427,7 +2427,7 @@ struct rdma_cm_id *
int ret;
accept_wait:
- ret = down_interruptible(&isert_np->sem);
+ ret = wait_for_completion_interruptible(&isert_np->comp);
if (ret)
return -ENODEV;
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h
index c02ada5..a1277c0 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -3,6 +3,7 @@
#include <linux/in6.h>
#include <rdma/ib_verbs.h>
#include <rdma/rdma_cm.h>
+#include <linux/completion.h>
#include <rdma/rw.h>
#include <scsi/iser.h>
@@ -190,7 +191,7 @@ struct isert_device {
struct isert_np {
struct iscsi_np *np;
- struct semaphore sem;
+ struct completion comp;
struct rdma_cm_id *cm_id;
struct mutex mutex;
struct list_head accepted;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v3 4/9] IB/mthca: Replace semaphore poll_sem with mutex
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan@linaro.org>
The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future. So replace the semaphore 'poll_sem'
with a mutex. Also, remove mutex_[un]lock from mthca_cmd_use_events and
mthca_cmd_use_polling respectively.
Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
drivers/infiniband/hw/mthca/mthca_cmd.c | 10 +++-------
drivers/infiniband/hw/mthca/mthca_cmd.h | 1 +
drivers/infiniband/hw/mthca/mthca_dev.h | 2 +-
3 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index c7f49bb..49c6e19 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -347,7 +347,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
unsigned long end;
u8 status;
- down(&dev->cmd.poll_sem);
+ mutex_lock(&dev->cmd.poll_mutex);
err = mthca_cmd_post(dev, in_param,
out_param ? *out_param : 0,
@@ -382,7 +382,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
}
out:
- up(&dev->cmd.poll_sem);
+ mutex_unlock(&dev->cmd.poll_mutex);
return err;
}
@@ -520,7 +520,7 @@ static int mthca_cmd_imm(struct mthca_dev *dev,
int mthca_cmd_init(struct mthca_dev *dev)
{
mutex_init(&dev->cmd.hcr_mutex);
- sema_init(&dev->cmd.poll_sem, 1);
+ mutex_init(&dev->cmd.poll_mutex);
dev->cmd.flags = 0;
dev->hcr = ioremap(pci_resource_start(dev->pdev, 0) + MTHCA_HCR_BASE,
@@ -582,8 +582,6 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
dev->cmd.flags |= MTHCA_CMD_USE_EVENTS;
- down(&dev->cmd.poll_sem);
-
return 0;
}
@@ -600,8 +598,6 @@ void mthca_cmd_use_polling(struct mthca_dev *dev)
down(&dev->cmd.event_sem);
kfree(dev->cmd.context);
-
- up(&dev->cmd.poll_sem);
}
struct mthca_mailbox *mthca_alloc_mailbox(struct mthca_dev *dev,
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.h b/drivers/infiniband/hw/mthca/mthca_cmd.h
index d2e5b19..a7f197e 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.h
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.h
@@ -35,6 +35,7 @@
#ifndef MTHCA_CMD_H
#define MTHCA_CMD_H
+#include <linux/mutex.h>
#include <rdma/ib_verbs.h>
#define MTHCA_MAILBOX_SIZE 4096
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 4393a02..87ab964 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -120,7 +120,7 @@ enum {
struct mthca_cmd {
struct pci_pool *pool;
struct mutex hcr_mutex;
- struct semaphore poll_sem;
+ struct mutex poll_mutex;
struct semaphore event_sem;
int max_cmds;
spinlock_t context_lock;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
^ permalink raw reply related
* [PATCH v3 3/9] IB/hns: Replace semaphore poll_sem with mutex
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan@linaro.org>
The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future. So replace the semaphore 'poll_sem'
with a mutex. Also, remove mutex_[un]lock from mthca_cmd_use_events and
mthca_cmd_use_polling respectively.
Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
drivers/infiniband/hw/hns/hns_roce_cmd.c | 11 ++++-------
drivers/infiniband/hw/hns/hns_roce_device.h | 3 ++-
2 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 2a0b6c0..51a0675 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -119,7 +119,7 @@ static int hns_roce_cmd_mbox_post_hw(struct hns_roce_dev *hr_dev, u64 in_param,
return ret;
}
-/* this should be called with "poll_sem" */
+/* this should be called with "poll_mutex" */
static int __hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
u64 out_param, unsigned long in_modifier,
u8 op_modifier, u16 op,
@@ -167,10 +167,10 @@ static int hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
{
int ret;
- down(&hr_dev->cmd.poll_sem);
+ mutex_lock(&hr_dev->cmd.poll_mutex);
ret = __hns_roce_cmd_mbox_poll(hr_dev, in_param, out_param, in_modifier,
op_modifier, op, timeout);
- up(&hr_dev->cmd.poll_sem);
+ mutex_unlock(&hr_dev->cmd.poll_mutex);
return ret;
}
@@ -275,7 +275,7 @@ int hns_roce_cmd_init(struct hns_roce_dev *hr_dev)
struct device *dev = &hr_dev->pdev->dev;
mutex_init(&hr_dev->cmd.hcr_mutex);
- sema_init(&hr_dev->cmd.poll_sem, 1);
+ mutex_init(&hr_dev->cmd.poll_mutex);
hr_dev->cmd.use_events = 0;
hr_dev->cmd.toggle = 1;
hr_dev->cmd.max_cmds = CMD_MAX_NUM;
@@ -319,8 +319,6 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
hr_cmd->token_mask = CMD_TOKEN_MASK;
hr_cmd->use_events = 1;
- down(&hr_cmd->poll_sem);
-
return 0;
}
@@ -335,7 +333,6 @@ void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
down(&hr_cmd->event_sem);
kfree(hr_cmd->context);
- up(&hr_cmd->poll_sem);
}
struct hns_roce_cmd_mailbox
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 3417315..2afe075 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -34,6 +34,7 @@
#define _HNS_ROCE_DEVICE_H
#include <rdma/ib_verbs.h>
+#include <linux/mutex.h>
#define DRV_NAME "hns_roce"
@@ -358,7 +359,7 @@ struct hns_roce_cmdq {
struct dma_pool *pool;
u8 __iomem *hcr;
struct mutex hcr_mutex;
- struct semaphore poll_sem;
+ struct mutex poll_mutex;
/*
* Event mode: cmd register mutex protection,
* ensure to not exceed max_cmds and user use limit region
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
^ permalink raw reply related
* [PATCH v3 2/9] IB/core: Replace semaphore sm_sem with an atomic wait
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma, linux-kernel, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan@linaro.org>
The semaphore 'sm_sem' is used for an exclusive ownership of the device
so model the same as an atomic variable with an associated wait_event.
Semaphores are going away in the future.
Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
drivers/infiniband/core/user_mad.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 415a318..6101c0a 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -67,6 +67,8 @@ enum {
IB_UMAD_MINOR_BASE = 0
};
+#define UMAD_F_CLAIM 0x01
+
/*
* Our lifetime rules for these structs are the following:
* device special file is opened, we take a reference on the
@@ -87,7 +89,8 @@ struct ib_umad_port {
struct cdev sm_cdev;
struct device *sm_dev;
- struct semaphore sm_sem;
+ wait_queue_head_t wq;
+ unsigned long flags;
struct mutex file_mutex;
struct list_head file_list;
@@ -1030,12 +1033,14 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
port = container_of(inode->i_cdev, struct ib_umad_port, sm_cdev);
if (filp->f_flags & O_NONBLOCK) {
- if (down_trylock(&port->sm_sem)) {
+ if (test_and_set_bit(UMAD_F_CLAIM, &port->flags)) {
ret = -EAGAIN;
goto fail;
}
} else {
- if (down_interruptible(&port->sm_sem)) {
+ if (wait_event_interruptible(port->wq,
+ !test_and_set_bit(UMAD_F_CLAIM,
+ &port->flags))) {
ret = -ERESTARTSYS;
goto fail;
}
@@ -1060,7 +1065,8 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
ib_modify_port(port->ib_dev, port->port_num, 0, &props);
err_up_sem:
- up(&port->sm_sem);
+ clear_bit(UMAD_F_CLAIM, &port->flags);
+ wake_up(&port->wq);
fail:
return ret;
@@ -1079,7 +1085,8 @@ static int ib_umad_sm_close(struct inode *inode, struct file *filp)
ret = ib_modify_port(port->ib_dev, port->port_num, 0, &props);
mutex_unlock(&port->file_mutex);
- up(&port->sm_sem);
+ clear_bit(UMAD_F_CLAIM, &port->flags);
+ wake_up(&port->wq);
kobject_put(&port->umad_dev->kobj);
@@ -1177,7 +1184,8 @@ static int ib_umad_init_port(struct ib_device *device, int port_num,
port->ib_dev = device;
port->port_num = port_num;
- sema_init(&port->sm_sem, 1);
+ init_waitqueue_head(&port->wq);
+ __clear_bit(UMAD_F_CLAIM, &port->flags);
mutex_init(&port->file_mutex);
INIT_LIST_HEAD(&port->file_list);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
^ permalink raw reply related
* [PATCH v3 1/9] IB/core: iwpm_nlmsg_request: Replace semaphore with completion
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
In-Reply-To: <1477468874-16328-1-git-send-email-binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Semaphore sem in iwpm_nlmsg_request is used as completion, so
convert it to a struct completion type. Semaphores are going
away in the future.
Signed-off-by: Binoy Jayan <binoy.jayan-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
drivers/infiniband/core/iwpm_msg.c | 8 ++++----
drivers/infiniband/core/iwpm_util.c | 7 +++----
drivers/infiniband/core/iwpm_util.h | 3 ++-
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/core/iwpm_msg.c b/drivers/infiniband/core/iwpm_msg.c
index 1c41b95..761358f 100644
--- a/drivers/infiniband/core/iwpm_msg.c
+++ b/drivers/infiniband/core/iwpm_msg.c
@@ -394,7 +394,7 @@ int iwpm_register_pid_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found nlmsg_request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_register_pid_cb);
@@ -463,7 +463,7 @@ int iwpm_add_mapping_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_add_mapping_cb);
@@ -555,7 +555,7 @@ int iwpm_add_and_query_mapping_cb(struct sk_buff *skb,
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_add_and_query_mapping_cb);
@@ -749,7 +749,7 @@ int iwpm_mapping_error_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_mapping_error_cb);
diff --git a/drivers/infiniband/core/iwpm_util.c b/drivers/infiniband/core/iwpm_util.c
index ade71e7..08ddd2e 100644
--- a/drivers/infiniband/core/iwpm_util.c
+++ b/drivers/infiniband/core/iwpm_util.c
@@ -323,8 +323,7 @@ struct iwpm_nlmsg_request *iwpm_get_nlmsg_request(__u32 nlmsg_seq,
nlmsg_request->nl_client = nl_client;
nlmsg_request->request_done = 0;
nlmsg_request->err_code = 0;
- sema_init(&nlmsg_request->sem, 1);
- down(&nlmsg_request->sem);
+ init_completion(&nlmsg_request->comp);
return nlmsg_request;
}
@@ -368,8 +367,8 @@ int iwpm_wait_complete_req(struct iwpm_nlmsg_request *nlmsg_request)
{
int ret;
- ret = down_timeout(&nlmsg_request->sem, IWPM_NL_TIMEOUT);
- if (ret) {
+ ret = wait_for_completion_timeout(&nlmsg_request->comp, IWPM_NL_TIMEOUT);
+ if (!ret) {
ret = -EINVAL;
pr_info("%s: Timeout %d sec for netlink request (seq = %u)\n",
__func__, (IWPM_NL_TIMEOUT/HZ), nlmsg_request->nlmsg_seq);
diff --git a/drivers/infiniband/core/iwpm_util.h b/drivers/infiniband/core/iwpm_util.h
index af1fc14..ea6c299 100644
--- a/drivers/infiniband/core/iwpm_util.h
+++ b/drivers/infiniband/core/iwpm_util.h
@@ -43,6 +43,7 @@
#include <linux/delay.h>
#include <linux/workqueue.h>
#include <linux/mutex.h>
+#include <linux/completion.h>
#include <linux/jhash.h>
#include <linux/kref.h>
#include <net/netlink.h>
@@ -69,7 +70,7 @@ struct iwpm_nlmsg_request {
u8 nl_client;
u8 request_done;
u16 err_code;
- struct semaphore sem;
+ struct completion comp;
struct kref kref;
};
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v3 0/9] infiniband: Remove semaphores
From: Binoy Jayan @ 2016-10-26 8:01 UTC (permalink / raw)
To: Doug Ledford, Sean Hefty, Hal Rosenstock
Cc: Arnd Bergmann, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Binoy Jayan
Hi,
These are a set of patches [v3] which removes semaphores from infiniband.
These are part of a bigger effort to eliminate all semaphores from the
linux kernel.
v2 -> v3:
IB/mlx5: Move '&umr_context' into helper fn
IB/mthca: Restructure mthca_cmd.c to manage free_head
IB/hns: Restructure hns_roce_cmd.c to manage free_head
IB/core: Convert completion to wait_event
IB/mlx5: Simplify completion into a wait_event
Thanks,
Binoy
Binoy Jayan (9):
IB/core: iwpm_nlmsg_request: Replace semaphore with completion
IB/core: Replace semaphore sm_sem with an atomic wait
IB/hns: Replace semaphore poll_sem with mutex
IB/mthca: Replace semaphore poll_sem with mutex
IB/isert: Replace semaphore sem with completion
IB/hns: Replace counting semaphore event_sem with wait_event
IB/mthca: Replace counting semaphore event_sem with wait_event
IB/mlx5: Add helper mlx5_ib_post_send_wait
IB/mlx5: Simplify completion into a wait_event
drivers/infiniband/core/iwpm_msg.c | 8 +-
drivers/infiniband/core/iwpm_util.c | 7 +-
drivers/infiniband/core/iwpm_util.h | 3 +-
drivers/infiniband/core/user_mad.c | 20 +++--
drivers/infiniband/hw/hns/hns_roce_cmd.c | 55 ++++++++-----
drivers/infiniband/hw/hns/hns_roce_device.h | 5 +-
drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 +-
drivers/infiniband/hw/mlx5/mr.c | 120 ++++++++--------------------
drivers/infiniband/hw/mthca/mthca_cmd.c | 55 ++++++++-----
drivers/infiniband/hw/mthca/mthca_cmd.h | 1 +
drivers/infiniband/hw/mthca/mthca_dev.h | 5 +-
drivers/infiniband/ulp/isert/ib_isert.c | 6 +-
drivers/infiniband/ulp/isert/ib_isert.h | 3 +-
include/rdma/ib_verbs.h | 3 +-
14 files changed, 140 insertions(+), 153 deletions(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [For help] rdma-roce build quesiton
From: oulijun @ 2016-10-26 7:23 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Linuxarm, linux-rdma
Hi, Jason
I am building my userspace library code using cmake.
after i fix some lines, it is failed.
I directly called the min() from the ccan/minmax.h, the min()
as follows:
#if HAVE_BUILTIN_TYPES_COMPATIBLE_P
#define MINMAX_ASSERT_COMPATIBLE(a, b) \
BUILD_ASSERT(__builtin_types_compatible_p(a, b))
#else
#define MINMAX_ASSERT_COMPATIBLE(a, b) \
do { } while (0)
#endif
#define min(a, b) \
({ \
typeof(a) _a = (a); \
typeof(b) _b = (b); \
MINMAX_ASSERT_COMPATIBLE(typeof(_a), typeof(_b)); \
_a < _b ? _a : _b; \
})
and use #include <ccan/minmax.h> in my .c file where is used the min()
according to the modification, I use the cmd as follows:
CC=aarch64-linux-gnu-gcc cmake -GNinja -DENABLE_RESOLVE_NEIGH=0 -DHAVE_ARCH_ARM64=1 ..
the build is fail and the print log as follows:
error: size of unnamed array is negative
attr->cap.max_recv_wr = min(context->max_qp_wr, attr->cap.max_recv_wr);
Now, after fixed, the error is elimed. the fix as follows:
Fix the HAVE_BUILTIN_TYPES_COMPATIBLE_P for 0 in config.h.in or use the origin definition in my .h file
the origin defintion:
#define min(a, b) \
({ \
typeof(a) _a = (a); \
typeof(b) _b = (b); \
_a < _b ? _a : _b; \
})
but I think that the above modification is not a better approach
I try to change the Optimization Option(from -O0 to -O2) in aarch64-linux-gnu-gcc by finded some material.
CC=aarch64-linux-gnu-gcc cmake -O2 -GNinja -DENABLE_RESOLVE_NEIGH=0 -DHAVE_ARCH_ARM64=1 ..
But the error is not elimed.
Can you give me some guide?
thanks
Lijun ou
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] IB/mlx4: avoid a -Wmaybe-uninitialize warning
From: Yishai Hadas @ 2016-10-26 6:51 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Yishai Hadas, David S. Miller, Jack Morgenstein, Or Gerlitz,
Eran Ben Elisha, Moshe Shemesh, Christophe Jaillet, Moni Shoua,
netdev, linux-rdma, linux-kernel
In-Reply-To: <20161025161632.411899-1-arnd@arndb.de>
On 10/25/2016 7:16 PM, Arnd Bergmann wrote:
> There is an old warning about mlx4_SW2HW_EQ_wrapper on x86:
>
> ethernet/mellanox/mlx4/resource_tracker.c: In function ‘mlx4_SW2HW_EQ_wrapper’:
> ethernet/mellanox/mlx4/resource_tracker.c:3071:10: error: ‘eq’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
>
> The problem here is that gcc won't track the state of the variable
> across a spin_unlock. Moving the assignment out of the lock is
> safe here and avoids the warning.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
> ---
> drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> index 84d7857ccc27..c548beaaf910 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> @@ -1605,13 +1605,14 @@ static int eq_res_start_move_to(struct mlx4_dev *dev, int slave, int index,
> r->com.from_state = r->com.state;
> r->com.to_state = state;
> r->com.state = RES_EQ_BUSY;
> - if (eq)
> - *eq = r;
> }
> }
>
> spin_unlock_irq(mlx4_tlock(dev));
>
> + if (!err && eq)
> + *eq = r;
> +
> return err;
> }
>
>
^ permalink raw reply
* [PATCH v2] Avoid possible hang on device removal
From: Mustafa Ismail @ 2016-10-25 23:35 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW,
leon-DgEjT+Ai2ygdnm+yROfE0A
When we get an RDMA_CM_EVENT_DEVICE_REMOVAL the cm_thread will
exit and because flush errors are ignored the cb->sem may not get signaled.
So just signal on device removal event.
v1 -> v2: Add Fixes tag
Fixes: 612eae1f6fe3 ("rping: ignore flushed completions")
Signed-off-by: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
librdmacm/examples/rping.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/librdmacm/examples/rping.c b/librdmacm/examples/rping.c
index ad38f6d..53c1525 100644
--- a/librdmacm/examples/rping.c
+++ b/librdmacm/examples/rping.c
@@ -224,6 +224,8 @@ static int rping_cma_event_handler(struct rdma_cm_id *cma_id,
case RDMA_CM_EVENT_DEVICE_REMOVAL:
fprintf(stderr, "cma detected device removal!!!!\n");
+ cb->state = ERROR;
+ sem_post(&cb->sem);
ret = -1;
break;
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH] Avoid possible hang on device removal
From: Jason Gunthorpe @ 2016-10-25 22:43 UTC (permalink / raw)
To: Steve Wise
Cc: 'Mustafa Ismail', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb,
dledford-H+wXaHxf7aLQT0dZR+AlfA, leon-DgEjT+Ai2ygdnm+yROfE0A
In-Reply-To: <01ec01d22f02$53012630$f9037290$@opengridcomputing.com>
On Tue, Oct 25, 2016 at 03:56:57PM -0500, Steve Wise wrote:
> Reviewed-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
>
> Also, this fixes a previous commit, so you could add this tag:
>
> Fixes 612eae1f6fe3 ("rping: ignore flushed completions")
>
> However that commit id is from the previous librdmacm repo...not
> sure how useful it is?
It still exists in the new repo, I'd include the fixes line.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 0/3] iopmem : A block device for PCIe memory
From: Dave Chinner @ 2016-10-25 21:19 UTC (permalink / raw)
To: Stephen Bates
Cc: Christoph Hellwig, Dan Williams, linux-kernel@vger.kernel.org,
linux-nvdimm@lists.01.org, linux-rdma, linux-block, Linux MM,
Ross Zwisler, Matthew Wilcox, jgunthorpe, haggaie, Jens Axboe,
Jonathan Corbet, jim.macdonald, sbates, Logan Gunthorpe,
David Woodhouse, Raj, Ashok
In-Reply-To: <20161025115043.GA14986@cgy1-donard.priv.deltatee.com>
On Tue, Oct 25, 2016 at 05:50:43AM -0600, Stephen Bates wrote:
> Hi Dave and Christoph
>
> On Fri, Oct 21, 2016 at 10:12:53PM +1100, Dave Chinner wrote:
> > On Fri, Oct 21, 2016 at 02:57:14AM -0700, Christoph Hellwig wrote:
> > > On Fri, Oct 21, 2016 at 10:22:39AM +1100, Dave Chinner wrote:
> > > > You do realise that local filesystems can silently change the
> > > > location of file data at any point in time, so there is no such
> > > > thing as a "stable mapping" of file data to block device addresses
> > > > in userspace?
> > > >
> > > > If you want remote access to the blocks owned and controlled by a
> > > > filesystem, then you need to use a filesystem with a remote locking
> > > > mechanism to allow co-ordinated, coherent access to the data in
> > > > those blocks. Anything else is just asking for ongoing, unfixable
> > > > filesystem corruption or data leakage problems (i.e. security
> > > > issues).
> > >
>
> Dave are you saying that even for local mappings of files on a DAX
> capable system it is possible for the mappings to move on you unless
> the FS supports locking?
Yes.
> Does that not mean DAX on such FS is
> inherently broken?
No. DAX is accessed through a virtual mapping layer that abstracts
the physical location from userspace applications.
Example: think copy-on-write overwrites. It occurs atomically from
the perspective of userspace and starts by invalidating any current
mappings userspace has of that physical location. The location is
changes, the data copied in, and then when the locks are released
userspace can fault in a new page table mapping on the next
access....
> > > And at least for XFS we have such a mechanism :) E.g. I have a
> > > prototype of a pNFS layout that uses XFS+DAX to allow clients to do
> > > RDMA directly to XFS files, with the same locking mechanism we use
> > > for the current block and scsi layout in xfs_pnfs.c.
>
> Thanks for fixing this issue on XFS Christoph! I assume this problem
> continues to exist on the other DAX capable FS?
Yes, but it they implement the exportfs API that supplies this
capability, they'll be able to use pNFS, too.
> One more reason to consider a move to /dev/dax I guess ;-)...
That doesn't get rid of the need for sane access control arbitration
across all machines that are directly accessing the storage. That's
the problem pNFS solves, regardless of whether your direct access
target is a filesystem, a block device or object storage...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply
* RE: [PATCH] Avoid possible hang on device removal
From: Steve Wise @ 2016-10-25 20:56 UTC (permalink / raw)
To: 'Mustafa Ismail', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, leon-DgEjT+Ai2ygdnm+yROfE0A
In-Reply-To: <1477428628-11504-1-git-send-email-mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> When we get an RDMA_CM_EVENT_DEVICE_REMOVAL the cm_thread will
> exit and because flush errors are ignored the cb->sem may not get signaled.
> So just signal on device removal event.
>
> Signed-off-by: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Looks good.
Reviewed-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Also, this fixes a previous commit, so you could add this tag:
Fixes 612eae1f6fe3 ("rping: ignore flushed completions")
However that commit id is from the previous librdmacm repo...not sure how useful
it is?
Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] Avoid possible hang on device removal
From: Mustafa Ismail @ 2016-10-25 20:50 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW,
leon-DgEjT+Ai2ygdnm+yROfE0A
When we get an RDMA_CM_EVENT_DEVICE_REMOVAL the cm_thread will
exit and because flush errors are ignored the cb->sem may not get signaled.
So just signal on device removal event.
Signed-off-by: Mustafa Ismail <mustafa.ismail-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
librdmacm/examples/rping.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/librdmacm/examples/rping.c b/librdmacm/examples/rping.c
index ad38f6d..53c1525 100644
--- a/librdmacm/examples/rping.c
+++ b/librdmacm/examples/rping.c
@@ -224,6 +224,8 @@ static int rping_cma_event_handler(struct rdma_cm_id *cma_id,
case RDMA_CM_EVENT_DEVICE_REMOVAL:
fprintf(stderr, "cma detected device removal!!!!\n");
+ cb->state = ERROR;
+ sem_post(&cb->sem);
ret = -1;
break;
--
2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* mlx4 RoCE mode without OFED?
From: Robert LeBlanc @ 2016-10-25 20:28 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
I've been trying to get our ConnectX-3 card to run RoCE (Ethernet mode
is working) [0], but I can't pass the roce_mode to the mlx4_core
module even with 4.8.4. Do you have to use OFED to use RoCE with
ConnectX-3? According to the spec sheet, the card should support RoCE
[1]. We generally run 4.4.x kernel and so getting OFED to compile is
quite the challenge, so we are looking for an upstream solution. It
seems that commit 3afd8362fabd167bb04f79501f21dd67aa9cb99f added some
bits to add roce_mode to the module, but I don't see it in the module.
Linux localhost 4.8.4 #3 SMP Tue Oct 25 13:28:15 MDT 2016 x86_64
x86_64 x86_64 GNU/Linux
# mstflint -d 02:00.0 q
Image type: FS2
FW Version: 2.35.5100
Rom Info: type=PXE version=3.4.648 devid=4099
Device ID: 4099
Description: Node Port1 Port2 Sys image
GUIDs: 0cc47affff4fe9fc 0cc47affff4fe9fd 0cc47affff4fe9fe
0cc47affff4fe9ff
MACs: 0cc47a4fe9fd 0cc47a4fe9fe
VSD: n/a
PSID: SM_2221000001000
[Tue Oct 25 13:40:21 2016] mlx4_core: unknown parameter 'roce_mode' ignored
[Tue Oct 25 13:40:21 2016] mlx4_core: unknown parameter 'roce_mode' ignored
# modinfo mlx4_core
filename:
/lib/modules/4.8.4/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko
version: 2.2-1
license: Dual BSD/GPL
description: Mellanox ConnectX HCA low-level driver
author: Roland Dreier
srcversion: BB58E84E637E4E5EC69D04E
alias: pci:v000015B3d00001010sv*sd*bc*sc*i*
alias: pci:v000015B3d0000100Fsv*sd*bc*sc*i*
alias: pci:v000015B3d0000100Esv*sd*bc*sc*i*
alias: pci:v000015B3d0000100Dsv*sd*bc*sc*i*
alias: pci:v000015B3d0000100Csv*sd*bc*sc*i*
alias: pci:v000015B3d0000100Bsv*sd*bc*sc*i*
alias: pci:v000015B3d0000100Asv*sd*bc*sc*i*
alias: pci:v000015B3d00001009sv*sd*bc*sc*i*
alias: pci:v000015B3d00001008sv*sd*bc*sc*i*
alias: pci:v000015B3d00001007sv*sd*bc*sc*i*
alias: pci:v000015B3d00001006sv*sd*bc*sc*i*
alias: pci:v000015B3d00001005sv*sd*bc*sc*i*
alias: pci:v000015B3d00001004sv*sd*bc*sc*i*
alias: pci:v000015B3d00001003sv*sd*bc*sc*i*
alias: pci:v000015B3d00001002sv*sd*bc*sc*i*
alias: pci:v000015B3d0000676Esv*sd*bc*sc*i*
alias: pci:v000015B3d00006746sv*sd*bc*sc*i*
alias: pci:v000015B3d00006764sv*sd*bc*sc*i*
alias: pci:v000015B3d0000675Asv*sd*bc*sc*i*
alias: pci:v000015B3d00006372sv*sd*bc*sc*i*
alias: pci:v000015B3d00006750sv*sd*bc*sc*i*
alias: pci:v000015B3d00006368sv*sd*bc*sc*i*
alias: pci:v000015B3d0000673Csv*sd*bc*sc*i*
alias: pci:v000015B3d00006732sv*sd*bc*sc*i*
alias: pci:v000015B3d00006354sv*sd*bc*sc*i*
alias: pci:v000015B3d0000634Asv*sd*bc*sc*i*
alias: pci:v000015B3d00006340sv*sd*bc*sc*i*
depends:
intree: Y
vermagic: 4.8.4 SMP mod_unload
parm: debug_level:Enable debug tracing if > 0 (int)
parm: msi_x:attempt to use MSI-X if nonzero (int)
parm: num_vfs:enable #num_vfs functions if num_vfs > 0
num_vfs=port1,port2,port1+2 (array of byte)
parm: probe_vf:number of vfs to probe by pf driver (num_vfs > 0)
probe_vf=port1,port2,port1+2 (array of byte)
parm: log_num_mgm_entry_size:log mgm size, that defines the
num of qp per mcg, for example: 10 gives 248.range: 7 <=
log_num_mgm_entry_size <= 12. To activate device managed flow steering
when available, set to
-1 (int)
parm: enable_64b_cqe_eqe:Enable 64 byte CQEs/EQEs when the
FW supports this (default: True) (bool)
parm: enable_4k_uar:Enable using 4K UAR. Should not be
enabled if have VFs which do not support 4K UARs (default: false)
(bool)
parm: log_num_mac:Log2 max number of MACs per ETH port (1-7) (int)
parm: log_num_vlan:Log2 max number of VLANs per ETH port (0-7) (int)
parm: use_prio:Enable steering by VLAN priority on ETH ports
(deprecated) (bool)
parm: log_mtts_per_seg:Log2 number of MTT entries per
segment (1-7) (int)
parm: port_type_array:Array of port types: HW_DEFAULT (0) is
default 1 for IB, 2 for Ethernet (array of int)
parm: enable_qos:Enable Enhanced QoS support (default: on) (bool)
parm: internal_err_reset:Reset device on internal errors if
non-zero (default 1) (int)
Thanks,
Robert LeBlanc
[0] https://community.mellanox.com/docs/DOC-1444
[1] http://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX3_VPI_Card.pdf
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH v4 0/8] connect reject event helpers
From: Steve Wise @ 2016-10-25 20:19 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
While reviewing:
http://lists.infradead.org/pipermail/linux-nvme/2016-October/006681.html
I decided to propose transport-agnostic helper functions to better
handle connection reject event information. Included are patches
for nvme/fabrics, iser, and rds to utilize the new helpers.
Doug, this series applies on top of v4.9-rc2 as well as sagi's nvmf-4.9-rc
branch as of today. The series should be targeted for 4.10.
Changes since v3:
- added reviewed-by/acked-by tags
- positive logic in reject_msg helpers
- cleaned up reject message strings
- added patches to use helpers in isert and nvmet_rdma
- fixed zero-day build warning
Changes since v2:
- reworked ibcm/iwcm_reject_msg() as per Christoph's recommendation
- use ibcm_ and iwcm_ prefix instead of ib_ and iw_ for reject_msg funcs
- change rdma_consumer_reject() to rdma_is_consumer_reject()
- add rdma_consumer_reject_data() helper function to return private
data/len
- use new helpers in nvme_rdma, ib_iser, and rdma_rds
- in nvme_rdma, add strings for nvme_rdma_cm_status values
---
Steve Wise (8):
rdma_cm: add rdma_reject_msg() helper function
rdma_cm: add rdma_is_consumer_reject() helper function
rdma_cm: add rdma_consumer_reject_data helper function
nvme-rdma: use rdma connection reject helper functions
ib_iser: log the connection reject message
rds_rdma: log the connection reject message
ib_isert: log the connection reject message
nvmet_rdma: log the connection reject message
drivers/infiniband/core/cm.c | 48 ++++++++++++++++++++++++++++++++
drivers/infiniband/core/cma.c | 43 ++++++++++++++++++++++++++++
drivers/infiniband/core/iwcm.c | 21 ++++++++++++++
drivers/infiniband/ulp/iser/iser_verbs.c | 5 +++-
drivers/infiniband/ulp/isert/ib_isert.c | 2 ++
drivers/nvme/host/rdma.c | 46 ++++++++++++++++++++++++------
drivers/nvme/target/rdma.c | 3 ++
include/rdma/ib_cm.h | 6 ++++
include/rdma/iw_cm.h | 6 ++++
include/rdma/rdma_cm.h | 25 +++++++++++++++++
net/rds/rdma_transport.c | 5 +++-
11 files changed, 200 insertions(+), 10 deletions(-)
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v2 2/8] IB/core: Replace semaphore sm_sem with completion
From: Arnd Bergmann @ 2016-10-25 20:15 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Binoy Jayan, Jack Wang, Doug Ledford, Sean Hefty, Hal Rosenstock,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Linux kernel mailing list
In-Reply-To: <20161025154822.GA27240-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
On Tuesday, October 25, 2016 9:48:22 AM CEST Jason Gunthorpe wrote:
>
> Using a completion to model exclusive ownership seems convoluted to
> me - is that a thing now? What about an atomic?
I also totally missed how this is used. I initially mentioned to Binoy
that almost all semaphores can either be converted to a mutex or a
completion unless they rely on the counting behavior of the semaphore.
This driver clearly falls into another category and none of the above
apply here.
> open:
>
> while (true) {
> wait_event_interruptible(priv->queue,test_bit(CLAIMED_BIT, &priv->flags));
> if (!test_and_set_bit(CLAIMED_BIT, &priv->flags))
> break;
> }
I'd fold the test_and_set_bit() into the wait_event() and get rid of the
loop here, otherwise this looks nice.
I'm generally a bit suspicious of open() functions that try to
protect you from having multiple file descriptors open, as dup()
or fork() will also result in extra file descriptors, but this
use here seems harmless, as the device itself only supports
open() and close() and the only thing it does is to keep track
of whether it is open or not.
It's also interesting that the ib_modify_port() function also
keeps track of the a flag that is set in open() and cleared
in close(), so in principle we should be able to unify this
with the semaphore, but the wait_event() approach is certainly
much simpler.
Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH 7/7] IB/hfi1: Remove incorrect IS_ERR check
From: Dennis Dalessandro @ 2016-10-25 20:12 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick, Dan Carpenter
In-Reply-To: <20161025194241.15765.8903.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
Remove IS_ERR check from caching code as the function being called does
not actually return error pointers.
Fixes: f19bd643dbde: "IB/hfi1: Prevent NULL pointer deferences in caching code"
Reported-by: Dan Carpenter <dan.carpenter-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/user_sdma.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index a761f80..77697d6 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -1144,7 +1144,7 @@ static int pin_vector_pages(struct user_sdma_request *req,
rb_node = hfi1_mmu_rb_extract(pq->handler,
(unsigned long)iovec->iov.iov_base,
iovec->iov.iov_len);
- if (rb_node && !IS_ERR(rb_node))
+ if (rb_node)
node = container_of(rb_node, struct sdma_mmu_node, rb);
else
rb_node = NULL;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 6/7] IB/hfi1: Prevent hardware counter names from being cut off
From: Dennis Dalessandro @ 2016-10-25 20:12 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jianxin Xiong
In-Reply-To: <20161025194241.15765.8903.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
From: Jianxin Xiong <jianxin.xiong-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Increase the size of the buffer that is used to construct per-VL
and per-SDMA counter names.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Jianxin Xiong <jianxin.xiong-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/chip.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 3371361..8dee6c2 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -12098,7 +12098,7 @@ static void update_synth_timer(unsigned long opaque)
mod_timer(&dd->synth_stats_timer, jiffies + HZ * SYNTH_CNT_TIME);
}
-#define C_MAX_NAME 13 /* 12 chars + one for /0 */
+#define C_MAX_NAME 16 /* 15 chars + one for /0 */
static int init_cntrs(struct hfi1_devdata *dd)
{
int i, rcv_ctxts, j;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH 5/7] IB/hfi1: Optimize pio_buf and send_context structs
From: Dennis Dalessandro @ 2016-10-25 20:12 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Mike Marciniszyn,
Sebastian Sanchez
In-Reply-To: <20161025194241.15765.8903.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
From: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Both pio_buf and send_context structs have oversized
fields and have cachelines that can be optimized.
Reduce oversized fields for both structs.
Make sure pio_buf struct fits within a cacheline.
Move read-only fields to their own cacheline in
send_context struct.
All of this will avoid cacheline trading as the ring
progresses and pio buffers/send contexts are used.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/hfi1/pio.c | 5 ++---
drivers/infiniband/hw/hfi1/pio.h | 29 +++++++++++++++--------------
drivers/infiniband/hw/hfi1/pio_copy.c | 22 +++++++++++-----------
3 files changed, 28 insertions(+), 28 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/pio.c b/drivers/infiniband/hw/hfi1/pio.c
index 00a0e9e..92701f1 100644
--- a/drivers/infiniband/hw/hfi1/pio.c
+++ b/drivers/infiniband/hw/hfi1/pio.c
@@ -758,6 +758,7 @@ struct send_context *sc_alloc(struct hfi1_devdata *dd, int type,
sc->hw_context = hw_context;
cr_group_addresses(sc, &dma);
sc->credits = sci->credits;
+ sc->size = sc->credits * PIO_BLOCK_SIZE;
/* PIO Send Memory Address details */
#define PIO_ADDR_CONTEXT_MASK 0xfful
@@ -1463,9 +1464,7 @@ retry:
/* finish filling in the buffer outside the lock */
pbuf->start = sc->base_addr + fill_wrap * PIO_BLOCK_SIZE;
- pbuf->size = sc->credits * PIO_BLOCK_SIZE;
- pbuf->end = sc->base_addr + pbuf->size;
- pbuf->block_count = blocks;
+ pbuf->end = sc->base_addr + sc->size;
pbuf->qw_written = 0;
pbuf->carry_bytes = 0;
pbuf->carry.val64 = 0;
diff --git a/drivers/infiniband/hw/hfi1/pio.h b/drivers/infiniband/hw/hfi1/pio.h
index 498b548..867e5ff 100644
--- a/drivers/infiniband/hw/hfi1/pio.h
+++ b/drivers/infiniband/hw/hfi1/pio.h
@@ -83,43 +83,43 @@ struct pio_buf {
void *arg; /* argument for cb */
void __iomem *start; /* buffer start address */
void __iomem *end; /* context end address */
- unsigned long size; /* context size, in bytes */
unsigned long sent_at; /* buffer is sent when <= free */
- u32 block_count; /* size of buffer, in blocks */
- u32 qw_written; /* QW written so far */
- u32 carry_bytes; /* number of valid bytes in carry */
union mix carry; /* pending unwritten bytes */
+ u16 qw_written; /* QW written so far */
+ u8 carry_bytes; /* number of valid bytes in carry */
};
/* cache line aligned pio buffer array */
union pio_shadow_ring {
struct pio_buf pbuf;
- u64 unused[16]; /* cache line spacer */
} ____cacheline_aligned;
/* per-NUMA send context */
struct send_context {
/* read-only after init */
struct hfi1_devdata *dd; /* device */
- void __iomem *base_addr; /* start of PIO memory */
union pio_shadow_ring *sr; /* shadow ring */
+ void __iomem *base_addr; /* start of PIO memory */
+ u32 __percpu *buffers_allocated;/* count of buffers allocated */
+ u32 size; /* context size, in bytes */
- struct work_struct halt_work; /* halted context work queue entry */
- unsigned long flags; /* flags */
int node; /* context home node */
- int type; /* context type */
- u32 sw_index; /* software index number */
- u32 hw_context; /* hardware context number */
- u32 credits; /* number of blocks in context */
u32 sr_size; /* size of the shadow ring */
- u32 group; /* credit return group */
+ u16 flags; /* flags */
+ u8 type; /* context type */
+ u8 sw_index; /* software index number */
+ u8 hw_context; /* hardware context number */
+ u8 group; /* credit return group */
+
/* allocator fields */
spinlock_t alloc_lock ____cacheline_aligned_in_smp;
u32 sr_head; /* shadow ring head */
unsigned long fill; /* official alloc count */
unsigned long alloc_free; /* copy of free (less cache thrash) */
- u32 __percpu *buffers_allocated;/* count of buffers allocated */
u32 fill_wrap; /* tracks fill within ring */
+ u32 credits; /* number of blocks in context */
+ /* adding a new field here would make it part of this cacheline */
+
/* releaser fields */
spinlock_t release_lock ____cacheline_aligned_in_smp;
u32 sr_tail; /* shadow ring tail */
@@ -131,6 +131,7 @@ struct send_context {
u32 credit_intr_count; /* count of credit intr users */
u64 credit_ctrl; /* cache for credit control */
wait_queue_head_t halt_wait; /* wait until kernel sees interrupt */
+ struct work_struct halt_work; /* halted context work queue entry */
};
/* send context flags */
diff --git a/drivers/infiniband/hw/hfi1/pio_copy.c b/drivers/infiniband/hw/hfi1/pio_copy.c
index aa77736..03024ce 100644
--- a/drivers/infiniband/hw/hfi1/pio_copy.c
+++ b/drivers/infiniband/hw/hfi1/pio_copy.c
@@ -129,8 +129,8 @@ void pio_copy(struct hfi1_devdata *dd, struct pio_buf *pbuf, u64 pbc,
dest += sizeof(u64);
}
- dest -= pbuf->size;
- dend -= pbuf->size;
+ dest -= pbuf->sc->size;
+ dend -= pbuf->sc->size;
}
/* write 8-byte non-SOP, non-wrap chunk data */
@@ -361,8 +361,8 @@ void seg_pio_copy_start(struct pio_buf *pbuf, u64 pbc,
dest += sizeof(u64);
}
- dest -= pbuf->size;
- dend -= pbuf->size;
+ dest -= pbuf->sc->size;
+ dend -= pbuf->sc->size;
}
/* write 8-byte non-SOP, non-wrap chunk data */
@@ -458,8 +458,8 @@ static void mid_copy_mix(struct pio_buf *pbuf, const void *from, size_t nbytes)
dest += sizeof(u64);
}
- dest -= pbuf->size;
- dend -= pbuf->size;
+ dest -= pbuf->sc->size;
+ dend -= pbuf->sc->size;
}
/* write 8-byte non-SOP, non-wrap chunk data */
@@ -492,7 +492,7 @@ static void mid_copy_mix(struct pio_buf *pbuf, const void *from, size_t nbytes)
*/
/* adjust if we have wrapped */
if (dest >= pbuf->end)
- dest -= pbuf->size;
+ dest -= pbuf->sc->size;
/* jump to the SOP range if within the first block */
else if (pbuf->qw_written < PIO_BLOCK_QWS)
dest += SOP_DISTANCE;
@@ -584,8 +584,8 @@ static void mid_copy_straight(struct pio_buf *pbuf,
dest += sizeof(u64);
}
- dest -= pbuf->size;
- dend -= pbuf->size;
+ dest -= pbuf->sc->size;
+ dend -= pbuf->sc->size;
}
/* write 8-byte non-SOP, non-wrap chunk data */
@@ -666,7 +666,7 @@ void seg_pio_copy_mid(struct pio_buf *pbuf, const void *from, size_t nbytes)
*/
/* adjust if we've wrapped */
if (dest >= pbuf->end)
- dest -= pbuf->size;
+ dest -= pbuf->sc->size;
/* jump to SOP range if within the first block */
else if (pbuf->qw_written < PIO_BLOCK_QWS)
dest += SOP_DISTANCE;
@@ -719,7 +719,7 @@ void seg_pio_copy_end(struct pio_buf *pbuf)
*/
/* adjust if we have wrapped */
if (dest >= pbuf->end)
- dest -= pbuf->size;
+ dest -= pbuf->sc->size;
/* jump to the SOP range if within the first block */
else if (pbuf->qw_written < PIO_BLOCK_QWS)
dest += SOP_DISTANCE;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox