* RE: Configuration of cq->cqe is lower than entries by 1
From: Amrani, Ram @ 2016-11-14 12:05 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <20161114120153.GC4240-2ukJVAZIZ/Y@public.gmane.org>
> There is addition of 1 in mlx4_ib_create_cq():
> 192 entries = roundup_pow_of_two(entries + 1);
> 193 cq->ibcq.cqe = entries - 1;
I thought something else might hide there.
Thanks,
Ram
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Configuration of cq->cqe is lower than entries by 1
From: Leon Romanovsky @ 2016-11-14 12:01 UTC (permalink / raw)
To: Amrani, Ram; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <SN1PR07MB22076C5DE03F1939603C554CF8BC0-mikhvbZlbf8TSoR2DauN2+FPX92sqiQdvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 1058 bytes --]
On Mon, Nov 14, 2016 at 11:07:53AM +0000, Amrani, Ram wrote:
> Hi Leon, All,
> While inspecting MLX code as well as other vendors' I see that the actual number of cq->cqe is configured to be less by 1 than 'entries'. Why is that?
There is addition of 1 in mlx4_ib_create_cq():
192 entries = roundup_pow_of_two(entries + 1);
193 cq->ibcq.cqe = entries - 1;
The same goes for mlx4_alloc_resize_buf, just earlier in the stack.
>
> e.g.
> struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
> const struct ib_cq_init_attr *attr,
> struct ib_ucontext *context,
> struct ib_udata *udata)
> {
> ...
> cq->ibcq.cqe = entries - 1;
> ...
> }
>
>
> static int mlx4_alloc_resize_buf(struct mlx4_ib_dev *dev, struct mlx4_ib_cq *cq,
> int entries)
> {
> ...
> cq->resize_buf->cqe = entries - 1; // this is later copied to cq->ibcq.cqe
> ...
> }
>
> Thanks,
> Ram
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [PATCH v6 12/16] IB/pvrdma: Add Queue Pair support
From: Yuval Shaia @ 2016-11-14 11:34 UTC (permalink / raw)
To: Adit Ranadive
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
pv-drivers-pghWNbHTmq7QT0dZR+AlfA, jhansen-pghWNbHTmq7QT0dZR+AlfA,
asarwade-pghWNbHTmq7QT0dZR+AlfA,
georgezhang-pghWNbHTmq7QT0dZR+AlfA,
bryantan-pghWNbHTmq7QT0dZR+AlfA
In-Reply-To: <6a643e92376856394d45638d80a90619d3abac37.1475458407.git.aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
On Sun, Oct 02, 2016 at 07:10:32PM -0700, Adit Ranadive wrote:
> This patch adds the ability to create, modify, query and destroy QPs. The
> PVRDMA device supports RC, UD and GSI QPs.
>
> Reviewed-by: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Jorgen Hansen <jhansen-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
> Reviewed-by: George Zhang <georgezhang-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Aditya Sarwade <asarwade-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Bryan Tan <bryantan-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Adit Ranadive <aditr-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
> ---
> Changes v5->v6:
> - Removed a duplicate include of the ABI header.
>
> Changes v4->v5:
> - Updated include for headers in UAPI folder.
> - Update to pvrdma_cmd_post for creating/destroying/querying/modifying QPs.
> - Use the pvrdma_sge struct when posting WRs/allocating QP memory.
> - Removed two set but unused variables.
>
> Changes v3->v4:
> - Removed an unnecessary switch case.
> - Unified the returns in pvrdma_create_qp to use one exit point.
> - Renamed pvrdma_flush_cqe to _pvrdma_flush_cqe since we need a lock to
> be held when calling this.
> - Updated to use wrapper for UAR write for QP.
> - Updated conversion function to func_name(dst, src) format.
> - Renamed max_gs to max_sg.
> - Renamed cap variable to req_cap in pvrdma_set_sq/rq_size.
> - Changed dev_warn to dev_warn_ratelimited in pvrdma_post_send/recv.
> - Added nesting locking for flushing CQs when destroying/resetting a QP.
> - Added missing ret value.
>
> Changes v2->v3:
> - Removed boolean in pvrdma_cmd_post.
> ---
> drivers/infiniband/hw/pvrdma/pvrdma_qp.c | 972 +++++++++++++++++++++++++++++++
> 1 file changed, 972 insertions(+)
> create mode 100644 drivers/infiniband/hw/pvrdma/pvrdma_qp.c
>
> diff --git a/drivers/infiniband/hw/pvrdma/pvrdma_qp.c b/drivers/infiniband/hw/pvrdma/pvrdma_qp.c
> new file mode 100644
> index 0000000..c8c01e5
> --- /dev/null
> +++ b/drivers/infiniband/hw/pvrdma/pvrdma_qp.c
> @@ -0,0 +1,972 @@
> +/*
> + * Copyright (c) 2012-2016 VMware, Inc. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of EITHER the GNU General Public License
> + * version 2 as published by the Free Software Foundation or the BSD
> + * 2-Clause License. This program is distributed in the hope that it
> + * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED
> + * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
> + * See the GNU General Public License version 2 for more details at
> + * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program available in the file COPYING in the main
> + * directory of this source tree.
> + *
> + * The BSD 2-Clause License
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * - Redistributions of source code must retain the above
> + * copyright notice, this list of conditions and the following
> + * disclaimer.
> + *
> + * - Redistributions in binary form must reproduce the above
> + * copyright notice, this list of conditions and the following
> + * disclaimer in the documentation and/or other materials
> + * provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
> + * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> + * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
> + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
> + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
> + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
> + * OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <asm/page.h>
> +#include <linux/io.h>
> +#include <linux/wait.h>
> +#include <rdma/ib_addr.h>
> +#include <rdma/ib_smi.h>
> +#include <rdma/ib_user_verbs.h>
> +
> +#include "pvrdma.h"
> +
> +static inline void get_cqs(struct pvrdma_qp *qp, struct pvrdma_cq **send_cq,
> + struct pvrdma_cq **recv_cq)
> +{
> + *send_cq = to_vcq(qp->ibqp.send_cq);
> + *recv_cq = to_vcq(qp->ibqp.recv_cq);
> +}
> +
> +static void pvrdma_lock_cqs(struct pvrdma_cq *scq, struct pvrdma_cq *rcq,
> + unsigned long *scq_flags,
> + unsigned long *rcq_flags)
> + __acquires(scq->cq_lock) __acquires(rcq->cq_lock)
> +{
> + if (scq == rcq) {
> + spin_lock_irqsave(&scq->cq_lock, *scq_flags);
> + __acquire(rcq->cq_lock);
> + } else if (scq->cq_handle < rcq->cq_handle) {
> + spin_lock_irqsave(&scq->cq_lock, *scq_flags);
> + spin_lock_irqsave_nested(&rcq->cq_lock, *rcq_flags,
> + SINGLE_DEPTH_NESTING);
> + } else {
> + spin_lock_irqsave(&rcq->cq_lock, *rcq_flags);
> + spin_lock_irqsave_nested(&scq->cq_lock, *scq_flags,
> + SINGLE_DEPTH_NESTING);
> + }
> +}
> +
> +static void pvrdma_unlock_cqs(struct pvrdma_cq *scq, struct pvrdma_cq *rcq,
> + unsigned long *scq_flags,
> + unsigned long *rcq_flags)
> + __releases(scq->cq_lock) __releases(rcq->cq_lock)
> +{
> + if (scq == rcq) {
> + __release(rcq->cq_lock);
> + spin_unlock_irqrestore(&scq->cq_lock, *scq_flags);
> + } else if (scq->cq_handle < rcq->cq_handle) {
> + spin_unlock_irqrestore(&rcq->cq_lock, *rcq_flags);
> + spin_unlock_irqrestore(&scq->cq_lock, *scq_flags);
> + } else {
> + spin_unlock_irqrestore(&scq->cq_lock, *scq_flags);
> + spin_unlock_irqrestore(&rcq->cq_lock, *rcq_flags);
> + }
> +}
> +
> +static void pvrdma_reset_qp(struct pvrdma_qp *qp)
> +{
> + struct pvrdma_cq *scq, *rcq;
> + unsigned long scq_flags, rcq_flags;
> +
> + /* Clean up cqes */
> + get_cqs(qp, &scq, &rcq);
> + pvrdma_lock_cqs(scq, rcq, &scq_flags, &rcq_flags);
> +
> + _pvrdma_flush_cqe(qp, scq);
> + if (scq != rcq)
> + _pvrdma_flush_cqe(qp, rcq);
> +
> + pvrdma_unlock_cqs(scq, rcq, &scq_flags, &rcq_flags);
> +
> + /*
> + * Reset queuepair. The checks are because usermode queuepairs won't
> + * have kernel ringstates.
> + */
> + if (qp->rq.ring) {
> + atomic_set(&qp->rq.ring->cons_head, 0);
> + atomic_set(&qp->rq.ring->prod_tail, 0);
> + }
> + if (qp->sq.ring) {
> + atomic_set(&qp->sq.ring->cons_head, 0);
> + atomic_set(&qp->sq.ring->prod_tail, 0);
> + }
> +}
> +
> +static int pvrdma_set_rq_size(struct pvrdma_dev *dev,
> + struct ib_qp_cap *req_cap,
> + struct pvrdma_qp *qp)
> +{
> + if (req_cap->max_recv_wr > dev->dsr->caps.max_qp_wr ||
> + req_cap->max_recv_sge > dev->dsr->caps.max_sge) {
> + dev_warn(&dev->pdev->dev, "recv queue size invalid\n");
> + return -EINVAL;
> + }
> +
> + qp->rq.wqe_cnt = roundup_pow_of_two(max(1U, req_cap->max_recv_wr));
> + qp->rq.max_sg = roundup_pow_of_two(max(1U, req_cap->max_recv_sge));
> +
> + /* Write back */
> + req_cap->max_recv_wr = qp->rq.wqe_cnt;
> + req_cap->max_recv_sge = qp->rq.max_sg;
> +
> + qp->rq.wqe_size = roundup_pow_of_two(sizeof(struct pvrdma_rq_wqe_hdr) +
> + sizeof(struct pvrdma_sge) *
> + qp->rq.max_sg);
> + qp->npages_recv = (qp->rq.wqe_cnt * qp->rq.wqe_size + PAGE_SIZE - 1) /
> + PAGE_SIZE;
> +
> + return 0;
> +}
> +
> +static int pvrdma_set_sq_size(struct pvrdma_dev *dev, struct ib_qp_cap *req_cap,
> + enum ib_qp_type type, struct pvrdma_qp *qp)
> +{
> + if (req_cap->max_send_wr > dev->dsr->caps.max_qp_wr ||
> + req_cap->max_send_sge > dev->dsr->caps.max_sge) {
> + dev_warn(&dev->pdev->dev, "send queue size invalid\n");
> + return -EINVAL;
> + }
> +
> + qp->sq.wqe_cnt = roundup_pow_of_two(max(1U, req_cap->max_send_wr));
> + qp->sq.max_sg = roundup_pow_of_two(max(1U, req_cap->max_send_sge));
> +
> + /* Write back */
> + req_cap->max_send_wr = qp->sq.wqe_cnt;
> + req_cap->max_send_sge = qp->sq.max_sg;
> +
> + qp->sq.wqe_size = roundup_pow_of_two(sizeof(struct pvrdma_sq_wqe_hdr) +
> + sizeof(struct pvrdma_sge) *
> + qp->sq.max_sg);
> + /* Note: one extra page for the header. */
> + qp->npages_send = 1 + (qp->sq.wqe_cnt * qp->sq.wqe_size +
> + PAGE_SIZE - 1) / PAGE_SIZE;
> +
> + return 0;
> +}
> +
> +/**
> + * pvrdma_create_qp - create queue pair
> + * @pd: protection domain
> + * @init_attr: queue pair attributes
> + * @udata: user data
> + *
> + * @return: the ib_qp pointer on success, otherwise returns an errno.
> + */
> +struct ib_qp *pvrdma_create_qp(struct ib_pd *pd,
> + struct ib_qp_init_attr *init_attr,
> + struct ib_udata *udata)
> +{
> + struct pvrdma_qp *qp = NULL;
> + struct pvrdma_dev *dev = to_vdev(pd->device);
> + union pvrdma_cmd_req req;
> + union pvrdma_cmd_resp rsp;
> + struct pvrdma_cmd_create_qp *cmd = &req.create_qp;
> + struct pvrdma_cmd_create_qp_resp *resp = &rsp.create_qp_resp;
> + struct pvrdma_create_qp ucmd;
> + unsigned long flags;
> + int ret;
> +
> + if (init_attr->create_flags) {
> + dev_warn(&dev->pdev->dev,
> + "invalid create queuepair flags %#x\n",
> + init_attr->create_flags);
> + return ERR_PTR(-EINVAL);
> + }
> +
> + if (init_attr->qp_type != IB_QPT_RC &&
> + init_attr->qp_type != IB_QPT_UD &&
> + init_attr->qp_type != IB_QPT_GSI) {
> + dev_warn(&dev->pdev->dev, "queuepair type %d not supported\n",
> + init_attr->qp_type);
> + return ERR_PTR(-EINVAL);
> + }
> +
> + if (!atomic_add_unless(&dev->num_qps, 1, dev->dsr->caps.max_qp))
> + return ERR_PTR(-ENOMEM);
> +
> + switch (init_attr->qp_type) {
> + case IB_QPT_GSI:
> + if (init_attr->port_num == 0 ||
> + init_attr->port_num > pd->device->phys_port_cnt ||
> + udata) {
> + dev_warn(&dev->pdev->dev, "invalid queuepair attrs\n");
> + ret = -EINVAL;
> + goto err_qp;
> + }
> + /* fall through */
> + case IB_QPT_RC:
> + case IB_QPT_UD:
> + qp = kzalloc(sizeof(*qp), GFP_KERNEL);
> + if (!qp) {
> + ret = -ENOMEM;
> + goto err_qp;
> + }
> +
> + spin_lock_init(&qp->sq.lock);
> + spin_lock_init(&qp->rq.lock);
> + mutex_init(&qp->mutex);
> + atomic_set(&qp->refcnt, 1);
> + init_waitqueue_head(&qp->wait);
> +
> + qp->state = IB_QPS_RESET;
> +
> + if (pd->uobject && udata) {
> + dev_dbg(&dev->pdev->dev,
> + "create queuepair from user space\n");
> +
> + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) {
> + ret = -EFAULT;
> + goto err_qp;
> + }
> +
> + /* set qp->sq.wqe_cnt, shift, buf_size.. */
> + qp->rumem = ib_umem_get(pd->uobject->context,
> + ucmd.rbuf_addr,
> + ucmd.rbuf_size, 0, 0);
> + if (IS_ERR(qp->rumem)) {
> + ret = PTR_ERR(qp->rumem);
> + goto err_qp;
> + }
> +
> + qp->sumem = ib_umem_get(pd->uobject->context,
> + ucmd.sbuf_addr,
> + ucmd.sbuf_size, 0, 0);
> + if (IS_ERR(qp->sumem)) {
> + ib_umem_release(qp->rumem);
> + ret = PTR_ERR(qp->sumem);
> + goto err_qp;
> + }
> +
> + qp->npages_send = ib_umem_page_count(qp->sumem);
> + qp->npages_recv = ib_umem_page_count(qp->rumem);
> + qp->npages = qp->npages_send + qp->npages_recv;
> + } else {
> + qp->is_kernel = true;
> +
> + ret = pvrdma_set_sq_size(to_vdev(pd->device),
> + &init_attr->cap,
> + init_attr->qp_type, qp);
> + if (ret)
> + goto err_qp;
> +
> + ret = pvrdma_set_rq_size(to_vdev(pd->device),
> + &init_attr->cap, qp);
> + if (ret)
> + goto err_qp;
> +
> + qp->npages = qp->npages_send + qp->npages_recv;
> +
> + /* Skip header page. */
> + qp->sq.offset = PAGE_SIZE;
> +
> + /* Recv queue pages are after send pages. */
> + qp->rq.offset = qp->npages_send * PAGE_SIZE;
> + }
> +
> + if (qp->npages < 0 || qp->npages > PVRDMA_PAGE_DIR_MAX_PAGES) {
> + dev_warn(&dev->pdev->dev,
> + "overflow pages in queuepair\n");
> + ret = -EINVAL;
> + goto err_umem;
> + }
> +
> + ret = pvrdma_page_dir_init(dev, &qp->pdir, qp->npages,
> + qp->is_kernel);
> + if (ret) {
> + dev_warn(&dev->pdev->dev,
> + "could not allocate page directory\n");
> + goto err_umem;
> + }
> +
> + if (!qp->is_kernel) {
> + pvrdma_page_dir_insert_umem(&qp->pdir, qp->sumem, 0);
> + pvrdma_page_dir_insert_umem(&qp->pdir, qp->rumem,
> + qp->npages_send);
> + } else {
> + /* Ring state is always the first page. */
> + qp->sq.ring = qp->pdir.pages[0];
> + qp->rq.ring = &qp->sq.ring[1];
> + }
> + break;
> + default:
> + ret = -EINVAL;
> + goto err_qp;
> + }
> +
> + /* Not supported */
> + init_attr->cap.max_inline_data = 0;
> +
> + memset(cmd, 0, sizeof(*cmd));
> + cmd->hdr.cmd = PVRDMA_CMD_CREATE_QP;
> + cmd->pd_handle = to_vpd(pd)->pd_handle;
> + cmd->send_cq_handle = to_vcq(init_attr->send_cq)->cq_handle;
> + cmd->recv_cq_handle = to_vcq(init_attr->recv_cq)->cq_handle;
> + cmd->max_send_wr = init_attr->cap.max_send_wr;
> + cmd->max_recv_wr = init_attr->cap.max_recv_wr;
> + cmd->max_send_sge = init_attr->cap.max_send_sge;
> + cmd->max_recv_sge = init_attr->cap.max_recv_sge;
> + cmd->max_inline_data = init_attr->cap.max_inline_data;
> + cmd->sq_sig_all = (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) ? 1 : 0;
> + cmd->qp_type = ib_qp_type_to_pvrdma(init_attr->qp_type);
> + cmd->access_flags = IB_ACCESS_LOCAL_WRITE;
> + cmd->total_chunks = qp->npages;
> + cmd->send_chunks = qp->npages_send - 1;
> + cmd->pdir_dma = qp->pdir.dir_dma;
> +
> + dev_dbg(&dev->pdev->dev, "create queuepair with %d, %d, %d, %d\n",
> + cmd->max_send_wr, cmd->max_recv_wr, cmd->max_send_sge,
> + cmd->max_recv_sge);
> +
> + ret = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_CREATE_QP_RESP);
> + if (ret < 0) {
> + dev_warn(&dev->pdev->dev,
> + "could not create queuepair, error: %d\n", ret);
> + goto err_pdir;
> + }
> +
> + /* max_send_wr/_recv_wr/_send_sge/_recv_sge/_inline_data */
> + qp->qp_handle = resp->qpn;
> + qp->port = init_attr->port_num;
> + qp->ibqp.qp_num = resp->qpn;
> + spin_lock_irqsave(&dev->qp_tbl_lock, flags);
> + dev->qp_tbl[qp->qp_handle % dev->dsr->caps.max_qp] = qp;
> + spin_unlock_irqrestore(&dev->qp_tbl_lock, flags);
> +
> + return &qp->ibqp;
> +
> +err_pdir:
> + pvrdma_page_dir_cleanup(dev, &qp->pdir);
> +err_umem:
> + if (pd->uobject && udata) {
> + if (qp->rumem)
> + ib_umem_release(qp->rumem);
> + if (qp->sumem)
> + ib_umem_release(qp->sumem);
> + }
> +err_qp:
> + kfree(qp);
> + atomic_dec(&dev->num_qps);
> +
> + return ERR_PTR(ret);
> +}
> +
> +static void pvrdma_free_qp(struct pvrdma_qp *qp)
> +{
> + struct pvrdma_dev *dev = to_vdev(qp->ibqp.device);
> + struct pvrdma_cq *scq;
> + struct pvrdma_cq *rcq;
> + unsigned long flags, scq_flags, rcq_flags;
> +
> + /* In case cq is polling */
> + get_cqs(qp, &scq, &rcq);
> + pvrdma_lock_cqs(scq, rcq, &scq_flags, &rcq_flags);
> +
> + _pvrdma_flush_cqe(qp, scq);
> + if (scq != rcq)
> + _pvrdma_flush_cqe(qp, rcq);
> +
> + spin_lock_irqsave(&dev->qp_tbl_lock, flags);
> + dev->qp_tbl[qp->qp_handle] = NULL;
> + spin_unlock_irqrestore(&dev->qp_tbl_lock, flags);
> +
> + pvrdma_unlock_cqs(scq, rcq, &scq_flags, &rcq_flags);
> +
> + atomic_dec(&qp->refcnt);
> + wait_event(qp->wait, !atomic_read(&qp->refcnt));
> +
> + pvrdma_page_dir_cleanup(dev, &qp->pdir);
> +
> + kfree(qp);
> +
> + atomic_dec(&dev->num_qps);
> +}
> +
> +/**
> + * pvrdma_destroy_qp - destroy a queue pair
> + * @qp: the queue pair to destroy
> + *
> + * @return: 0 on success.
> + */
> +int pvrdma_destroy_qp(struct ib_qp *qp)
> +{
> + struct pvrdma_qp *vqp = to_vqp(qp);
> + union pvrdma_cmd_req req;
> + struct pvrdma_cmd_destroy_qp *cmd = &req.destroy_qp;
> + int ret;
> +
> + memset(cmd, 0, sizeof(*cmd));
> + cmd->hdr.cmd = PVRDMA_CMD_DESTROY_QP;
> + cmd->qp_handle = vqp->qp_handle;
> +
> + ret = pvrdma_cmd_post(to_vdev(qp->device), &req, NULL, 0);
> + if (ret < 0)
> + dev_warn(&to_vdev(qp->device)->pdev->dev,
> + "destroy queuepair failed, error: %d\n", ret);
> +
> + pvrdma_free_qp(vqp);
> +
> + return 0;
> +}
> +
> +/**
> + * pvrdma_modify_qp - modify queue pair attributes
> + * @ibqp: the queue pair
> + * @attr: the new queue pair's attributes
> + * @attr_mask: attributes mask
> + * @udata: user data
> + *
> + * @returns 0 on success, otherwise returns an errno.
> + */
> +int pvrdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
> + int attr_mask, struct ib_udata *udata)
> +{
> + struct pvrdma_dev *dev = to_vdev(ibqp->device);
> + struct pvrdma_qp *qp = to_vqp(ibqp);
> + union pvrdma_cmd_req req;
> + union pvrdma_cmd_resp rsp;
> + struct pvrdma_cmd_modify_qp *cmd = &req.modify_qp;
> + int cur_state, next_state;
> + int ret;
> +
> + /* Sanity checking. Should need lock here */
> + mutex_lock(&qp->mutex);
> + cur_state = (attr_mask & IB_QP_CUR_STATE) ? attr->cur_qp_state :
> + qp->state;
> + next_state = (attr_mask & IB_QP_STATE) ? attr->qp_state : cur_state;
> +
> + if (!ib_modify_qp_is_ok(cur_state, next_state, ibqp->qp_type,
> + attr_mask, IB_LINK_LAYER_ETHERNET)) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + if (attr_mask & IB_QP_PORT) {
> + if (attr->port_num == 0 ||
> + attr->port_num > ibqp->device->phys_port_cnt) {
> + ret = -EINVAL;
> + goto out;
> + }
> + }
> +
> + if (attr_mask & IB_QP_MIN_RNR_TIMER) {
> + if (attr->min_rnr_timer > 31) {
> + ret = -EINVAL;
> + goto out;
> + }
> + }
> +
> + if (attr_mask & IB_QP_PKEY_INDEX) {
> + if (attr->pkey_index >= dev->dsr->caps.max_pkeys) {
> + ret = -EINVAL;
> + goto out;
> + }
> + }
> +
> + if (attr_mask & IB_QP_QKEY)
> + qp->qkey = attr->qkey;
> +
> + if (cur_state == next_state && cur_state == IB_QPS_RESET) {
> + ret = 0;
> + goto out;
> + }
> +
> + qp->state = next_state;
> + memset(cmd, 0, sizeof(*cmd));
> + cmd->hdr.cmd = PVRDMA_CMD_MODIFY_QP;
> + cmd->qp_handle = qp->qp_handle;
> + cmd->attr_mask = ib_qp_attr_mask_to_pvrdma(attr_mask);
> + cmd->attrs.qp_state = ib_qp_state_to_pvrdma(attr->qp_state);
> + cmd->attrs.cur_qp_state =
> + ib_qp_state_to_pvrdma(attr->cur_qp_state);
> + cmd->attrs.path_mtu = ib_mtu_to_pvrdma(attr->path_mtu);
> + cmd->attrs.path_mig_state =
> + ib_mig_state_to_pvrdma(attr->path_mig_state);
> + cmd->attrs.qkey = attr->qkey;
> + cmd->attrs.rq_psn = attr->rq_psn;
> + cmd->attrs.sq_psn = attr->sq_psn;
> + cmd->attrs.dest_qp_num = attr->dest_qp_num;
> + cmd->attrs.qp_access_flags =
> + ib_access_flags_to_pvrdma(attr->qp_access_flags);
> + cmd->attrs.pkey_index = attr->pkey_index;
> + cmd->attrs.alt_pkey_index = attr->alt_pkey_index;
> + cmd->attrs.en_sqd_async_notify = attr->en_sqd_async_notify;
> + cmd->attrs.sq_draining = attr->sq_draining;
> + cmd->attrs.max_rd_atomic = attr->max_rd_atomic;
> + cmd->attrs.max_dest_rd_atomic = attr->max_dest_rd_atomic;
> + cmd->attrs.min_rnr_timer = attr->min_rnr_timer;
> + cmd->attrs.port_num = attr->port_num;
> + cmd->attrs.timeout = attr->timeout;
> + cmd->attrs.retry_cnt = attr->retry_cnt;
> + cmd->attrs.rnr_retry = attr->rnr_retry;
> + cmd->attrs.alt_port_num = attr->alt_port_num;
> + cmd->attrs.alt_timeout = attr->alt_timeout;
> + ib_qp_cap_to_pvrdma(&cmd->attrs.cap, &attr->cap);
> + ib_ah_attr_to_pvrdma(&cmd->attrs.ah_attr, &attr->ah_attr);
> + ib_ah_attr_to_pvrdma(&cmd->attrs.alt_ah_attr, &attr->alt_ah_attr);
> +
> + ret = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_MODIFY_QP_RESP);
> + if (ret < 0) {
> + dev_warn(&dev->pdev->dev,
> + "could not modify queuepair, error: %d\n", ret);
> + } else if (rsp.hdr.err > 0) {
> + dev_warn(&dev->pdev->dev,
> + "cannot modify queuepair, error: %d\n", rsp.hdr.err);
> + ret = -EINVAL;
> + }
> +
> + if (ret == 0 && next_state == IB_QPS_RESET)
> + pvrdma_reset_qp(qp);
> +
> +out:
> + mutex_unlock(&qp->mutex);
> +
> + return ret;
> +}
> +
> +static inline void *get_sq_wqe(struct pvrdma_qp *qp, int n)
> +{
> + return pvrdma_page_dir_get_ptr(&qp->pdir,
> + qp->sq.offset + n * qp->sq.wqe_size);
> +}
> +
> +static inline void *get_rq_wqe(struct pvrdma_qp *qp, int n)
> +{
> + return pvrdma_page_dir_get_ptr(&qp->pdir,
> + qp->rq.offset + n * qp->rq.wqe_size);
> +}
> +
> +static int set_reg_seg(struct pvrdma_sq_wqe_hdr *wqe_hdr, struct ib_reg_wr *wr)
> +{
> + struct pvrdma_user_mr *mr = to_vmr(wr->mr);
> +
> + wqe_hdr->wr.fast_reg.iova_start = mr->ibmr.iova;
> + wqe_hdr->wr.fast_reg.pl_pdir_dma = mr->pdir.dir_dma;
> + wqe_hdr->wr.fast_reg.page_shift = mr->page_shift;
> + wqe_hdr->wr.fast_reg.page_list_len = mr->npages;
> + wqe_hdr->wr.fast_reg.length = mr->ibmr.length;
> + wqe_hdr->wr.fast_reg.access_flags = wr->access;
> + wqe_hdr->wr.fast_reg.rkey = wr->key;
> +
> + return pvrdma_page_dir_insert_page_list(&mr->pdir, mr->pages,
> + mr->npages);
> +}
> +
> +/**
> + * pvrdma_post_send - post send work request entries on a QP
> + * @ibqp: the QP
> + * @wr: work request list to post
> + * @bad_wr: the first bad WR returned
> + *
> + * @return: 0 on success, otherwise errno returned.
> + */
> +int pvrdma_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
> + struct ib_send_wr **bad_wr)
> +{
> + struct pvrdma_qp *qp = to_vqp(ibqp);
> + struct pvrdma_dev *dev = to_vdev(ibqp->device);
> + unsigned long flags;
> + struct pvrdma_sq_wqe_hdr *wqe_hdr;
> + struct pvrdma_sge *sge;
> + int i, index;
> + int nreq;
> + int ret;
> +
> + /*
> + * In states lower than RTS, we can fail immediately. In other states,
> + * just post and let the device figure it out.
> + */
> + if (qp->state < IB_QPS_RTS) {
> + *bad_wr = wr;
> + return -EINVAL;
> + }
> +
> + spin_lock_irqsave(&qp->sq.lock, flags);
> +
> + index = pvrdma_idx(&qp->sq.ring->prod_tail, qp->sq.wqe_cnt);
Not sure if it was discussed so posting "just in case".
I believe it is unlikely that index will go out of range but since
pvrdma_idx might return PVRDMA_INVALID_IDX i suggest to add a check here.
Something like:
if (unlikely(index == PVRDMA_INVALID_IDX)) {
*bad_wr = wr;
return -EFAULT;
}
Same goes with pvrdma_post_recv
> + for (nreq = 0; wr; nreq++, wr = wr->next) {
> + unsigned int tail;
> +
> + if (unlikely(!pvrdma_idx_ring_has_space(
> + qp->sq.ring, qp->sq.wqe_cnt, &tail))) {
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "send queue is full\n");
> + *bad_wr = wr;
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + if (unlikely(wr->num_sge > qp->sq.max_sg || wr->num_sge < 0)) {
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "send SGE overflow\n");
> + *bad_wr = wr;
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + if (unlikely(wr->opcode < 0)) {
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "invalid send opcode\n");
> + *bad_wr = wr;
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + /*
> + * Only support UD, RC.
> + * Need to check opcode table for thorough checking.
> + * opcode _UD _UC _RC
> + * _SEND x x x
> + * _SEND_WITH_IMM x x x
> + * _RDMA_WRITE x x
> + * _RDMA_WRITE_WITH_IMM x x
> + * _LOCAL_INV x x
> + * _SEND_WITH_INV x x
> + * _RDMA_READ x
> + * _ATOMIC_CMP_AND_SWP x
> + * _ATOMIC_FETCH_AND_ADD x
> + * _MASK_ATOMIC_CMP_AND_SWP x
> + * _MASK_ATOMIC_FETCH_AND_ADD x
> + * _REG_MR x
> + *
> + */
> + if (qp->ibqp.qp_type != IB_QPT_UD &&
> + qp->ibqp.qp_type != IB_QPT_RC &&
> + wr->opcode != IB_WR_SEND) {
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "unsupported queuepair type\n");
> + *bad_wr = wr;
> + ret = -EINVAL;
> + goto out;
> + } else if (qp->ibqp.qp_type == IB_QPT_UD ||
> + qp->ibqp.qp_type == IB_QPT_GSI) {
> + if (wr->opcode != IB_WR_SEND &&
> + wr->opcode != IB_WR_SEND_WITH_IMM) {
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "invalid send opcode\n");
> + *bad_wr = wr;
> + ret = -EINVAL;
> + goto out;
> + }
> + }
> +
> + wqe_hdr = (struct pvrdma_sq_wqe_hdr *)get_sq_wqe(qp, index);
> + memset(wqe_hdr, 0, sizeof(*wqe_hdr));
> + wqe_hdr->wr_id = wr->wr_id;
> + wqe_hdr->num_sge = wr->num_sge;
> + wqe_hdr->opcode = ib_wr_opcode_to_pvrdma(wr->opcode);
> + wqe_hdr->send_flags = ib_send_flags_to_pvrdma(wr->send_flags);
> + if (wr->opcode == IB_WR_SEND_WITH_IMM ||
> + wr->opcode == IB_WR_RDMA_WRITE_WITH_IMM)
> + wqe_hdr->ex.imm_data = wr->ex.imm_data;
> +
> + switch (qp->ibqp.qp_type) {
> + case IB_QPT_GSI:
> + case IB_QPT_UD:
> + if (unlikely(!ud_wr(wr)->ah)) {
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "invalid address handle\n");
> + *bad_wr = wr;
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + /*
> + * Use qkey from qp context if high order bit set,
> + * otherwise from work request.
> + */
> + wqe_hdr->wr.ud.remote_qpn = ud_wr(wr)->remote_qpn;
> + wqe_hdr->wr.ud.remote_qkey =
> + ud_wr(wr)->remote_qkey & 0x80000000 ?
> + qp->qkey : ud_wr(wr)->remote_qkey;
> + wqe_hdr->wr.ud.av = to_vah(ud_wr(wr)->ah)->av;
> +
> + break;
> + case IB_QPT_RC:
> + switch (wr->opcode) {
> + case IB_WR_RDMA_READ:
> + case IB_WR_RDMA_WRITE:
> + case IB_WR_RDMA_WRITE_WITH_IMM:
> + wqe_hdr->wr.rdma.remote_addr =
> + rdma_wr(wr)->remote_addr;
> + wqe_hdr->wr.rdma.rkey = rdma_wr(wr)->rkey;
> + break;
> + case IB_WR_LOCAL_INV:
> + case IB_WR_SEND_WITH_INV:
> + wqe_hdr->ex.invalidate_rkey =
> + wr->ex.invalidate_rkey;
> + break;
> + case IB_WR_ATOMIC_CMP_AND_SWP:
> + case IB_WR_ATOMIC_FETCH_AND_ADD:
> + wqe_hdr->wr.atomic.remote_addr =
> + atomic_wr(wr)->remote_addr;
> + wqe_hdr->wr.atomic.rkey = atomic_wr(wr)->rkey;
> + wqe_hdr->wr.atomic.compare_add =
> + atomic_wr(wr)->compare_add;
> + if (wr->opcode == IB_WR_ATOMIC_CMP_AND_SWP)
> + wqe_hdr->wr.atomic.swap =
> + atomic_wr(wr)->swap;
> + break;
> + case IB_WR_REG_MR:
> + ret = set_reg_seg(wqe_hdr, reg_wr(wr));
> + if (ret < 0) {
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "Failed to set fast register work request\n");
> + *bad_wr = wr;
> + goto out;
> + }
> + break;
> + default:
> + break;
> + }
> +
> + break;
> + default:
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "invalid queuepair type\n");
> + ret = -EINVAL;
> + *bad_wr = wr;
> + goto out;
> + }
> +
> + sge = (struct pvrdma_sge *)(wqe_hdr + 1);
> + for (i = 0; i < wr->num_sge; i++) {
> + /* Need to check wqe_size 0 or max size */
> + sge->addr = wr->sg_list[i].addr;
> + sge->length = wr->sg_list[i].length;
> + sge->lkey = wr->sg_list[i].lkey;
> + sge++;
> + }
> +
> + /* Make sure wqe is written before index update */
> + smp_wmb();
> +
> + index++;
> + if (unlikely(index >= qp->sq.wqe_cnt))
> + index = 0;
> + /* Update shared sq ring */
> + pvrdma_idx_ring_inc(&qp->sq.ring->prod_tail,
> + qp->sq.wqe_cnt);
> + }
> +
> + ret = 0;
> +
> +out:
> + spin_unlock_irqrestore(&qp->sq.lock, flags);
> +
> + if (!ret)
> + pvrdma_write_uar_qp(dev, PVRDMA_UAR_QP_SEND | qp->qp_handle);
> +
> + return ret;
> +}
> +
> +/**
> + * pvrdma_post_receive - post receive work request entries on a QP
> + * @ibqp: the QP
> + * @wr: the work request list to post
> + * @bad_wr: the first bad WR returned
> + *
> + * @return: 0 on success, otherwise errno returned.
> + */
> +int pvrdma_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr,
> + struct ib_recv_wr **bad_wr)
> +{
> + struct pvrdma_dev *dev = to_vdev(ibqp->device);
> + unsigned long flags;
> + struct pvrdma_qp *qp = to_vqp(ibqp);
> + struct pvrdma_rq_wqe_hdr *wqe_hdr;
> + struct pvrdma_sge *sge;
> + int index, nreq;
> + int ret = 0;
> + int i;
> +
> + /*
> + * In the RESET state, we can fail immediately. For other states,
> + * just post and let the device figure it out.
> + */
> + if (qp->state == IB_QPS_RESET) {
> + *bad_wr = wr;
> + return -EINVAL;
> + }
> +
> + spin_lock_irqsave(&qp->rq.lock, flags);
> +
> + index = pvrdma_idx(&qp->rq.ring->prod_tail, qp->rq.wqe_cnt);
> + for (nreq = 0; wr; nreq++, wr = wr->next) {
> + unsigned int tail;
> +
> + if (unlikely(wr->num_sge > qp->rq.max_sg ||
> + wr->num_sge < 0)) {
> + ret = -EINVAL;
> + *bad_wr = wr;
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "recv SGE overflow\n");
> + goto out;
> + }
> +
> + if (unlikely(!pvrdma_idx_ring_has_space(
> + qp->rq.ring, qp->rq.wqe_cnt, &tail))) {
> + ret = -ENOMEM;
> + *bad_wr = wr;
> + dev_warn_ratelimited(&dev->pdev->dev,
> + "recv queue full\n");
> + goto out;
> + }
> +
> + wqe_hdr = (struct pvrdma_rq_wqe_hdr *)get_rq_wqe(qp, index);
> + wqe_hdr->wr_id = wr->wr_id;
> + wqe_hdr->num_sge = wr->num_sge;
> + wqe_hdr->total_len = 0;
> +
> + sge = (struct pvrdma_sge *)(wqe_hdr + 1);
> + for (i = 0; i < wr->num_sge; i++) {
> + sge->addr = wr->sg_list[i].addr;
> + sge->length = wr->sg_list[i].length;
> + sge->lkey = wr->sg_list[i].lkey;
> + sge++;
> + }
> +
> + /* Make sure wqe is written before index update */
> + smp_wmb();
> +
> + index++;
> + if (unlikely(index >= qp->rq.wqe_cnt))
> + index = 0;
> + /* Update shared rq ring */
> + pvrdma_idx_ring_inc(&qp->rq.ring->prod_tail,
> + qp->rq.wqe_cnt);
> + }
> +
> + spin_unlock_irqrestore(&qp->rq.lock, flags);
> +
> + pvrdma_write_uar_qp(dev, PVRDMA_UAR_QP_RECV | qp->qp_handle);
> +
> + return ret;
> +
> +out:
> + spin_unlock_irqrestore(&qp->rq.lock, flags);
> +
> + return ret;
> +}
> +
> +/**
> + * pvrdma_query_qp - query a queue pair's attributes
> + * @ibqp: the queue pair to query
> + * @attr: the queue pair's attributes
> + * @attr_mask: attributes mask
> + * @init_attr: initial queue pair attributes
> + *
> + * @returns 0 on success, otherwise returns an errno.
> + */
> +int pvrdma_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
> + int attr_mask, struct ib_qp_init_attr *init_attr)
> +{
> + struct pvrdma_dev *dev = to_vdev(ibqp->device);
> + struct pvrdma_qp *qp = to_vqp(ibqp);
> + union pvrdma_cmd_req req;
> + union pvrdma_cmd_resp rsp;
> + struct pvrdma_cmd_query_qp *cmd = &req.query_qp;
> + struct pvrdma_cmd_query_qp_resp *resp = &rsp.query_qp_resp;
> + int ret = 0;
> +
> + mutex_lock(&qp->mutex);
> +
> + if (qp->state == IB_QPS_RESET) {
> + attr->qp_state = IB_QPS_RESET;
> + goto out;
> + }
> +
> + memset(cmd, 0, sizeof(*cmd));
> + cmd->hdr.cmd = PVRDMA_CMD_QUERY_QP;
> + cmd->qp_handle = qp->qp_handle;
> + cmd->attr_mask = ib_qp_attr_mask_to_pvrdma(attr_mask);
> +
> + ret = pvrdma_cmd_post(dev, &req, &rsp, PVRDMA_CMD_QUERY_QP_RESP);
> + if (ret < 0) {
> + dev_warn(&dev->pdev->dev,
> + "could not query queuepair, error: %d\n", ret);
> + goto out;
> + }
> +
> + attr->qp_state = pvrdma_qp_state_to_ib(resp->attrs.qp_state);
> + attr->cur_qp_state =
> + pvrdma_qp_state_to_ib(resp->attrs.cur_qp_state);
> + attr->path_mtu = pvrdma_mtu_to_ib(resp->attrs.path_mtu);
> + attr->path_mig_state =
> + pvrdma_mig_state_to_ib(resp->attrs.path_mig_state);
> + attr->qkey = resp->attrs.qkey;
> + attr->rq_psn = resp->attrs.rq_psn;
> + attr->sq_psn = resp->attrs.sq_psn;
> + attr->dest_qp_num = resp->attrs.dest_qp_num;
> + attr->qp_access_flags =
> + pvrdma_access_flags_to_ib(resp->attrs.qp_access_flags);
> + attr->pkey_index = resp->attrs.pkey_index;
> + attr->alt_pkey_index = resp->attrs.alt_pkey_index;
> + attr->en_sqd_async_notify = resp->attrs.en_sqd_async_notify;
> + attr->sq_draining = resp->attrs.sq_draining;
> + attr->max_rd_atomic = resp->attrs.max_rd_atomic;
> + attr->max_dest_rd_atomic = resp->attrs.max_dest_rd_atomic;
> + attr->min_rnr_timer = resp->attrs.min_rnr_timer;
> + attr->port_num = resp->attrs.port_num;
> + attr->timeout = resp->attrs.timeout;
> + attr->retry_cnt = resp->attrs.retry_cnt;
> + attr->rnr_retry = resp->attrs.rnr_retry;
> + attr->alt_port_num = resp->attrs.alt_port_num;
> + attr->alt_timeout = resp->attrs.alt_timeout;
> + pvrdma_qp_cap_to_ib(&attr->cap, &resp->attrs.cap);
> + pvrdma_ah_attr_to_ib(&attr->ah_attr, &resp->attrs.ah_attr);
> + pvrdma_ah_attr_to_ib(&attr->alt_ah_attr, &resp->attrs.alt_ah_attr);
> +
> + qp->state = attr->qp_state;
> +
> + ret = 0;
> +
> +out:
> + attr->cur_qp_state = attr->qp_state;
> +
> + init_attr->event_handler = qp->ibqp.event_handler;
> + init_attr->qp_context = qp->ibqp.qp_context;
> + init_attr->send_cq = qp->ibqp.send_cq;
> + init_attr->recv_cq = qp->ibqp.recv_cq;
> + init_attr->srq = qp->ibqp.srq;
> + init_attr->xrcd = NULL;
> + init_attr->cap = attr->cap;
> + init_attr->sq_sig_type = 0;
> + init_attr->qp_type = qp->ibqp.qp_type;
> + init_attr->create_flags = 0;
> + init_attr->port_num = qp->port;
> +
> + mutex_unlock(&qp->mutex);
> + return ret;
> +}
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Configuration of cq->cqe is lower than entries by 1
From: Amrani, Ram @ 2016-11-14 11:07 UTC (permalink / raw)
To: Leon Romanovsky,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Hi Leon, All,
While inspecting MLX code as well as other vendors' I see that the actual number of cq->cqe is configured to be less by 1 than 'entries'. Why is that?
e.g.
struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
struct ib_udata *udata)
{
...
cq->ibcq.cqe = entries - 1;
...
}
static int mlx4_alloc_resize_buf(struct mlx4_ib_dev *dev, struct mlx4_ib_cq *cq,
int entries)
{
...
cq->resize_buf->cqe = entries - 1; // this is later copied to cq->ibcq.cqe
...
}
Thanks,
Ram
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] IB/usnic: simplify IS_ERR_OR_NULL to IS_ERR
From: Leon Romanovsky @ 2016-11-14 6:00 UTC (permalink / raw)
To: Julia Lawall
Cc: Christian Benvenuti, kernel-janitors, Dave Goodell, Doug Ledford,
Sean Hefty, Hal Rosenstock, linux-rdma, linux-kernel,
Christophe JAILLET
In-Reply-To: <1478891066-16093-1-git-send-email-Julia.Lawall@lip6.fr>
[-- Attachment #1: Type: text/plain, Size: 889 bytes --]
On Fri, Nov 11, 2016 at 08:04:26PM +0100, Julia Lawall wrote:
> The function usnic_ib_qp_grp_get_chunk only returns an ERR_PTR value or a
> valid pointer, never NULL. The same is true of get_qp_res_chunk, which
> just returns the result of calling usnic_ib_qp_grp_get_chunk. Simplify
> IS_ERR_OR_NULL to IS_ERR in both cases.
>
> The semantic patch that makes this change is as follows:
> (http://coccinelle.lip6.fr/)
>
> // <smpl>
> @@
> expression t,e;
> @@
>
> t = \(usnic_ib_qp_grp_get_chunk(...)\|get_qp_res_chunk(...)\)
> ... when != t=e
> - IS_ERR_OR_NULL(t)
> + IS_ERR(t)
>
> @@
> expression t,e,e1;
> @@
>
> t = \(usnic_ib_qp_grp_get_chunk(...)\|get_qp_res_chunk(...)\)
> ... when != t=e
> ?- t ? PTR_ERR(t) : e1
> + PTR_ERR(t)
> ... when any
> // </smpl>
>
> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Thanks, Julia.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [PATCH v5 rdma-core 0/7] libhns: userspace library for hns
From: Leon Romanovsky @ 2016-11-14 5:53 UTC (permalink / raw)
To: Lijun Ou
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1479033360-56035-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 1012 bytes --]
On Sun, Nov 13, 2016 at 06:35:53PM +0800, Lijun Ou wrote:
> This patch series introduces userspace library for hns RoCE driver.
>
> changes v4 -> v5:
> 1. elminate the warning when CFLAGS equal to -m32
>
> changes v3 -> v4:
> 1. eliminate the warning by Travis CI testing
>
> changes v2 -> v3:
> 1. Fix the code style, for example, if (addr == NULL)
> 2. Fix the bug for hns_roce_u_reg_mr
>
> changes v1 -> v2:
> 1. Delete the min() definition and instead of ccan header
> 2. Delete the CHECK_C_SOURCE_COMPILES
> 3. sort the c file in rdma_provider()
> 4. Delete the unused code in hns_roce_u_db.h
>
> Lijun Ou (7):
> libhns: Add initial main frame
> libhns: Add verbs of querying device and querying port
> libhns: Add verbs of pd and mr support
> libhns: Add verbs of cq support
> libhns: Add verbs of qp support
> libhns: Add verbs of post_send and post_recv support
> libhns: Add consolidated repo for userspace library of hns
Thanks, applied.
https://github.com/linux-rdma/rdma-core/pull/38
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [PATCH, RESEND] IB/srpt: Report login failures only once
From: Max Gurtovoy @ 2016-11-13 17:29 UTC (permalink / raw)
To: Bart Van Assche, Doug Ledford
Cc: Nicholas A. Bellinger, Christoph Hellwig, Sagi Grimberg,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <7737d8fc-d41d-0755-d7d2-a3a2b9b6a76e-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
On 11/12/2016 2:36 AM, Bart Van Assche wrote:
> Report the following message only once if no ACL has been configured
> yet for an initiator port:
>
> "Rejected login because no ACL has been configured yet for initiator %s.\n"
>
> Signed-off-by: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> Cc: Nicholas Bellinger <nab-IzHhD5pYlfBP7FQvKIMDCQ@public.gmane.org>
> Cc: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
> Cc: Sagi Grimberg <sagig-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
>
Looks good.
Reviewed-by: Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: [PATCH rdma-core] qede: fix general protection fault may occur on probe
From: Amrani, Ram @ 2016-11-13 16:42 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: Elior, Ariel, Kalderon, Michal, Mintz, Yuval,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1477400039-16925-1-git-send-email-Ram.Amrani-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
> The recent introduction of qedr driver support in qede causes a GPF when
> probing the driver in a server without a RoCE enabled QLogic NIC. This fix avoids
> using an uninitialized pointer in such a case. Caught by the kernel test robot.
>
> Signed-off-by: Ram Amrani <Ram.Amrani-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
> ---
> drivers/net/ethernet/qlogic/qede/qede_roce.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/qlogic/qede/qede_roce.c
> b/drivers/net/ethernet/qlogic/qede/qede_roce.c
> index 9867f96..4927271 100644
> --- a/drivers/net/ethernet/qlogic/qede/qede_roce.c
> +++ b/drivers/net/ethernet/qlogic/qede/qede_roce.c
> @@ -191,8 +191,8 @@ int qede_roce_register_driver(struct qedr_driver *drv)
> }
> mutex_unlock(&qedr_dev_list_lock);
>
> - DP_INFO(edev, "qedr: discovered and registered %d RoCE funcs\n",
> - qedr_counter);
> + pr_notice("qedr: discovered and registered %d RoCE funcs\n",
> + qedr_counter);
>
> return 0;
> }
> --
> 1.8.3.1
Hi Doug,
Can you update if this patch taken and if not, when do you expect it will be?
Thanks,
Ram
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH v5 rdma-core 7/7] libhns: Add consolidated repo for userspace library of hns
From: Lijun Ou @ 2016-11-13 10:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1479033360-56035-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch configures the consolidated repo to build userspace
library of hns(libhns).
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
v5/v4/v3:
- No change over v2
v2:
- Delete the CHECK_C_SOURCE_COMPILES and sort the .c file
v1:
- The initial submit
---
CMakeLists.txt | 1 +
MAINTAINERS | 6 ++++++
README.md | 1 +
providers/hns/CMakeLists.txt | 6 ++++++
4 files changed, 14 insertions(+)
create mode 100644 providers/hns/CMakeLists.txt
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 0ac7477..3c2aa79 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -328,6 +328,7 @@ add_subdirectory(libibcm)
add_subdirectory(providers/cxgb3)
add_subdirectory(providers/cxgb4)
add_subdirectory(providers/hfi1verbs)
+add_subdirectory(providers/hns)
add_subdirectory(providers/i40iw)
add_subdirectory(providers/ipathverbs)
add_subdirectory(providers/mlx4)
diff --git a/MAINTAINERS b/MAINTAINERS
index d83de10..bc6eb50 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -57,6 +57,12 @@ S: Supported
L: intel-opa-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org (moderated for non-subscribers)
F: providers/hfi1verbs/
+HNS USERSPACE PROVIDER (for hns-roce.ko)
+M: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
+M: Wei Hu(Xavier) <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
+S: Supported
+F: providers/hns/
+
I40IW USERSPACE PROVIDER (for i40iw.ko)
M: Tatyana Nikolova <Tatyana.E.Nikolova-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
S: Supported
diff --git a/README.md b/README.md
index 3a13042..e3bc33f 100644
--- a/README.md
+++ b/README.md
@@ -18,6 +18,7 @@ is included:
- iw_cxgb3.ko
- iw_cxgb4.ko
- hfi1.ko
+ - hns-roce.ko
- i40iw.ko
- ib_qib.ko
- mlx4_ib.ko
diff --git a/providers/hns/CMakeLists.txt b/providers/hns/CMakeLists.txt
new file mode 100644
index 0000000..19a793e
--- /dev/null
+++ b/providers/hns/CMakeLists.txt
@@ -0,0 +1,6 @@
+rdma_provider(hns
+ hns_roce_u.c
+ hns_roce_u_buf.c
+ hns_roce_u_hw_v1.c
+ hns_roce_u_verbs.c
+)
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 rdma-core 6/7] libhns: Add verbs of post_send and post_recv support
From: Lijun Ou @ 2016-11-13 10:35 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1479033360-56035-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces the verbs of posting send
and psoting recv.
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
v5/v4/v3/v2:
- No change over the v1
v1:
- The initial submit
---
providers/hns/hns_roce_u.c | 2 +
providers/hns/hns_roce_u.h | 8 +
providers/hns/hns_roce_u_hw_v1.c | 314 +++++++++++++++++++++++++++++++++++++++
providers/hns/hns_roce_u_hw_v1.h | 79 ++++++++++
4 files changed, 403 insertions(+)
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index de2fd57..281f9f4 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -131,6 +131,8 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
context->ibv_ctx.ops.query_qp = hns_roce_u_query_qp;
context->ibv_ctx.ops.modify_qp = hr_dev->u_hw->modify_qp;
context->ibv_ctx.ops.destroy_qp = hr_dev->u_hw->destroy_qp;
+ context->ibv_ctx.ops.post_send = hr_dev->u_hw->post_send;
+ context->ibv_ctx.ops.post_recv = hr_dev->u_hw->post_recv;
if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
goto tptr_free;
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index 02b9251..4a6ed8e 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -51,6 +51,10 @@
#define PFX "hns: "
+#ifndef likely
+#define likely(x) __builtin_expect(!!(x), 1)
+#endif
+
#define roce_get_field(origin, mask, shift) \
(((origin) & (mask)) >> (shift))
@@ -171,6 +175,10 @@ struct hns_roce_qp {
struct hns_roce_u_hw {
int (*poll_cq)(struct ibv_cq *ibvcq, int ne, struct ibv_wc *wc);
int (*arm_cq)(struct ibv_cq *ibvcq, int solicited);
+ int (*post_send)(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+ struct ibv_send_wr **bad_wr);
+ int (*post_recv)(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
+ struct ibv_recv_wr **bad_wr);
int (*modify_qp)(struct ibv_qp *qp, struct ibv_qp_attr *attr,
int attr_mask);
int (*destroy_qp)(struct ibv_qp *ibqp);
diff --git a/providers/hns/hns_roce_u_hw_v1.c b/providers/hns/hns_roce_u_hw_v1.c
index e5c7f6a..e5cfe48 100644
--- a/providers/hns/hns_roce_u_hw_v1.c
+++ b/providers/hns/hns_roce_u_hw_v1.c
@@ -37,6 +37,59 @@
#include "hns_roce_u_hw_v1.h"
#include "hns_roce_u.h"
+static inline void set_raddr_seg(struct hns_roce_wqe_raddr_seg *rseg,
+ uint64_t remote_addr, uint32_t rkey)
+{
+ rseg->raddr = remote_addr;
+ rseg->rkey = rkey;
+ rseg->len = 0;
+}
+
+static void set_data_seg(struct hns_roce_wqe_data_seg *dseg, struct ibv_sge *sg)
+{
+
+ dseg->lkey = sg->lkey;
+ dseg->addr = sg->addr;
+ dseg->len = sg->length;
+}
+
+static void hns_roce_update_rq_head(struct hns_roce_context *ctx,
+ unsigned int qpn, unsigned int rq_head)
+{
+ struct hns_roce_rq_db rq_db;
+
+ rq_db.u32_4 = 0;
+ rq_db.u32_8 = 0;
+
+ roce_set_field(rq_db.u32_4, RQ_DB_U32_4_RQ_HEAD_M,
+ RQ_DB_U32_4_RQ_HEAD_S, rq_head);
+ roce_set_field(rq_db.u32_8, RQ_DB_U32_8_QPN_M, RQ_DB_U32_8_QPN_S, qpn);
+ roce_set_field(rq_db.u32_8, RQ_DB_U32_8_CMD_M, RQ_DB_U32_8_CMD_S, 1);
+ roce_set_bit(rq_db.u32_8, RQ_DB_U32_8_HW_SYNC_S, 1);
+
+ hns_roce_write64((uint32_t *)&rq_db, ctx, ROCEE_DB_OTHERS_L_0_REG);
+}
+
+static void hns_roce_update_sq_head(struct hns_roce_context *ctx,
+ unsigned int qpn, unsigned int port,
+ unsigned int sl, unsigned int sq_head)
+{
+ struct hns_roce_sq_db sq_db;
+
+ sq_db.u32_4 = 0;
+ sq_db.u32_8 = 0;
+
+ roce_set_field(sq_db.u32_4, SQ_DB_U32_4_SQ_HEAD_M,
+ SQ_DB_U32_4_SQ_HEAD_S, sq_head);
+ roce_set_field(sq_db.u32_4, SQ_DB_U32_4_PORT_M, SQ_DB_U32_4_PORT_S,
+ port);
+ roce_set_field(sq_db.u32_4, SQ_DB_U32_4_SL_M, SQ_DB_U32_4_SL_S, sl);
+ roce_set_field(sq_db.u32_8, SQ_DB_U32_8_QPN_M, SQ_DB_U32_8_QPN_S, qpn);
+ roce_set_bit(sq_db.u32_8, SQ_DB_U32_8_HW_SYNC, 1);
+
+ hns_roce_write64((uint32_t *)&sq_db, ctx, ROCEE_DB_SQ_L_0_REG);
+}
+
static void hns_roce_update_cq_cons_index(struct hns_roce_context *ctx,
struct hns_roce_cq *cq)
{
@@ -126,6 +179,16 @@ static struct hns_roce_cqe *next_cqe_sw(struct hns_roce_cq *cq)
return get_sw_cqe(cq, cq->cons_index);
}
+static void *get_recv_wqe(struct hns_roce_qp *qp, int n)
+{
+ if ((n < 0) || (n > qp->rq.wqe_cnt)) {
+ printf("rq wqe index:%d,rq wqe cnt:%d\r\n", n, qp->rq.wqe_cnt);
+ return NULL;
+ }
+
+ return qp->buf.buf + qp->rq.offset + (n << qp->rq.wqe_shift);
+}
+
static void *get_send_wqe(struct hns_roce_qp *qp, int n)
{
if ((n < 0) || (n > qp->sq.wqe_cnt)) {
@@ -136,6 +199,26 @@ static void *get_send_wqe(struct hns_roce_qp *qp, int n)
return (void *)(qp->buf.buf + qp->sq.offset + (n << qp->sq.wqe_shift));
}
+static int hns_roce_wq_overflow(struct hns_roce_wq *wq, int nreq,
+ struct hns_roce_cq *cq)
+{
+ unsigned int cur;
+
+ cur = wq->head - wq->tail;
+ if (cur + nreq < wq->max_post)
+ return 0;
+
+ /* While the num of wqe exceeds cap of the device, cq will be locked */
+ pthread_spin_lock(&cq->lock);
+ cur = wq->head - wq->tail;
+ pthread_spin_unlock(&cq->lock);
+
+ printf("wq:(head = %d, tail = %d, max_post = %d), nreq = 0x%x\n",
+ wq->head, wq->tail, wq->max_post, nreq);
+
+ return cur + nreq >= wq->max_post;
+}
+
static struct hns_roce_qp *hns_roce_find_qp(struct hns_roce_context *ctx,
uint32_t qpn)
{
@@ -372,6 +455,144 @@ static int hns_roce_u_v1_arm_cq(struct ibv_cq *ibvcq, int solicited)
return 0;
}
+static int hns_roce_u_v1_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+ struct ibv_send_wr **bad_wr)
+{
+ unsigned int ind;
+ void *wqe;
+ int nreq;
+ int ps_opcode, i;
+ int ret = 0;
+ struct hns_roce_wqe_ctrl_seg *ctrl = NULL;
+ struct hns_roce_wqe_data_seg *dseg = NULL;
+ struct hns_roce_qp *qp = to_hr_qp(ibvqp);
+ struct hns_roce_context *ctx = to_hr_ctx(ibvqp->context);
+
+ pthread_spin_lock(&qp->sq.lock);
+
+ /* check that state is OK to post send */
+ ind = qp->sq.head;
+
+ for (nreq = 0; wr; ++nreq, wr = wr->next) {
+ if (hns_roce_wq_overflow(&qp->sq, nreq,
+ to_hr_cq(qp->ibv_qp.send_cq))) {
+ ret = -1;
+ *bad_wr = wr;
+ goto out;
+ }
+ if (wr->num_sge > qp->sq.max_gs) {
+ ret = -1;
+ *bad_wr = wr;
+ printf("wr->num_sge(<=%d) = %d, check failed!\r\n",
+ qp->sq.max_gs, wr->num_sge);
+ goto out;
+ }
+
+ ctrl = wqe = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1));
+ memset(ctrl, 0, sizeof(struct hns_roce_wqe_ctrl_seg));
+
+ qp->sq.wrid[ind & (qp->sq.wqe_cnt - 1)] = wr->wr_id;
+ for (i = 0; i < wr->num_sge; i++)
+ ctrl->msg_length += wr->sg_list[i].length;
+
+
+ ctrl->flag |= ((wr->send_flags & IBV_SEND_SIGNALED) ?
+ HNS_ROCE_WQE_CQ_NOTIFY : 0) |
+ (wr->send_flags & IBV_SEND_SOLICITED ?
+ HNS_ROCE_WQE_SE : 0) |
+ ((wr->opcode == IBV_WR_SEND_WITH_IMM ||
+ wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM) ?
+ HNS_ROCE_WQE_IMM : 0) |
+ (wr->send_flags & IBV_SEND_FENCE ?
+ HNS_ROCE_WQE_FENCE : 0);
+
+ if (wr->opcode == IBV_WR_SEND_WITH_IMM ||
+ wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM)
+ ctrl->imm_data = wr->imm_data;
+
+ wqe += sizeof(struct hns_roce_wqe_ctrl_seg);
+
+ /* set remote addr segment */
+ switch (ibvqp->qp_type) {
+ case IBV_QPT_RC:
+ switch (wr->opcode) {
+ case IBV_WR_RDMA_READ:
+ ps_opcode = HNS_ROCE_WQE_OPCODE_RDMA_READ;
+ set_raddr_seg(wqe, wr->wr.rdma.remote_addr,
+ wr->wr.rdma.rkey);
+ break;
+ case IBV_WR_RDMA_WRITE:
+ case IBV_WR_RDMA_WRITE_WITH_IMM:
+ ps_opcode = HNS_ROCE_WQE_OPCODE_RDMA_WRITE;
+ set_raddr_seg(wqe, wr->wr.rdma.remote_addr,
+ wr->wr.rdma.rkey);
+ break;
+ case IBV_WR_SEND:
+ case IBV_WR_SEND_WITH_IMM:
+ ps_opcode = HNS_ROCE_WQE_OPCODE_SEND;
+ break;
+ case IBV_WR_ATOMIC_CMP_AND_SWP:
+ case IBV_WR_ATOMIC_FETCH_AND_ADD:
+ default:
+ ps_opcode = HNS_ROCE_WQE_OPCODE_MASK;
+ break;
+ }
+ ctrl->flag |= (ps_opcode);
+ wqe += sizeof(struct hns_roce_wqe_raddr_seg);
+ break;
+ case IBV_QPT_UC:
+ case IBV_QPT_UD:
+ default:
+ break;
+ }
+
+ dseg = wqe;
+
+ /* Inline */
+ if (wr->send_flags & IBV_SEND_INLINE && wr->num_sge) {
+ if (ctrl->msg_length > qp->max_inline_data) {
+ ret = -1;
+ *bad_wr = wr;
+ printf("inline data len(1-32)=%d, send_flags = 0x%x, check failed!\r\n",
+ wr->send_flags, ctrl->msg_length);
+ return ret;
+ }
+
+ for (i = 0; i < wr->num_sge; i++) {
+ memcpy(wqe,
+ ((void *) (uintptr_t) wr->sg_list[i].addr),
+ wr->sg_list[i].length);
+ wqe = wqe + wr->sg_list[i].length;
+ }
+
+ ctrl->flag |= HNS_ROCE_WQE_INLINE;
+ } else {
+ /* set sge */
+ for (i = 0; i < wr->num_sge; i++)
+ set_data_seg(dseg+i, wr->sg_list + i);
+
+ ctrl->flag |= wr->num_sge << HNS_ROCE_WQE_SGE_NUM_BIT;
+ }
+
+ ind++;
+ }
+
+out:
+ /* Set DB return */
+ if (likely(nreq)) {
+ qp->sq.head += nreq;
+ wmb();
+
+ hns_roce_update_sq_head(ctx, qp->ibv_qp.qp_num,
+ qp->port_num - 1, qp->sl,
+ qp->sq.head & ((qp->sq.wqe_cnt << 1) - 1));
+ }
+
+ pthread_spin_unlock(&qp->sq.lock);
+
+ return ret;
+}
+
static void __hns_roce_v1_cq_clean(struct hns_roce_cq *cq, uint32_t qpn,
struct hns_roce_srq *srq)
{
@@ -515,9 +736,102 @@ static int hns_roce_u_v1_destroy_qp(struct ibv_qp *ibqp)
return ret;
}
+static int hns_roce_u_v1_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
+ struct ibv_recv_wr **bad_wr)
+{
+ int ret = 0;
+ int nreq;
+ int ind;
+ struct ibv_sge *sg;
+ struct hns_roce_rc_rq_wqe *rq_wqe;
+ struct hns_roce_qp *qp = to_hr_qp(ibvqp);
+ struct hns_roce_context *ctx = to_hr_ctx(ibvqp->context);
+
+ pthread_spin_lock(&qp->rq.lock);
+
+ /* check that state is OK to post receive */
+ ind = qp->rq.head & (qp->rq.wqe_cnt - 1);
+
+ for (nreq = 0; wr; ++nreq, wr = wr->next) {
+ if (hns_roce_wq_overflow(&qp->rq, nreq,
+ to_hr_cq(qp->ibv_qp.recv_cq))) {
+ ret = -1;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ if (wr->num_sge > qp->rq.max_gs) {
+ ret = -1;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ rq_wqe = get_recv_wqe(qp, ind);
+ if (wr->num_sge > HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM) {
+ ret = -1;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ if (wr->num_sge == HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM) {
+ roce_set_field(rq_wqe->u32_2,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_M,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_S,
+ HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM);
+ sg = wr->sg_list;
+
+ rq_wqe->va0 = (sg->addr);
+ rq_wqe->l_key0 = (sg->lkey);
+ rq_wqe->length0 = (sg->length);
+
+ sg = wr->sg_list + 1;
+
+ rq_wqe->va1 = (sg->addr);
+ rq_wqe->l_key1 = (sg->lkey);
+ rq_wqe->length1 = (sg->length);
+ } else if (wr->num_sge == HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM - 1) {
+ roce_set_field(rq_wqe->u32_2,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_M,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_S,
+ HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM - 1);
+ sg = wr->sg_list;
+
+ rq_wqe->va0 = (sg->addr);
+ rq_wqe->l_key0 = (sg->lkey);
+ rq_wqe->length0 = (sg->length);
+
+ } else if (wr->num_sge == HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM - 2) {
+ roce_set_field(rq_wqe->u32_2,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_M,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_S,
+ HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM - 2);
+ }
+
+ qp->rq.wrid[ind] = wr->wr_id;
+
+ ind = (ind + 1) & (qp->rq.wqe_cnt - 1);
+ }
+
+out:
+ if (nreq) {
+ qp->rq.head += nreq;
+
+ wmb();
+
+ hns_roce_update_rq_head(ctx, qp->ibv_qp.qp_num,
+ qp->rq.head & ((qp->rq.wqe_cnt << 1) - 1));
+ }
+
+ pthread_spin_unlock(&qp->rq.lock);
+
+ return ret;
+}
+
struct hns_roce_u_hw hns_roce_u_hw_v1 = {
.poll_cq = hns_roce_u_v1_poll_cq,
.arm_cq = hns_roce_u_v1_arm_cq,
+ .post_send = hns_roce_u_v1_post_send,
+ .post_recv = hns_roce_u_v1_post_recv,
.modify_qp = hns_roce_u_v1_modify_qp,
.destroy_qp = hns_roce_u_v1_destroy_qp,
};
diff --git a/providers/hns/hns_roce_u_hw_v1.h b/providers/hns/hns_roce_u_hw_v1.h
index b249f54..128c66f 100644
--- a/providers/hns/hns_roce_u_hw_v1.h
+++ b/providers/hns/hns_roce_u_hw_v1.h
@@ -39,9 +39,15 @@
#define HNS_ROCE_CQE_IS_SQ 0
#define HNS_ROCE_RC_WQE_INLINE_DATA_MAX_LEN 32
+#define HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM 2
enum {
+ HNS_ROCE_WQE_INLINE = 1 << 31,
+ HNS_ROCE_WQE_SE = 1 << 30,
+ HNS_ROCE_WQE_SGE_NUM_BIT = 24,
HNS_ROCE_WQE_IMM = 1 << 23,
+ HNS_ROCE_WQE_FENCE = 1 << 21,
+ HNS_ROCE_WQE_CQ_NOTIFY = 1 << 20,
HNS_ROCE_WQE_OPCODE_SEND = 0 << 16,
HNS_ROCE_WQE_OPCODE_RDMA_READ = 1 << 16,
HNS_ROCE_WQE_OPCODE_RDMA_WRITE = 2 << 16,
@@ -52,6 +58,20 @@ enum {
struct hns_roce_wqe_ctrl_seg {
__be32 sgl_pa_h;
__be32 flag;
+ __be32 imm_data;
+ __be32 msg_length;
+};
+
+struct hns_roce_wqe_data_seg {
+ __be64 addr;
+ __be32 lkey;
+ __be32 len;
+};
+
+struct hns_roce_wqe_raddr_seg {
+ __be32 rkey;
+ __be32 len;
+ __be64 raddr;
};
enum {
@@ -102,6 +122,43 @@ struct hns_roce_cq_db {
#define CQ_DB_U32_8_HW_SYNC_S 31
+struct hns_roce_rq_db {
+ unsigned int u32_4;
+ unsigned int u32_8;
+};
+
+#define RQ_DB_U32_4_RQ_HEAD_S 0
+#define RQ_DB_U32_4_RQ_HEAD_M (((1UL << 15) - 1) << RQ_DB_U32_4_RQ_HEAD_S)
+
+#define RQ_DB_U32_8_QPN_S 0
+#define RQ_DB_U32_8_QPN_M (((1UL << 24) - 1) << RQ_DB_U32_8_QPN_S)
+
+#define RQ_DB_U32_8_CMD_S 28
+#define RQ_DB_U32_8_CMD_M (((1UL << 3) - 1) << RQ_DB_U32_8_CMD_S)
+
+#define RQ_DB_U32_8_HW_SYNC_S 31
+
+struct hns_roce_sq_db {
+ unsigned int u32_4;
+ unsigned int u32_8;
+};
+
+#define SQ_DB_U32_4_SQ_HEAD_S 0
+#define SQ_DB_U32_4_SQ_HEAD_M (((1UL << 15) - 1) << SQ_DB_U32_4_SQ_HEAD_S)
+
+#define SQ_DB_U32_4_SL_S 16
+#define SQ_DB_U32_4_SL_M (((1UL << 2) - 1) << SQ_DB_U32_4_SL_S)
+
+#define SQ_DB_U32_4_PORT_S 18
+#define SQ_DB_U32_4_PORT_M (((1UL << 3) - 1) << SQ_DB_U32_4_PORT_S)
+
+#define SQ_DB_U32_4_DIRECT_WQE_S 31
+
+#define SQ_DB_U32_8_QPN_S 0
+#define SQ_DB_U32_8_QPN_M (((1UL << 24) - 1) << SQ_DB_U32_8_QPN_S)
+
+#define SQ_DB_U32_8_HW_SYNC 31
+
struct hns_roce_cqe {
unsigned int cqe_byte_4;
union {
@@ -160,4 +217,26 @@ struct hns_roce_rc_send_wqe {
unsigned int length1;
};
+struct hns_roce_rc_rq_wqe {
+ unsigned int u32_0;
+ unsigned int sgl_ba_31_0;
+ unsigned int u32_2;
+ unsigned int rvd_5;
+ unsigned int rvd_6;
+ unsigned int rvd_7;
+ unsigned int rvd_8;
+ unsigned int rvd_9;
+
+ uint64_t va0;
+ unsigned int l_key0;
+ unsigned int length0;
+
+ uint64_t va1;
+ unsigned int l_key1;
+ unsigned int length1;
+};
+#define RC_RQ_WQE_NUMBER_OF_DATA_SEG_S 16
+#define RC_RQ_WQE_NUMBER_OF_DATA_SEG_M \
+ (((1UL << 6) - 1) << RC_RQ_WQE_NUMBER_OF_DATA_SEG_S)
+
#endif /* _HNS_ROCE_U_HW_V1_H */
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 rdma-core 5/7] libhns: Add verbs of qp support
From: Lijun Ou @ 2016-11-13 10:35 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1479033360-56035-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces the relatived qp verbs for userspace
library of hns, include:
1. create_qp
2. query_qp
3. modify_qp
4. destroy_qp
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
v5/v4/v3:
- No change over v2
v2:
- Delete the min() and use the ccan header
v1:
- The initial submit
---
providers/hns/hns_roce_u.c | 5 +
providers/hns/hns_roce_u.h | 45 +++++++
providers/hns/hns_roce_u_abi.h | 8 ++
providers/hns/hns_roce_u_hw_v1.c | 155 +++++++++++++++++++++++
providers/hns/hns_roce_u_verbs.c | 259 ++++++++++++++++++++++++++++++++++++++-
5 files changed, 471 insertions(+), 1 deletion(-)
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index 1877218..de2fd57 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -127,6 +127,11 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
context->ibv_ctx.ops.cq_event = hns_roce_u_cq_event;
context->ibv_ctx.ops.destroy_cq = hns_roce_u_destroy_cq;
+ context->ibv_ctx.ops.create_qp = hns_roce_u_create_qp;
+ context->ibv_ctx.ops.query_qp = hns_roce_u_query_qp;
+ context->ibv_ctx.ops.modify_qp = hr_dev->u_hw->modify_qp;
+ context->ibv_ctx.ops.destroy_qp = hr_dev->u_hw->destroy_qp;
+
if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
goto tptr_free;
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index c3e364d..02b9251 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -44,6 +44,7 @@
#define HNS_ROCE_MAX_CQ_NUM 0x10000
#define HNS_ROCE_MIN_CQE_NUM 0x40
+#define HNS_ROCE_MIN_WQE_NUM 0x20
#define HNS_ROCE_CQ_DB_BUF_SIZE ((HNS_ROCE_MAX_CQ_NUM >> 11) << 12)
#define HNS_ROCE_TPTR_OFFSET 0x1000
#define HNS_ROCE_HW_VER1 ('h' << 24 | 'i' << 16 | '0' << 8 | '6')
@@ -128,10 +129,29 @@ struct hns_roce_cq {
int arm_sn;
};
+struct hns_roce_srq {
+ struct ibv_srq ibv_srq;
+ struct hns_roce_buf buf;
+ pthread_spinlock_t lock;
+ unsigned long *wrid;
+ unsigned int srqn;
+ int max;
+ unsigned int max_gs;
+ int wqe_shift;
+ int head;
+ int tail;
+ unsigned int *db;
+ unsigned short counter;
+};
+
struct hns_roce_wq {
unsigned long *wrid;
+ pthread_spinlock_t lock;
unsigned int wqe_cnt;
+ int max_post;
+ unsigned int head;
unsigned int tail;
+ unsigned int max_gs;
int wqe_shift;
int offset;
};
@@ -139,14 +159,21 @@ struct hns_roce_wq {
struct hns_roce_qp {
struct ibv_qp ibv_qp;
struct hns_roce_buf buf;
+ int max_inline_data;
+ int buf_size;
unsigned int sq_signal_bits;
struct hns_roce_wq sq;
struct hns_roce_wq rq;
+ int port_num;
+ int sl;
};
struct hns_roce_u_hw {
int (*poll_cq)(struct ibv_cq *ibvcq, int ne, struct ibv_wc *wc);
int (*arm_cq)(struct ibv_cq *ibvcq, int solicited);
+ int (*modify_qp)(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ int attr_mask);
+ int (*destroy_qp)(struct ibv_qp *ibqp);
};
static inline unsigned long align(unsigned long val, unsigned long align)
@@ -174,6 +201,16 @@ static inline struct hns_roce_cq *to_hr_cq(struct ibv_cq *ibv_cq)
return container_of(ibv_cq, struct hns_roce_cq, ibv_cq);
}
+static inline struct hns_roce_srq *to_hr_srq(struct ibv_srq *ibv_srq)
+{
+ return container_of(ibv_srq, struct hns_roce_srq, ibv_srq);
+}
+
+static inline struct hns_roce_qp *to_hr_qp(struct ibv_qp *ibv_qp)
+{
+ return container_of(ibv_qp, struct hns_roce_qp, ibv_qp);
+}
+
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr);
int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
@@ -193,10 +230,18 @@ struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
int hns_roce_u_destroy_cq(struct ibv_cq *cq);
void hns_roce_u_cq_event(struct ibv_cq *cq);
+struct ibv_qp *hns_roce_u_create_qp(struct ibv_pd *pd,
+ struct ibv_qp_init_attr *attr);
+
+int hns_roce_u_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr,
+ int attr_mask, struct ibv_qp_init_attr *init_attr);
+
int hns_roce_alloc_buf(struct hns_roce_buf *buf, unsigned int size,
int page_size);
void hns_roce_free_buf(struct hns_roce_buf *buf);
+void hns_roce_init_qp_indices(struct hns_roce_qp *qp);
+
extern struct hns_roce_u_hw hns_roce_u_hw_v1;
#endif /* _HNS_ROCE_U_H */
diff --git a/providers/hns/hns_roce_u_abi.h b/providers/hns/hns_roce_u_abi.h
index 1e62a7e..e78f967 100644
--- a/providers/hns/hns_roce_u_abi.h
+++ b/providers/hns/hns_roce_u_abi.h
@@ -58,4 +58,12 @@ struct hns_roce_create_cq_resp {
__u32 reserved;
};
+struct hns_roce_create_qp {
+ struct ibv_create_qp ibv_cmd;
+ __u64 buf_addr;
+ __u8 log_sq_bb_count;
+ __u8 log_sq_stride;
+ __u8 reserved[5];
+};
+
#endif /* _HNS_ROCE_U_ABI_H */
diff --git a/providers/hns/hns_roce_u_hw_v1.c b/providers/hns/hns_roce_u_hw_v1.c
index 39a67b1..e5c7f6a 100644
--- a/providers/hns/hns_roce_u_hw_v1.c
+++ b/providers/hns/hns_roce_u_hw_v1.c
@@ -149,6 +149,16 @@ static struct hns_roce_qp *hns_roce_find_qp(struct hns_roce_context *ctx,
}
}
+static void hns_roce_clear_qp(struct hns_roce_context *ctx, uint32_t qpn)
+{
+ int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift;
+
+ if (!--ctx->qp_table[tind].refcnt)
+ free(ctx->qp_table[tind].table);
+ else
+ ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = NULL;
+}
+
static int hns_roce_v1_poll_one(struct hns_roce_cq *cq,
struct hns_roce_qp **cur_qp, struct ibv_wc *wc)
{
@@ -362,7 +372,152 @@ static int hns_roce_u_v1_arm_cq(struct ibv_cq *ibvcq, int solicited)
return 0;
}
+static void __hns_roce_v1_cq_clean(struct hns_roce_cq *cq, uint32_t qpn,
+ struct hns_roce_srq *srq)
+{
+ int nfreed = 0;
+ uint32_t prod_index;
+ uint8_t owner_bit = 0;
+ struct hns_roce_cqe *cqe, *dest;
+ struct hns_roce_context *ctx = to_hr_ctx(cq->ibv_cq.context);
+
+ for (prod_index = cq->cons_index; get_sw_cqe(cq, prod_index);
+ ++prod_index)
+ if (prod_index == cq->cons_index + cq->ibv_cq.cqe)
+ break;
+
+ while ((int) --prod_index - (int) cq->cons_index >= 0) {
+ cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe);
+ if ((roce_get_field(cqe->cqe_byte_16, CQE_BYTE_16_LOCAL_QPN_M,
+ CQE_BYTE_16_LOCAL_QPN_S) & 0xffffff) == qpn) {
+ ++nfreed;
+ } else if (nfreed) {
+ dest = get_cqe(cq,
+ (prod_index + nfreed) & cq->ibv_cq.cqe);
+ owner_bit = roce_get_bit(dest->cqe_byte_4,
+ CQE_BYTE_4_OWNER_S);
+ memcpy(dest, cqe, sizeof(*cqe));
+ roce_set_bit(dest->cqe_byte_4, CQE_BYTE_4_OWNER_S,
+ owner_bit);
+ }
+ }
+
+ if (nfreed) {
+ cq->cons_index += nfreed;
+ wmb();
+ hns_roce_update_cq_cons_index(ctx, cq);
+ }
+}
+
+static void hns_roce_v1_cq_clean(struct hns_roce_cq *cq, unsigned int qpn,
+ struct hns_roce_srq *srq)
+{
+ pthread_spin_lock(&cq->lock);
+ __hns_roce_v1_cq_clean(cq, qpn, srq);
+ pthread_spin_unlock(&cq->lock);
+}
+
+static int hns_roce_u_v1_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ int attr_mask)
+{
+ int ret;
+ struct ibv_modify_qp cmd;
+ struct hns_roce_qp *hr_qp = to_hr_qp(qp);
+
+ ret = ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof(cmd));
+
+ if (!ret && (attr_mask & IBV_QP_STATE) &&
+ attr->qp_state == IBV_QPS_RESET) {
+ hns_roce_v1_cq_clean(to_hr_cq(qp->recv_cq), qp->qp_num,
+ qp->srq ? to_hr_srq(qp->srq) : NULL);
+ if (qp->send_cq != qp->recv_cq)
+ hns_roce_v1_cq_clean(to_hr_cq(qp->send_cq), qp->qp_num,
+ NULL);
+
+ hns_roce_init_qp_indices(to_hr_qp(qp));
+ }
+
+ if (!ret && (attr_mask & IBV_QP_PORT)) {
+ hr_qp->port_num = attr->port_num;
+ printf("hr_qp->port_num= 0x%x\n", hr_qp->port_num);
+ }
+
+ hr_qp->sl = attr->ah_attr.sl;
+
+ return ret;
+}
+
+static void hns_roce_lock_cqs(struct ibv_qp *qp)
+{
+ struct hns_roce_cq *send_cq = to_hr_cq(qp->send_cq);
+ struct hns_roce_cq *recv_cq = to_hr_cq(qp->recv_cq);
+
+ if (send_cq == recv_cq) {
+ pthread_spin_lock(&send_cq->lock);
+ } else if (send_cq->cqn < recv_cq->cqn) {
+ pthread_spin_lock(&send_cq->lock);
+ pthread_spin_lock(&recv_cq->lock);
+ } else {
+ pthread_spin_lock(&recv_cq->lock);
+ pthread_spin_lock(&send_cq->lock);
+ }
+}
+
+static void hns_roce_unlock_cqs(struct ibv_qp *qp)
+{
+ struct hns_roce_cq *send_cq = to_hr_cq(qp->send_cq);
+ struct hns_roce_cq *recv_cq = to_hr_cq(qp->recv_cq);
+
+ if (send_cq == recv_cq) {
+ pthread_spin_unlock(&send_cq->lock);
+ } else if (send_cq->cqn < recv_cq->cqn) {
+ pthread_spin_unlock(&recv_cq->lock);
+ pthread_spin_unlock(&send_cq->lock);
+ } else {
+ pthread_spin_unlock(&send_cq->lock);
+ pthread_spin_unlock(&recv_cq->lock);
+ }
+}
+
+static int hns_roce_u_v1_destroy_qp(struct ibv_qp *ibqp)
+{
+ int ret;
+ struct hns_roce_qp *qp = to_hr_qp(ibqp);
+
+ pthread_mutex_lock(&to_hr_ctx(ibqp->context)->qp_table_mutex);
+ ret = ibv_cmd_destroy_qp(ibqp);
+ if (ret) {
+ pthread_mutex_unlock(&to_hr_ctx(ibqp->context)->qp_table_mutex);
+ return ret;
+ }
+
+ hns_roce_lock_cqs(ibqp);
+
+ __hns_roce_v1_cq_clean(to_hr_cq(ibqp->recv_cq), ibqp->qp_num,
+ ibqp->srq ? to_hr_srq(ibqp->srq) : NULL);
+
+ if (ibqp->send_cq != ibqp->recv_cq)
+ __hns_roce_v1_cq_clean(to_hr_cq(ibqp->send_cq), ibqp->qp_num,
+ NULL);
+
+ hns_roce_clear_qp(to_hr_ctx(ibqp->context), ibqp->qp_num);
+
+ hns_roce_unlock_cqs(ibqp);
+ pthread_mutex_unlock(&to_hr_ctx(ibqp->context)->qp_table_mutex);
+
+ free(qp->sq.wrid);
+ if (qp->rq.wqe_cnt)
+ free(qp->rq.wrid);
+
+ hns_roce_free_buf(&qp->buf);
+ free(qp);
+
+ return ret;
+}
+
struct hns_roce_u_hw hns_roce_u_hw_v1 = {
.poll_cq = hns_roce_u_v1_poll_cq,
.arm_cq = hns_roce_u_v1_arm_cq,
+ .modify_qp = hns_roce_u_v1_modify_qp,
+ .destroy_qp = hns_roce_u_v1_destroy_qp,
};
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index c9324dd..0b8f444 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -38,11 +38,19 @@
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
-
+#include <ccan/minmax.h>
#include "hns_roce_u.h"
#include "hns_roce_u_abi.h"
#include "hns_roce_u_hw_v1.h"
+void hns_roce_init_qp_indices(struct hns_roce_qp *qp)
+{
+ qp->sq.head = 0;
+ qp->sq.tail = 0;
+ qp->rq.head = 0;
+ qp->rq.tail = 0;
+}
+
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr)
{
@@ -163,6 +171,29 @@ static int align_cq_size(int req)
return nent;
}
+static int align_qp_size(int req)
+{
+ int nent;
+
+ for (nent = HNS_ROCE_MIN_WQE_NUM; nent < req; nent <<= 1)
+ ;
+
+ return nent;
+}
+
+static void hns_roce_set_sq_sizes(struct hns_roce_qp *qp,
+ struct ibv_qp_cap *cap, enum ibv_qp_type type)
+{
+ struct hns_roce_context *ctx = to_hr_ctx(qp->ibv_qp.context);
+
+ qp->sq.max_gs = 2;
+ cap->max_send_sge = min(ctx->max_sge, qp->sq.max_gs);
+ qp->sq.max_post = min(ctx->max_qp_wr, qp->sq.wqe_cnt);
+ cap->max_send_wr = qp->sq.max_post;
+ qp->max_inline_data = 32;
+ cap->max_inline_data = qp->max_inline_data;
+}
+
static int hns_roce_verify_cq(int *cqe, struct hns_roce_context *context)
{
if (*cqe < HNS_ROCE_MIN_CQE_NUM) {
@@ -189,6 +220,17 @@ static int hns_roce_alloc_cq_buf(struct hns_roce_device *dev,
return 0;
}
+static void hns_roce_calc_sq_wqe_size(struct ibv_qp_cap *cap,
+ enum ibv_qp_type type,
+ struct hns_roce_qp *qp)
+{
+ int size = sizeof(struct hns_roce_rc_send_wqe);
+
+ for (qp->sq.wqe_shift = 6; 1 << qp->sq.wqe_shift < size;
+ qp->sq.wqe_shift++)
+ ;
+}
+
struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
struct ibv_comp_channel *channel,
int comp_vector)
@@ -266,3 +308,218 @@ int hns_roce_u_destroy_cq(struct ibv_cq *cq)
return ret;
}
+
+static int hns_roce_verify_qp(struct ibv_qp_init_attr *attr,
+ struct hns_roce_context *context)
+{
+ if (attr->cap.max_send_wr < HNS_ROCE_MIN_WQE_NUM) {
+ fprintf(stderr,
+ "max_send_wr = %d, less than minimum WQE number.\n",
+ attr->cap.max_send_wr);
+ attr->cap.max_send_wr = HNS_ROCE_MIN_WQE_NUM;
+ }
+
+ if (attr->cap.max_recv_wr < HNS_ROCE_MIN_WQE_NUM) {
+ fprintf(stderr,
+ "max_recv_wr = %d, less than minimum WQE number.\n",
+ attr->cap.max_recv_wr);
+ attr->cap.max_recv_wr = HNS_ROCE_MIN_WQE_NUM;
+ }
+
+ if (attr->cap.max_recv_sge < 1)
+ attr->cap.max_recv_sge = 1;
+ if (attr->cap.max_send_wr > context->max_qp_wr ||
+ attr->cap.max_recv_wr > context->max_qp_wr ||
+ attr->cap.max_send_sge > context->max_sge ||
+ attr->cap.max_recv_sge > context->max_sge)
+ return -1;
+
+ if ((attr->qp_type != IBV_QPT_RC) && (attr->qp_type != IBV_QPT_UD))
+ return -1;
+
+ if ((attr->qp_type == IBV_QPT_RC) &&
+ (attr->cap.max_inline_data > HNS_ROCE_RC_WQE_INLINE_DATA_MAX_LEN))
+ return -1;
+
+ if (attr->qp_type == IBV_QPT_UC)
+ return -1;
+
+ return 0;
+}
+
+static int hns_roce_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap,
+ enum ibv_qp_type type, struct hns_roce_qp *qp)
+{
+ qp->sq.wrid =
+ (unsigned long *)malloc(qp->sq.wqe_cnt * sizeof(uint64_t));
+ if (!qp->sq.wrid)
+ return -1;
+
+ if (qp->rq.wqe_cnt) {
+ qp->rq.wrid = malloc(qp->rq.wqe_cnt * sizeof(uint64_t));
+ if (!qp->rq.wrid) {
+ free(qp->sq.wrid);
+ return -1;
+ }
+ }
+
+ for (qp->rq.wqe_shift = 4;
+ 1 << qp->rq.wqe_shift < sizeof(struct hns_roce_rc_send_wqe);
+ qp->rq.wqe_shift++)
+ ;
+
+ qp->buf_size = align((qp->sq.wqe_cnt << qp->sq.wqe_shift), 0x1000) +
+ (qp->rq.wqe_cnt << qp->rq.wqe_shift);
+
+ if (qp->rq.wqe_shift > qp->sq.wqe_shift) {
+ qp->rq.offset = 0;
+ qp->sq.offset = qp->rq.wqe_cnt << qp->rq.wqe_shift;
+ } else {
+ qp->rq.offset = align((qp->sq.wqe_cnt << qp->sq.wqe_shift),
+ 0x1000);
+ qp->sq.offset = 0;
+ }
+
+ if (hns_roce_alloc_buf(&qp->buf, align(qp->buf_size, 0x1000),
+ to_hr_dev(pd->context->device)->page_size)) {
+ free(qp->sq.wrid);
+ free(qp->rq.wrid);
+ return -1;
+ }
+
+ memset(qp->buf.buf, 0, qp->buf_size);
+
+ return 0;
+}
+
+static int hns_roce_store_qp(struct hns_roce_context *ctx, uint32_t qpn,
+ struct hns_roce_qp *qp)
+{
+ int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift;
+
+ if (!ctx->qp_table[tind].refcnt) {
+ ctx->qp_table[tind].table = calloc(ctx->qp_table_mask + 1,
+ sizeof(struct hns_roce_qp *));
+ if (!ctx->qp_table[tind].table)
+ return -1;
+ }
+
+ ++ctx->qp_table[tind].refcnt;
+ ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = qp;
+
+ return 0;
+}
+
+struct ibv_qp *hns_roce_u_create_qp(struct ibv_pd *pd,
+ struct ibv_qp_init_attr *attr)
+{
+ int ret;
+ struct hns_roce_qp *qp = NULL;
+ struct hns_roce_create_qp cmd;
+ struct ibv_create_qp_resp resp;
+ struct hns_roce_context *context = to_hr_ctx(pd->context);
+
+ if (hns_roce_verify_qp(attr, context)) {
+ fprintf(stderr, "hns_roce_verify_sizes failed!\n");
+ return NULL;
+ }
+
+ qp = malloc(sizeof(*qp));
+ if (!qp) {
+ fprintf(stderr, "malloc failed!\n");
+ return NULL;
+ }
+
+ hns_roce_calc_sq_wqe_size(&attr->cap, attr->qp_type, qp);
+ qp->sq.wqe_cnt = align_qp_size(attr->cap.max_send_wr);
+ qp->rq.wqe_cnt = align_qp_size(attr->cap.max_recv_wr);
+
+ if (hns_roce_alloc_qp_buf(pd, &attr->cap, attr->qp_type, qp)) {
+ fprintf(stderr, "hns_roce_alloc_qp_buf failed!\n");
+ goto err;
+ }
+
+ hns_roce_init_qp_indices(qp);
+
+ if (pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE) ||
+ pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE)) {
+ fprintf(stderr, "pthread_spin_init failed!\n");
+ goto err_free;
+ }
+
+ cmd.buf_addr = (uintptr_t) qp->buf.buf;
+ cmd.log_sq_stride = qp->sq.wqe_shift;
+ for (cmd.log_sq_bb_count = 0; qp->sq.wqe_cnt > 1 << cmd.log_sq_bb_count;
+ ++cmd.log_sq_bb_count)
+ ;
+
+ memset(cmd.reserved, 0, sizeof(cmd.reserved));
+
+ pthread_mutex_lock(&to_hr_ctx(pd->context)->qp_table_mutex);
+
+ ret = ibv_cmd_create_qp(pd, &qp->ibv_qp, attr, &cmd.ibv_cmd,
+ sizeof(cmd), &resp, sizeof(resp));
+ if (ret) {
+ fprintf(stderr, "ibv_cmd_create_qp failed!\n");
+ goto err_rq_db;
+ }
+
+ ret = hns_roce_store_qp(to_hr_ctx(pd->context), qp->ibv_qp.qp_num, qp);
+ if (ret) {
+ fprintf(stderr, "hns_roce_store_qp failed!\n");
+ goto err_destroy;
+ }
+ pthread_mutex_unlock(&to_hr_ctx(pd->context)->qp_table_mutex);
+
+ qp->rq.wqe_cnt = attr->cap.max_recv_wr;
+ qp->rq.max_gs = attr->cap.max_recv_sge;
+
+ /* adjust rq maxima to not exceed reported device maxima */
+ attr->cap.max_recv_wr = min(context->max_qp_wr, attr->cap.max_recv_wr);
+ attr->cap.max_recv_sge = min(context->max_sge, attr->cap.max_recv_sge);
+
+ qp->rq.max_post = attr->cap.max_recv_wr;
+ hns_roce_set_sq_sizes(qp, &attr->cap, attr->qp_type);
+
+ qp->sq_signal_bits = attr->sq_sig_all ? 0 : 1;
+
+ return &qp->ibv_qp;
+
+err_destroy:
+ ibv_cmd_destroy_qp(&qp->ibv_qp);
+
+err_rq_db:
+ pthread_mutex_unlock(&to_hr_ctx(pd->context)->qp_table_mutex);
+
+err_free:
+ free(qp->sq.wrid);
+ if (qp->rq.wqe_cnt)
+ free(qp->rq.wrid);
+ hns_roce_free_buf(&qp->buf);
+
+err:
+ free(qp);
+
+ return NULL;
+}
+
+int hns_roce_u_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr,
+ int attr_mask, struct ibv_qp_init_attr *init_attr)
+{
+ int ret;
+ struct ibv_query_qp cmd;
+ struct hns_roce_qp *qp = to_hr_qp(ibqp);
+
+ ret = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd,
+ sizeof(cmd));
+ if (ret)
+ return ret;
+
+ init_attr->cap.max_send_wr = qp->sq.max_post;
+ init_attr->cap.max_send_sge = qp->sq.max_gs;
+ init_attr->cap.max_inline_data = qp->max_inline_data;
+
+ attr->cap = init_attr->cap;
+
+ return ret;
+}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 rdma-core 4/7] libhns: Add verbs of cq support
From: Lijun Ou @ 2016-11-13 10:35 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1479033360-56035-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces the relatived cq verbs for userspace
of hns, include:
1. create_cq
2. poll_cq
3. req_notify_cq
4. cq_event
5. destroy_cq
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
v5:
- Eliminate the warning when CFLAGS equal to -m32
v4:
- Eliminate the warning by Travis CI testing
v3:
- No change over the v2
v2:
- Delete the unused code
v1:
- The initial submit
---
providers/hns/hns_roce_u.c | 57 +++++-
providers/hns/hns_roce_u.h | 94 ++++++++++
providers/hns/hns_roce_u_abi.h | 12 ++
providers/hns/hns_roce_u_buf.c | 61 +++++++
providers/hns/hns_roce_u_db.h | 54 ++++++
providers/hns/hns_roce_u_hw_v1.c | 368 +++++++++++++++++++++++++++++++++++++++
providers/hns/hns_roce_u_hw_v1.h | 163 +++++++++++++++++
providers/hns/hns_roce_u_verbs.c | 116 ++++++++++++
8 files changed, 920 insertions(+), 5 deletions(-)
create mode 100644 providers/hns/hns_roce_u_buf.c
create mode 100644 providers/hns/hns_roce_u_db.h
create mode 100644 providers/hns/hns_roce_u_hw_v1.c
create mode 100644 providers/hns/hns_roce_u_hw_v1.h
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index 53e2720..1877218 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -46,15 +46,19 @@
static const struct {
char hid[HID_LEN];
+ void *data;
+ int version;
} acpi_table[] = {
- {"acpi:HISI00D1:"},
- {},
+ {"acpi:HISI00D1:", &hns_roce_u_hw_v1, HNS_ROCE_HW_VER1},
+ {},
};
static const struct {
char compatible[DEV_MATCH_LEN];
+ void *data;
+ int version;
} dt_table[] = {
- {"hisilicon,hns-roce-v1"},
+ {"hisilicon,hns-roce-v1", &hns_roce_u_hw_v1, HNS_ROCE_HW_VER1},
{},
};
@@ -93,6 +97,21 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
goto err_free;
}
+ if (hr_dev->hw_version == HNS_ROCE_HW_VER1) {
+ /*
+ * when vma->vm_pgoff is 1, the cq_tptr_base includes 64K CQ,
+ * a pointer of CQ need 2B size
+ */
+ context->cq_tptr_base = mmap(NULL, HNS_ROCE_CQ_DB_BUF_SIZE,
+ PROT_READ | PROT_WRITE, MAP_SHARED,
+ cmd_fd, HNS_ROCE_TPTR_OFFSET);
+ if (context->cq_tptr_base == MAP_FAILED) {
+ fprintf(stderr,
+ PFX "Warning: Failed to mmap cq_tptr page.\n");
+ goto db_free;
+ }
+ }
+
pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE);
context->ibv_ctx.ops.query_device = hns_roce_u_query_device;
@@ -102,6 +121,12 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
context->ibv_ctx.ops.reg_mr = hns_roce_u_reg_mr;
context->ibv_ctx.ops.dereg_mr = hns_roce_u_dereg_mr;
+ context->ibv_ctx.ops.create_cq = hns_roce_u_create_cq;
+ context->ibv_ctx.ops.poll_cq = hr_dev->u_hw->poll_cq;
+ context->ibv_ctx.ops.req_notify_cq = hr_dev->u_hw->arm_cq;
+ context->ibv_ctx.ops.cq_event = hns_roce_u_cq_event;
+ context->ibv_ctx.ops.destroy_cq = hns_roce_u_destroy_cq;
+
if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
goto tptr_free;
@@ -112,6 +137,16 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
return &context->ibv_ctx;
tptr_free:
+ if (hr_dev->hw_version == HNS_ROCE_HW_VER1) {
+ if (munmap(context->cq_tptr_base, HNS_ROCE_CQ_DB_BUF_SIZE))
+ fprintf(stderr, PFX "Warning: Munmap tptr failed.\n");
+ context->cq_tptr_base = NULL;
+ }
+
+db_free:
+ munmap(context->uar, to_hr_dev(ibdev)->page_size);
+ context->uar = NULL;
+
err_free:
free(context);
return NULL;
@@ -122,6 +157,8 @@ static void hns_roce_free_context(struct ibv_context *ibctx)
struct hns_roce_context *context = to_hr_ctx(ibctx);
munmap(context->uar, to_hr_dev(ibctx->device)->page_size);
+ if (to_hr_dev(ibctx->device)->hw_version == HNS_ROCE_HW_VER1)
+ munmap(context->cq_tptr_base, HNS_ROCE_CQ_DB_BUF_SIZE);
context->uar = NULL;
@@ -140,18 +177,26 @@ static struct ibv_device *hns_roce_driver_init(const char *uverbs_sys_path,
struct hns_roce_device *dev;
char value[128];
int i;
+ void *u_hw;
+ int hw_version;
if (ibv_read_sysfs_file(uverbs_sys_path, "device/modalias",
value, sizeof(value)) > 0)
for (i = 0; i < sizeof(acpi_table) / sizeof(acpi_table[0]); ++i)
- if (!strcmp(value, acpi_table[i].hid))
+ if (!strcmp(value, acpi_table[i].hid)) {
+ u_hw = acpi_table[i].data;
+ hw_version = acpi_table[i].version;
goto found;
+ }
if (ibv_read_sysfs_file(uverbs_sys_path, "device/of_node/compatible",
value, sizeof(value)) > 0)
for (i = 0; i < sizeof(dt_table) / sizeof(dt_table[0]); ++i)
- if (!strcmp(value, dt_table[i].compatible))
+ if (!strcmp(value, dt_table[i].compatible)) {
+ u_hw = dt_table[i].data;
+ hw_version = dt_table[i].version;
goto found;
+ }
return NULL;
@@ -164,6 +209,8 @@ found:
}
dev->ibv_dev.ops = hns_roce_dev_ops;
+ dev->u_hw = (struct hns_roce_u_hw *)u_hw;
+ dev->hw_version = hw_version;
dev->page_size = sysconf(_SC_PAGESIZE);
return &dev->ibv_dev;
}
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index 5b73794..c3e364d 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -40,18 +40,53 @@
#include <infiniband/verbs.h>
#include <ccan/container_of.h>
+#define HNS_ROCE_CQE_ENTRY_SIZE 0x20
+
+#define HNS_ROCE_MAX_CQ_NUM 0x10000
+#define HNS_ROCE_MIN_CQE_NUM 0x40
+#define HNS_ROCE_CQ_DB_BUF_SIZE ((HNS_ROCE_MAX_CQ_NUM >> 11) << 12)
+#define HNS_ROCE_TPTR_OFFSET 0x1000
#define HNS_ROCE_HW_VER1 ('h' << 24 | 'i' << 16 | '0' << 8 | '6')
#define PFX "hns: "
+#define roce_get_field(origin, mask, shift) \
+ (((origin) & (mask)) >> (shift))
+
+#define roce_get_bit(origin, shift) \
+ roce_get_field((origin), (1ul << (shift)), (shift))
+
+#define roce_set_field(origin, mask, shift, val) \
+ do { \
+ (origin) &= (~(mask)); \
+ (origin) |= (((unsigned int)(val) << (shift)) & (mask)); \
+ } while (0)
+
+#define roce_set_bit(origin, shift, val) \
+ roce_set_field((origin), (1ul << (shift)), (shift), (val))
+
enum {
HNS_ROCE_QP_TABLE_BITS = 8,
HNS_ROCE_QP_TABLE_SIZE = 1 << HNS_ROCE_QP_TABLE_BITS,
};
+/* operation type list */
+enum {
+ /* rq&srq operation */
+ HNS_ROCE_OPCODE_SEND_DATA_RECEIVE = 0x06,
+ HNS_ROCE_OPCODE_RDMA_WITH_IMM_RECEIVE = 0x07,
+};
+
struct hns_roce_device {
struct ibv_device ibv_dev;
int page_size;
+ struct hns_roce_u_hw *u_hw;
+ int hw_version;
+};
+
+struct hns_roce_buf {
+ void *buf;
+ unsigned int length;
};
struct hns_roce_context {
@@ -59,7 +94,10 @@ struct hns_roce_context {
void *uar;
pthread_spinlock_t uar_lock;
+ void *cq_tptr_base;
+
struct {
+ struct hns_roce_qp **table;
int refcnt;
} qp_table[HNS_ROCE_QP_TABLE_SIZE];
@@ -78,6 +116,44 @@ struct hns_roce_pd {
unsigned int pdn;
};
+struct hns_roce_cq {
+ struct ibv_cq ibv_cq;
+ struct hns_roce_buf buf;
+ pthread_spinlock_t lock;
+ unsigned int cqn;
+ unsigned int cq_depth;
+ unsigned int cons_index;
+ unsigned int *set_ci_db;
+ unsigned int *arm_db;
+ int arm_sn;
+};
+
+struct hns_roce_wq {
+ unsigned long *wrid;
+ unsigned int wqe_cnt;
+ unsigned int tail;
+ int wqe_shift;
+ int offset;
+};
+
+struct hns_roce_qp {
+ struct ibv_qp ibv_qp;
+ struct hns_roce_buf buf;
+ unsigned int sq_signal_bits;
+ struct hns_roce_wq sq;
+ struct hns_roce_wq rq;
+};
+
+struct hns_roce_u_hw {
+ int (*poll_cq)(struct ibv_cq *ibvcq, int ne, struct ibv_wc *wc);
+ int (*arm_cq)(struct ibv_cq *ibvcq, int solicited);
+};
+
+static inline unsigned long align(unsigned long val, unsigned long align)
+{
+ return (val + align - 1) & ~(align - 1);
+}
+
static inline struct hns_roce_device *to_hr_dev(struct ibv_device *ibv_dev)
{
return container_of(ibv_dev, struct hns_roce_device, ibv_dev);
@@ -93,6 +169,11 @@ static inline struct hns_roce_pd *to_hr_pd(struct ibv_pd *ibv_pd)
return container_of(ibv_pd, struct hns_roce_pd, ibv_pd);
}
+static inline struct hns_roce_cq *to_hr_cq(struct ibv_cq *ibv_cq)
+{
+ return container_of(ibv_cq, struct hns_roce_cq, ibv_cq);
+}
+
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr);
int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
@@ -105,4 +186,17 @@ struct ibv_mr *hns_roce_u_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
int access);
int hns_roce_u_dereg_mr(struct ibv_mr *mr);
+struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
+ struct ibv_comp_channel *channel,
+ int comp_vector);
+
+int hns_roce_u_destroy_cq(struct ibv_cq *cq);
+void hns_roce_u_cq_event(struct ibv_cq *cq);
+
+int hns_roce_alloc_buf(struct hns_roce_buf *buf, unsigned int size,
+ int page_size);
+void hns_roce_free_buf(struct hns_roce_buf *buf);
+
+extern struct hns_roce_u_hw hns_roce_u_hw_v1;
+
#endif /* _HNS_ROCE_U_H */
diff --git a/providers/hns/hns_roce_u_abi.h b/providers/hns/hns_roce_u_abi.h
index 0a0cd0c..1e62a7e 100644
--- a/providers/hns/hns_roce_u_abi.h
+++ b/providers/hns/hns_roce_u_abi.h
@@ -46,4 +46,16 @@ struct hns_roce_alloc_pd_resp {
__u32 reserved;
};
+struct hns_roce_create_cq {
+ struct ibv_create_cq ibv_cmd;
+ __u64 buf_addr;
+ __u64 db_addr;
+};
+
+struct hns_roce_create_cq_resp {
+ struct ibv_create_cq_resp ibv_resp;
+ __u32 cqn;
+ __u32 reserved;
+};
+
#endif /* _HNS_ROCE_U_ABI_H */
diff --git a/providers/hns/hns_roce_u_buf.c b/providers/hns/hns_roce_u_buf.c
new file mode 100644
index 0000000..f92ea65
--- /dev/null
+++ b/providers/hns/hns_roce_u_buf.c
@@ -0,0 +1,61 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <errno.h>
+#include <sys/mman.h>
+
+#include "hns_roce_u.h"
+
+int hns_roce_alloc_buf(struct hns_roce_buf *buf, unsigned int size,
+ int page_size)
+{
+ int ret;
+
+ buf->length = align(size, page_size);
+ buf->buf = mmap(NULL, buf->length, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ if (buf->buf == MAP_FAILED)
+ return errno;
+
+ ret = ibv_dontfork_range(buf->buf, size);
+ if (ret)
+ munmap(buf->buf, buf->length);
+
+ return ret;
+}
+
+void hns_roce_free_buf(struct hns_roce_buf *buf)
+{
+ ibv_dofork_range(buf->buf, buf->length);
+
+ munmap(buf->buf, buf->length);
+}
diff --git a/providers/hns/hns_roce_u_db.h b/providers/hns/hns_roce_u_db.h
new file mode 100644
index 0000000..76d13ce
--- /dev/null
+++ b/providers/hns/hns_roce_u_db.h
@@ -0,0 +1,54 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/types.h>
+
+#include "hns_roce_u.h"
+
+#ifndef _HNS_ROCE_U_DB_H
+#define _HNS_ROCE_U_DB_H
+
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define HNS_ROCE_PAIR_TO_64(val) ((uint64_t) val[1] << 32 | val[0])
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define HNS_ROCE_PAIR_TO_64(val) ((uint64_t) val[0] << 32 | val[1])
+#else
+#error __BYTE_ORDER not defined
+#endif
+
+static inline void hns_roce_write64(uint32_t val[2],
+ struct hns_roce_context *ctx, int offset)
+{
+ *(volatile uint64_t *) (ctx->uar + offset) = HNS_ROCE_PAIR_TO_64(val);
+}
+
+#endif /* _HNS_ROCE_U_DB_H */
diff --git a/providers/hns/hns_roce_u_hw_v1.c b/providers/hns/hns_roce_u_hw_v1.c
new file mode 100644
index 0000000..39a67b1
--- /dev/null
+++ b/providers/hns/hns_roce_u_hw_v1.c
@@ -0,0 +1,368 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <malloc.h>
+#include "hns_roce_u_db.h"
+#include "hns_roce_u_hw_v1.h"
+#include "hns_roce_u.h"
+
+static void hns_roce_update_cq_cons_index(struct hns_roce_context *ctx,
+ struct hns_roce_cq *cq)
+{
+ struct hns_roce_cq_db cq_db;
+
+ cq_db.u32_4 = 0;
+ cq_db.u32_8 = 0;
+
+ roce_set_bit(cq_db.u32_8, CQ_DB_U32_8_HW_SYNC_S, 1);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CMD_M, CQ_DB_U32_8_CMD_S, 3);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CMD_MDF_M,
+ CQ_DB_U32_8_CMD_MDF_S, 0);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CQN_M, CQ_DB_U32_8_CQN_S,
+ cq->cqn);
+ roce_set_field(cq_db.u32_4, CQ_DB_U32_4_CONS_IDX_M,
+ CQ_DB_U32_4_CONS_IDX_S,
+ cq->cons_index & ((cq->cq_depth << 1) - 1));
+
+ hns_roce_write64((uint32_t *)&cq_db, ctx, ROCEE_DB_OTHERS_L_0_REG);
+}
+
+static void hns_roce_handle_error_cqe(struct hns_roce_cqe *cqe,
+ struct ibv_wc *wc)
+{
+ switch (roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_STATUS_OF_THE_OPERATION_M,
+ CQE_BYTE_4_STATUS_OF_THE_OPERATION_S) &
+ HNS_ROCE_CQE_STATUS_MASK) {
+ fprintf(stderr, PFX "error cqe!\n");
+ case HNS_ROCE_CQE_SYNDROME_LOCAL_LENGTH_ERR:
+ wc->status = IBV_WC_LOC_LEN_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_LOCAL_QP_OP_ERR:
+ wc->status = IBV_WC_LOC_QP_OP_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_LOCAL_PROT_ERR:
+ wc->status = IBV_WC_LOC_PROT_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_WR_FLUSH_ERR:
+ wc->status = IBV_WC_WR_FLUSH_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_MEM_MANAGE_OPERATE_ERR:
+ wc->status = IBV_WC_MW_BIND_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_BAD_RESP_ERR:
+ wc->status = IBV_WC_BAD_RESP_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_LOCAL_ACCESS_ERR:
+ wc->status = IBV_WC_LOC_ACCESS_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR:
+ wc->status = IBV_WC_REM_INV_REQ_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_REMOTE_ACCESS_ERR:
+ wc->status = IBV_WC_REM_ACCESS_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_REMOTE_OP_ERR:
+ wc->status = IBV_WC_REM_OP_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR:
+ wc->status = IBV_WC_RETRY_EXC_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_RNR_RETRY_EXC_ERR:
+ wc->status = IBV_WC_RNR_RETRY_EXC_ERR;
+ break;
+ default:
+ wc->status = IBV_WC_GENERAL_ERR;
+ break;
+ }
+}
+
+static struct hns_roce_cqe *get_cqe(struct hns_roce_cq *cq, int entry)
+{
+ return cq->buf.buf + entry * HNS_ROCE_CQE_ENTRY_SIZE;
+}
+
+static void *get_sw_cqe(struct hns_roce_cq *cq, int n)
+{
+ struct hns_roce_cqe *cqe = get_cqe(cq, n & cq->ibv_cq.cqe);
+
+ return (!!(roce_get_bit(cqe->cqe_byte_4, CQE_BYTE_4_OWNER_S)) ^
+ !!(n & (cq->ibv_cq.cqe + 1))) ? cqe : NULL;
+}
+
+static struct hns_roce_cqe *next_cqe_sw(struct hns_roce_cq *cq)
+{
+ return get_sw_cqe(cq, cq->cons_index);
+}
+
+static void *get_send_wqe(struct hns_roce_qp *qp, int n)
+{
+ if ((n < 0) || (n > qp->sq.wqe_cnt)) {
+ printf("sq wqe index:%d,sq wqe cnt:%d\r\n", n, qp->sq.wqe_cnt);
+ return NULL;
+ }
+
+ return (void *)(qp->buf.buf + qp->sq.offset + (n << qp->sq.wqe_shift));
+}
+
+static struct hns_roce_qp *hns_roce_find_qp(struct hns_roce_context *ctx,
+ uint32_t qpn)
+{
+ int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift;
+
+ if (ctx->qp_table[tind].refcnt) {
+ return ctx->qp_table[tind].table[qpn & ctx->qp_table_mask];
+ } else {
+ printf("hns_roce_find_qp fail!\n");
+ return NULL;
+ }
+}
+
+static int hns_roce_v1_poll_one(struct hns_roce_cq *cq,
+ struct hns_roce_qp **cur_qp, struct ibv_wc *wc)
+{
+ uint32_t qpn;
+ int is_send;
+ uint16_t wqe_ctr;
+ uint32_t local_qpn;
+ struct hns_roce_wq *wq = NULL;
+ struct hns_roce_cqe *cqe = NULL;
+ struct hns_roce_wqe_ctrl_seg *sq_wqe = NULL;
+
+ /* According to CI, find the relative cqe */
+ cqe = next_cqe_sw(cq);
+ if (!cqe)
+ return CQ_EMPTY;
+
+ /* Get the next cqe, CI will be added gradually */
+ ++cq->cons_index;
+
+ rmb();
+
+ qpn = roce_get_field(cqe->cqe_byte_16, CQE_BYTE_16_LOCAL_QPN_M,
+ CQE_BYTE_16_LOCAL_QPN_S);
+
+ is_send = (roce_get_bit(cqe->cqe_byte_4, CQE_BYTE_4_SQ_RQ_FLAG_S) ==
+ HNS_ROCE_CQE_IS_SQ);
+
+ local_qpn = roce_get_field(cqe->cqe_byte_16, CQE_BYTE_16_LOCAL_QPN_M,
+ CQE_BYTE_16_LOCAL_QPN_S);
+
+ /* if qp is zero, it will not get the correct qpn */
+ if (!*cur_qp ||
+ (local_qpn & HNS_ROCE_CQE_QPN_MASK) != (*cur_qp)->ibv_qp.qp_num) {
+
+ *cur_qp = hns_roce_find_qp(to_hr_ctx(cq->ibv_cq.context),
+ qpn & 0xffffff);
+ if (!*cur_qp) {
+ fprintf(stderr, PFX "can't find qp!\n");
+ return CQ_POLL_ERR;
+ }
+ }
+ wc->qp_num = qpn & 0xffffff;
+
+ if (is_send) {
+ wq = &(*cur_qp)->sq;
+ /*
+ * if sq_signal_bits is 1, the tail pointer first update to
+ * the wqe corresponding the current cqe
+ */
+ if ((*cur_qp)->sq_signal_bits) {
+ wqe_ctr = (uint16_t)(roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_WQE_INDEX_M,
+ CQE_BYTE_4_WQE_INDEX_S));
+ /*
+ * wq->tail will plus a positive number every time,
+ * when wq->tail exceeds 32b, it is 0 and acc
+ */
+ wq->tail += (wqe_ctr - (uint16_t) wq->tail) &
+ (wq->wqe_cnt - 1);
+ }
+ /* write the wr_id of wq into the wc */
+ wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+ ++wq->tail;
+ } else {
+ wq = &(*cur_qp)->rq;
+ wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+ ++wq->tail;
+ }
+
+ /*
+ * HW maintains wc status, set the err type and directly return, after
+ * generated the incorrect CQE
+ */
+ if (roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_STATUS_OF_THE_OPERATION_M,
+ CQE_BYTE_4_STATUS_OF_THE_OPERATION_S) != HNS_ROCE_CQE_SUCCESS) {
+ hns_roce_handle_error_cqe(cqe, wc);
+ return CQ_OK;
+ }
+ wc->status = IBV_WC_SUCCESS;
+
+ /*
+ * According to the opcode type of cqe, mark the opcode and other
+ * information of wc
+ */
+ if (is_send) {
+ /* Get opcode and flag before update the tail point for send */
+ sq_wqe = (struct hns_roce_wqe_ctrl_seg *)
+ get_send_wqe(*cur_qp, roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_WQE_INDEX_M,
+ CQE_BYTE_4_WQE_INDEX_S));
+ switch (sq_wqe->flag & HNS_ROCE_WQE_OPCODE_MASK) {
+ case HNS_ROCE_WQE_OPCODE_SEND:
+ wc->opcode = IBV_WC_SEND;
+ break;
+ case HNS_ROCE_WQE_OPCODE_RDMA_READ:
+ wc->opcode = IBV_WC_RDMA_READ;
+ wc->byte_len = cqe->byte_cnt;
+ break;
+ case HNS_ROCE_WQE_OPCODE_RDMA_WRITE:
+ wc->opcode = IBV_WC_RDMA_WRITE;
+ break;
+ case HNS_ROCE_WQE_OPCODE_BIND_MW2:
+ wc->opcode = IBV_WC_BIND_MW;
+ break;
+ default:
+ wc->status = IBV_WC_GENERAL_ERR;
+ break;
+ }
+ wc->wc_flags = (sq_wqe->flag & HNS_ROCE_WQE_IMM ?
+ IBV_WC_WITH_IMM : 0);
+ } else {
+ /* Get opcode and flag in rq&srq */
+ wc->byte_len = (cqe->byte_cnt);
+
+ switch (roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_OPERATION_TYPE_M,
+ CQE_BYTE_4_OPERATION_TYPE_S) &
+ HNS_ROCE_CQE_OPCODE_MASK) {
+ case HNS_ROCE_OPCODE_RDMA_WITH_IMM_RECEIVE:
+ wc->opcode = IBV_WC_RECV_RDMA_WITH_IMM;
+ wc->wc_flags = IBV_WC_WITH_IMM;
+ wc->imm_data = cqe->immediate_data;
+ break;
+ case HNS_ROCE_OPCODE_SEND_DATA_RECEIVE:
+ if (roce_get_bit(cqe->cqe_byte_4,
+ CQE_BYTE_4_IMMEDIATE_DATA_FLAG_S)) {
+ wc->opcode = IBV_WC_RECV;
+ wc->wc_flags = IBV_WC_WITH_IMM;
+ wc->imm_data = cqe->immediate_data;
+ } else {
+ wc->opcode = IBV_WC_RECV;
+ wc->wc_flags = 0;
+ }
+ break;
+ default:
+ wc->status = IBV_WC_GENERAL_ERR;
+ break;
+ }
+ }
+
+ return CQ_OK;
+}
+
+static int hns_roce_u_v1_poll_cq(struct ibv_cq *ibvcq, int ne,
+ struct ibv_wc *wc)
+{
+ int npolled;
+ int err = CQ_OK;
+ struct hns_roce_qp *qp = NULL;
+ struct hns_roce_cq *cq = to_hr_cq(ibvcq);
+ struct hns_roce_context *ctx = to_hr_ctx(ibvcq->context);
+ struct hns_roce_device *dev = to_hr_dev(ibvcq->context->device);
+
+ pthread_spin_lock(&cq->lock);
+
+ for (npolled = 0; npolled < ne; ++npolled) {
+ err = hns_roce_v1_poll_one(cq, &qp, wc + npolled);
+ if (err != CQ_OK)
+ break;
+ }
+
+ if (npolled) {
+ if (dev->hw_version == HNS_ROCE_HW_VER1) {
+ *cq->set_ci_db = (unsigned short)(cq->cons_index &
+ ((cq->cq_depth << 1) - 1));
+ mb();
+ }
+
+ hns_roce_update_cq_cons_index(ctx, cq);
+ }
+
+ pthread_spin_unlock(&cq->lock);
+
+ return err == CQ_POLL_ERR ? err : npolled;
+}
+
+/**
+ * hns_roce_u_v1_arm_cq - request completion notification on a CQ
+ * @ibvcq: The completion queue to request notification for.
+ * @solicited: If non-zero, a event will be generated only for
+ * the next solicited CQ entry. If zero, any CQ entry,
+ * solicited or not, will generate an event
+ */
+static int hns_roce_u_v1_arm_cq(struct ibv_cq *ibvcq, int solicited)
+{
+ uint32_t ci;
+ uint32_t solicited_flag;
+ struct hns_roce_cq_db cq_db;
+ struct hns_roce_cq *cq = to_hr_cq(ibvcq);
+
+ ci = cq->cons_index & ((cq->cq_depth << 1) - 1);
+ solicited_flag = solicited ? HNS_ROCE_CQ_DB_REQ_SOL :
+ HNS_ROCE_CQ_DB_REQ_NEXT;
+
+ cq_db.u32_4 = 0;
+ cq_db.u32_8 = 0;
+
+ roce_set_bit(cq_db.u32_8, CQ_DB_U32_8_HW_SYNC_S, 1);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CMD_M, CQ_DB_U32_8_CMD_S, 3);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CMD_MDF_M,
+ CQ_DB_U32_8_CMD_MDF_S, 1);
+ roce_set_bit(cq_db.u32_8, CQ_DB_U32_8_NOTIFY_TYPE_S, solicited_flag);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CQN_M, CQ_DB_U32_8_CQN_S,
+ cq->cqn);
+ roce_set_field(cq_db.u32_4, CQ_DB_U32_4_CONS_IDX_M,
+ CQ_DB_U32_4_CONS_IDX_S, ci);
+
+ hns_roce_write64((uint32_t *)&cq_db, to_hr_ctx(ibvcq->context),
+ ROCEE_DB_OTHERS_L_0_REG);
+ return 0;
+}
+
+struct hns_roce_u_hw hns_roce_u_hw_v1 = {
+ .poll_cq = hns_roce_u_v1_poll_cq,
+ .arm_cq = hns_roce_u_v1_arm_cq,
+};
diff --git a/providers/hns/hns_roce_u_hw_v1.h b/providers/hns/hns_roce_u_hw_v1.h
new file mode 100644
index 0000000..b249f54
--- /dev/null
+++ b/providers/hns/hns_roce_u_hw_v1.h
@@ -0,0 +1,163 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _HNS_ROCE_U_HW_V1_H
+#define _HNS_ROCE_U_HW_V1_H
+
+#define HNS_ROCE_CQ_DB_REQ_SOL 1
+#define HNS_ROCE_CQ_DB_REQ_NEXT 0
+
+#define HNS_ROCE_CQE_IS_SQ 0
+
+#define HNS_ROCE_RC_WQE_INLINE_DATA_MAX_LEN 32
+
+enum {
+ HNS_ROCE_WQE_IMM = 1 << 23,
+ HNS_ROCE_WQE_OPCODE_SEND = 0 << 16,
+ HNS_ROCE_WQE_OPCODE_RDMA_READ = 1 << 16,
+ HNS_ROCE_WQE_OPCODE_RDMA_WRITE = 2 << 16,
+ HNS_ROCE_WQE_OPCODE_BIND_MW2 = 6 << 16,
+ HNS_ROCE_WQE_OPCODE_MASK = 15 << 16,
+};
+
+struct hns_roce_wqe_ctrl_seg {
+ __be32 sgl_pa_h;
+ __be32 flag;
+};
+
+enum {
+ CQ_OK = 0,
+ CQ_EMPTY = -1,
+ CQ_POLL_ERR = -2,
+};
+
+enum {
+ HNS_ROCE_CQE_QPN_MASK = 0x3ffff,
+ HNS_ROCE_CQE_STATUS_MASK = 0x1f,
+ HNS_ROCE_CQE_OPCODE_MASK = 0xf,
+};
+
+enum {
+ HNS_ROCE_CQE_SUCCESS,
+ HNS_ROCE_CQE_SYNDROME_LOCAL_LENGTH_ERR,
+ HNS_ROCE_CQE_SYNDROME_LOCAL_QP_OP_ERR,
+ HNS_ROCE_CQE_SYNDROME_LOCAL_PROT_ERR,
+ HNS_ROCE_CQE_SYNDROME_WR_FLUSH_ERR,
+ HNS_ROCE_CQE_SYNDROME_MEM_MANAGE_OPERATE_ERR,
+ HNS_ROCE_CQE_SYNDROME_BAD_RESP_ERR,
+ HNS_ROCE_CQE_SYNDROME_LOCAL_ACCESS_ERR,
+ HNS_ROCE_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR,
+ HNS_ROCE_CQE_SYNDROME_REMOTE_ACCESS_ERR,
+ HNS_ROCE_CQE_SYNDROME_REMOTE_OP_ERR,
+ HNS_ROCE_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR,
+ HNS_ROCE_CQE_SYNDROME_RNR_RETRY_EXC_ERR,
+};
+
+struct hns_roce_cq_db {
+ unsigned int u32_4;
+ unsigned int u32_8;
+};
+#define CQ_DB_U32_4_CONS_IDX_S 0
+#define CQ_DB_U32_4_CONS_IDX_M (((1UL << 16) - 1) << CQ_DB_U32_4_CONS_IDX_S)
+
+#define CQ_DB_U32_8_CQN_S 0
+#define CQ_DB_U32_8_CQN_M (((1UL << 16) - 1) << CQ_DB_U32_8_CQN_S)
+
+#define CQ_DB_U32_8_NOTIFY_TYPE_S 16
+
+#define CQ_DB_U32_8_CMD_MDF_S 24
+#define CQ_DB_U32_8_CMD_MDF_M (((1UL << 4) - 1) << CQ_DB_U32_8_CMD_MDF_S)
+
+#define CQ_DB_U32_8_CMD_S 28
+#define CQ_DB_U32_8_CMD_M (((1UL << 3) - 1) << CQ_DB_U32_8_CMD_S)
+
+#define CQ_DB_U32_8_HW_SYNC_S 31
+
+struct hns_roce_cqe {
+ unsigned int cqe_byte_4;
+ union {
+ unsigned int r_key;
+ unsigned int immediate_data;
+ };
+ unsigned int byte_cnt;
+ unsigned int cqe_byte_16;
+ unsigned int cqe_byte_20;
+ unsigned int s_mac_l;
+ unsigned int cqe_byte_28;
+ unsigned int reserved;
+};
+#define CQE_BYTE_4_OPERATION_TYPE_S 0
+#define CQE_BYTE_4_OPERATION_TYPE_M \
+ (((1UL << 4) - 1) << CQE_BYTE_4_OPERATION_TYPE_S)
+
+#define CQE_BYTE_4_OWNER_S 7
+
+#define CQE_BYTE_4_STATUS_OF_THE_OPERATION_S 8
+#define CQE_BYTE_4_STATUS_OF_THE_OPERATION_M \
+ (((1UL << 5) - 1) << CQE_BYTE_4_STATUS_OF_THE_OPERATION_S)
+
+#define CQE_BYTE_4_SQ_RQ_FLAG_S 14
+
+#define CQE_BYTE_4_IMMEDIATE_DATA_FLAG_S 15
+
+#define CQE_BYTE_4_WQE_INDEX_S 16
+#define CQE_BYTE_4_WQE_INDEX_M (((1UL << 14) - 1) << CQE_BYTE_4_WQE_INDEX_S)
+
+#define CQE_BYTE_16_LOCAL_QPN_S 0
+#define CQE_BYTE_16_LOCAL_QPN_M (((1UL << 24) - 1) << CQE_BYTE_16_LOCAL_QPN_S)
+
+#define ROCEE_DB_SQ_L_0_REG 0x230
+
+#define ROCEE_DB_OTHERS_L_0_REG 0x238
+
+struct hns_roce_rc_send_wqe {
+ unsigned int sgl_ba_31_0;
+ unsigned int u32_1;
+ union {
+ unsigned int r_key;
+ unsigned int immediate_data;
+ };
+ unsigned int msg_length;
+ unsigned int rvd_3;
+ unsigned int rvd_4;
+ unsigned int rvd_5;
+ unsigned int rvd_6;
+ uint64_t va0;
+ unsigned int l_key0;
+ unsigned int length0;
+
+ uint64_t va1;
+ unsigned int l_key1;
+ unsigned int length1;
+};
+
+#endif /* _HNS_ROCE_U_HW_V1_H */
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index c163d3c..c9324dd 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -40,6 +40,8 @@
#include <unistd.h>
#include "hns_roce_u.h"
+#include "hns_roce_u_abi.h"
+#include "hns_roce_u_hw_v1.h"
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr)
@@ -150,3 +152,117 @@ int hns_roce_u_dereg_mr(struct ibv_mr *mr)
return ret;
}
+
+static int align_cq_size(int req)
+{
+ int nent;
+
+ for (nent = HNS_ROCE_MIN_CQE_NUM; nent < req; nent <<= 1)
+ ;
+
+ return nent;
+}
+
+static int hns_roce_verify_cq(int *cqe, struct hns_roce_context *context)
+{
+ if (*cqe < HNS_ROCE_MIN_CQE_NUM) {
+ fprintf(stderr, "cqe = %d, less than minimum CQE number.\n",
+ *cqe);
+ *cqe = HNS_ROCE_MIN_CQE_NUM;
+ }
+
+ if (*cqe > context->max_cqe)
+ return -1;
+
+ return 0;
+}
+
+static int hns_roce_alloc_cq_buf(struct hns_roce_device *dev,
+ struct hns_roce_buf *buf, int nent)
+{
+ if (hns_roce_alloc_buf(buf,
+ align(nent * HNS_ROCE_CQE_ENTRY_SIZE, dev->page_size),
+ dev->page_size))
+ return -1;
+ memset(buf->buf, 0, nent * HNS_ROCE_CQE_ENTRY_SIZE);
+
+ return 0;
+}
+
+struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
+ struct ibv_comp_channel *channel,
+ int comp_vector)
+{
+ struct hns_roce_create_cq cmd;
+ struct hns_roce_create_cq_resp resp;
+ struct hns_roce_cq *cq;
+ int ret;
+
+ if (hns_roce_verify_cq(&cqe, to_hr_ctx(context)))
+ return NULL;
+
+ cq = malloc(sizeof(*cq));
+ if (!cq)
+ return NULL;
+
+ cq->cons_index = 0;
+
+ if (pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE))
+ goto err;
+
+ cqe = align_cq_size(cqe);
+
+ if (hns_roce_alloc_cq_buf(to_hr_dev(context->device), &cq->buf, cqe))
+ goto err;
+
+ cmd.buf_addr = (uintptr_t) cq->buf.buf;
+
+ ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector,
+ &cq->ibv_cq, &cmd.ibv_cmd, sizeof(cmd),
+ &resp.ibv_resp, sizeof(resp));
+ if (ret)
+ goto err_db;
+
+ cq->cqn = resp.cqn;
+ cq->cq_depth = cqe;
+
+ if (to_hr_dev(context->device)->hw_version == HNS_ROCE_HW_VER1)
+ cq->set_ci_db = to_hr_ctx(context)->cq_tptr_base + cq->cqn * 2;
+ else
+ cq->set_ci_db = to_hr_ctx(context)->uar +
+ ROCEE_DB_OTHERS_L_0_REG;
+
+ cq->arm_db = cq->set_ci_db;
+ cq->arm_sn = 1;
+ *(cq->set_ci_db) = 0;
+ *(cq->arm_db) = 0;
+
+ return &cq->ibv_cq;
+
+err_db:
+ hns_roce_free_buf(&cq->buf);
+
+err:
+ free(cq);
+
+ return NULL;
+}
+
+void hns_roce_u_cq_event(struct ibv_cq *cq)
+{
+ to_hr_cq(cq)->arm_sn++;
+}
+
+int hns_roce_u_destroy_cq(struct ibv_cq *cq)
+{
+ int ret;
+
+ ret = ibv_cmd_destroy_cq(cq);
+ if (ret)
+ return ret;
+
+ hns_roce_free_buf(&to_hr_cq(cq)->buf);
+ free(to_hr_cq(cq));
+
+ return ret;
+}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 rdma-core 3/7] libhns: Add verbs of pd and mr support
From: Lijun Ou @ 2016-11-13 10:35 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1479033360-56035-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces the verbs with pd and mr,
included alloc_pd, dealloc_pd, reg_mr and dereg_mr.
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
v5/v4:
- No change over v3
v3:
This fixes the comments given by Leon Romanovsky on PATCHv2
v2:
- No change over v1
v1:
- The initial submit
---
providers/hns/hns_roce_u.c | 4 ++
providers/hns/hns_roce_u.h | 18 +++++++++
providers/hns/hns_roce_u_abi.h | 6 +++
providers/hns/hns_roce_u_verbs.c | 79 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 107 insertions(+)
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index c0f6fe9..53e2720 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -97,6 +97,10 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
context->ibv_ctx.ops.query_device = hns_roce_u_query_device;
context->ibv_ctx.ops.query_port = hns_roce_u_query_port;
+ context->ibv_ctx.ops.alloc_pd = hns_roce_u_alloc_pd;
+ context->ibv_ctx.ops.dealloc_pd = hns_roce_u_free_pd;
+ context->ibv_ctx.ops.reg_mr = hns_roce_u_reg_mr;
+ context->ibv_ctx.ops.dereg_mr = hns_roce_u_dereg_mr;
if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
goto tptr_free;
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index aa58ee6..5b73794 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -73,6 +73,11 @@ struct hns_roce_context {
int max_cqe;
};
+struct hns_roce_pd {
+ struct ibv_pd ibv_pd;
+ unsigned int pdn;
+};
+
static inline struct hns_roce_device *to_hr_dev(struct ibv_device *ibv_dev)
{
return container_of(ibv_dev, struct hns_roce_device, ibv_dev);
@@ -83,8 +88,21 @@ static inline struct hns_roce_context *to_hr_ctx(struct ibv_context *ibv_ctx)
return container_of(ibv_ctx, struct hns_roce_context, ibv_ctx);
}
+static inline struct hns_roce_pd *to_hr_pd(struct ibv_pd *ibv_pd)
+{
+ return container_of(ibv_pd, struct hns_roce_pd, ibv_pd);
+}
+
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr);
int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
struct ibv_port_attr *attr);
+
+struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context);
+int hns_roce_u_free_pd(struct ibv_pd *pd);
+
+struct ibv_mr *hns_roce_u_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
+ int access);
+int hns_roce_u_dereg_mr(struct ibv_mr *mr);
+
#endif /* _HNS_ROCE_U_H */
diff --git a/providers/hns/hns_roce_u_abi.h b/providers/hns/hns_roce_u_abi.h
index 4bfc8fa..0a0cd0c 100644
--- a/providers/hns/hns_roce_u_abi.h
+++ b/providers/hns/hns_roce_u_abi.h
@@ -40,4 +40,10 @@ struct hns_roce_alloc_ucontext_resp {
__u32 qp_tab_size;
};
+struct hns_roce_alloc_pd_resp {
+ struct ibv_alloc_pd_resp ibv_resp;
+ __u32 pdn;
+ __u32 reserved;
+};
+
#endif /* _HNS_ROCE_U_ABI_H */
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index 2d95f20..c163d3c 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -71,3 +71,82 @@ int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd));
}
+
+struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context)
+{
+ struct ibv_alloc_pd cmd;
+ struct hns_roce_pd *pd;
+ struct hns_roce_alloc_pd_resp resp;
+
+ pd = (struct hns_roce_pd *)malloc(sizeof(*pd));
+ if (!pd)
+ return NULL;
+
+ if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd),
+ &resp.ibv_resp, sizeof(resp))) {
+ free(pd);
+ return NULL;
+ }
+
+ pd->pdn = resp.pdn;
+
+ return &pd->ibv_pd;
+}
+
+int hns_roce_u_free_pd(struct ibv_pd *pd)
+{
+ int ret;
+
+ ret = ibv_cmd_dealloc_pd(pd);
+ if (ret)
+ return ret;
+
+ free(to_hr_pd(pd));
+
+ return ret;
+}
+
+struct ibv_mr *hns_roce_u_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
+ int access)
+{
+ int ret;
+ struct ibv_mr *mr;
+ struct ibv_reg_mr cmd;
+ struct ibv_reg_mr_resp resp;
+
+ if (!addr) {
+ fprintf(stderr, "2nd parm addr is NULL!\n");
+ return NULL;
+ }
+
+ if (!length) {
+ fprintf(stderr, "3st parm length is 0!\n");
+ return NULL;
+ }
+
+ mr = malloc(sizeof(*mr));
+ if (!mr)
+ return NULL;
+
+ ret = ibv_cmd_reg_mr(pd, addr, length, (uintptr_t) addr, access, mr,
+ &cmd, sizeof(cmd), &resp, sizeof(resp));
+ if (ret) {
+ free(mr);
+ return NULL;
+ }
+
+ return mr;
+}
+
+int hns_roce_u_dereg_mr(struct ibv_mr *mr)
+{
+ int ret;
+
+ ret = ibv_cmd_dereg_mr(mr);
+ if (ret)
+ return ret;
+
+ free(mr);
+
+ return ret;
+}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 rdma-core 2/7] libhns: Add verbs of querying device and querying port
From: Lijun Ou @ 2016-11-13 10:35 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1479033360-56035-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces query verbs for querying device
and querying port.
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
v5:
Change the type of raw_fw_ver for uint64_t
v4/v3/v2:
- No change over the v1
v1:
- The initial submit
---
providers/hns/hns_roce_u.c | 7 ++++
providers/hns/hns_roce_u.h | 4 +++
providers/hns/hns_roce_u_verbs.c | 73 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 84 insertions(+)
create mode 100644 providers/hns/hns_roce_u_verbs.c
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index bda4dd8..c0f6fe9 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -95,12 +95,19 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE);
+ context->ibv_ctx.ops.query_device = hns_roce_u_query_device;
+ context->ibv_ctx.ops.query_port = hns_roce_u_query_port;
+
+ if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
+ goto tptr_free;
+
context->max_qp_wr = dev_attrs.max_qp_wr;
context->max_sge = dev_attrs.max_sge;
context->max_cqe = dev_attrs.max_cqe;
return &context->ibv_ctx;
+tptr_free:
err_free:
free(context);
return NULL;
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index 3eef171..aa58ee6 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -83,4 +83,8 @@ static inline struct hns_roce_context *to_hr_ctx(struct ibv_context *ibv_ctx)
return container_of(ibv_ctx, struct hns_roce_context, ibv_ctx);
}
+int hns_roce_u_query_device(struct ibv_context *context,
+ struct ibv_device_attr *attr);
+int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
+ struct ibv_port_attr *attr);
#endif /* _HNS_ROCE_U_H */
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
new file mode 100644
index 0000000..2d95f20
--- /dev/null
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -0,0 +1,73 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <pthread.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+#include "hns_roce_u.h"
+
+int hns_roce_u_query_device(struct ibv_context *context,
+ struct ibv_device_attr *attr)
+{
+ int ret;
+ struct ibv_query_device cmd;
+ uint64_t raw_fw_ver;
+ unsigned int major, minor, sub_minor;
+
+ ret = ibv_cmd_query_device(context, attr, &raw_fw_ver, &cmd,
+ sizeof(cmd));
+ if (ret)
+ return ret;
+
+ major = (raw_fw_ver >> 32) & 0xffff;
+ minor = (raw_fw_ver >> 16) & 0xffff;
+ sub_minor = raw_fw_ver & 0xffff;
+
+ snprintf(attr->fw_ver, sizeof(attr->fw_ver), "%d.%d.%03d", major, minor,
+ sub_minor);
+
+ return 0;
+}
+
+int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
+ struct ibv_port_attr *attr)
+{
+ struct ibv_query_port cmd;
+
+ return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd));
+}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 rdma-core 1/7] libhns: Add initial main frame
From: Lijun Ou @ 2016-11-13 10:35 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1479033360-56035-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces initial main frame for
userspace library of hns.
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
v5/v4/v3/v2:
- No change over the v1
v1:
- The initial submit
---
providers/hns/hns_roce_u.c | 163 +++++++++++++++++++++++++++++++++++++++++
providers/hns/hns_roce_u.h | 86 ++++++++++++++++++++++
providers/hns/hns_roce_u_abi.h | 43 +++++++++++
3 files changed, 292 insertions(+)
create mode 100644 providers/hns/hns_roce_u.c
create mode 100644 providers/hns/hns_roce_u.h
create mode 100644 providers/hns/hns_roce_u_abi.h
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
new file mode 100644
index 0000000..bda4dd8
--- /dev/null
+++ b/providers/hns/hns_roce_u.c
@@ -0,0 +1,163 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <pthread.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+#include "hns_roce_u.h"
+#include "hns_roce_u_abi.h"
+
+#define HID_LEN 15
+#define DEV_MATCH_LEN 128
+
+static const struct {
+ char hid[HID_LEN];
+} acpi_table[] = {
+ {"acpi:HISI00D1:"},
+ {},
+};
+
+static const struct {
+ char compatible[DEV_MATCH_LEN];
+} dt_table[] = {
+ {"hisilicon,hns-roce-v1"},
+ {},
+};
+
+static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
+ int cmd_fd)
+{
+ int i;
+ struct ibv_get_context cmd;
+ struct ibv_device_attr dev_attrs;
+ struct hns_roce_context *context;
+ struct hns_roce_alloc_ucontext_resp resp;
+ struct hns_roce_device *hr_dev = to_hr_dev(ibdev);
+
+ context = calloc(1, sizeof(*context));
+ if (!context)
+ return NULL;
+
+ context->ibv_ctx.cmd_fd = cmd_fd;
+ if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof(cmd),
+ &resp.ibv_resp, sizeof(resp)))
+ goto err_free;
+
+ context->num_qps = resp.qp_tab_size;
+ context->qp_table_shift = ffs(context->num_qps) - 1 -
+ HNS_ROCE_QP_TABLE_BITS;
+ context->qp_table_mask = (1 << context->qp_table_shift) - 1;
+
+ pthread_mutex_init(&context->qp_table_mutex, NULL);
+ for (i = 0; i < HNS_ROCE_QP_TABLE_SIZE; ++i)
+ context->qp_table[i].refcnt = 0;
+
+ context->uar = mmap(NULL, to_hr_dev(ibdev)->page_size,
+ PROT_READ | PROT_WRITE, MAP_SHARED, cmd_fd, 0);
+ if (context->uar == MAP_FAILED) {
+ fprintf(stderr, PFX "Warning: failed to mmap() uar page.\n");
+ goto err_free;
+ }
+
+ pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE);
+
+ context->max_qp_wr = dev_attrs.max_qp_wr;
+ context->max_sge = dev_attrs.max_sge;
+ context->max_cqe = dev_attrs.max_cqe;
+
+ return &context->ibv_ctx;
+
+err_free:
+ free(context);
+ return NULL;
+}
+
+static void hns_roce_free_context(struct ibv_context *ibctx)
+{
+ struct hns_roce_context *context = to_hr_ctx(ibctx);
+
+ munmap(context->uar, to_hr_dev(ibctx->device)->page_size);
+
+ context->uar = NULL;
+
+ free(context);
+ context = NULL;
+}
+
+static struct ibv_device_ops hns_roce_dev_ops = {
+ .alloc_context = hns_roce_alloc_context,
+ .free_context = hns_roce_free_context
+};
+
+static struct ibv_device *hns_roce_driver_init(const char *uverbs_sys_path,
+ int abi_version)
+{
+ struct hns_roce_device *dev;
+ char value[128];
+ int i;
+
+ if (ibv_read_sysfs_file(uverbs_sys_path, "device/modalias",
+ value, sizeof(value)) > 0)
+ for (i = 0; i < sizeof(acpi_table) / sizeof(acpi_table[0]); ++i)
+ if (!strcmp(value, acpi_table[i].hid))
+ goto found;
+
+ if (ibv_read_sysfs_file(uverbs_sys_path, "device/of_node/compatible",
+ value, sizeof(value)) > 0)
+ for (i = 0; i < sizeof(dt_table) / sizeof(dt_table[0]); ++i)
+ if (!strcmp(value, dt_table[i].compatible))
+ goto found;
+
+ return NULL;
+
+found:
+ dev = malloc(sizeof(struct hns_roce_device));
+ if (!dev) {
+ fprintf(stderr, PFX "Fatal: couldn't allocate device for %s\n",
+ uverbs_sys_path);
+ return NULL;
+ }
+
+ dev->ibv_dev.ops = hns_roce_dev_ops;
+ dev->page_size = sysconf(_SC_PAGESIZE);
+ return &dev->ibv_dev;
+}
+
+static __attribute__((constructor)) void hns_roce_register_driver(void)
+{
+ ibv_register_driver("hns", hns_roce_driver_init);
+}
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
new file mode 100644
index 0000000..3eef171
--- /dev/null
+++ b/providers/hns/hns_roce_u.h
@@ -0,0 +1,86 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _HNS_ROCE_U_H
+#define _HNS_ROCE_U_H
+
+#include <stddef.h>
+
+#include <infiniband/driver.h>
+#include <infiniband/arch.h>
+#include <infiniband/verbs.h>
+#include <ccan/container_of.h>
+
+#define HNS_ROCE_HW_VER1 ('h' << 24 | 'i' << 16 | '0' << 8 | '6')
+
+#define PFX "hns: "
+
+enum {
+ HNS_ROCE_QP_TABLE_BITS = 8,
+ HNS_ROCE_QP_TABLE_SIZE = 1 << HNS_ROCE_QP_TABLE_BITS,
+};
+
+struct hns_roce_device {
+ struct ibv_device ibv_dev;
+ int page_size;
+};
+
+struct hns_roce_context {
+ struct ibv_context ibv_ctx;
+ void *uar;
+ pthread_spinlock_t uar_lock;
+
+ struct {
+ int refcnt;
+ } qp_table[HNS_ROCE_QP_TABLE_SIZE];
+
+ pthread_mutex_t qp_table_mutex;
+
+ int num_qps;
+ int qp_table_shift;
+ int qp_table_mask;
+ unsigned int max_qp_wr;
+ unsigned int max_sge;
+ int max_cqe;
+};
+
+static inline struct hns_roce_device *to_hr_dev(struct ibv_device *ibv_dev)
+{
+ return container_of(ibv_dev, struct hns_roce_device, ibv_dev);
+}
+
+static inline struct hns_roce_context *to_hr_ctx(struct ibv_context *ibv_ctx)
+{
+ return container_of(ibv_ctx, struct hns_roce_context, ibv_ctx);
+}
+
+#endif /* _HNS_ROCE_U_H */
diff --git a/providers/hns/hns_roce_u_abi.h b/providers/hns/hns_roce_u_abi.h
new file mode 100644
index 0000000..4bfc8fa
--- /dev/null
+++ b/providers/hns/hns_roce_u_abi.h
@@ -0,0 +1,43 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _HNS_ROCE_U_ABI_H
+#define _HNS_ROCE_U_ABI_H
+
+#include <infiniband/kern-abi.h>
+
+struct hns_roce_alloc_ucontext_resp {
+ struct ibv_get_context_resp ibv_resp;
+ __u32 qp_tab_size;
+};
+
+#endif /* _HNS_ROCE_U_ABI_H */
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 rdma-core 0/7] libhns: userspace library for hns
From: Lijun Ou @ 2016-11-13 10:35 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
This patch series introduces userspace library for hns RoCE driver.
changes v4 -> v5:
1. elminate the warning when CFLAGS equal to -m32
changes v3 -> v4:
1. eliminate the warning by Travis CI testing
changes v2 -> v3:
1. Fix the code style, for example, if (addr == NULL)
2. Fix the bug for hns_roce_u_reg_mr
changes v1 -> v2:
1. Delete the min() definition and instead of ccan header
2. Delete the CHECK_C_SOURCE_COMPILES
3. sort the c file in rdma_provider()
4. Delete the unused code in hns_roce_u_db.h
Lijun Ou (7):
libhns: Add initial main frame
libhns: Add verbs of querying device and querying port
libhns: Add verbs of pd and mr support
libhns: Add verbs of cq support
libhns: Add verbs of qp support
libhns: Add verbs of post_send and post_recv support
libhns: Add consolidated repo for userspace library of hns
CMakeLists.txt | 1 +
MAINTAINERS | 6 +
README.md | 1 +
providers/hns/CMakeLists.txt | 6 +
providers/hns/hns_roce_u.c | 228 +++++++++++
providers/hns/hns_roce_u.h | 255 ++++++++++++
providers/hns/hns_roce_u_abi.h | 69 ++++
providers/hns/hns_roce_u_buf.c | 61 +++
providers/hns/hns_roce_u_db.h | 54 +++
providers/hns/hns_roce_u_hw_v1.c | 837 +++++++++++++++++++++++++++++++++++++++
providers/hns/hns_roce_u_hw_v1.h | 242 +++++++++++
providers/hns/hns_roce_u_verbs.c | 525 ++++++++++++++++++++++++
12 files changed, 2285 insertions(+)
create mode 100644 providers/hns/CMakeLists.txt
create mode 100644 providers/hns/hns_roce_u.c
create mode 100644 providers/hns/hns_roce_u.h
create mode 100644 providers/hns/hns_roce_u_abi.h
create mode 100644 providers/hns/hns_roce_u_buf.c
create mode 100644 providers/hns/hns_roce_u_db.h
create mode 100644 providers/hns/hns_roce_u_hw_v1.c
create mode 100644 providers/hns/hns_roce_u_hw_v1.h
create mode 100644 providers/hns/hns_roce_u_verbs.c
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v4 rdma-core 0/7] libhns: userspace library for hns
From: oulijun @ 2016-11-13 10:24 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <20161111130350.GI28957-2ukJVAZIZ/Y@public.gmane.org>
在 2016/11/11 21:03, Leon Romanovsky 写道:
> On Fri, Nov 11, 2016 at 12:57:23PM +0800, Lijun Ou wrote:
>> This patch series introduces userspace library for hns RoCE driver.
>>
>> changes v3 -> v4:
>> 1. eliminate the warning by Travis CI testing
>
> It still fails.
> https://travis-ci.org/linux-rdma/rdma-core/builds/175060470
>
> Please test it locally.
>
> Thanks.
>
Hi, leon
I have passed the Travis CI testing according to your advice [2] and
sent the patchv5
Lijun Ou
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v3 rdma-core 0/7] libhns: userspace library for hns
From: oulijun @ 2016-11-13 10:22 UTC (permalink / raw)
To: Leon Romanovsky
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <20161111124240.GG28957-2ukJVAZIZ/Y@public.gmane.org>
在 2016/11/11 20:42, Leon Romanovsky 写道:
> On Fri, Nov 11, 2016 at 02:37:46PM +0800, oulijun wrote:
>> 在 2016/11/11 0:15, Leon Romanovsky 写道:
>>> On Thu, Nov 10, 2016 at 08:46:10PM +0800, Lijun Ou wrote:
>>>> This patch series introduces userspace library for hns RoCE driver.
>>>>
>>>> changes v2 -> v3:
>>>> 1. Fix the code style, for example, if (addr == NULL)
>>>> 2. Fix the bug for hns_roce_u_reg_mr
>>>>
>>>> changes v1 -> v2:
>>>> 1. Delete the min() definition and instead of ccan header
>>>> 2. Delete the CHECK_C_SOURCE_COMPILES
>>>> 3. sort the c file in rdma_provider()
>>>> 4. Delete the unused code in hns_roce_u_db.h
>>>>
>>>> Lijun Ou (7):
>>>> libhns: Add initial main frame
>>>> libhns: Add verbs of querying device and querying port
>>>> libhns: Add verbs of pd and mr support
>>>> libhns: Add verbs of cq support
>>>> libhns: Add verbs of qp support
>>>> libhns: Add verbs of post_send and post_recv support
>>>> libhns: Add consolidated repo for userspace library of hns
>>>
>>> Hi Lijun,
>>>
>>> I tried to take your patch set, but it fails to pass our Travis CI
>>> tests.
>>>
>>> https://github.com/linux-rdma/rdma-core/pull/38
>>> https://travis-ci.org/linux-rdma/rdma-core/builds/174815897
>>>
>>> You need to fix it prior to our acceptance.
>>>
>>> Thanks
>>>
>> Hi, Leon
>> I have checked the patches and fixed it and send the patchv4. I think that
>> the patchv4 will not have errors and warnings for travis CI testing, but
>> I don't have the Travis CI environment for testing it.
>> I have tried to install the test environment according to your test log and it
>> is fail.
>> In addtion that, I have tested it for patchv1/patchv2/patch3 by use the following way and it
>> is ok
>> 1. directly use the default script file build.sh
>> ./build.sh
>> 2. use CC=aarch64-linux-gnu-gcc build
>> mkdir build
>> cd build
>> CC=aarch64-linux-gnu-gcc cmake -Gninja -DENABLE_RESOLVE_NEIGH=0 ..
>
> Generally speaking, you have 4 possible ways to check your code prior
> your submission:
> 1. Use Travis CI locally [1].
> 2. Fork rdma-core to your personal github account which can be connected
> to Travis CI for free.
> 3. Install all required tools and run/compile locally
> 4. Send pull request to rdma-core via github interface and it will check
> automatically.
>
> First 3 options are the best.
>
> [1]
> https://docs.travis-ci.com/user/common-build-problems/#Troubleshooting-Locally-in-a-Docker-Image
>
Thanks for your guide. I have test it according the [2] and sent patchv5
Lijun Ou
>>
>> lijun Ou
>>>
>>>>
>>>> CMakeLists.txt | 1 +
>>>> MAINTAINERS | 6 +
>>>> README.md | 1 +
>>>> providers/hns/CMakeLists.txt | 6 +
>>>> providers/hns/hns_roce_u.c | 228 +++++++++++
>>>> providers/hns/hns_roce_u.h | 255 ++++++++++++
>>>> providers/hns/hns_roce_u_abi.h | 69 ++++
>>>> providers/hns/hns_roce_u_buf.c | 61 +++
>>>> providers/hns/hns_roce_u_db.h | 54 +++
>>>> providers/hns/hns_roce_u_hw_v1.c | 839 +++++++++++++++++++++++++++++++++++++++
>>>> providers/hns/hns_roce_u_hw_v1.h | 242 +++++++++++
>>>> providers/hns/hns_roce_u_verbs.c | 525 ++++++++++++++++++++++++
>>>> 12 files changed, 2287 insertions(+)
>>>> create mode 100644 providers/hns/CMakeLists.txt
>>>> create mode 100644 providers/hns/hns_roce_u.c
>>>> create mode 100644 providers/hns/hns_roce_u.h
>>>> create mode 100644 providers/hns/hns_roce_u_abi.h
>>>> create mode 100644 providers/hns/hns_roce_u_buf.c
>>>> create mode 100644 providers/hns/hns_roce_u_db.h
>>>> create mode 100644 providers/hns/hns_roce_u_hw_v1.c
>>>> create mode 100644 providers/hns/hns_roce_u_hw_v1.h
>>>> create mode 100644 providers/hns/hns_roce_u_verbs.c
>>>>
>>>> --
>>>> 1.9.1
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v3 rdma-core 0/7] libhns: userspace library for hns
From: oulijun @ 2016-11-13 10:21 UTC (permalink / raw)
To: Jason Gunthorpe, Leon Romanovsky
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <20161111160202.GA22935-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
在 2016/11/12 0:02, Jason Gunthorpe 写道:
> On Fri, Nov 11, 2016 at 02:42:40PM +0200, Leon Romanovsky wrote:
>
>>> I have checked the patches and fixed it and send the patchv4. I think that
>>> the patchv4 will not have errors and warnings for travis CI testing, but
>>> I don't have the Travis CI environment for testing it.
>>> I have tried to install the test environment according to your test log and it
>>> is fail.
>>> In addtion that, I have tested it for patchv1/patchv2/patch3 by use the following way and it
>>> is ok
>>> 1. directly use the default script file build.sh
>>> ./build.sh
>>> 2. use CC=aarch64-linux-gnu-gcc build
>>> mkdir build
>>> cd build
>>> CC=aarch64-linux-gnu-gcc cmake -Gninja -DENABLE_RESOLVE_NEIGH=0 ..
>
> For the 32 bit bugs all you need to do is run on x86 with the 32 bit
> compiler support installed:
>
> $ CFLAGS=-m32 cmake -GNinja ..
> $ ninja
>
>> 4. Send pull request to rdma-core via github interface and it will check
>> automatically.
>
> This is certainly a better option than having Leon test your patches,
> just keep force pushing to your own pull request until travis is
> happy.
>
> Make sure the 32 bit problems are fixed properly, use inttypes.h,
> uintptr, and so forth.
>
> Jason
>
Thanks your guide. I have test it by travis CI successfully and sent tha patchV5
Lijun Ou
> .
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* I Hope You Get My Message This Time
From: Mr Friedrich Mayrhofer @ 2016-11-12 6:41 UTC (permalink / raw)
--
This is the second time i am sending you this mail.
I, Friedrich Mayrhofer Donate $ 1,000,000.00 to You, Email Me
personally for more details.
Regards.
Friedrich Mayrhofer
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-next V1 06/17] IB/hfi1: Remove debug prints after allocation failure
From: Dalessandro, Dennis @ 2016-11-12 4:04 UTC (permalink / raw)
To: leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1478184265-9620-7-git-send-email-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 466 bytes --]
On Thu, 2016-11-03 at 16:44 +0200, Leon Romanovsky wrote:
> The prints after [k|v][m|z|c]alloc() functions are not needed,
> because in case of failure, allocator will print their internal
> error prints anyway.
>
> Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¥{±Ù{ayº\x1dÊÚë,j\a¢f£¢·h»öì\x17/oSc¾Ú³9uÀ¦æåÈ&jw¨®\x03(éÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þàþf£¢·h§~m
^ permalink raw reply
* [PATCH, RESEND] IB/srpt: Report login failures only once
From: Bart Van Assche @ 2016-11-12 0:36 UTC (permalink / raw)
To: Doug Ledford
Cc: Nicholas A. Bellinger, Christoph Hellwig, Sagi Grimberg,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Report the following message only once if no ACL has been configured
yet for an initiator port:
"Rejected login because no ACL has been configured yet for initiator %s.\n"
Signed-off-by: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Cc: Nicholas Bellinger <nab-IzHhD5pYlfBP7FQvKIMDCQ@public.gmane.org>
Cc: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Cc: Sagi Grimberg <sagig-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
---
drivers/infiniband/ulp/srpt/ib_srpt.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index b268aaa..0951e4f 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1842,7 +1842,6 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
struct srpt_rdma_ch *ch, *tmp_ch;
u32 it_iu_len;
int i, ret = 0;
- unsigned char *p;
WARN_ON_ONCE(irqs_disabled());
@@ -1996,21 +1995,18 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
be64_to_cpu(*(__be64 *)(ch->i_port_id + 8)));
pr_debug("registering session %s\n", ch->sess_name);
- p = &ch->sess_name[0];
-try_again:
ch->sess = target_alloc_session(&sport->port_tpg_1, 0, 0,
- TARGET_PROT_NORMAL, p, ch, NULL);
+ TARGET_PROT_NORMAL, ch->sess_name, ch,
+ NULL);
+ /* Retry without leading "0x" */
+ if (IS_ERR(ch->sess))
+ ch->sess = target_alloc_session(&sport->port_tpg_1, 0, 0,
+ TARGET_PROT_NORMAL,
+ ch->sess_name + 2, ch, NULL);
if (IS_ERR(ch->sess)) {
- pr_info("Rejected login because no ACL has been"
- " configured yet for initiator %s.\n", p);
- /*
- * XXX: Hack to retry of ch->i_port_id without leading '0x'
- */
- if (p == &ch->sess_name[0]) {
- p += 2;
- goto try_again;
- }
+ pr_info("Rejected login because no ACL has been configured yet for initiator %s.\n",
+ ch->sess_name);
rej->reason = cpu_to_be32((PTR_ERR(ch->sess) == -ENOMEM) ?
SRP_LOGIN_REJ_INSUFFICIENT_RESOURCES :
SRP_LOGIN_REJ_CHANNEL_LIMIT_REACHED);
--
2.10.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: Problems trying to bridge/route RoCE
From: Robert LeBlanc @ 2016-11-11 20:45 UTC (permalink / raw)
To: Majd Dibbiny; +Cc: Parav Pandit, linux-rdma
In-Reply-To: <45EBC19B-6779-4AA6-BEC6-A61103A2C501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
How do you add GRH for iSER? Does it happen automatically? I thought
that is what default_roce_mode would do. What am I missing here?
My testbed had to be torn down today, so I've got to set it up again
on different hardware. So I won't be able to really test things until
next week, until then I'll try to understand it as much as I can.
Thank you,
Robert LeBlanc
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Fri, Nov 11, 2016 at 1:34 AM, Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>
> On Nov 11, 2016, at 12:33 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote:
>
> I found a ConnectX-3 (non-pro) and wired it up. So in bridge mode, it
> seems like I can get ib_read_bw to work (still with a warm-up error
> message), but as router, I'm still having trouble.
>
> 192.168.21.17 ----- Linux bridge ------ 192.168.21.18
>
> # ib_read_bw -d mlx5_0 -F -a 192.168.21.17
>
> Hi Robert,
>
> You should provide the gid index parameter which adds GRH to the packet in
> order to work with RoCE.
>
> In the perftest suite it's -x parameter.
>
> If you are trying to pass traffic between different subnets, then you need
> to run routable roce traffic and thus using RoCE v2 gid index.
>
> Also, if you are using rdma-cm, you need to configure the rdma-cm default
> gid type to v2 as well using configfs.
>
> ---------------------------------------------------------------------------------------
> Device not recognized to implement inline feature. Disabling it
> ------I
> ---------------------------------------------------------------------------------
>
> RDMA_Read BW Test
> Dual-port : OFF Device : mlx5_0
> Number of qps : 1 Transport type : IB
> Connection type : RC Using SRQ : OFF
> TX depth : 128
> CQ Moderation : 100
> Mtu : 1024[B]
> Link type : Ethernet
> Gid index : 0
> Outstand reads : 16
> rdma_cm QPs : OFF
> Data ex. method : Ethernet
> ---------------------------------------------------------------------------------------
> local address: LID 0000 QPN 0x0135 PSN 0x12f108 OUT 0x10 RKey
> 0x009f79 VAddr 0x007f1c82d1f000
> GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:18
> remote address: LID 0000 QPN 0x0175 PSN 0x37982e OUT 0x10 RKey
> 0x00eac9 VAddr 0x007f54c1405000
> GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:17
> ---------------------------------------------------------------------------------------
> #bytes #iterations BW peak[MB/sec] BW average[MB/sec]
> MsgRate[Mpps]
> Conflicting CPU frequency values detected: 3698.669000 != 3102.661000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.86 differs from nominal 3698.67
> MHz
> 2 1000 0.65 0.65 0.341088
> Conflicting CPU frequency values detected: 3699.310000 != 1199.920000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3500.01 differs from nominal 3699.31
> MHz
> 4 1000 0.10 0.10 0.025750
> Conflicting CPU frequency values detected: 3681.579000 != 1199.920000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.99 differs from nominal 3681.58
> MHz
> 8 1000 2.77 2.77 0.363689
> Conflicting CPU frequency values detected: 3602.325000 != 3265.655000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.99 differs from nominal 3602.32
> MHz
> 16 1000 5.37 5.36 0.351569
> Conflicting CPU frequency values detected: 3600.830000 != 3265.655000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.97 differs from nominal 3600.83
> MHz
> 32 1000 11.30 11.29 0.370062
> Conflicting CPU frequency values detected: 3599.761000 != 3265.655000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3500.01 differs from nominal 3599.76
> MHz
> 64 1000 22.39 22.28 0.365108
> Conflicting CPU frequency values detected: 3599.975000 != 3265.655000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3500.01 differs from nominal 3599.97
> MHz
> 128 1000 45.09 45.08 0.369316
> Conflicting CPU frequency values detected: 3599.761000 != 3265.655000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.99 differs from nominal 3599.76
> MHz
> 256 1000 89.55 89.54 0.366765
> Conflicting CPU frequency values detected: 3599.761000 != 2280.212000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3500 differs from nominal 3599.76 MHz
> 512 1000 179.65 179.64 0.367907
> Conflicting CPU frequency values detected: 3599.761000 != 1200.347000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.99 differs from nominal 3599.76
> MHz
> 1024 1000 361.00 360.98 0.369639
> Conflicting CPU frequency values detected: 3601.043000 != 1751.495000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3500.01 differs from nominal 3601.04
> MHz
> 2048 1000 492.15 491.42 0.251606
> Conflicting CPU frequency values detected: 3698.028000 != 3601.470000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3500.01 differs from nominal 3698.03
> MHz
> 4096 1000 617.10 615.00 0.157440
> Conflicting CPU frequency values detected: 3684.356000 != 3600.189000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3500 differs from nominal 3684.36 MHz
> 8192 1000 679.31 679.30 0.086951
> Conflicting CPU frequency values detected: 3646.759000 != 1877.532000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.98 differs from nominal 3646.76
> MHz
> 16384 1000 722.86 722.85 0.046262
> Conflicting CPU frequency values detected: 3599.975000 != 2271.881000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.99 differs from nominal 3599.97
> MHz
> 32768 1000 742.08 742.08 0.023746
> Conflicting CPU frequency values detected: 3602.966000 != 1933.929000
> Test integrity may be harmed !
> Warning: measured timestamp frequency 3499.97 differs from nominal 3602.97
> MHz
> 65536 1000 763.25 762.52 0.012200
> mlx5: prv-0-18-roberttest.betterservers.com: got completion with error:
> 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000 00000000
> 00000000 00008813 10000135 4680fcd2
> Problems with warm up
>
>
> === Router config ===
> 192.168.21.17 ------ 192.168.21.11 (Linux router) 192.168.22.11 ------
> 192.168.21.18
>
> #192.168.22.18
> # ping 192.168.21.17
> PING 192.168.21.17 (192.168.21.17) 56(84) bytes of data.
> 64 bytes from 192.168.21.17: icmp_seq=1 ttl=63 time=0.191 ms
> ^C
> --- 192.168.21.17 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.191/0.191/0.191/0.000 ms
>
> #192.168.21.17
> # route -n | grep 168
> 192.168.21.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2
> 192.168.22.0 192.168.21.11 255.255.255.0 UG 0 0 0 eth2
>
> #192.168.22.18
> # route -n | grep 168
> 192.168.21.0 192.168.22.11 255.255.255.0 UG 0 0 0 eth2
> 192.168.22.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2
>
> #192.168.22.18
> # ib_read_bw -d mlx5_0 -F -a 192.168.21.17
> ---------------------------------------------------------------------------------------
> Device not recognized to implement inline feature. Disabling it
> ---------------------------------------------------------------------------------------
> RDMA_Read BW Test
> Dual-port : OFF Device : mlx5_0
> Number of qps : 1 Transport type : IB
> Connection type : RC Using SRQ : OFF
> TX depth : 128
> CQ Moderation : 100
> Mtu : 1024[B]
> Link type : Ethernet
> Gid index : 0
> Outstand reads : 16
> rdma_cm QPs : OFF
> Data ex. method : Ethernet
> ---------------------------------------------------------------------------------------
> local address: LID 0000 QPN 0x013a PSN 0x676912 OUT 0x10 RKey
> 0x00dfd3 VAddr 0x007fe67aee8000
> GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:22:18
> remote address: LID 0000 QPN 0x017a PSN 0x4256ce OUT 0x10 RKey
> 0x012985 VAddr 0x007f59de5bf000
> GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:21:17
> ---------------------------------------------------------------------------------------
> #bytes #iterations BW peak[MB/sec] BW average[MB/sec]
> MsgRate[Mpps]
> Problems with warm up
>
>
> #192.168.21.17
> # cat /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode
> RoCE v2
>
> #192.168.22.18
> # cat /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode
> RoCE v2
>
> With routing, I'm not seeing any RoCE traffic with tcpdump on the
> interfaces. With bridge mode, I do see the RoCE traffic, but it looks
> like RoCE v1 traffic.
>
> [snip]
> 14:55:06.010682 0c:c4:7a:89:f7:06 > 0c:c4:7a:89:f6:f6, ethertype
> Unknown (0x8915), length 78:
> 0x0000: 6010 0000 0018 1b40 0000 0000 0000 0000 `......@........
> 0x0010: 0000 ffff c0a8 1511 0000 0000 0000 0000 ................
> 0x0020: 0000 ffff c0a8 1512 1060 ffff 0000 013e .........`.....>
> 0x0030: 00e5 7b6c 0000 0411 0000 0000 60bb 6a87 ..{l........`.j.
> [snip]
>
> I can get iSER to kind of work...
>
> In bridge mode and running fio on the iSER target, I'm getting
> messages in dmesg:
> [Thu Nov 10 15:14:17 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe
> [Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:14:17 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:14:17 2016] 00000000 08007806 2500014f a7a758d2
> [Thu Nov 10 15:14:17 2016] iser: iser_err_comp: memreg failure: memory
> management operation error (6) vend_err 78
> [Thu Nov 10 15:14:17 2016] connection82:0: detected conn error (1011)
> [Thu Nov 10 15:14:24 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe
> [Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:14:24 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:14:24 2016] 00000000 08007806 25000150 3471eed2
> ...
>
> In routed mode I also get the same messages, but the device goes
> offline and crashes fio
>
> [Thu Nov 10 15:09:13 2016] mlx5_0:dump_cqe:263:(pid 0): dump error cqe
> [Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:09:13 2016] 00000000 00000000 00000000 00000000
> [Thu Nov 10 15:09:13 2016] 00000000 08007806 25000149 5a524ad2
> [Thu Nov 10 15:09:13 2016] iser: iser_err_comp: memreg failure: memory
> management operation error (6) vend_err 78
> [Thu Nov 10 15:09:13 2016] connection80:0: detected conn error (1011)
> [Thu Nov 10 15:09:18 2016] session80: session recovery timed out after 5
> secs
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device
> [Thu Nov 10 15:09:18 2016] scsi_io_completion: 23 callbacks suppressed
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] FAILED Result:
> hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] CDB: Read(10) 28 00 09
> 9f 97 18 00 01 48 00
> [Thu Nov 10 15:09:18 2016] blk_update_request: 23 callbacks suppressed
> [Thu Nov 10 15:09:18 2016] blk_update_request: I/O error, dev sdab,
> sector 161453848
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] FAILED Result:
> hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] CDB: Read(10) 28 00 07
> bf 98 60 00 00 a8 00
> [Thu Nov 10 15:09:18 2016] blk_update_request: I/O error, dev sdab,
> sector 129996896
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: rejecting I/O to offline device
> [Thu Nov 10 15:09:18 2016] sd 13:0:0:0: [sdab] killing request
> ...
>
> This is all using ConnectX-4 LX cards on the target and initiator and
> the 3.8.5 kernel.
>
> Any ideas of what may be causing these issues?
>
> Thank you,
> Robert LeBlanc
>
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
>
>
> On Thu, Nov 3, 2016 at 11:38 AM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org>
> wrote:
>
> That box has a build-in ConnectX-3 card that we aren't using in this
>
> test so the mlx4 modules are loaded. I unloaded mlx4_ib, no luck. I
>
> also tried to unload the mlx5_ib driver and it also unloaded mlx5_core
>
> and my interfaces were gone. It seems like I can't only unload
>
> mlx5_ib.
>
>
> With mlx4_ib unloaded I still can't rping or ib_read_bw (connects, but
>
> get messages like:
>
> ethernet_read_keys: Couldn't read remote address
>
> Unable to read to socket/rdam_cm
>
> Failed to exchange data between server and clients
>
> Problems with warm up) same as before.
>
> ----------------
>
> Robert LeBlanc
>
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
>
>
>
> On Thu, Nov 3, 2016 at 11:16 AM, Parav Pandit <pandit.parav-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> wrote:
>
> Hi Robert,
>
>
> Can you please unload the mlx4_ib module in the bridge/router box and
>
> give it a quick try?
>
>
> Parav
>
>
> On Thu, Nov 3, 2016 at 10:32 PM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org>
> wrote:
>
> I'm trying to do some testing of RoCE v2 and so I put a LInux box
>
> between two RoCE machines. I think the ConnectX-4 Lx card in the
>
> bridge/router is intercepting the RoCE traffic and so it is not being
>
> bridged/routed. I don't see any traffic using tcpdump which seems to
>
> confirm this. I thought I could change the UDP port that the card is
>
> looking for RoCE traffic to something not in use [0], but rr_proto is
>
> not a valid parameter for the inbox mlx5_core module on 4.8.5. I can
>
> ping across the bridge/router so I know that it is setup correctly,
>
> just RDMA is not working.
>
>
> Any ideas on how to pass RoCE traffic like normal traffic? The reason
>
> we are using a Linux box is that we can use netem to understand how
>
> RoCE behaves in different situations.
>
>
> [0] https://community.mellanox.com/docs/DOC-1444
>
>
> Thank you
>
> ----------------
>
> Robert LeBlanc
>
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
>
> --
>
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-core 0/8] libpvrdma: userspace library for PVRDMA
From: Adit Ranadive @ 2016-11-11 19:07 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Christoph Hellwig, Doug Ledford,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, pv-drivers
In-Reply-To: <20161109175142.GA13467-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
On Wed, Nov 09, 2016 at 10:51:42AM -0700, Jason Gunthorpe wrote:
> On Wed, Nov 09, 2016 at 09:39:49AM -0800, Adit Ranadive wrote:
> > On Tue, Nov 08, 2016 at 5:17:15PM -0800, Christoph Hellwig wrote:
> > > FYI, the convention used in scsi is vmw_pvscsi.c, so naming the
> > > RDMA equivalent vmw_pvrdma would make a lot of sense and still
> > > be reasonably short.
> >
> > Thanks Christoph. We agree with that as well. Then the driver path
> > (drivers/infiniband/hw/vmw_pvrdma/...) and the module name would
> > reflect vmw_pvrdma.
> > What about the user library? libpvrdma -> libvmwpvrdma?
>
> I am encouraging people to match the name of the module in the
> library.
>
> Jason
>
Hi Doug,
We would like to get the driver in 4.10 if its possible. I can
respin the patches (with the new driver name/location) based on your
for-4.10 branch or another branch if you prefer.
Thanks,
Adit
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] IB/usnic: simplify IS_ERR_OR_NULL to IS_ERR
From: Julia Lawall @ 2016-11-11 19:04 UTC (permalink / raw)
To: Christian Benvenuti
Cc: kernel-janitors, Dave Goodell, Doug Ledford, Sean Hefty,
Hal Rosenstock, linux-rdma, linux-kernel, Christophe JAILLET
The function usnic_ib_qp_grp_get_chunk only returns an ERR_PTR value or a
valid pointer, never NULL. The same is true of get_qp_res_chunk, which
just returns the result of calling usnic_ib_qp_grp_get_chunk. Simplify
IS_ERR_OR_NULL to IS_ERR in both cases.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression t,e;
@@
t = \(usnic_ib_qp_grp_get_chunk(...)\|get_qp_res_chunk(...)\)
... when != t=e
- IS_ERR_OR_NULL(t)
+ IS_ERR(t)
@@
expression t,e,e1;
@@
t = \(usnic_ib_qp_grp_get_chunk(...)\|get_qp_res_chunk(...)\)
... when != t=e
?- t ? PTR_ERR(t) : e1
+ PTR_ERR(t)
... when any
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
---
drivers/infiniband/hw/usnic/usnic_ib_qp_grp.c | 12 ++++++------
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 12 ++++++------
2 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_qp_grp.c b/drivers/infiniband/hw/usnic/usnic_ib_qp_grp.c
index 0e813ec..092d4e1 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_qp_grp.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_qp_grp.c
@@ -117,10 +117,10 @@ static int enable_qp_grp(struct usnic_ib_qp_grp *qp_grp)
vnic_idx = usnic_vnic_get_index(qp_grp->vf->vnic);
res_chunk = get_qp_res_chunk(qp_grp);
- if (IS_ERR_OR_NULL(res_chunk)) {
+ if (IS_ERR(res_chunk)) {
usnic_err("Unable to get qp res with err %ld\n",
PTR_ERR(res_chunk));
- return res_chunk ? PTR_ERR(res_chunk) : -ENOMEM;
+ return PTR_ERR(res_chunk);
}
for (i = 0; i < res_chunk->cnt; i++) {
@@ -158,10 +158,10 @@ static int disable_qp_grp(struct usnic_ib_qp_grp *qp_grp)
vnic_idx = usnic_vnic_get_index(qp_grp->vf->vnic);
res_chunk = get_qp_res_chunk(qp_grp);
- if (IS_ERR_OR_NULL(res_chunk)) {
+ if (IS_ERR(res_chunk)) {
usnic_err("Unable to get qp res with err %ld\n",
PTR_ERR(res_chunk));
- return res_chunk ? PTR_ERR(res_chunk) : -ENOMEM;
+ return PTR_ERR(res_chunk);
}
for (i = 0; i < res_chunk->cnt; i++) {
@@ -186,11 +186,11 @@ static int init_filter_action(struct usnic_ib_qp_grp *qp_grp,
struct usnic_vnic_res_chunk *res_chunk;
res_chunk = usnic_ib_qp_grp_get_chunk(qp_grp, USNIC_VNIC_RES_TYPE_RQ);
- if (IS_ERR_OR_NULL(res_chunk)) {
+ if (IS_ERR(res_chunk)) {
usnic_err("Unable to get %s with err %ld\n",
usnic_vnic_res_type_to_str(USNIC_VNIC_RES_TYPE_RQ),
PTR_ERR(res_chunk));
- return res_chunk ? PTR_ERR(res_chunk) : -ENOMEM;
+ return PTR_ERR(res_chunk);
}
uaction->vnic_idx = usnic_vnic_get_index(qp_grp->vf->vnic);
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index a5bfbba..79766db 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -87,12 +87,12 @@ static int usnic_ib_fill_create_qp_resp(struct usnic_ib_qp_grp *qp_grp,
resp.bar_len = bar->len;
chunk = usnic_ib_qp_grp_get_chunk(qp_grp, USNIC_VNIC_RES_TYPE_RQ);
- if (IS_ERR_OR_NULL(chunk)) {
+ if (IS_ERR(chunk)) {
usnic_err("Failed to get chunk %s for qp_grp %d with err %ld\n",
usnic_vnic_res_type_to_str(USNIC_VNIC_RES_TYPE_RQ),
qp_grp->grp_id,
PTR_ERR(chunk));
- return chunk ? PTR_ERR(chunk) : -ENOMEM;
+ return PTR_ERR(chunk);
}
WARN_ON(chunk->type != USNIC_VNIC_RES_TYPE_RQ);
@@ -101,12 +101,12 @@ static int usnic_ib_fill_create_qp_resp(struct usnic_ib_qp_grp *qp_grp,
resp.rq_idx[i] = chunk->res[i]->vnic_idx;
chunk = usnic_ib_qp_grp_get_chunk(qp_grp, USNIC_VNIC_RES_TYPE_WQ);
- if (IS_ERR_OR_NULL(chunk)) {
+ if (IS_ERR(chunk)) {
usnic_err("Failed to get chunk %s for qp_grp %d with err %ld\n",
usnic_vnic_res_type_to_str(USNIC_VNIC_RES_TYPE_WQ),
qp_grp->grp_id,
PTR_ERR(chunk));
- return chunk ? PTR_ERR(chunk) : -ENOMEM;
+ return PTR_ERR(chunk);
}
WARN_ON(chunk->type != USNIC_VNIC_RES_TYPE_WQ);
@@ -115,12 +115,12 @@ static int usnic_ib_fill_create_qp_resp(struct usnic_ib_qp_grp *qp_grp,
resp.wq_idx[i] = chunk->res[i]->vnic_idx;
chunk = usnic_ib_qp_grp_get_chunk(qp_grp, USNIC_VNIC_RES_TYPE_CQ);
- if (IS_ERR_OR_NULL(chunk)) {
+ if (IS_ERR(chunk)) {
usnic_err("Failed to get chunk %s for qp_grp %d with err %ld\n",
usnic_vnic_res_type_to_str(USNIC_VNIC_RES_TYPE_CQ),
qp_grp->grp_id,
PTR_ERR(chunk));
- return chunk ? PTR_ERR(chunk) : -ENOMEM;
+ return PTR_ERR(chunk);
}
WARN_ON(chunk->type != USNIC_VNIC_RES_TYPE_CQ);
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox